0% found this document useful (0 votes)
22 views22 pages

General Notions of Statistical Depth Function

The document discusses general notions of statistical depth functions. It introduces four desirable properties for depth functions: affine invariance, maximality at center, monotonicity relative to deepest point, and vanishing at infinity. It evaluates several existing depth functions against these properties, finding that the halfspace depth possesses all four properties, while other functions like simplicial depth may lack some properties depending on the distribution. The document provides a framework for systematically selecting depth functions based on how well they satisfy important mathematical properties.

Uploaded by

Sen Shen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views22 pages

General Notions of Statistical Depth Function

The document discusses general notions of statistical depth functions. It introduces four desirable properties for depth functions: affine invariance, maximality at center, monotonicity relative to deepest point, and vanishing at infinity. It evaluates several existing depth functions against these properties, finding that the halfspace depth possesses all four properties, while other functions like simplicial depth may lack some properties depending on the distribution. The document provides a framework for systematically selecting depth functions based on how well they satisfy important mathematical properties.

Uploaded by

Sen Shen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

The Annals of Statistics

2000, Vol. 28, No. 2, 461–482

GENERAL NOTIONS OF STATISTICAL DEPTH FUNCTION

By Yijun Zuo and Robert Serfling1


Arizona State University and University of Texas
Statistical depth functions are being formulated ad hoc with increas-
ing popularity in nonparametric inference for multivariate data. Here we
introduce several general structures for depth functions, classify many ex-
isting examples as special cases, and establish results on the possession, or
lack thereof, of four key properties desirable for depth functions in general.
Roughly speaking, these properties may be described as: affine invariance,
maximality at center, monotonicity relative to deepest point, and vanishing
at infinity. This provides a more systematic basis for selection of a depth
function. In particular, from these and other considerations it is found that
the halfspace depth behaves very well overall in comparison with various
competitors.

1. Introduction. Statistical depth functions have become increasingly


pursued as a useful tool in nonparametric inference for multivariate data.
Roughly speaking, for a distribution P in Rd , a corresponding depth function
is any function Dx P which provides a P-based center-outward ordering of
points x ∈ Rd . Tukey (1975) proposed a “halfspace” depth and suggested its
role in defining multivariate analogues of univariate rank and order statistics
via depth-induced “contours.” The halfspace depth (HD) of a point x in Rd
with respect to a probability measure P on Rd is defined as the minimum
probability mass carried by any closed halfspace containing x, that is,

HDx P = inf PH  H a closed halfspace x ∈ H x ∈ Rd 

Based on this depth, Donoho and Gasko (1992) studied multivariate location
estimators and Yeh and Singh (1997) developed confidence regions. Properties
of the corresponding contours have been studied by various authors including
Eddy (1985), Nolan (1992), Donoho and Gasko (1992) and Massé and Theodor-
escu (1994). See Carrizosa (1996) for a characterization of halfspace depth
relating to problems of facility location analysis in the operations research
literature.
The “center-outward ordering” interpretation of a depth function suggests
that (i) a relevant notion of “center” is available, and (ii) points near the cen-
ter should have higher depth. From this standpoint, the “center” consists of
the set of points globally maximizing depth, in which case a depth function
should tend to ignore multimodality features of the underlying distribution
P. If, on the other hand, sensitivity to multimodality is desirable, then the

Received December 1998; revised December 1999.


1 Supported
by NSF Grant DMS-97-05209.
AMS 1991 subject classifications. Primary 62H05; secondary 62G20.
Key words and phrases. Statistical depth functions, halfspace depth, simplicial depth, multi-
variate symmetry.
461
462 Y. ZUO AND R. SERFLING

“center” should include local maxima as well, in which case the notion of
center-outward ordering becomes compromised and “inner” points can have
low depth. It is thus important, in considering depth functions, to make a
choice on this issue. In the present paper we opt for the center to be given
by global maxima, with low depth corresponding to large distance from the
center. For further discussion, see Remark A.1 in Appendix A.
Liu (1990) introduced a notion of “simplicial” depth and corresponding mul-
tivariate location estimators. Namely, the simplicial depth (SD) of a point x
in Rd with respect to a probability measure P on Rd is defined to be the
probability that x belongs to a random simplex in Rd , that is,
SDx P = Px ∈ S X1      Xd+1  x ∈ Rd 
where X1      Xd+1 is a random sample from P and S x1      xd+1 denotes
the d-dimensional simplex with vertices x1      xd+1 , that is, the set of all
points in Rd that are convex combinations of x1      xd+1 .
Liu and Singh (1993) considered the above two depth functions and two
more, “Mahalanobis” depth and “majority” depth, which they applied in for-
mulating a “quality index” for use in connection with manufacturing processes.
Rousseeuw and Hubert (1999) introduced “regression depth” and Rousseeuw
and Ruts (1996), Ruts and Rousseeuw (1996) and Rousseeuw and Struyf (1998)
studied computing issues concerning depth functions and contours. Liu, Pare-
lius and Singh (1999) considered seven examples of depth function, including a
“convex hull peeling” version and a “likelihood” type, and developed methodol-
ogy for their practical use in exploratory statistical analysis. Likelihood-based
depth functions have also been considered by Fraiman and Meloche (1996) and
Fraiman, Liu and Meloche (1997). Koshevoy and Mosler (1997) introduced a
“zonoid” depth function based on “zonoid trimming.” Bartoszyński, Pearl and
Lawrence (1997) introduced a depth function based on interpoint distances in
the context of a multivariate goodness-of-fit test. Depth functions also arise
in the theory of social choice [see Caplin and Nalebuff (1988, 1991a, b)]. Non-
parametric notions of multivariate “scatter measure” and “more scattered”
based on general depth functions have been formulated and studied by Zuo
and Serfling (2000a). Mizera (1998) has introduced a differential calculus for
depth functions. Finally, Vardi and Zhang (1999) have introduced a method
for constructing depth functions from notions of multivariate median.
Depth functions thus have been introduced ad hoc in great variety, without
regard to whether they meet any particular set of criteria that ought to be
satisfied. Consequently, there is no systematic basis for preferring one such
function over another. In the present paper, we address this issue by asking:
(i) What desirable properties should a statistical depth function possess?
(ii) What constructive approaches lead to attractive depth functions?
(iii) Do existing depth functions possess all desired properties?
In Section 2 we list several desirable properties first introduced by Liu
(1990), on the basis of which we formulate a general definition of “statisti-
cal depth function”. Roughly speaking, these properties may be described as:
STATISTICAL DEPTH FUNCTION 463

affine invariance, maximality at center, monotonicity relative to deepest point,


and vanishing at infinity. Also, several distinct structures for construction of
depth functions are introduced and investigated with respect to possession
of these properties, and a number of presently popular depth functions are
classified with respect to these different structural types.
In Section 3 we evaluate and critically compare, from the above perspectives
as well as from robustness considerations, a number of existing depth func-
tions and some new ones introduced via the above-mentioned constructions. It
is found that the half-space depth and a closely related “projection depth,” both
of which reflect projection pursuit methodology, are distinctly more attractive
than popular competitors.
Various supplementary notes are provided in Appendix A, including dis-
cussion of almost sure uniform convergence of sample depth functions to their
population counterparts. Finally, proofs of the results in Section 2 are provided
in Appendix B.

2. General notions of statistical depth. Here we consider general no-


tions of depth function on Rd , defined with respect to arbitrary distributions
which may be either continuous or discrete. In the spirit of Liu (1990), Sec-
tion 2.1 presents four desirable properties that an ideal depth function should
possess. In Section 2.2 the halfspace and simplicial depth functions are exam-
ined with respect to these criteria, and it is found that the halfspace depth
possesses all four properties (see Theorem 2.1), whereas the simplicial depth
lacks certain properties in some cases (see Remark 2.1). In Section 2.3, sev-
eral general structures for depth functions are introduced and investigated
with respect to the four properties (see Theorems 2.2–2.11). Also, familiar ex-
isting versions of depth function as well as some new ones are reviewed in the
context of these structures.

2.1. Desirable properties and a general definition. We confine attention


to depth functions that are nonnegative and bounded. In order that a depth
function serve most effectively as a tool providing a center-outward ordering
of points in Rd , it should ideally satisfy the following further properties, which
we state informally first and then more precisely in Definition 2.1.
P1. Affine invariance. The depth of a point x ∈ Rd should not depend on the
underlying coordinate system or, in particular, on the scales of the underlying
measurements.
P2. Maximality at center. For a distribution having a uniquely defined “center”
(e.g., the point of symmetry with respect to some notion of symmetry), the
depth function should attain maximum value at this center.
P3. Monotonicity relative to deepest point. As a point x ∈ Rd moves away from
the “deepest point” (the point at which the depth function attains maximum
value; in particular, for a symmetric distribution, the center) along any fixed
ray through the center, the depth at x should decrease monotonically.
464 Y. ZUO AND R. SERFLING

P4. Vanishing at infinity. The depth of a point x should approach zero as x


approaches infinity.
We note that P1–P4 are introduced and investigated for the simplicial depth
in Liu (1990).
We now formally define “statistical depth function”. Denote by F the class
of distributions on the Borel sets of Rd and by Fξ the distribution of a given
random vector ξ.

Definition 2.1. Let the mapping D·  ·  Rd × F → R1 be bounded, non-


negative, and satisfy P1–P4. That is, assume:
(i) DAx + b FAX+b  = Dx FX  holds for any random vector X in Rd ,
any d × d nonsingular matrix A, and any d-vector b;
(ii) Dθ F = supx∈Rd Dx F holds for any F ∈ F having center θ;
(iii) for any F ∈ F having deepest point θ, Dx F ≤ Dθ + αx − θ F
holds for α ∈ 0 1 ; and
(iv) Dx F → 0 as x → ∞, for each F ∈ F .
Then D·  F is called a statistical depth function.

A sample version of Dx P, denoted by Dn x ≡ Dx P , may be defined


n

by replacing P by a suitable empirical measure Pn .
In the above we have used the term “center” to denote a point of symmetry.
Various notions of multivariate symmetry are possible. In particular, a stan-
dard notion widely used in the literature is that a random vector X in Rd is
d d
centrally symmetric about θ if X − θ = θ − X, where “=” denotes “equal in
distribution.” A broader notion due to Liu (1990) defines X to be angularly
symmetric about θ if X − θ/ X − θ is centrally symmetric about the origin.
A still broader notion, which we here introduce, defines X to be halfspace
symmetric about θ if PX ∈ H ≥ 1/2 for every closed halfspace H contain-
ing θ. In an obvious terminology, it is easily established that C-symmetry →
A-symmetry → H-symmetry. For characterizations of H-symmetry motivat-
ing its relevance in nonparametric multivariate location inference, see Zuo and
Serfling (2000c). Thus the most favorable manifestation of property P2 for a
depth function D· · is that maximality at center should hold for D· F as
generally as possible, that is, for every H-symmetric F. A similar remark
holds with respect to property P3. [For further comparison of angular and
halfspace symmetry, and of these with notions in Beran and Millar (1997), see
Remark A.2 in Appendix A.]
One might view property P4 as rather too strict and thus instead consider
some weaker variant. If, for example, the depth function has a lower limit
L > 0, one might normalize the depth function by subtracting L. But when L
depends on F (as for the majority depth when d ≥ 2), this is computationally
and technically very burdensome.
Or one might require merely that Rx F → 0 as x → ∞, where Rx F =
PF y  Dy F ≤ Dx F, the proportion of the distribution F having
STATISTICAL DEPTH FUNCTION 465

depth ≤ the depth of x. [This quantity is used by Liu and Singh (1993) in
defining their “quality index.”] Under P2 and P3, however, convergence of
Rx F to 0 is seen to hold already and thus does not offer anything productive
in addition to P2 and P3.
In Zuo and Serfling [(2000b), Theorem 3.1(iv)], the present form of P4 is use-
ful in establishing compactness of depth-trimmed regions. Further, it plays a
role in using truncation arguments to establish almost sure uniform conver-
gence of sample depth functions to population versions.

2.2. A further look at the halfspace and simplicial depth functions. We


now investigate whether the halfspace depth function HDx P and the sim-
plicial depth function SDx P are “statistical depth functions” in the sense
of Definition 2.1. These are treated, respectively, in the following theorem and
remark.

Theorem 2.1. The halfspace depth function HDx P is a statistical depth


function in the sense of Definition 2.1.

Remark 2.1. For continuous angularly symmetric distributions, it follows


from results of Liu (1990) that the simplicial depth function SD·  P is a sta-
tistical depth function in the sense of Definition 2.1. For discrete distributions,
however, SDx P can for H-symmetric distributions fail to satisfy the “maxi-
mality” property P2 and even for C-symmetric distributions fail to satisfy the
“monotonicity” property P3. This is seen from the following counterexamples.

Counterexample 1. Let d = 1 and PX = 0 = 1/5, PX = ±1 = 1/5,


and PX = ±2 = 1/5. Then clearly X is centrally symmetric about 0. It is not
difficult to show that SD1/2 P = 12/25 and SD1 P = 15/25, violating
P3.

Counterexample 2. Let d = 2 and PX = ±1 0 = PX = ±2 0 =


PX = 0 ±1 = 1/6. Then X is centrally symmetric about (0, 0) and
SD1 0 P − SD1/2 0 P = 3! · 2 · 1/63 = 1/18 > 0
again violating P3.

Counterexample 3. Let d = 2 and PX = θ = 0 0 = 19/40, PX =


A = −1 1 = 3/40, and PX = B = −1 −1 = PX = C = 1 0 = 1/40.
Let B θ intersect AC at D, x be a point inside the triangle  A θD, and PX =
x = 16/40. Then it is not difficult to verify, based on results established in
Zuo and Serfling (2000c), that X is H-symmetric about θ, which is thus the
center of the distribution. However, we have
3!
SDx P − SDθ P = 2 × 16 × 1 × 3 − 3 × 1 × 19 + 1 × 1 × 19 > 0
403
that is, the “maximality” property P2 fails to hold.
466 Y. ZUO AND R. SERFLING

For the above two well-known notions of depth function, we thus have found
that one behaves well overall, while in some discrete cases the other is not
completely satisfactory. This leads one to investigate whether other attractive
statistical depth functions can be defined, indeed to explore general structures
for such functions and to seek to identify the more favorable types.

2.3. General structures for statistical depth functions. Four general struc-
tures for construction of statistical depth functions are introduced and inves-
tigated with respect to properties P1–P4. Various existing depth functions are
classified according to these types.
2.3.1. Type A depth functions. Let hx x1      xr  be any bounded non-
negative function which in some sense measures the closeness of x to the
points x1      xr . A corresponding Type A depth function is then defined by
the average closeness of x to a random sample of size r:
(1) Dx P = Ehx X1      Xr 
where X1      Xr is a random sample from P. For such depth functions
the corresponding sample versions Dx P   turn out to be U-statistics or V-
n
statistics.
Taking r = d + 1 and hx x1      xd+1  = I  x ∈ S x1      xd+1 , we ob-
tain the simplicial depth, whose properties have been covered in Section 2.2.
Another example is the following.

Example 2.1. [Majority depth (Singh, 1991)] For given points x1      xd


in Rd which determine a unique hyperplane containing themselves, there cor-
respond two closed halfspaces with this hyperplane as boundary. Denote by
HxP1 xd the one which carries probability mass ≥ 1/2 under the distribution
P on Rd . Then the majority depth function is defined by
P
(2) MJDx P = Px ∈ HX 1 Xd
 x ∈ Rd 
where X1      Xd is a random sample from P. Clearly, the majority depth
function is of Type A with r = d and hx x1      xd  ≡ I  x ∈ HxP1 xd .
Let us explore the majority depth function with respect to properties P1–
P4. Clearly P1 is satisfied. Also, as remarked by Liu and Singh (1993), for any
A-symmetric distribution P, MJDx P decreases monotonically as x moves
away from the center along any fixed ray originating from the center, that is,
P2 and P3 hold. Indeed, the following result establishes this more generally.

Theorem 2.2. For H-symmetric distributions P, MJDx P satisfies P2


and P3.

The majority depth fails to satisfy property P4, however. As a counterex-


ample, take d = 2 and define P by PX = ±1 0 = 1/3 and PX =
0 1 = 1/3. Then it is easy to see that lim x →∞ MJDx P = 2/3. As an-
other counter example, for d = 1 one can show for any P that MJDx P =
1/2 + minPx 1 − Px → 1/2 as x → ∞.
STATISTICAL DEPTH FUNCTION 467

2.3.2. Type B depth functions. Let hx x1      xr  be an unbounded non-


negative function which measures in some sense the distance of x from the
points x1      xr . A corresponding Type B depth function is then defined by
(3) Dx F ≡ 1 + Ehx X1      Xr −1 
for X1      Xr a random sample from F. Closely related to (3), but not equiv-
alent, is the structure E 1 + hx X1      Xr  −1 , which is a further example
of the Type A structure. For the sake of tractability, we prefer the form (3).
As a measure of dispersion of a point cloud x x1      xr , the function
hx x1      xr  possibly may not possess the affine invariance property P1,
but in many such cases it satisfies at least rigid-body invariance, that is,
hAx + b Ax1 + b     Axr + b = hx x1      xr  for any d × d orthogonal
matrix A and any vector b ∈ Rd . For example, see the Lp depth treated
below. Or, a suitable modification of the function h sometimes yields an affine
invariant version, as in the case of the “simplicial volume depth” as well as the
L2 depth treated below. Regarding properties P2–P4, Type B depth functions
are rather well behaved, as shown by the following examples and theorems.
Example 2.2 (Simplicial volume depth). Take
hx x1      xd  = α S x x1      xd 
where S x x1      xd  denotes the volume of the d-dimensional simplex
S x x1      xd and α > 0. This is a measure of the dispersion of the point
cloud x x1      xd  and accordingly
(4) 1 + E α S x X1      Xd  −1
defines a Type B depth function. This depth function usually is not affine
invariant, however, since
α S Ax + b Ax1 + b     Axd + b  =  detA α α S x x1      xd 
where b is any vector in Rd , and the determinant detA of the nonsingular
matrix A is not always equal to 1. This problem can be rectified by a modifi-
cation. Rather than (4), we define the simplicial volume depth function by
  α  −1
α S x X1      Xd 
(5) SVD x F ≡ 1 + E  
det
where  is the covariance matrix of F. This version is affine invariant.

Remark 2.2. Oja (1983) introduced for C-symmetric distributions a family


of location measures utilizing simplicial volume, as follows. For each α > 0, a
location measure µα :  → Rd is defined by
E α S µα F X1      Xd  = inf E α S µ X1      Xd  
µ∈Rd

However, he did not develop it into a depth function, nor did he consider the
affine invariant version (5).
468 Y. ZUO AND R. SERFLING

Example 2.3. [Lp depth (p > 0.)] Another way to measure distance is via
the Lp norm · p . Taking hx x1  = x − x1 p , a corresponding Type B depth
function is given by
 −1
(6) Lp Dx F ≡ 1 + E x − X p 
Note that Lp Dx F generally does not possess the affine invariance property,
however, since
E Ax + b − AX + b p = E Ax − X p

which is not equal to E x − X p for every nonsingular matrix A. On the other


hand, taking p = 2, it is easy to see that L2 Dx F is rigid-body invariant.
Moreover, a modification of the L2 norm yields an affine invariant version.
Following Rao (1988), for a positive definite d × d matrix M, define a norm
· M as

(7) x M ≡ x Mx ∀x ∈ Rd 
Then, for p = 2, the depth function defined in (6) may be modified to an affine
invariant version,

(8) L2 Dx F ≡ 1 + E x−X −1 −1 


where  is the covariance matrix of F.

Under some conditions on hx x1      xr , Type B depth functions neces-


sarily satisfy P2 and P3, as shown in the following two results.

Theorem 2.3. Suppose θ is the point of symmetry of a distribution F with


respect to a given notion of symmetry. Then Type B depth functions Dx F
possess the “maximality at center” property P2 if:
(i) hx + b x1 + b     xr + b = hx x1      xr 
(ii) h−x −x1      −xr  = hx x1      xr 
(iii) hx x1      xr  is convex in the argument x and
(iv) for x, b and x1      xr arbitrary vectors in Rd and X1      Xr a random
sample from F, the set

arg inf Ehx X1 − θ     Xr − θ ∩ arg inf Ehx θ − X1      θ − Xr 


x∈Rd x∈Rd

is nonempty.

Remark 2.3. For any distribution C-symmetric about a point θ in Rd , there


is always a point y ∈ Rd satisfying condition (iv) above.

Theorem 2.4. If hx x1      xr  is convex in x, then the corresponding


Type B depth function Dx F decreases monotonically as x moves outward
along any ray starting at a deepest point of F.
STATISTICAL DEPTH FUNCTION 469

Equipped with the above two results, we now take a further look at
SVDα x F and Lp Dx F.

Corollary 2.1. For α ≥ 1, SVDα x F satisfies P3 and P4.

Since α S x x1      xd  is convex and rigid-body invariant, according to


Theorem 2.3 we obtain

Corollary 2.2. For C-symmetric distributions and α ≥ 1, SVDα x F sat-


isfies P2.

The affine invariance and Corollaries 2.1 and 2.2 thus yield:

Theorem 2.5. For C-symmetric distributions and α ≥ 1, SVDα x F is a


statistical depth function in the sense of Definition 2.1.

The next three results treat P2–P4 for Lp Dx F, p ≥ 1 and L2 Dx F.
Convexity of hx x1  = x − x1 p in the argument x follows in straight-
forward fashion from Minkowski’s inequality. Thus Theorem 2.4 yields P3 for
Lp Dx F, while P4 is obvious. Thus we have

Corollary 2.3. For p ≥ 1, Lp Dx F satisfies P3 and P4.

Since hx x1  is location invariant and even, that is, hx + b x1 + b =


hx x1  for any vector b ∈ Rd and h−x −x1  = hx x1 , by the convexity
just established and Theorem 2.3 we obtain:

Corollary 2.4. For C-symmetric distributions and for p ≥ 1, Lp Dx F


satisfies P2.

For L2 Dx F we have:

Theorem 2.6. For any distribution F A-symmetric about a unique point


θ ∈ Rd , L2 Dx F defined in 8 is a statistical depth function in the sense of
Definition 2.1.

Remark 2.4. In the foregoing proof, condition (iv) of Theorem 2.3 was es-
tablished for L2 x F for all A-symmetric F. For the depth function L2 x F,
it follows from results established in Zuo and Serfling (2000c) that this condi-
tion holds for all H-symmetric F.

2.3.3. Type C depth functions. Let Ox F be a measure of the outlying-


ness of the point x in Rd with respect to the center or the deepest point of the
distribution F. Usually Ox F is unbounded, but a corresponding bounded
depth function is defined by
(9) Dx F ≡ 1 + Ox F−1 
We call these Type C depth functions.
470 Y. ZUO AND R. SERFLING

Remark 2.5. Although Type B and Type C depth functions are clearly sim-
ilar in form, it is convenient to treat them separately, as they arise from some-
what different conceptual points of view.

Example 2.4. Projection depth. Define the outlyingness of a point x to be


the worst case outlyingness of x with respect to the one-dimensional median
in any one-dimensional projection, that is,

 u x − Medu X 
(10) Ox F ≡ sup 
u =1 MADu X

where X has distribution F, Med denotes the univariate median, MAD de-
notes the univariate median absolute deviation defined for univariate Y as
MADY = MedY − MedY, and · is the Euclidean norm. We call
the corresponding Type C depth function projection depth and denote it by
PDx F, x ∈ Rd .

Remark 2.6. For one-dimensional datasets X = X1      Xn ,

On x ≡ x − Med1≤i≤n Xi / MAD1≤i≤n Xi 

has long been used as a robust measure of outlyingness of x ∈ R with respect


to the center (median) of the dataset. See Mosteller and Tukey [(1977), pages
205–208]. Here

1
Med1≤i≤n Xi  = 2
X n+1
2 
+ X n+2
2 


MAD1≤i≤n Xi  = Med1≤i≤n Xi − Med1≤j≤n Xj 

and X1 ≤ · · · ≤ Xn are the ordered X1      Xn . Donoho and Gasko (1992)
generalized this to arbitrary dimension d, defining On x to be the worst case
outlyingness of x ∈ Rd in any one-dimensional projection of x and the dataset
X. A sample version of the projection depth function PDx F is thus given
by

(11) PDn x = 1 + On x−1 

Liu (1992) suggested the use of (11) as a data depth function, but did not
provide any treatment of it.

Example 2.5 (Mahalanobis depth). Mahalanobis (1936) introduced a dis-


tance between two points x and y in Rd , with respect to a positive definite
d × d matrix M, as

d2M x y = x − y M−1 x − y


STATISTICAL DEPTH FUNCTION 471

Based on this Mahalanobis distance, one can define a Mahalanobis depth as


the corresponding Type C depth function,
−1
(12) MHDx F = 1 + d2F x µF 

where F is a given distribution and µF and F are any corresponding
location and covariance measures, respectively. The case that µF and F
are the mean and covariance matrix of F was suggested by Liu (1992). For
these choices, however, MHD·  F is not “robust” [since µF = mean is not
robust, as noted by Liu and Singh (1993)], and it can fail to achieve maximum
value at the center of A-symmetric distributions.
For Type C depth functions, the following analogues of Theorems 2.3 and
2.4 hold and can be proved similarly. It is convenient to write Ox X for
Ox FX .

Theorem 2.7. Suppose θ in Rd is the point of symmetry of a distribution F


with respect to a given notion of symmetry. The Type C depth functions Dx F
possess the “maximality at center” property P2 if for arbitrary vectors x, b in
Rd 
(i) Ox + b X + b = Ox X
(ii) O−x −X = Ox X
(iii) Ox X is convex in the argument x and
(iv) the set

y ∈ arg inf Ox X − θ ∩ arg inf Ox θ − X


x∈Rd x∈Rd

is nonempty.

Theorem 2.8. If Ox F is convex in the argument x, then the correspond-


ing Type C depth function Dx F decreases monotonically as x moves outward
along any ray starting at a deepest point of F.

The following two theorems establish that PDx F and MHDx F are
proper statistical depth functions.

Theorem 2.9. The projection depth function PDx F is a statistical depth


function in the sense of Definition 2.1.

A location measure µ is affine equivariant if µAX + b = AµX + b for


any affine transformation AX + b of X. A covariance measure  is affine
equivariant if AX + b = AXA for any affine transformation AX + b
of X.

Theorem 2.10. Let F be symmetric. Then the Mahalanobis depth function


MHDx F is a statistical depth function in the sense of Definition 2.1 if µ and
 are affine equivariant and µF agrees with the point of symmetry of F.
472 Y. ZUO AND R. SERFLING

The proof is straightforward.


2.3.4. Type D depth functions. One can interpret the “tailedness” of a point
with respect to a given distribution as an index related to its relative depth
with respect to the center or deepest point of the distribution. Let C be a class
of closed subsets of Rd and P a probability measure on Rd . A corresponding
Type D depth function is defined by
(13) Dx P C  ≡ inf  PC  x ∈ C ∈ C 
C

Thus the C -depth of a point x with respect to a probability measure P on Rd


is defined to be the minimum probability mass carried by a set C in C that
contains x. In essence, this form of depth function is equivalent, via D = 1 − I,
to the “index function” I x P C introduced by Small (1987) for measuring
the “tailedness” of points x in some space. Such functions have antecedents in
game theoretical work of Hotelling (1929) and Chamberlin (1937).
We confine attention to classes C satisfying the following conditions:
C1. If C ∈ C , then Cc ∈  .
C2. For C ∈ C and x ∈ C◦ , there exists C1 ∈ C with x ∈ ∂C1 , C1 ⊂ C◦ ,
where ∂C, Cc , C◦ and C denote, respectively, the boundary, complement, inte-
rior and closure of C.
The class of all closed halfspaces H on Rd satisfies C1 and C2 and thus
the halfspace depth is a typical example of Type D depth function. As shown
in Theorem 2.1, HDx P is a statistical depth function. Useful further prop-
erties of HDx P that in fact hold more generally are given in the following
result.

Theorem 2.11. Let C be a class of closed Borel sets satisfying C1 and C2.
Further, for a given probability measure P on Rd , assume that if x ∈ C ∈ C
and PC < α, then there is a C1 ∈ C such that x ∈ C◦1 and PC1  < α. Then:
(i) Dx P C  is upper semicontinuous;
(ii) Dα ≡ x ∈ Rd  Dx P C  ≥ α, α ∈ 0 1 , are compact and nested
i.e., Dα1 ⊂ Dα2 if α1 > α2  and
(iii) Dα is convex if every C ∈ C is convex.

Remark 2.7. If C2 is replaced by

C2 . P∂C = 0, ∀ C ∈  ,

the above theorem remains true.

3. Concluding remarks. Here we examine and compare a number of


depth functions with respect to the criteria given by properties P1–P4.
We begin with four cases having central importance because the correspond-
ing versions of multidimensional median generated by their points of maximal
depth are among the most popular competitors for nonparametric and robust
STATISTICAL DEPTH FUNCTION 473

estimation of multidimensional location. These are the halfspace depth (Type


D, Example 2.7), the simplicial depth (Type A, Example 2.1), the simplicial
volume depth (Type B, Example 2.3), and the L2 depth (Type B, Example
2.4), which generate, respectively, the so-called Tukey/Donoho halfspace me-
dian (H), the Liu simplicial depth median (S), the Oja median (O) and the
spatial or L2 median. [See Small (1990) for an overview of these and other
multidimensional medians.] With respect to affine invariance P1, all but the
L2 version are fully satisfactory, the L2 depth function being invariant only
under rotational and rigid-body transformations. The “maximality at center”
property P2 is satisfied by the halfspace depth function for H-symmetric dis-
tributions (see the proof of Theorem 2.1) and can be shown to be satisfied
by the L2 depth function for all H-symmetric distributions (see Remark 2.4)
and the simplicial volume depth function for C-symmetric distributions (see
Corollary 2.2). Also, P2 is satisfied by the simplicial depth function for contin-
uous A-symmetric distributions but not necessarily for discrete H-symmetric
distributions (see Remark 2.1). The “monotonicity relative to deepest point”
P3 is satisfied arbitrarily by the halfspace, simplicial volume, and L2 depth
functions, and also by the simplicial depth function except in some discrete
cases (see Theorem 2.1, Remark 2.1, and Corollaries 2.1 and 2.3). Finally,
“vanishing at infinity” P4 is satisfied by all four of these depth functions (see
Theorem 2.1 and Corollaries 2.1 and 2.3). Thus, from consideration of P1–P4,
the halfspace and simplicial volume depth functions appear to be the most
comprehensively attractive among these four competitors. If, however, we in
addition consider breakdown points of the corresponding location estimators
[for details, see Small (1990), Niinimaa, Oja and Tableman (1990), Donoho
and Gasko (1992) and Chen (1995)], we find that the estimator based on the
simplicial volume depth, unlike the others, has breakdown point 0, while that
based on the halfspace depth has breakdown point 1/3 for typical data sets,
leading us to prefer the halfspace depth function more exclusively.
Let us now consider the projection depth and the Mahalanobis depth. By
Theorems 2.9 and 2.10, these both satisfy properties P1–P4. Regarding robust-
ness, however, the multidimensional median corresponding to sample projec-
tion depth has large-sample breakdown point 1/2 [see Tyler (1994), page 1033,
and Zuo (1999)] as does the closely related Donoho-Stahel estimator [Stahel
(1981), Donoho (1982) and Donoho and Gasko (1992)], whereas the robustness
of the median generated by the Mahalanobis depth depends critically on the
choice of location and covariance measures in defining this depth. We antici-
pate that suitable choices exist which yield high breakdown point. Therefore,
we consider both of these depth functions to be competitive.
Another approach toward construction of depth functions consists of “peel-
ing” methods, such as convex hull peeling. This latter approach, however, not
only lacks a population analogue but also exhibits very unfavorable robust-
ness properties. See discussion of Donoho and Gasko (1992), Nolan (1992) and
Liu, Parelius and Singh (1999).
Likelihood-based depth functions have also been considered. See Fraiman
and Meloche (1996), Fraiman, Liu and Meloche (1997) and Liu, Parelius and
474 Y. ZUO AND R. SERFLING

Singh (1999). These, however, fail to satisfy in general any of P1–P4, and
their effectiveness appears to be confined primarily to models with ellipsoidal
densities, or to situations where sensitivity to multimodality is paramount.
For further discussion, see Remark A.1 in Appendix A.
The zonoid depth function of Koshevoy and Mosler (1997) has some nice
properties but can fail to satisfy “maximality at center” P2 for A- or H-sym-
metric distributions, because it attains maximum value always at the expec-
tation EX for any random variable X in Rd . Also, the sample zonoid depth
function is not robust, as a single corrupted data point can move the “center
point of zonoid data depth” to infinity.
In conclusion, the halfspace and projection depth functions appear to repre-
sent very favorable choices. Both are implementations of the “projection pur-
suit” method, which utilizes all of the one-dimensional views of a dataset as a
foundation for data analysis, thus producing the advantage of great power at
extraction of information, although at the expense of a substantial computa-
tional burden. Also, competitively, the L2 and Mahahalanobis depth functions
appear to have strong potential for development.

APPENDIX A: SUPPLEMENTARY NOTES

Remark A.1. As pointed out and pictorially illustrated in Baggerly and


Scott (1999), the near convexity of the simplicial depth contours limits their
interpretability for multimodal data, whereas the likelihood depth contours
follow the multimodality structure. In the usual sense of “center-outward or-
dering,” and from the common standpoint of desiring connectedness of depth-
trimmed regions, the likelihood “depth” has less of a role as a depth function
than as simply what it is by definition: a density function, which keeps the
information on multimodality structure when present.

Remark A.2. As broadenings of central symmetry, angular and halfspace


symmetry are opposite in character and purpose to several notions of nonpara-
metric multivariate symmetry introduced by Beran and Millar (1997) which
in fact are narrowings — see their formula (17). Also, their use of halfspaces
is essentially for the purpose of indexing the empirical measure, rather than
as a fundamental element in defining symmetry.
As shown in Zuo and Serfling (2000c), halfspace symmetry of P about θ
reduces to angular symmetry about θ except when P is discrete with posi-
tive mass at θ. These exceptions are of practical relevance, since underlying
distributions for actually observed phenomena are invariably discrete (and
asymmetric), and it is reasonable to permit an approximating symmetric dis-
tribution to have mass at the center of symmetry.

Remark A.3. An important aspect of any depth function is whether its


sample version converges to the population counterpart. In particular, we de-
STATISTICAL DEPTH FUNCTION 475

sire that almost surely [P]


(A.1) sup Dn x − Dx P → 0 n → ∞
x

Besides carrying intrinsic interest, (A.1) plays a supporting role for other pur-
poses. For example, it underlies the convergence of sample depth contours to
their population counterparts, as in He and Wang (1997) especially for ellip-
tical models and in Zuo and Serfling (2000b) for more general models. In Liu
and Singh (1993), it is basic to the convergence of a certain “quality index”,
while in Liu, Parelius and Singh (1999) it supports various practical methods
such as “DD-plots.”
Results on (A.1) are now available for several cases of depth function.
Donoho and Gasko (1992) proved it for the sample halfspace depth,
 H  H a closed halfspace x ∈ H
HDn x = inf P x ∈ Rd 
n

where P  denotes the usual empirical measure, and Liu (1990), Dümbgen
n
(1990), and Arcones and Giné (1993) for the sample simplicial depth
−1
n 
SDn x = Ix ∈ S Xi1      Xid+1  x ∈ Rd 
d+1 1≤i1 <···<id+1 ≤n

For the sample majority and Mahalanobis depths, under suitable conditions
on F, (A.1) is established by Liu and Singh (1993). For sample versions of
the “projection” depth function and the “Type D” depth functions introduced
above, (A.1) is established in Appendix B of Zuo and Serfling (2000b).

APPENDIX B: PROOFS

Proof of Theorem 2.1. Clearly, HDx P is bounded and nonnegative.


We need only check P1–P4.
(a) Affine invariance. Straightforward.
(b) Maximality at center. Suppose that P is H-symmetric about a unique
point θ ∈ Rd . By the definition of H-symmetry, we have PHθ  ≥ 1/2, for any
closed halfspace H with θ ∈ ∂H. It follows that HDθ P ≥ 1/2 Now suppose
that there is a point x0 ∈ Rd , x0 = θ, such that HDx0  P > 1/2 Then PH >
1/2 for any closed halfspace H with x0 ∈ ∂H, which implies that P is also
H-symmetric about x0 , contradicting the assumption that P is H-symmetric
about a unique point θ ∈ Rd . Therefore, HDθ P = supx∈Rd HDx P.
(c) Monotonicity relative to deepest point. Suppose θ is a deepest point with
respect to the underlying distribution. To compare HDx P and HDθ+αx−
θ P, we need only consider the infimum in the definition of HD over all closed
halfspaces which do not contain θ. For any Hθ+αx−θ [closed halfspace with
θ+αx−θ ∈ ∂H], by the separating hyperplane theorem there always exists
a closed halfspace Hx such that Hx ⊂ Hθ+αx−θ  It follows that HDx P ≤
HDθ + αx − θ P, ∀α ∈ 0 1.
476 Y. ZUO AND R. SERFLING

(d) Vanishing at infinity. It is easy to see that P X ≥ x  → 0 as x →


∞ and that for each x and X there exists a closed halfspace Hx such that
Hx ⊂  X ≥ x  Thus HDx P → 0 as x → ∞ completing the proof.

Proof of Theorem 2.2. (a) Let θ be the center of an H-symmetric dis-


tribution P and x an arbitrary point in Rd . Then, by the definition of H-
P
symmetry, for any random sample X1      Xd from P we have x ∈ HX 1 Xd

P
θ ∈ HX1 Xd and thus MJDθ P = supx∈Rd MJDx P.
(b) Let λ ∈ 0 1 and x0 ≡ λθ + 1 − λx. Then

P P
MJDx0  P − MJDx P = P x0 ∈ HX 1 Xd
− P x ∈ HX 1 Xd

P P
= P x0 ∈ HX 1 Xd
and x ∈ HX 1 Xd

≥ 0 ✷

Proof of Theorem 2.3. By (i) and (ii) we have

Ehx X1 − θ     Xr − θ = Ehθ + x X1      Xr 
Ehx θ − X1      θ − Xr  = Ehθ − x X1      Xr 

Let y be a point in the set in (iv). It follows that

y ∈ arg inf Ehθ + x X1      Xr  ∩ arg inf Ehθ − x X1      Xr  


x∈Rd x∈Rd

The convexity of hx x1      xr  in x now yields

hθ X1      Xr  ≤ 12 hθ + y X1      Xr  + 12 hθ − y X1      Xr 

It follows that

Ehθ X1      Xr  ≤ 12 Ehθ + y X1      Xr  + 12 Ehθ − y X1      Xr 


= inf Ehθ + x X1      Xr 
x∈Rd

= inf Ehx X1      Xr 
x∈Rd

Hence Dθ F = supx∈Rd Dx F, completing the proof. ✷

Proof of Theorem 2.4. Let θ in Rd be a deepest point with respect to


the underlying distribution F, that is, Dθ F = supx∈Rd Dx F Let x = θ
be an arbitrary point in Rd , let λ ∈ 0 1 and set x0 ≡ θ + λx − θ. Then
Dx F ≤ Dθ F The convexity of hx x1      xr  in x yields

hx0  X1      Xr  ≤ λhx X1      Xr  + 1 − λhθ X1      Xr 


STATISTICAL DEPTH FUNCTION 477

Thus
Ehx0  X1      Xr  ≤ maxEhx X1      Xr  Ehθ X1      Xr 
= Ehx X1      Xr 
and hence Dx0  F ≥ Dx F completing the proof. ✷

Proof of Corollary 2.1. (a) By Theorem 2.4, to show P3 we check con-


vexity of α S x x1      xd  in the argument x for α ∈ 1 ∞. Let x y be two
points in Rd , take λ ∈ 0 1, and put x0 ≡ λx + 1 − λy. Then
   
 1 1 ··· 1  
   
 1  x01 x11 · · · xd1  
S x0  x1      xd  =  
det       
 d!      
 x x ··· x  
0d 1d dd

   
  λ + 1 − λ 1 ··· 1  
   
 1  λx̃1 + 1 − λỹ1 x11 ··· xd1  
=  det      

 d!       
  λx̃ + 1 − λỹ x1d · · · xdd  
d d

≤ λS x x1      xd  + 1 − λS y x1      xd 
where x = x̃1      x̃d   y = ỹ1      ỹd  and xi = xi1      xid  for 0 ≤ i ≤


d. Now the convexity of the function xα for 0 < x < ∞ and α ≥ 1 yields
α S x0  x1      xd  ≤ λα S x x1      xd  + 1 − λα S y x1      xd 
(b) It is obvious that α S x x1      xd  → ∞ as x → ∞ Thus
SVDα x F → 0 as x → ∞ completing the proof. ✷

Proof of Theorem 2.6. Since L2 Dx F defined in (8) is affine invariant,


and P4 is evident, we check P2 and P3.
(a) We first show that · M is convex for any positive definite d × d matrix
M. Since M is positive definite, there is a nonsingular matrix S such that
M = S S. Let x y be two points in Rd and λ ∈ 0 1. Then
2
λx + 1 − λy M = λx + 1 − λy Mλx + 1 − λy
= λ2 x Mx + 2λ1 − λx My + 1 − λ2 y My
= λ2 x Mx + 2λ1 − λSx Sy + 1 − λ2 y My
The Schwarz inequality implies that
2
λx + 1 − λy M ≤ λ2 x Mx + 2λ1 − λ Sx Sy + 1 − λ2 y My
= λ2 x 2
M + 2λ1 − λ x M y M + 1 − λ2 y 2
M
2
= λ x M + 1 − λ y M 
478 Y. ZUO AND R. SERFLING

It follows that
λx + 1 − λy M ≤λ x M + 1 − λ y M
d
(b) Now we show that there is a point y ∈ R satisfying condition (4) of
Theorem 2.3. Equivalently, we need to show that
(B.1) θ ∈ arg inf E x−X −1 
x∈Rd

where  is the covariance matrix of F.


We first show that
 
θ−X
(∗) E = 0
X − θ −1
Since F is angularly symmetric about θ, it can be shown [see Zuo and Serfling
(2000c)] that PX ∈ Hθ  = PX ∈ −Hθ  for any closed halfspace Hθ with
θ on the boundary, where −Hθ is the reflection of Hθ about θ. Since −1 is
positive definite, there is a nonsingular matrix R such that −1 = R R. Thus
PRX ∈ RHθ  = PRX ∈ −RHθ 
for any closed halfspace Hθ with θ on the boundary. By nonsingularity and
results established in Zuo and Serfling (2000c), we conclude that RX is angu-
larly symmetric about Rθ. Hence
R X − θ d R θ − X
= 
R X − θ R θ − X
which is equivalent to
R X − θ d R θ − X
= 
X − θ −1 θ − X −1
This implies (∗).
Now we show that (B.1) holds true. Consider the derivative of E µ−X −1
with respect to µ ∈ Rd . By vector differentiation, we have
 
d E µ − X −1  d Rd µ − x −1 dFx
=
dµ dµ

 d µ − x −1 
= dFx
Rd dµ

 −1 µ − x
= dFx
Rd µ − x −1
 
−1 µ−X
= E 
µ − X −1
Then by convexity and (∗) we conclude that (B.1) holds.
STATISTICAL DEPTH FUNCTION 479

The result now follows from Theorems 2.3 and 2.4. ✷

Proof of Theorem 2.9. Since PDx F is nonnegative and bounded, we


need only check P1–P4.
(a) Affine invariance. Straightforward.
(b) Maximality at center. Suppose that F is H-symmetric about a unique
point θ ∈ Rd . Then [see Zuo and Serfling (2000c)] we have Medu X = u θ
for any unit vector u ∈ Rd and it follows that PDθ F = supx∈Rd PDx F.
(c) Monotonicity relative to deepest point. We show that Ox X is convex
in its first argument. Let θ and x be two arbitrary points in Rd , 0 < α < 1,
and put x0 ≡ 1 − αθ + αx. Then we have
 u x0 − Medu X  =  u 1 − αθ + αx − Medu X 
=  1 − αu θ − Medu X + αu x − Medu X 
≤ 1 − α  u θ − Medu X  +α  u x − Medu X  
It follows that
 u x0 − Medu X 
Ox0  X = sup
u =1 MADu X

1 − α  u θ − MeduX  +α  u x − MeduX 


≤ sup
u =1 MADu X

≤ 1 − αOθ F + αOx F


“Monotonicity” now follows from Theorem 2.8.
(d) Vanishing at infinity. Straightforward. ✷

Proof of Theorem 2.11. (i) We first show that

(∗) x ∈ Rd  Dx P C  ≥ α = ∩C  PC > 1 − α C ∈ C 


(a) If x ∈ x ∈ Rd  Dx P C  ≥ α and there exists a C ∈ C such that
PC > 1 − α x ∈ C then x ∈ Cc  PCc  < α By C1 and C2, there is a
C1 ∈ C such that x ∈ ∂C1  C1 ⊂ Cc  It follows that PC1  < α and hence
Dx P C  < α, which is a contradiction to the assumption that x ∈ x ∈ Rd 
Dx P C  ≥ α. This implies
x ∈ Rd  Dx P C  ≥ α ⊂ ∩C  PC > 1 − α C ∈ C 
(b) If x ∈ ∩C  PC > 1 − α C ∈ C , and there is a C ∈ C such that
x ∈ C PC < α then by the condition given, there exists a C1 ∈ C such that
x ∈ C◦1  PC1  < α and thus x ∈ Cc1  P  Cc1  > 1 − α which contradicts the
assumption that x ∈ ∩C  PC > 1 − α C ∈ C . This implies
x ∈ Rd  Dx P C  ≥ α ⊃ ∩C  PC > 1 − α C ∈ C 
480 Y. ZUO AND R. SERFLING

Now (a) and (b) yield ∗, which implies that Dα is closed, and thus Dx P C 
is upper semicontinuous.
(ii) The nestedness of Dα is trival. The boundedness of Dα follows from the
fact that Dx P C  → 0 as x → ∞ The compactness of Dα now follows
from its being bounded and closed.
(iii) The convexity follows from ∗, since the intersection of convex sets is
convex. ✷

Acknowledgments. The authors greatly appreciate the thoughtful and


constructive remarks of an Associate Editor and two referees, which led to
distinctive improvements in the paper.

REFERENCES
Arcones, M. A. and Giné, E. (1993). Limit theorems for U-processes. Ann. Probab. 21 1494–1542.
Baggerly, K. A. and Scott, D. W. (1999). Comment on “Multivariate analysis by data depth:
Descriptive statistics, graphics and inference,” by R. Y. Liu, J. M. Parelius and K. Singh.
Ann. Statist. 27 843–844.
Bartoszyński, R., Pearl, D. K. and Lawrence, J. (1997). A multidimensional goodness-of-fit test
based on interpoint distances. J. Amer. Statist. Assoc. 92 577–586.
Beran, R. J. and Millar, P. W. (1997). Multivariate symmetry models. In Festschrift for Lucien
Le Cam: Research Papers in Probability and Statistics (D. Pollard, E. Torgerson and
G. L. Yang, eds.) 13–42. Springer, Berlin.
Caplin, A. and Nalebuff, B. (1988). On 64%-majority rule. Econometrica 56 787–814.
Caplin, A. and Nalebuff, B. (1991a). Aggregation and social choice: A mean voter theorem.
Econometrica 59 1–23.
Caplin, A. and Nalebuff, B. (1991b). Aggregation and imperfect competition: On the existence
of equilibrium. Econometrica 59 25–59.
Carrizosa, E. (1996). A characterization of halfspace depth. J. Multivariate Anal. 58 21–26.
Chamberlin, E. (1937). The Theory of Monopolistic Competition. Harvard Univ. Press.
Chen, Z. (1995). Bounds for the breakdown point of the simplicial median. J. Multivariate Anal.
55 1–13.
Donoho, D. L. (1982). Breakdown properties of multivariate location estimators. Ph. D. qualifying
paper, Dept. Statistics, Harvard Univ.
Donoho, D. L. and Gasko, M. (1992). Breakdown properties of location estimates based on half-
space depth and projected outlyingness. Ann. Statist. 20 1803–1827.
Dümbgen, L. (1990). Limit theorems for the empirical simplicial depth. Statist. Probab. Lett. 14
119–128.
Eddy, W. F. (1985). Ordering of multivariate data. In Computer Science and Statistics: The In-
terface (L. Billard, ed.) 25–30. North-Holland, Amsterdam.
Fraiman, R. and Meloche, J. (1996). Multivariate L-estimation. Preprint.
Fraiman, R., Liu, R. Y. and Meloche, J. (1997). Multivariate density estimation by probing
depth. In L1 -Statistical Procedures and Related Topics (Y. Dodge, ed.) 415–430. IMS,
Hayward, CA.
He, X. and Wang, G. (1997). Convergence of depth contours for multivariate datasets. Ann.
Statist. 25 495–504.
Hotelling, H. (1929). Stability in competition. Econom. J. 39 41–57.
Koshevoy, G. and Mosler, K. (1997). Zonoid trimming for multivariate distributions. Ann.
Statist. 25 1998–2017.
STATISTICAL DEPTH FUNCTION 481

Liu, R. Y. (1990). On a notion of data depth based on random simplices. Ann. Statist. 18 405–414.
Liu, R. Y. (1992). Data depth and multivariate rank tests. In L1 -Statistics and Related Methods
(Y. Dodge, ed.) 279–294. North-Holland, Amsterdam.
Liu, R. Y., Parelius, J. M. and Singh, K. (1999). Multivariate analysis by data depth: Descriptive
statistics, graphics and inference (with discussion). Ann. Statist. 27 783–858.
Liu, R. Y. and Singh, K. (1993). A quality index based on data depth and multivariate rank tests.
J. Amer. Statist. Assoc. 88 252–260.
Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proc. Nat. Acad. Sci. India
12 49–55.
Massé, J. C. and Theodorescu, R. (1994). Halfplane trimming for bivariate distributions. J.
Multivariate Anal. 48 188–202.
Mizera, I. (1998). On depth and deep points: a calculus. Preprint.
Mosteller, C. F. and Tukey, J. W. (1977). Data Analysis and Regression. Addison-Wesley, Read-
ing, MA.
Niinimaa, A., Oja, H. and Tableman, M. (1990). On the finite sample breakdown point of the Oja
bivariate median and of the corresponding half-samples version. Statist. Probab. Lett.
10 325–328.
Nolan, D. (1992). Asymptotics for multivariate trimming. Stochastic Process. Appl. 42 157–169.
Oja, H. (1983). Descriptive statistics for multivariate distributions. Statist. Probab. Lett. 1 327–
333.
Rao, C. R. (1988). Methodology based on the L1 norm in statistical inference. Sankhyā Ser. A 50
289–313.
Rousseeuw, P. J. and Hubert, M. (1999). Regression depth (with discussion). J. Amer. Statist.
Assoc. 94 388–433.
Rousseeuw, P. J. and Ruts, I. (1996). Bivariate location depth. J. Roy. Statist. Soc. Ser. C 45
516–526.
Rousseeuw, P. J. and Struyf, A. (1998). Computing location depth and regression depth in
higher dimensions. Statist. Comput. 8 193–203.
Ruts, I. and Rousseeuw, P. J. (1996). Computing depth contours of bivariate point clouds. Com-
put. Statist. Data Anal. 23 153–168.
Serfling, R. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York.
Singh, K. (1991). A notion of majority depth. Preprint.
Small, C. G. (1987). Measures of centrality for multivariate and directional distributions. Canad.
J. Statist. 15 31–39.
Small, C. G. (1990). A survey of multidimensional medians. Internat. Statist. Inst. Rev. 58 263–
277.
Stahel, W. A. (1981). Robust estimation: infinitesimal optimality and covariance matrix estima-
tors. Ph. D thesis, ETH, Zurich (in German).
Tukey, J. W. (1975). Mathematics and picturing data. In Proceedings of the International
Congress on Mathematics (R. D. James, ed.) 2 523–531 Canadian Math. Congress.
Tyler, D. E. (1994). Finite sample breakdown points of projection based multivariate location
and scatter statistics. Ann. Statist. 22 1024–1044.
Vardi, Y. and Zhang, C.-H. (1999). The multivariate L1 -median and associated data depth.
Preprint.
Yeh, A. B. and Singh, K. (1997). Balanced confidence regions based on Tukey’s depth and the
bootstrap. J. Roy. Statist. Soc. Ser. B 59 639–652.
Zuo, Y. (1999). Affine equivariant multivariate location estimates with best possible breakdown
points. Preprint.
Zuo, Y. and Serfling, R. (2000a). Nonparametric notions of multivariate “scatter measure” and
“more scattered” based on statistical depth functions. J. Multivariate Anal. To appear.
482 Y. ZUO AND R. SERFLING

Zuo, Y. and Serfling, R. (2000b). Structural properties and convergence results for contours of
sample statistical depth functions. Ann. Statist. 28 483–499.
Zuo, Y. and Serfling, R. (2000c). On the performance of some robust nonparametric location
measures relative to a general notion of multivariate symmetry. J. Statist. Plann. In-
ference 84 55–79.

Department of Mathematics Department of Mathematical Sciences


Arizona State University University of Texas at Dallas
Tempe, Arizona 85287-1804 Richardson, Texas 75083-0688
E-mail: [email protected] E-mail: serfl[email protected]

You might also like