0% found this document useful (0 votes)

22 views22 pages

General Notions of Statistical Depth Function

The document discusses general notions of statistical depth functions. It introduces four desirable properties for depth functions: affine invariance, maximality at center, monotonicity relative to deepest point, and vanishing at infinity. It evaluates several existing depth functions against these properties, finding that the halfspace depth possesses all four properties, while other functions like simplicial depth may lack some properties depending on the distribution. The document provides a framework for systematically selecting depth functions based on how well they satisfy important mathematical properties.

Uploaded by

Sen Shen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views22 pages

General Notions of Statistical Depth Function

Uploaded by

Sen Shen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

The Annals of Statistics

2000, Vol. 28, No. 2, 461–482

GENERAL NOTIONS OF STATISTICAL DEPTH FUNCTION

By Yijun Zuo and Robert Serfling1

Arizona State University and University of Texas
Statistical depth functions are being formulated ad hoc with increas-
ing popularity in nonparametric inference for multivariate data. Here we
introduce several general structures for depth functions, classify many ex-
isting examples as special cases, and establish results on the possession, or
lack thereof, of four key properties desirable for depth functions in general.
Roughly speaking, these properties may be described as: afﬁne invariance,
maximality at center, monotonicity relative to deepest point, and vanishing
at inﬁnity. This provides a more systematic basis for selection of a depth
function. In particular, from these and other considerations it is found that
the halfspace depth behaves very well overall in comparison with various
competitors.

1. Introduction. Statistical depth functions have become increasingly

pursued as a useful tool in nonparametric inference for multivariate data.
Roughly speaking, for a distribution P in Rd , a corresponding depth function
is any function Dx P which provides a P-based center-outward ordering of
points x ∈ Rd . Tukey (1975) proposed a “halfspace” depth and suggested its
role in deﬁning multivariate analogues of univariate rank and order statistics
via depth-induced “contours.” The halfspace depth (HD) of a point x in Rd
with respect to a probability measure P on Rd is deﬁned as the minimum
probability mass carried by any closed halfspace containing x, that is,

HDx P = inf PH H a closed halfspace x ∈ H x ∈ Rd

Based on this depth, Donoho and Gasko (1992) studied multivariate location
estimators and Yeh and Singh (1997) developed conﬁdence regions. Properties
of the corresponding contours have been studied by various authors including
Eddy (1985), Nolan (1992), Donoho and Gasko (1992) and Massé and Theodor-
escu (1994). See Carrizosa (1996) for a characterization of halfspace depth
relating to problems of facility location analysis in the operations research
literature.
The “center-outward ordering” interpretation of a depth function suggests
that (i) a relevant notion of “center” is available, and (ii) points near the cen-
ter should have higher depth. From this standpoint, the “center” consists of
the set of points globally maximizing depth, in which case a depth function
should tend to ignore multimodality features of the underlying distribution
P. If, on the other hand, sensitivity to multimodality is desirable, then the

Received December 1998; revised December 1999.

1 Supported
by NSF Grant DMS-97-05209.
AMS 1991 subject classiﬁcations. Primary 62H05; secondary 62G20.
Key words and phrases. Statistical depth functions, halfspace depth, simplicial depth, multi-
variate symmetry.
461
462 Y. ZUO AND R. SERFLING

“center” should include local maxima as well, in which case the notion of
center-outward ordering becomes compromised and “inner” points can have
low depth. It is thus important, in considering depth functions, to make a
choice on this issue. In the present paper we opt for the center to be given
by global maxima, with low depth corresponding to large distance from the
center. For further discussion, see Remark A.1 in Appendix A.
Liu (1990) introduced a notion of “simplicial” depth and corresponding mul-
tivariate location estimators. Namely, the simplicial depth (SD) of a point x
in Rd with respect to a probability measure P on Rd is defined to be the
probability that x belongs to a random simplex in Rd , that is,
SDx P = Px ∈ S X1 Xd+1 x ∈ Rd
where X1 Xd+1 is a random sample from P and S x1 xd+1 denotes
the d-dimensional simplex with vertices x1 xd+1 , that is, the set of all
points in Rd that are convex combinations of x1 xd+1 .
Liu and Singh (1993) considered the above two depth functions and two
more, “Mahalanobis” depth and “majority” depth, which they applied in for-
mulating a “quality index” for use in connection with manufacturing processes.
Rousseeuw and Hubert (1999) introduced “regression depth” and Rousseeuw
and Ruts (1996), Ruts and Rousseeuw (1996) and Rousseeuw and Struyf (1998)
studied computing issues concerning depth functions and contours. Liu, Pare-
lius and Singh (1999) considered seven examples of depth function, including a
“convex hull peeling” version and a “likelihood” type, and developed methodol-
ogy for their practical use in exploratory statistical analysis. Likelihood-based
depth functions have also been considered by Fraiman and Meloche (1996) and
Fraiman, Liu and Meloche (1997). Koshevoy and Mosler (1997) introduced a
“zonoid” depth function based on “zonoid trimming.” Bartoszyński, Pearl and
Lawrence (1997) introduced a depth function based on interpoint distances in
the context of a multivariate goodness-of-fit test. Depth functions also arise
in the theory of social choice [see Caplin and Nalebuff (1988, 1991a, b)]. Non-
parametric notions of multivariate “scatter measure” and “more scattered”
based on general depth functions have been formulated and studied by Zuo
and Serfling (2000a). Mizera (1998) has introduced a differential calculus for
depth functions. Finally, Vardi and Zhang (1999) have introduced a method
for constructing depth functions from notions of multivariate median.
Depth functions thus have been introduced ad hoc in great variety, without
regard to whether they meet any particular set of criteria that ought to be
satisfied. Consequently, there is no systematic basis for preferring one such
function over another. In the present paper, we address this issue by asking:
(i) What desirable properties should a statistical depth function possess?
(ii) What constructive approaches lead to attractive depth functions?
(iii) Do existing depth functions possess all desired properties?
In Section 2 we list several desirable properties first introduced by Liu
(1990), on the basis of which we formulate a general definition of “statisti-
cal depth function”. Roughly speaking, these properties may be described as:
STATISTICAL DEPTH FUNCTION 463

afﬁne invariance, maximality at center, monotonicity relative to deepest point,

and vanishing at infinity. Also, several distinct structures for construction of
depth functions are introduced and investigated with respect to possession
of these properties, and a number of presently popular depth functions are
classified with respect to these different structural types.
In Section 3 we evaluate and critically compare, from the above perspectives
as well as from robustness considerations, a number of existing depth func-
tions and some new ones introduced via the above-mentioned constructions. It
is found that the half-space depth and a closely related “projection depth,” both
of which reflect projection pursuit methodology, are distinctly more attractive
than popular competitors.
Various supplementary notes are provided in Appendix A, including dis-
cussion of almost sure uniform convergence of sample depth functions to their
population counterparts. Finally, proofs of the results in Section 2 are provided
in Appendix B.

2. General notions of statistical depth. Here we consider general no-

tions of depth function on Rd , deﬁned with respect to arbitrary distributions
which may be either continuous or discrete. In the spirit of Liu (1990), Sec-
tion 2.1 presents four desirable properties that an ideal depth function should
possess. In Section 2.2 the halfspace and simplicial depth functions are exam-
ined with respect to these criteria, and it is found that the halfspace depth
possesses all four properties (see Theorem 2.1), whereas the simplicial depth
lacks certain properties in some cases (see Remark 2.1). In Section 2.3, sev-
eral general structures for depth functions are introduced and investigated
with respect to the four properties (see Theorems 2.2–2.11). Also, familiar ex-
isting versions of depth function as well as some new ones are reviewed in the
context of these structures.

2.1. Desirable properties and a general deﬁnition. We conﬁne attention

to depth functions that are nonnegative and bounded. In order that a depth
function serve most effectively as a tool providing a center-outward ordering
of points in Rd , it should ideally satisfy the following further properties, which
we state informally first and then more precisely in Definition 2.1.
P1. Affine invariance. The depth of a point x ∈ Rd should not depend on the
underlying coordinate system or, in particular, on the scales of the underlying
measurements.
P2. Maximality at center. For a distribution having a uniquely defined “center”
(e.g., the point of symmetry with respect to some notion of symmetry), the
depth function should attain maximum value at this center.
P3. Monotonicity relative to deepest point. As a point x ∈ Rd moves away from
the “deepest point” (the point at which the depth function attains maximum
value; in particular, for a symmetric distribution, the center) along any fixed
ray through the center, the depth at x should decrease monotonically.
464 Y. ZUO AND R. SERFLING

P4. Vanishing at inﬁnity. The depth of a point x should approach zero as x

approaches inﬁnity.
We note that P1–P4 are introduced and investigated for the simplicial depth
in Liu (1990).
We now formally deﬁne “statistical depth function”. Denote by F the class
of distributions on the Borel sets of Rd and by Fξ the distribution of a given
random vector ξ.

Definition 2.1. Let the mapping D· · Rd × F → R1 be bounded, non-

negative, and satisfy P1–P4. That is, assume:
(i) DAx + b FAX+b = Dx FX holds for any random vector X in Rd ,
any d × d nonsingular matrix A, and any d-vector b;
(ii) Dθ F = supx∈Rd Dx F holds for any F ∈ F having center θ;
(iii) for any F ∈ F having deepest point θ, Dx F ≤ Dθ + αx − θ F
holds for α ∈ 0 1 ; and
(iv) Dx F → 0 as x → ∞, for each F ∈ F .
Then D· F is called a statistical depth function.

A sample version of Dx P, denoted by Dn x ≡ Dx P , may be deﬁned

n

by replacing P by a suitable empirical measure Pn .
In the above we have used the term “center” to denote a point of symmetry.
Various notions of multivariate symmetry are possible. In particular, a stan-
dard notion widely used in the literature is that a random vector X in Rd is
d d
centrally symmetric about θ if X − θ = θ − X, where “=” denotes “equal in
distribution.” A broader notion due to Liu (1990) defines X to be angularly
symmetric about θ if X − θ/ X − θ is centrally symmetric about the origin.
A still broader notion, which we here introduce, defines X to be halfspace
symmetric about θ if PX ∈ H ≥ 1/2 for every closed halfspace H contain-
ing θ. In an obvious terminology, it is easily established that C-symmetry →
A-symmetry → H-symmetry. For characterizations of H-symmetry motivat-
ing its relevance in nonparametric multivariate location inference, see Zuo and
Serfling (2000c). Thus the most favorable manifestation of property P2 for a
depth function D· · is that maximality at center should hold for D· F as
generally as possible, that is, for every H-symmetric F. A similar remark
holds with respect to property P3. [For further comparison of angular and
halfspace symmetry, and of these with notions in Beran and Millar (1997), see
Remark A.2 in Appendix A.]
One might view property P4 as rather too strict and thus instead consider
some weaker variant. If, for example, the depth function has a lower limit
L > 0, one might normalize the depth function by subtracting L. But when L
depends on F (as for the majority depth when d ≥ 2), this is computationally
and technically very burdensome.
Or one might require merely that Rx F → 0 as x → ∞, where Rx F =
PF y Dy F ≤ Dx F, the proportion of the distribution F having
STATISTICAL DEPTH FUNCTION 465

depth ≤ the depth of x. [This quantity is used by Liu and Singh (1993) in
deﬁning their “quality index.”] Under P2 and P3, however, convergence of
Rx F to 0 is seen to hold already and thus does not offer anything productive
in addition to P2 and P3.
In Zuo and Serﬂing [(2000b), Theorem 3.1(iv)], the present form of P4 is use-
ful in establishing compactness of depth-trimmed regions. Further, it plays a
role in using truncation arguments to establish almost sure uniform conver-
gence of sample depth functions to population versions.

2.2. A further look at the halfspace and simplicial depth functions. We

now investigate whether the halfspace depth function HDx P and the sim-
plicial depth function SDx P are “statistical depth functions” in the sense
of Deﬁnition 2.1. These are treated, respectively, in the following theorem and
remark.

Theorem 2.1. The halfspace depth function HDx P is a statistical depth

function in the sense of Deﬁnition 2.1.

Remark 2.1. For continuous angularly symmetric distributions, it follows

from results of Liu (1990) that the simplicial depth function SD· P is a sta-
tistical depth function in the sense of Deﬁnition 2.1. For discrete distributions,
however, SDx P can for H-symmetric distributions fail to satisfy the “maxi-
mality” property P2 and even for C-symmetric distributions fail to satisfy the
“monotonicity” property P3. This is seen from the following counterexamples.

Counterexample 1. Let d = 1 and PX = 0 = 1/5, PX = ±1 = 1/5,

and PX = ±2 = 1/5. Then clearly X is centrally symmetric about 0. It is not
difﬁcult to show that SD1/2 P = 12/25 and SD1 P = 15/25, violating
P3.

Counterexample 2. Let d = 2 and PX = ±1 0 = PX = ±2 0 =

PX = 0 ±1 = 1/6. Then X is centrally symmetric about (0, 0) and
SD1 0 P − SD1/2 0 P = 3! · 2 · 1/63 = 1/18 > 0
again violating P3.

Counterexample 3. Let d = 2 and PX = θ = 0 0 = 19/40, PX =

A = −1 1 = 3/40, and PX = B = −1 −1 = PX = C = 1 0 = 1/40.
Let B θ intersect AC at D, x be a point inside the triangle A θD, and PX =
x = 16/40. Then it is not difﬁcult to verify, based on results established in
Zuo and Serﬂing (2000c), that X is H-symmetric about θ, which is thus the
center of the distribution. However, we have
3!
SDx P − SDθ P = 2 × 16 × 1 × 3 − 3 × 1 × 19 + 1 × 1 × 19 > 0
403
that is, the “maximality” property P2 fails to hold.
466 Y. ZUO AND R. SERFLING

For the above two well-known notions of depth function, we thus have found
that one behaves well overall, while in some discrete cases the other is not
completely satisfactory. This leads one to investigate whether other attractive
statistical depth functions can be deﬁned, indeed to explore general structures
for such functions and to seek to identify the more favorable types.

2.3. General structures for statistical depth functions. Four general struc-
tures for construction of statistical depth functions are introduced and inves-
tigated with respect to properties P1–P4. Various existing depth functions are
classiﬁed according to these types.
2.3.1. Type A depth functions. Let hx x1 xr be any bounded non-
negative function which in some sense measures the closeness of x to the
points x1 xr . A corresponding Type A depth function is then deﬁned by
the average closeness of x to a random sample of size r:
(1) Dx P = Ehx X1 Xr
where X1 Xr is a random sample from P. For such depth functions
the corresponding sample versions Dx P turn out to be U-statistics or V-
n
statistics.
Taking r = d + 1 and hx x1 xd+1 = I x ∈ S x1 xd+1 , we ob-
tain the simplicial depth, whose properties have been covered in Section 2.2.
Another example is the following.

Example 2.1. [Majority depth (Singh, 1991)] For given points x1 xd

in Rd which determine a unique hyperplane containing themselves, there cor-
respond two closed halfspaces with this hyperplane as boundary. Denote by
HxP1 xd the one which carries probability mass ≥ 1/2 under the distribution
P on Rd . Then the majority depth function is defined by
P
(2) MJDx P = Px ∈ HX 1 Xd
x ∈ Rd
where X1 Xd is a random sample from P. Clearly, the majority depth
function is of Type A with r = d and hx x1 xd ≡ I x ∈ HxP1 xd .
Let us explore the majority depth function with respect to properties P1–
P4. Clearly P1 is satisfied. Also, as remarked by Liu and Singh (1993), for any
A-symmetric distribution P, MJDx P decreases monotonically as x moves
away from the center along any fixed ray originating from the center, that is,
P2 and P3 hold. Indeed, the following result establishes this more generally.

Theorem 2.2. For H-symmetric distributions P, MJDx P satisﬁes P2

and P3.

The majority depth fails to satisfy property P4, however. As a counterex-

ample, take d = 2 and deﬁne P by PX = ±1 0 = 1/3 and PX =
0 1 = 1/3. Then it is easy to see that lim x →∞ MJDx P = 2/3. As an-
other counter example, for d = 1 one can show for any P that MJDx P =
1/2 + minPx 1 − Px → 1/2 as x → ∞.
STATISTICAL DEPTH FUNCTION 467

2.3.2. Type B depth functions. Let hx x1 xr be an unbounded non-

negative function which measures in some sense the distance of x from the
points x1 xr . A corresponding Type B depth function is then defined by
(3) Dx F ≡ 1 + Ehx X1 Xr −1
for X1 Xr a random sample from F. Closely related to (3), but not equiv-
alent, is the structure E 1 + hx X1 Xr −1 , which is a further example
of the Type A structure. For the sake of tractability, we prefer the form (3).
As a measure of dispersion of a point cloud x x1 xr , the function
hx x1 xr possibly may not possess the affine invariance property P1,
but in many such cases it satisfies at least rigid-body invariance, that is,
hAx + b Ax1 + b Axr + b = hx x1 xr for any d × d orthogonal
matrix A and any vector b ∈ Rd . For example, see the Lp depth treated
below. Or, a suitable modification of the function h sometimes yields an affine
invariant version, as in the case of the “simplicial volume depth” as well as the
L2 depth treated below. Regarding properties P2–P4, Type B depth functions
are rather well behaved, as shown by the following examples and theorems.
Example 2.2 (Simplicial volume depth). Take
hx x1 xd = α S x x1 xd
where S x x1 xd denotes the volume of the d-dimensional simplex
S x x1 xd and α > 0. This is a measure of the dispersion of the point
cloud x x1 xd and accordingly
(4) 1 + E α S x X1 Xd −1
defines a Type B depth function. This depth function usually is not affine
invariant, however, since
α S Ax + b Ax1 + b Axd + b = detA α α S x x1 xd
where b is any vector in Rd , and the determinant detA of the nonsingular
matrix A is not always equal to 1. This problem can be rectified by a modifi-
cation. Rather than (4), we define the simplicial volume depth function by
α −1
α S x X1 Xd
(5) SVD x F ≡ 1 + E
det
where is the covariance matrix of F. This version is affine invariant.

Remark 2.2. Oja (1983) introduced for C-symmetric distributions a family

of location measures utilizing simplicial volume, as follows. For each α > 0, a
location measure µα : → Rd is deﬁned by
E α S µα F X1 Xd = inf E α S µ X1 Xd
µ∈Rd

However, he did not develop it into a depth function, nor did he consider the
afﬁne invariant version (5).
468 Y. ZUO AND R. SERFLING

Example 2.3. [Lp depth (p > 0.)] Another way to measure distance is via
the Lp norm · p . Taking hx x1 = x − x1 p , a corresponding Type B depth
function is given by
−1
(6) Lp Dx F ≡ 1 + E x − X p
Note that Lp Dx F generally does not possess the afﬁne invariance property,
however, since
E Ax + b − AX + b p = E Ax − X p

which is not equal to E x − X p for every nonsingular matrix A. On the other

hand, taking p = 2, it is easy to see that L2 Dx F is rigid-body invariant.
Moreover, a modification of the L2 norm yields an affine invariant version.
Following Rao (1988), for a positive definite d × d matrix M, define a norm
· M as
√
(7) x M ≡ x Mx ∀x ∈ Rd
Then, for p = 2, the depth function defined in (6) may be modified to an affine
invariant version,

(8) L2 Dx F ≡ 1 + E x−X −1 −1

where is the covariance matrix of F.

Under some conditions on hx x1 xr , Type B depth functions neces-

sarily satisfy P2 and P3, as shown in the following two results.

Theorem 2.3. Suppose θ is the point of symmetry of a distribution F with

respect to a given notion of symmetry. Then Type B depth functions Dx F
possess the “maximality at center” property P2 if:
(i) hx + b x1 + b xr + b = hx x1 xr
(ii) h−x −x1 −xr = hx x1 xr
(iii) hx x1 xr is convex in the argument x and
(iv) for x, b and x1 xr arbitrary vectors in Rd and X1 Xr a random
sample from F, the set

arg inf Ehx X1 − θ Xr − θ ∩ arg inf Ehx θ − X1 θ − Xr

x∈Rd x∈Rd

is nonempty.

Remark 2.3. For any distribution C-symmetric about a point θ in Rd , there

is always a point y ∈ Rd satisfying condition (iv) above.

Theorem 2.4. If hx x1 xr is convex in x, then the corresponding

Type B depth function Dx F decreases monotonically as x moves outward
along any ray starting at a deepest point of F.
STATISTICAL DEPTH FUNCTION 469

Equipped with the above two results, we now take a further look at
SVDα x F and Lp Dx F.

Corollary 2.1. For α ≥ 1, SVDα x F satisﬁes P3 and P4.

Since α S x x1 xd is convex and rigid-body invariant, according to

Theorem 2.3 we obtain

Corollary 2.2. For C-symmetric distributions and α ≥ 1, SVDα x F sat-

isﬁes P2.

The afﬁne invariance and Corollaries 2.1 and 2.2 thus yield:

Theorem 2.5. For C-symmetric distributions and α ≥ 1, SVDα x F is a

statistical depth function in the sense of Deﬁnition 2.1.

The next three results treat P2–P4 for Lp Dx F, p ≥ 1 and L2 Dx F.
Convexity of hx x1 = x − x1 p in the argument x follows in straight-
forward fashion from Minkowski’s inequality. Thus Theorem 2.4 yields P3 for
Lp Dx F, while P4 is obvious. Thus we have

Corollary 2.3. For p ≥ 1, Lp Dx F satisﬁes P3 and P4.

Since hx x1 is location invariant and even, that is, hx + b x1 + b =

hx x1 for any vector b ∈ Rd and h−x −x1 = hx x1 , by the convexity
just established and Theorem 2.3 we obtain:

Corollary 2.4. For C-symmetric distributions and for p ≥ 1, Lp Dx F

satisﬁes P2.

For L2 Dx F we have:

Theorem 2.6. For any distribution F A-symmetric about a unique point

θ ∈ Rd , L2 Dx F deﬁned in 8 is a statistical depth function in the sense of
Deﬁnition 2.1.

Remark 2.4. In the foregoing proof, condition (iv) of Theorem 2.3 was es-
tablished for L2 x F for all A-symmetric F. For the depth function L2 x F,
it follows from results established in Zuo and Serﬂing (2000c) that this condi-
tion holds for all H-symmetric F.

2.3.3. Type C depth functions. Let Ox F be a measure of the outlying-

ness of the point x in Rd with respect to the center or the deepest point of the
distribution F. Usually Ox F is unbounded, but a corresponding bounded
depth function is deﬁned by
(9) Dx F ≡ 1 + Ox F−1
We call these Type C depth functions.
470 Y. ZUO AND R. SERFLING

Remark 2.5. Although Type B and Type C depth functions are clearly sim-
ilar in form, it is convenient to treat them separately, as they arise from some-
what different conceptual points of view.

Example 2.4. Projection depth. Deﬁne the outlyingness of a point x to be

the worst case outlyingness of x with respect to the one-dimensional median
in any one-dimensional projection, that is,

u x − Medu X
(10) Ox F ≡ sup
u =1 MADu X

where X has distribution F, Med denotes the univariate median, MAD de-
notes the univariate median absolute deviation deﬁned for univariate Y as
MADY = MedY − MedY, and · is the Euclidean norm. We call
the corresponding Type C depth function projection depth and denote it by
PDx F, x ∈ Rd .

Remark 2.6. For one-dimensional datasets X = X1 Xn ,

On x ≡ x − Med1≤i≤n Xi / MAD1≤i≤n Xi

has long been used as a robust measure of outlyingness of x ∈ R with respect

to the center (median) of the dataset. See Mosteller and Tukey [(1977), pages
205–208]. Here

1
Med1≤i≤n Xi = 2
X n+1
2
+ X n+2
2

MAD1≤i≤n Xi = Med1≤i≤n Xi − Med1≤j≤n Xj

and X1 ≤ · · · ≤ Xn are the ordered X1 Xn . Donoho and Gasko (1992)
generalized this to arbitrary dimension d, deﬁning On x to be the worst case
outlyingness of x ∈ Rd in any one-dimensional projection of x and the dataset
X. A sample version of the projection depth function PDx F is thus given
by

(11) PDn x = 1 + On x−1

Liu (1992) suggested the use of (11) as a data depth function, but did not
provide any treatment of it.

Example 2.5 (Mahalanobis depth). Mahalanobis (1936) introduced a dis-

tance between two points x and y in Rd , with respect to a positive deﬁnite
d × d matrix M, as

d2M x y = x − y M−1 x − y

STATISTICAL DEPTH FUNCTION 471

Based on this Mahalanobis distance, one can deﬁne a Mahalanobis depth as

the corresponding Type C depth function,
−1
(12) MHDx F = 1 + d2F x µF

where F is a given distribution and µF and F are any corresponding
location and covariance measures, respectively. The case that µF and F
are the mean and covariance matrix of F was suggested by Liu (1992). For
these choices, however, MHD· F is not “robust” [since µF = mean is not
robust, as noted by Liu and Singh (1993)], and it can fail to achieve maximum
value at the center of A-symmetric distributions.
For Type C depth functions, the following analogues of Theorems 2.3 and
2.4 hold and can be proved similarly. It is convenient to write Ox X for
Ox FX .

Theorem 2.7. Suppose θ in Rd is the point of symmetry of a distribution F

with respect to a given notion of symmetry. The Type C depth functions Dx F
possess the “maximality at center” property P2 if for arbitrary vectors x, b in
Rd
(i) Ox + b X + b = Ox X
(ii) O−x −X = Ox X
(iii) Ox X is convex in the argument x and
(iv) the set

y ∈ arg inf Ox X − θ ∩ arg inf Ox θ − X

x∈Rd x∈Rd

is nonempty.

Theorem 2.8. If Ox F is convex in the argument x, then the correspond-

ing Type C depth function Dx F decreases monotonically as x moves outward
along any ray starting at a deepest point of F.

The following two theorems establish that PDx F and MHDx F are
proper statistical depth functions.

Theorem 2.9. The projection depth function PDx F is a statistical depth

function in the sense of Deﬁnition 2.1.

A location measure µ is afﬁne equivariant if µAX + b = AµX + b for

any affine transformation AX + b of X. A covariance measure is affine
equivariant if AX + b = AXA for any affine transformation AX + b
of X.

Theorem 2.10. Let F be symmetric. Then the Mahalanobis depth function

MHDx F is a statistical depth function in the sense of Deﬁnition 2.1 if µ and
are afﬁne equivariant and µF agrees with the point of symmetry of F.
472 Y. ZUO AND R. SERFLING

The proof is straightforward.

2.3.4. Type D depth functions. One can interpret the “tailedness” of a point
with respect to a given distribution as an index related to its relative depth
with respect to the center or deepest point of the distribution. Let C be a class
of closed subsets of Rd and P a probability measure on Rd . A corresponding
Type D depth function is deﬁned by
(13) Dx P C ≡ inf PC x ∈ C ∈ C
C

Thus the C -depth of a point x with respect to a probability measure P on Rd

is defined to be the minimum probability mass carried by a set C in C that
contains x. In essence, this form of depth function is equivalent, via D = 1 − I,
to the “index function” I x P C introduced by Small (1987) for measuring
the “tailedness” of points x in some space. Such functions have antecedents in
game theoretical work of Hotelling (1929) and Chamberlin (1937).
We confine attention to classes C satisfying the following conditions:
C1. If C ∈ C , then Cc ∈ .
C2. For C ∈ C and x ∈ C◦ , there exists C1 ∈ C with x ∈ ∂C1 , C1 ⊂ C◦ ,
where ∂C, Cc , C◦ and C denote, respectively, the boundary, complement, inte-
rior and closure of C.
The class of all closed halfspaces H on Rd satisfies C1 and C2 and thus
the halfspace depth is a typical example of Type D depth function. As shown
in Theorem 2.1, HDx P is a statistical depth function. Useful further prop-
erties of HDx P that in fact hold more generally are given in the following
result.

Theorem 2.11. Let C be a class of closed Borel sets satisfying C1 and C2.
Further, for a given probability measure P on Rd , assume that if x ∈ C ∈ C
and PC < α, then there is a C1 ∈ C such that x ∈ C◦1 and PC1 < α. Then:
(i) Dx P C is upper semicontinuous;
(ii) Dα ≡ x ∈ Rd Dx P C ≥ α, α ∈ 0 1 , are compact and nested
i.e., Dα1 ⊂ Dα2 if α1 > α2 and
(iii) Dα is convex if every C ∈ C is convex.

Remark 2.7. If C2 is replaced by

C2 . P∂C = 0, ∀ C ∈ ,

the above theorem remains true.

3. Concluding remarks. Here we examine and compare a number of

depth functions with respect to the criteria given by properties P1–P4.
We begin with four cases having central importance because the correspond-
ing versions of multidimensional median generated by their points of maximal
depth are among the most popular competitors for nonparametric and robust
STATISTICAL DEPTH FUNCTION 473

estimation of multidimensional location. These are the halfspace depth (Type

D, Example 2.7), the simplicial depth (Type A, Example 2.1), the simplicial
volume depth (Type B, Example 2.3), and the L2 depth (Type B, Example
2.4), which generate, respectively, the so-called Tukey/Donoho halfspace me-
dian (H), the Liu simplicial depth median (S), the Oja median (O) and the
spatial or L2 median. [See Small (1990) for an overview of these and other
multidimensional medians.] With respect to affine invariance P1, all but the
L2 version are fully satisfactory, the L2 depth function being invariant only
under rotational and rigid-body transformations. The “maximality at center”
property P2 is satisfied by the halfspace depth function for H-symmetric dis-
tributions (see the proof of Theorem 2.1) and can be shown to be satisfied
by the L2 depth function for all H-symmetric distributions (see Remark 2.4)
and the simplicial volume depth function for C-symmetric distributions (see
Corollary 2.2). Also, P2 is satisfied by the simplicial depth function for contin-
uous A-symmetric distributions but not necessarily for discrete H-symmetric
distributions (see Remark 2.1). The “monotonicity relative to deepest point”
P3 is satisfied arbitrarily by the halfspace, simplicial volume, and L2 depth
functions, and also by the simplicial depth function except in some discrete
cases (see Theorem 2.1, Remark 2.1, and Corollaries 2.1 and 2.3). Finally,
“vanishing at infinity” P4 is satisfied by all four of these depth functions (see
Theorem 2.1 and Corollaries 2.1 and 2.3). Thus, from consideration of P1–P4,
the halfspace and simplicial volume depth functions appear to be the most
comprehensively attractive among these four competitors. If, however, we in
addition consider breakdown points of the corresponding location estimators
[for details, see Small (1990), Niinimaa, Oja and Tableman (1990), Donoho
and Gasko (1992) and Chen (1995)], we find that the estimator based on the
simplicial volume depth, unlike the others, has breakdown point 0, while that
based on the halfspace depth has breakdown point 1/3 for typical data sets,
leading us to prefer the halfspace depth function more exclusively.
Let us now consider the projection depth and the Mahalanobis depth. By
Theorems 2.9 and 2.10, these both satisfy properties P1–P4. Regarding robust-
ness, however, the multidimensional median corresponding to sample projec-
tion depth has large-sample breakdown point 1/2 [see Tyler (1994), page 1033,
and Zuo (1999)] as does the closely related Donoho-Stahel estimator [Stahel
(1981), Donoho (1982) and Donoho and Gasko (1992)], whereas the robustness
of the median generated by the Mahalanobis depth depends critically on the
choice of location and covariance measures in defining this depth. We antici-
pate that suitable choices exist which yield high breakdown point. Therefore,
we consider both of these depth functions to be competitive.
Another approach toward construction of depth functions consists of “peel-
ing” methods, such as convex hull peeling. This latter approach, however, not
only lacks a population analogue but also exhibits very unfavorable robust-
ness properties. See discussion of Donoho and Gasko (1992), Nolan (1992) and
Liu, Parelius and Singh (1999).
Likelihood-based depth functions have also been considered. See Fraiman
and Meloche (1996), Fraiman, Liu and Meloche (1997) and Liu, Parelius and
474 Y. ZUO AND R. SERFLING

Singh (1999). These, however, fail to satisfy in general any of P1–P4, and
their effectiveness appears to be conﬁned primarily to models with ellipsoidal
densities, or to situations where sensitivity to multimodality is paramount.
For further discussion, see Remark A.1 in Appendix A.
The zonoid depth function of Koshevoy and Mosler (1997) has some nice
properties but can fail to satisfy “maximality at center” P2 for A- or H-sym-
metric distributions, because it attains maximum value always at the expec-
tation EX for any random variable X in Rd . Also, the sample zonoid depth
function is not robust, as a single corrupted data point can move the “center
point of zonoid data depth” to inﬁnity.
In conclusion, the halfspace and projection depth functions appear to repre-
sent very favorable choices. Both are implementations of the “projection pur-
suit” method, which utilizes all of the one-dimensional views of a dataset as a
foundation for data analysis, thus producing the advantage of great power at
extraction of information, although at the expense of a substantial computa-
tional burden. Also, competitively, the L2 and Mahahalanobis depth functions
appear to have strong potential for development.

APPENDIX A: SUPPLEMENTARY NOTES

Remark A.1. As pointed out and pictorially illustrated in Baggerly and

Scott (1999), the near convexity of the simplicial depth contours limits their
interpretability for multimodal data, whereas the likelihood depth contours
follow the multimodality structure. In the usual sense of “center-outward or-
dering,” and from the common standpoint of desiring connectedness of depth-
trimmed regions, the likelihood “depth” has less of a role as a depth function
than as simply what it is by deﬁnition: a density function, which keeps the
information on multimodality structure when present.

Remark A.2. As broadenings of central symmetry, angular and halfspace

symmetry are opposite in character and purpose to several notions of nonpara-
metric multivariate symmetry introduced by Beran and Millar (1997) which
in fact are narrowings — see their formula (17). Also, their use of halfspaces
is essentially for the purpose of indexing the empirical measure, rather than
as a fundamental element in deﬁning symmetry.
As shown in Zuo and Serﬂing (2000c), halfspace symmetry of P about θ
reduces to angular symmetry about θ except when P is discrete with posi-
tive mass at θ. These exceptions are of practical relevance, since underlying
distributions for actually observed phenomena are invariably discrete (and
asymmetric), and it is reasonable to permit an approximating symmetric dis-
tribution to have mass at the center of symmetry.

Remark A.3. An important aspect of any depth function is whether its

sample version converges to the population counterpart. In particular, we de-
STATISTICAL DEPTH FUNCTION 475

sire that almost surely [P]

(A.1) sup Dn x − Dx P → 0 n → ∞
x

Besides carrying intrinsic interest, (A.1) plays a supporting role for other pur-
poses. For example, it underlies the convergence of sample depth contours to
their population counterparts, as in He and Wang (1997) especially for ellip-
tical models and in Zuo and Serﬂing (2000b) for more general models. In Liu
and Singh (1993), it is basic to the convergence of a certain “quality index”,
while in Liu, Parelius and Singh (1999) it supports various practical methods
such as “DD-plots.”
Results on (A.1) are now available for several cases of depth function.
Donoho and Gasko (1992) proved it for the sample halfspace depth,
H H a closed halfspace x ∈ H
HDn x = inf P x ∈ Rd
n

where P denotes the usual empirical measure, and Liu (1990), Dümbgen
n
(1990), and Arcones and Giné (1993) for the sample simplicial depth
−1
n
SDn x = Ix ∈ S Xi1 Xid+1 x ∈ Rd
d+1 1≤i1 <···<id+1 ≤n

For the sample majority and Mahalanobis depths, under suitable conditions
on F, (A.1) is established by Liu and Singh (1993). For sample versions of
the “projection” depth function and the “Type D” depth functions introduced
above, (A.1) is established in Appendix B of Zuo and Serﬂing (2000b).

APPENDIX B: PROOFS

Proof of Theorem 2.1. Clearly, HDx P is bounded and nonnegative.

We need only check P1–P4.
(a) Affine invariance. Straightforward.
(b) Maximality at center. Suppose that P is H-symmetric about a unique
point θ ∈ Rd . By the definition of H-symmetry, we have PHθ ≥ 1/2, for any
closed halfspace H with θ ∈ ∂H. It follows that HDθ P ≥ 1/2 Now suppose
that there is a point x0 ∈ Rd , x0 = θ, such that HDx0 P > 1/2 Then PH >
1/2 for any closed halfspace H with x0 ∈ ∂H, which implies that P is also
H-symmetric about x0 , contradicting the assumption that P is H-symmetric
about a unique point θ ∈ Rd . Therefore, HDθ P = supx∈Rd HDx P.
(c) Monotonicity relative to deepest point. Suppose θ is a deepest point with
respect to the underlying distribution. To compare HDx P and HDθ+αx−
θ P, we need only consider the infimum in the definition of HD over all closed
halfspaces which do not contain θ. For any Hθ+αx−θ [closed halfspace with
θ+αx−θ ∈ ∂H], by the separating hyperplane theorem there always exists
a closed halfspace Hx such that Hx ⊂ Hθ+αx−θ It follows that HDx P ≤
HDθ + αx − θ P, ∀α ∈ 0 1.
476 Y. ZUO AND R. SERFLING

(d) Vanishing at inﬁnity. It is easy to see that P X ≥ x → 0 as x →

∞ and that for each x and X there exists a closed halfspace Hx such that
Hx ⊂ X ≥ x Thus HDx P → 0 as x → ∞ completing the proof.
✷

Proof of Theorem 2.2. (a) Let θ be the center of an H-symmetric dis-

tribution P and x an arbitrary point in Rd . Then, by the deﬁnition of H-
P
symmetry, for any random sample X1 Xd from P we have x ∈ HX 1 Xd
⇒
P
θ ∈ HX1 Xd and thus MJDθ P = supx∈Rd MJDx P.
(b) Let λ ∈ 0 1 and x0 ≡ λθ + 1 − λx. Then

P P
MJDx0 P − MJDx P = P x0 ∈ HX 1 Xd
− P x ∈ HX 1 Xd

P P
= P x0 ∈ HX 1 Xd
and x ∈ HX 1 Xd

≥ 0 ✷

Proof of Theorem 2.3. By (i) and (ii) we have

Ehx X1 − θ Xr − θ = Ehθ + x X1 Xr
Ehx θ − X1 θ − Xr = Ehθ − x X1 Xr

Let y be a point in the set in (iv). It follows that

y ∈ arg inf Ehθ + x X1 Xr ∩ arg inf Ehθ − x X1 Xr

x∈Rd x∈Rd

The convexity of hx x1 xr in x now yields

hθ X1 Xr ≤ 12 hθ + y X1 Xr + 12 hθ − y X1 Xr

It follows that

Ehθ X1 Xr ≤ 12 Ehθ + y X1 Xr + 12 Ehθ − y X1 Xr

= inf Ehθ + x X1 Xr
x∈Rd

= inf Ehx X1 Xr
x∈Rd

Hence Dθ F = supx∈Rd Dx F, completing the proof. ✷

Proof of Theorem 2.4. Let θ in Rd be a deepest point with respect to

the underlying distribution F, that is, Dθ F = supx∈Rd Dx F Let x = θ
be an arbitrary point in Rd , let λ ∈ 0 1 and set x0 ≡ θ + λx − θ. Then
Dx F ≤ Dθ F The convexity of hx x1 xr in x yields

hx0 X1 Xr ≤ λhx X1 Xr + 1 − λhθ X1 Xr

STATISTICAL DEPTH FUNCTION 477

Thus
Ehx0 X1 Xr ≤ maxEhx X1 Xr Ehθ X1 Xr
= Ehx X1 Xr
and hence Dx0 F ≥ Dx F completing the proof. ✷

Proof of Corollary 2.1. (a) By Theorem 2.4, to show P3 we check con-

vexity of α S x x1 xd in the argument x for α ∈ 1 ∞. Let x y be two
points in Rd , take λ ∈ 0 1, and put x0 ≡ λx + 1 − λy. Then

1 1 ··· 1

1 x01 x11 · · · xd1
S x0 x1 xd =
det
d!
x x ··· x
0d 1d dd

λ + 1 − λ 1 ··· 1

1 λx̃1 + 1 − λỹ1 x11 ··· xd1
= det

d!
λx̃ + 1 − λỹ x1d · · · xdd
d d

≤ λS x x1 xd + 1 − λS y x1 xd
where x = x̃1 x̃d y = ỹ1 ỹd and xi = xi1 xid for 0 ≤ i ≤

d. Now the convexity of the function xα for 0 < x < ∞ and α ≥ 1 yields
α S x0 x1 xd ≤ λα S x x1 xd + 1 − λα S y x1 xd
(b) It is obvious that α S x x1 xd → ∞ as x → ∞ Thus
SVDα x F → 0 as x → ∞ completing the proof. ✷

Proof of Theorem 2.6. Since L2 Dx F deﬁned in (8) is afﬁne invariant,

and P4 is evident, we check P2 and P3.
(a) We first show that · M is convex for any positive definite d × d matrix
M. Since M is positive definite, there is a nonsingular matrix S such that
M = S S. Let x y be two points in Rd and λ ∈ 0 1. Then
2
λx + 1 − λy M = λx + 1 − λy Mλx + 1 − λy
= λ2 x Mx + 2λ1 − λx My + 1 − λ2 y My
= λ2 x Mx + 2λ1 − λSx Sy + 1 − λ2 y My
The Schwarz inequality implies that
2
λx + 1 − λy M ≤ λ2 x Mx + 2λ1 − λ Sx Sy + 1 − λ2 y My
= λ2 x 2
M + 2λ1 − λ x M y M + 1 − λ2 y 2
M
2
= λ x M + 1 − λ y M
478 Y. ZUO AND R. SERFLING

It follows that
λx + 1 − λy M ≤λ x M + 1 − λ y M
d
(b) Now we show that there is a point y ∈ R satisfying condition (4) of
Theorem 2.3. Equivalently, we need to show that
(B.1) θ ∈ arg inf E x−X −1
x∈Rd

where is the covariance matrix of F.

We first show that

θ−X
(∗) E = 0
X − θ −1
Since F is angularly symmetric about θ, it can be shown [see Zuo and Serfling
(2000c)] that PX ∈ Hθ = PX ∈ −Hθ for any closed halfspace Hθ with
θ on the boundary, where −Hθ is the reflection of Hθ about θ. Since −1 is
positive definite, there is a nonsingular matrix R such that −1 = R R. Thus
PRX ∈ RHθ = PRX ∈ −RHθ
for any closed halfspace Hθ with θ on the boundary. By nonsingularity and
results established in Zuo and Serfling (2000c), we conclude that RX is angu-
larly symmetric about Rθ. Hence
R X − θ d R θ − X
=
R X − θ R θ − X
which is equivalent to
R X − θ d R θ − X
=
X − θ −1 θ − X −1
This implies (∗).
Now we show that (B.1) holds true. Consider the derivative of E µ−X −1
with respect to µ ∈ Rd . By vector differentiation, we have

d E µ − X −1 d Rd µ − x −1 dFx
=
dµ dµ

d µ − x −1
= dFx
Rd dµ

−1 µ − x
= dFx
Rd µ − x −1

−1 µ−X
= E
µ − X −1
Then by convexity and (∗) we conclude that (B.1) holds.
STATISTICAL DEPTH FUNCTION 479

The result now follows from Theorems 2.3 and 2.4. ✷

Proof of Theorem 2.9. Since PDx F is nonnegative and bounded, we

need only check P1–P4.
(a) Affine invariance. Straightforward.
(b) Maximality at center. Suppose that F is H-symmetric about a unique
point θ ∈ Rd . Then [see Zuo and Serfling (2000c)] we have Medu X = u θ
for any unit vector u ∈ Rd and it follows that PDθ F = supx∈Rd PDx F.
(c) Monotonicity relative to deepest point. We show that Ox X is convex
in its first argument. Let θ and x be two arbitrary points in Rd , 0 < α < 1,
and put x0 ≡ 1 − αθ + αx. Then we have
u x0 − Medu X = u 1 − αθ + αx − Medu X
= 1 − αu θ − Medu X + αu x − Medu X
≤ 1 − α u θ − Medu X +α u x − Medu X
It follows that
u x0 − Medu X
Ox0 X = sup
u =1 MADu X

1 − α u θ − MeduX +α u x − MeduX

≤ sup
u =1 MADu X

≤ 1 − αOθ F + αOx F

“Monotonicity” now follows from Theorem 2.8.
(d) Vanishing at inﬁnity. Straightforward. ✷

Proof of Theorem 2.11. (i) We ﬁrst show that

(∗) x ∈ Rd Dx P C ≥ α = ∩C PC > 1 − α C ∈ C

(a) If x ∈ x ∈ Rd Dx P C ≥ α and there exists a C ∈ C such that
PC > 1 − α x ∈ C then x ∈ Cc PCc < α By C1 and C2, there is a
C1 ∈ C such that x ∈ ∂C1 C1 ⊂ Cc It follows that PC1 < α and hence
Dx P C < α, which is a contradiction to the assumption that x ∈ x ∈ Rd
Dx P C ≥ α. This implies
x ∈ Rd Dx P C ≥ α ⊂ ∩C PC > 1 − α C ∈ C
(b) If x ∈ ∩C PC > 1 − α C ∈ C , and there is a C ∈ C such that
x ∈ C PC < α then by the condition given, there exists a C1 ∈ C such that
x ∈ C◦1 PC1 < α and thus x ∈ Cc1 P Cc1 > 1 − α which contradicts the
assumption that x ∈ ∩C PC > 1 − α C ∈ C . This implies
x ∈ Rd Dx P C ≥ α ⊃ ∩C PC > 1 − α C ∈ C
480 Y. ZUO AND R. SERFLING

Now (a) and (b) yield ∗, which implies that Dα is closed, and thus Dx P C
is upper semicontinuous.
(ii) The nestedness of Dα is trival. The boundedness of Dα follows from the
fact that Dx P C → 0 as x → ∞ The compactness of Dα now follows
from its being bounded and closed.
(iii) The convexity follows from ∗, since the intersection of convex sets is
convex. ✷

Acknowledgments. The authors greatly appreciate the thoughtful and

constructive remarks of an Associate Editor and two referees, which led to
distinctive improvements in the paper.

REFERENCES
Arcones, M. A. and Giné, E. (1993). Limit theorems for U-processes. Ann. Probab. 21 1494–1542.
Baggerly, K. A. and Scott, D. W. (1999). Comment on “Multivariate analysis by data depth:
Descriptive statistics, graphics and inference,” by R. Y. Liu, J. M. Parelius and K. Singh.
Ann. Statist. 27 843–844.
Bartoszyński, R., Pearl, D. K. and Lawrence, J. (1997). A multidimensional goodness-of-ﬁt test
based on interpoint distances. J. Amer. Statist. Assoc. 92 577–586.
Beran, R. J. and Millar, P. W. (1997). Multivariate symmetry models. In Festschrift for Lucien
Le Cam: Research Papers in Probability and Statistics (D. Pollard, E. Torgerson and
G. L. Yang, eds.) 13–42. Springer, Berlin.
Caplin, A. and Nalebuff, B. (1988). On 64%-majority rule. Econometrica 56 787–814.
Caplin, A. and Nalebuff, B. (1991a). Aggregation and social choice: A mean voter theorem.
Econometrica 59 1–23.
Caplin, A. and Nalebuff, B. (1991b). Aggregation and imperfect competition: On the existence
of equilibrium. Econometrica 59 25–59.
Carrizosa, E. (1996). A characterization of halfspace depth. J. Multivariate Anal. 58 21–26.
Chamberlin, E. (1937). The Theory of Monopolistic Competition. Harvard Univ. Press.
Chen, Z. (1995). Bounds for the breakdown point of the simplicial median. J. Multivariate Anal.
55 1–13.
Donoho, D. L. (1982). Breakdown properties of multivariate location estimators. Ph. D. qualifying
paper, Dept. Statistics, Harvard Univ.
Donoho, D. L. and Gasko, M. (1992). Breakdown properties of location estimates based on half-
space depth and projected outlyingness. Ann. Statist. 20 1803–1827.
Dümbgen, L. (1990). Limit theorems for the empirical simplicial depth. Statist. Probab. Lett. 14
119–128.
Eddy, W. F. (1985). Ordering of multivariate data. In Computer Science and Statistics: The In-
terface (L. Billard, ed.) 25–30. North-Holland, Amsterdam.
Fraiman, R. and Meloche, J. (1996). Multivariate L-estimation. Preprint.
Fraiman, R., Liu, R. Y. and Meloche, J. (1997). Multivariate density estimation by probing
depth. In L1 -Statistical Procedures and Related Topics (Y. Dodge, ed.) 415–430. IMS,
Hayward, CA.
He, X. and Wang, G. (1997). Convergence of depth contours for multivariate datasets. Ann.
Statist. 25 495–504.
Hotelling, H. (1929). Stability in competition. Econom. J. 39 41–57.
Koshevoy, G. and Mosler, K. (1997). Zonoid trimming for multivariate distributions. Ann.
Statist. 25 1998–2017.
STATISTICAL DEPTH FUNCTION 481

Liu, R. Y. (1990). On a notion of data depth based on random simplices. Ann. Statist. 18 405–414.
Liu, R. Y. (1992). Data depth and multivariate rank tests. In L1 -Statistics and Related Methods
(Y. Dodge, ed.) 279–294. North-Holland, Amsterdam.
Liu, R. Y., Parelius, J. M. and Singh, K. (1999). Multivariate analysis by data depth: Descriptive
statistics, graphics and inference (with discussion). Ann. Statist. 27 783–858.
Liu, R. Y. and Singh, K. (1993). A quality index based on data depth and multivariate rank tests.
J. Amer. Statist. Assoc. 88 252–260.
Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proc. Nat. Acad. Sci. India
12 49–55.
Massé, J. C. and Theodorescu, R. (1994). Halfplane trimming for bivariate distributions. J.
Multivariate Anal. 48 188–202.
Mizera, I. (1998). On depth and deep points: a calculus. Preprint.
Mosteller, C. F. and Tukey, J. W. (1977). Data Analysis and Regression. Addison-Wesley, Read-
ing, MA.
Niinimaa, A., Oja, H. and Tableman, M. (1990). On the finite sample breakdown point of the Oja
bivariate median and of the corresponding half-samples version. Statist. Probab. Lett.
10 325–328.
Nolan, D. (1992). Asymptotics for multivariate trimming. Stochastic Process. Appl. 42 157–169.
Oja, H. (1983). Descriptive statistics for multivariate distributions. Statist. Probab. Lett. 1 327–
333.
Rao, C. R. (1988). Methodology based on the L1 norm in statistical inference. Sankhyā Ser. A 50
289–313.
Rousseeuw, P. J. and Hubert, M. (1999). Regression depth (with discussion). J. Amer. Statist.
Assoc. 94 388–433.
Rousseeuw, P. J. and Ruts, I. (1996). Bivariate location depth. J. Roy. Statist. Soc. Ser. C 45
516–526.
Rousseeuw, P. J. and Struyf, A. (1998). Computing location depth and regression depth in
higher dimensions. Statist. Comput. 8 193–203.
Ruts, I. and Rousseeuw, P. J. (1996). Computing depth contours of bivariate point clouds. Com-
put. Statist. Data Anal. 23 153–168.
Serfling, R. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York.
Singh, K. (1991). A notion of majority depth. Preprint.
Small, C. G. (1987). Measures of centrality for multivariate and directional distributions. Canad.
J. Statist. 15 31–39.
Small, C. G. (1990). A survey of multidimensional medians. Internat. Statist. Inst. Rev. 58 263–
277.
Stahel, W. A. (1981). Robust estimation: infinitesimal optimality and covariance matrix estima-
tors. Ph. D thesis, ETH, Zurich (in German).
Tukey, J. W. (1975). Mathematics and picturing data. In Proceedings of the International
Congress on Mathematics (R. D. James, ed.) 2 523–531 Canadian Math. Congress.
Tyler, D. E. (1994). Finite sample breakdown points of projection based multivariate location
and scatter statistics. Ann. Statist. 22 1024–1044.
Vardi, Y. and Zhang, C.-H. (1999). The multivariate L1 -median and associated data depth.
Preprint.
Yeh, A. B. and Singh, K. (1997). Balanced confidence regions based on Tukey’s depth and the
bootstrap. J. Roy. Statist. Soc. Ser. B 59 639–652.
Zuo, Y. (1999). Affine equivariant multivariate location estimates with best possible breakdown
points. Preprint.
Zuo, Y. and Serfling, R. (2000a). Nonparametric notions of multivariate “scatter measure” and
“more scattered” based on statistical depth functions. J. Multivariate Anal. To appear.
482 Y. ZUO AND R. SERFLING

Zuo, Y. and Serfling, R. (2000b). Structural properties and convergence results for contours of
sample statistical depth functions. Ann. Statist. 28 483–499.
Zuo, Y. and Serfling, R. (2000c). On the performance of some robust nonparametric location
measures relative to a general notion of multivariate symmetry. J. Statist. Plann. In-
ference 84 55–79.

Department of Mathematics Department of Mathematical Sciences

Arizona State University University of Texas at Dallas
Tempe, Arizona 85287-1804 Richardson, Texas 75083-0688
E-mail: [email protected] E-mail: serﬂ[email protected]

A Basis For Scaling Qualitative Data
No ratings yet
A Basis For Scaling Qualitative Data
13 pages
Aerobic Respiration Worksheet
No ratings yet
Aerobic Respiration Worksheet
2 pages
Lecture Notes 2
No ratings yet
Lecture Notes 2
181 pages
Fernando Tola &amp Carmen Dragonetti - Trisvabhāvakārikā of Vasubandhu
100% (2)
Fernando Tola &amp Carmen Dragonetti - Trisvabhāvakārikā of Vasubandhu
43 pages
Gaussian Function: Properties
No ratings yet
Gaussian Function: Properties
8 pages
Online Banking Transaction
No ratings yet
Online Banking Transaction
57 pages
Fleet Management System-Sample
83% (6)
Fleet Management System-Sample
30 pages
Clustering Indices: Bernard Desgraupes University Paris Ouest Lab Modal'X
No ratings yet
Clustering Indices: Bernard Desgraupes University Paris Ouest Lab Modal'X
34 pages
On Depth Measures and Dual Statistics
No ratings yet
On Depth Measures and Dual Statistics
16 pages
Characterizing Angular Symmetry and
No ratings yet
Characterizing Angular Symmetry and
13 pages
Ngwanya Fin Book 1 Prep QP
No ratings yet
Ngwanya Fin Book 1 Prep QP
46 pages
2001 - Fraiman - Trimmed Means For Functional Data
No ratings yet
2001 - Fraiman - Trimmed Means For Functional Data
22 pages
Note 3
No ratings yet
Note 3
59 pages
Final Multiple Integrations
No ratings yet
Final Multiple Integrations
16 pages
Ei2015 15
No ratings yet
Ei2015 15
22 pages
Murota1998 Article DiscreteConvexAnalysis
No ratings yet
Murota1998 Article DiscreteConvexAnalysis
59 pages
Cluster Crit
No ratings yet
Cluster Crit
34 pages
Guttman 1944
No ratings yet
Guttman 1944
13 pages
Distribution Theory Talk
No ratings yet
Distribution Theory Talk
27 pages
1 s2.0 S0042207X23005390 Main
No ratings yet
1 s2.0 S0042207X23005390 Main
7 pages
Survey Loujiayi
No ratings yet
Survey Loujiayi
57 pages
New Measures of Central Tendency and Variability o
No ratings yet
New Measures of Central Tendency and Variability o
15 pages
Network An. Chapter-5
No ratings yet
Network An. Chapter-5
23 pages
Algorithms 17 00100
No ratings yet
Algorithms 17 00100
19 pages
Lecture 2
No ratings yet
Lecture 2
11 pages
Mobil™ Dexron-VI ATF: Product Description
No ratings yet
Mobil™ Dexron-VI ATF: Product Description
2 pages
Multivariate and Functional Classification Using Depth
No ratings yet
Multivariate and Functional Classification Using Depth
22 pages
Low Rolling Resistance For Conveyor Belts: Goodyear Conveyor Belt Products
No ratings yet
Low Rolling Resistance For Conveyor Belts: Goodyear Conveyor Belt Products
25 pages
L14 SVD
No ratings yet
L14 SVD
8 pages
Liu R. (1990) On A Notion of Data Depth Based On Random Simpleces
No ratings yet
Liu R. (1990) On A Notion of Data Depth Based On Random Simpleces
11 pages
Data-Driven Robust Optimization
No ratings yet
Data-Driven Robust Optimization
43 pages
A Fatigue Driving Detection Algorithm Based On Facial Multi-Feature Fusion
No ratings yet
A Fatigue Driving Detection Algorithm Based On Facial Multi-Feature Fusion
16 pages
Liu R. and Singh K. (1992) Ordering Directional Data Concepts of Data Depth On Circles and Spheres
No ratings yet
Liu R. and Singh K. (1992) Ordering Directional Data Concepts of Data Depth On Circles and Spheres
18 pages
Extremal Depth
No ratings yet
Extremal Depth
36 pages
Cluster Crit
No ratings yet
Cluster Crit
34 pages
GNGTS 2014 - Agostinelli
No ratings yet
GNGTS 2014 - Agostinelli
20 pages
VlSI ASIC Processing
No ratings yet
VlSI ASIC Processing
112 pages
High-Performance Liquid Chromatography Determination of Zn-Bacitracin in Animal Feed by Post-Column Derivatization and Fluorescence Detection
No ratings yet
High-Performance Liquid Chromatography Determination of Zn-Bacitracin in Animal Feed by Post-Column Derivatization and Fluorescence Detection
8 pages
Geography Hons SEC Statistics - Definition and Basic Concepts A Sarkar
No ratings yet
Geography Hons SEC Statistics - Definition and Basic Concepts A Sarkar
4 pages
Special Invited Paper: Multivariate Analysis by Data Depth: Descriptive Statistics, Graphics and Inference
No ratings yet
Special Invited Paper: Multivariate Analysis by Data Depth: Descriptive Statistics, Graphics and Inference
76 pages
Dr. Carlos S. Lanting College: Basic Education - Senior High School
No ratings yet
Dr. Carlos S. Lanting College: Basic Education - Senior High School
8 pages
Some Intriguing Properties of Tukey's Half-Space Depth: Bernoulli 10.3150/10-BEJ322
No ratings yet
Some Intriguing Properties of Tukey's Half-Space Depth: Bernoulli 10.3150/10-BEJ322
16 pages
MartinBarraganLilloRomo JAS
No ratings yet
MartinBarraganLilloRomo JAS
26 pages
Inheritance B
No ratings yet
Inheritance B
7 pages
M2 Lesson 4 Slides For Students
No ratings yet
M2 Lesson 4 Slides For Students
48 pages
Sparse Grid Tutorial
No ratings yet
Sparse Grid Tutorial
25 pages
Core Statistics PDF
100% (4)
Core Statistics PDF
256 pages
23G-04 1 06
No ratings yet
23G-04 1 06
17 pages
Discrete Multi-Resolution Analysis and Generalized Wavelets
No ratings yet
Discrete Multi-Resolution Analysis and Generalized Wavelets
40 pages
Concept of Data Depth and Applications
No ratings yet
Concept of Data Depth and Applications
9 pages
Liu R. and Singh K. (1993) A Quality Index Based On Data Depth and Multivariate Rank Tests
No ratings yet
Liu R. and Singh K. (1993) A Quality Index Based On Data Depth and Multivariate Rank Tests
10 pages
5 Further Topics (60 Min.) : Systematic Errors, MCMC
No ratings yet
5 Further Topics (60 Min.) : Systematic Errors, MCMC
35 pages
Davy Depth Based Classifier
No ratings yet
Davy Depth Based Classifier
33 pages
Symmetric Unimodal Models For Directional Data Mot
No ratings yet
Symmetric Unimodal Models For Directional Data Mot
18 pages
(23645504 - Current Directions in Biomedical Engineering) Removing Noise in Biomedical Signal Recordings by Singular Value Decomposition
No ratings yet
(23645504 - Current Directions in Biomedical Engineering) Removing Noise in Biomedical Signal Recordings by Singular Value Decomposition
4 pages
On Location, Scale, Skewness and Kurtosis of Univariate Distributions
0% (1)
On Location, Scale, Skewness and Kurtosis of Univariate Distributions
16 pages
Location-Scale Depth: Ivan M and Christine H. M
No ratings yet
Location-Scale Depth: Ivan M and Christine H. M
18 pages
Presbyterian University of East Africa School of Business Administration Unit Code: Name:Jemimah Mwandoe ADMISSION:N33/1102/02 Topic: Lecturer
No ratings yet
Presbyterian University of East Africa School of Business Administration Unit Code: Name:Jemimah Mwandoe ADMISSION:N33/1102/02 Topic: Lecturer
19 pages
Multivariate Analysis: Descriptive Statistics Is The Discipline of Quantitatively Describing The Main Features of A
No ratings yet
Multivariate Analysis: Descriptive Statistics Is The Discipline of Quantitatively Describing The Main Features of A
5 pages
Optical Computers Technical Seminar Report Vtu Ece
100% (1)
Optical Computers Technical Seminar Report Vtu Ece
33 pages
CPM and Pert
No ratings yet
CPM and Pert
40 pages
On Grouping For Maximum Homogeneity - Walter D. Fisher
No ratings yet
On Grouping For Maximum Homogeneity - Walter D. Fisher
11 pages
Math A 10F
No ratings yet
Math A 10F
9 pages
Bo
No ratings yet
Bo
36 pages
Interpretation of Canonical Discriminant Functions - Rencher
No ratings yet
Interpretation of Canonical Discriminant Functions - Rencher
10 pages
Edi Lab - 2019-2020
No ratings yet
Edi Lab - 2019-2020
13 pages
Note 6: EECS 189 Introduction To Machine Learning Fall 2020 1 Multivariate Gaussians
No ratings yet
Note 6: EECS 189 Introduction To Machine Learning Fall 2020 1 Multivariate Gaussians
9 pages
Integrated Circuits - K. R. Botkar
No ratings yet
Integrated Circuits - K. R. Botkar
67 pages
Mimo Introduction
No ratings yet
Mimo Introduction
13 pages
BULLETIN FOR THE HISTORYvol30-2
No ratings yet
BULLETIN FOR THE HISTORYvol30-2
100 pages
Formlabs Fuse F1 - Sift Tech Specs
No ratings yet
Formlabs Fuse F1 - Sift Tech Specs
4 pages
Clustering Indices: Bernard Desgraupes University Paris Ouest Lab Modal'X
100% (1)
Clustering Indices: Bernard Desgraupes University Paris Ouest Lab Modal'X
34 pages
01 Task Performance 1
No ratings yet
01 Task Performance 1
3 pages
Perio Instruments
100% (3)
Perio Instruments
32 pages
Wellcare Oil Tools Private Limited
No ratings yet
Wellcare Oil Tools Private Limited
4 pages
Clustering Indices: Bernard Desgraupes University Paris Ouest Lab Modal'X
No ratings yet
Clustering Indices: Bernard Desgraupes University Paris Ouest Lab Modal'X
34 pages
Lecture4 Mech SU
No ratings yet
Lecture4 Mech SU
17 pages
Macros 4 B
No ratings yet
Macros 4 B
5 pages
RigNotes15 PDF
No ratings yet
RigNotes15 PDF
130 pages
An Introduction To Air Density and Density Altitude Calculations
No ratings yet
An Introduction To Air Density and Density Altitude Calculations
22 pages
E427 PDF
No ratings yet
E427 PDF
7 pages
Design of Short Columns
No ratings yet
Design of Short Columns
26 pages
B Bob Doc Functions Def
No ratings yet
B Bob Doc Functions Def
14 pages
Data Mining1
No ratings yet
Data Mining1
3 pages
Survey Adjustment Notes
No ratings yet
Survey Adjustment Notes
7 pages
Heat Pipes Write Up With Example
No ratings yet
Heat Pipes Write Up With Example
9 pages
Steldeck Slab Design
No ratings yet
Steldeck Slab Design
18 pages
Applied Probability Models with Optimization Applications
From Everand
Applied Probability Models with Optimization Applications
Sheldon M. Ross
2.5/5 (3)
Basic Methods of Linear Functional Analysis
From Everand
Basic Methods of Linear Functional Analysis
John D. Pryce
No ratings yet
Conformal Mapping
From Everand
Conformal Mapping
Zeev Nehari
4/5 (1)

General Notions of Statistical Depth Function

Uploaded by

General Notions of Statistical Depth Function

Uploaded by

The Annals of Statistics

2000, Vol. 28, No. 2, 461–482

GENERAL NOTIONS OF STATISTICAL DEPTH FUNCTION

By Yijun Zuo and Robert Serfling1

1. Introduction. Statistical depth functions have become increasingly

HDx P = inf PH  H a closed halfspace x ∈ H x ∈ Rd 

Received December 1998; revised December 1999.

afﬁne invariance, maximality at center, monotonicity relative to deepest point,

2. General notions of statistical depth. Here we consider general no-

2.1. Desirable properties and a general deﬁnition. We conﬁne attention

P4. Vanishing at inﬁnity. The depth of a point x should approach zero as x

Definition 2.1. Let the mapping D· ·  Rd × F → R1 be bounded, non-

A sample version of Dx P, denoted by Dn x ≡ Dx P , may be deﬁned

2.2. A further look at the halfspace and simplicial depth functions. We

Theorem 2.1. The halfspace depth function HDx P is a statistical depth

Remark 2.1. For continuous angularly symmetric distributions, it follows

Counterexample 1. Let d = 1 and PX = 0 = 1/5, PX = ±1 = 1/5,

Counterexample 2. Let d = 2 and PX = ±1 0 = PX = ±2 0 =

Counterexample 3. Let d = 2 and PX = θ = 0 0 = 19/40, PX =

Example 2.1. [Majority depth (Singh, 1991)] For given points x1      xd

Theorem 2.2. For H-symmetric distributions P, MJDx P satisﬁes P2

The majority depth fails to satisfy property P4, however. As a counterex-

2.3.2. Type B depth functions. Let hx x1      xr be an unbounded non-

Remark 2.2. Oja (1983) introduced for C-symmetric distributions a family

which is not equal to E x − X p for every nonsingular matrix A. On the other

(8) L2 Dx F ≡ 1 + E x−X −1 −1 

Under some conditions on hx x1      xr , Type B depth functions neces-

Theorem 2.3. Suppose θ is the point of symmetry of a distribution F with

arg inf Ehx X1 − θ     Xr − θ ∩ arg inf Ehx θ − X1      θ − Xr

Remark 2.3. For any distribution C-symmetric about a point θ in Rd , there

Theorem 2.4. If hx x1      xr is convex in x, then the corresponding

Corollary 2.1. For α ≥ 1, SVDα x F satisﬁes P3 and P4.

Since α S x x1      xd is convex and rigid-body invariant, according to

Corollary 2.2. For C-symmetric distributions and α ≥ 1, SVDα x F sat-

Theorem 2.5. For C-symmetric distributions and α ≥ 1, SVDα x F is a

Corollary 2.3. For p ≥ 1, Lp Dx F satisﬁes P3 and P4.

Since hx x1 is location invariant and even, that is, hx + b x1 + b =

Corollary 2.4. For C-symmetric distributions and for p ≥ 1, Lp Dx F

For L2 Dx F we have:

Theorem 2.6. For any distribution F A-symmetric about a unique point

2.3.3. Type C depth functions. Let Ox F be a measure of the outlying-

Example 2.4. Projection depth. Deﬁne the outlyingness of a point x to be

Remark 2.6. For one-dimensional datasets X = X1      Xn ,

On x ≡ x − Med1≤i≤n Xi / MAD1≤i≤n Xi 

has long been used as a robust measure of outlyingness of x ∈ R with respect

MAD1≤i≤n Xi  = Med1≤i≤n Xi − Med1≤j≤n Xj 

(11) PDn x = 1 + On x−1 

Example 2.5 (Mahalanobis depth). Mahalanobis (1936) introduced a dis-

d2M x y = x − y M−1 x − y

Based on this Mahalanobis distance, one can deﬁne a Mahalanobis depth as

Theorem 2.7. Suppose θ in Rd is the point of symmetry of a distribution F

y ∈ arg inf Ox X − θ ∩ arg inf Ox θ − X

Theorem 2.8. If Ox F is convex in the argument x, then the correspond-

Theorem 2.9. The projection depth function PDx F is a statistical depth

A location measure µ is afﬁne equivariant if µAX + b = AµX + b for

Theorem 2.10. Let F be symmetric. Then the Mahalanobis depth function

The proof is straightforward.

Thus the C -depth of a point x with respect to a probability measure P on Rd

Remark 2.7. If C2 is replaced by

the above theorem remains true.

3. Concluding remarks. Here we examine and compare a number of

estimation of multidimensional location. These are the halfspace depth (Type

APPENDIX A: SUPPLEMENTARY NOTES

Remark A.1. As pointed out and pictorially illustrated in Baggerly and

Remark A.2. As broadenings of central symmetry, angular and halfspace

Remark A.3. An important aspect of any depth function is whether its

sire that almost surely [P]

Proof of Theorem 2.1. Clearly, HDx P is bounded and nonnegative.

(d) Vanishing at inﬁnity. It is easy to see that P X ≥ x → 0 as x →

Proof of Theorem 2.2. (a) Let θ be the center of an H-symmetric dis-

Proof of Theorem 2.3. By (i) and (ii) we have

Let y be a point in the set in (iv). It follows that

y ∈ arg inf Ehθ + x X1      Xr ∩ arg inf Ehθ − x X1      Xr 

The convexity of hx x1      xr in x now yields

hθ X1      Xr ≤ 12 hθ + y X1      Xr + 12 hθ − y X1      Xr 

Ehθ X1      Xr ≤ 12 Ehθ + y X1      Xr + 12 Ehθ − y X1      Xr

HDx P = inf PH H a closed halfspace x ∈ H x ∈ Rd

Definition 2.1. Let the mapping D· · Rd × F → R1 be bounded, non-

Counterexample 2. Let d = 2 and PX = ±1 0 = PX = ±2 0 =

Counterexample 3. Let d = 2 and PX = θ = 0 0 = 19/40, PX =

Example 2.1. [Majority depth (Singh, 1991)] For given points x1 xd

2.3.2. Type B depth functions. Let hx x1 xr be an unbounded non-

(8) L2 Dx F ≡ 1 + E x−X −1 −1

Under some conditions on hx x1 xr , Type B depth functions neces-

arg inf Ehx X1 − θ Xr − θ ∩ arg inf Ehx θ − X1 θ − Xr

Theorem 2.4. If hx x1 xr is convex in x, then the corresponding

Since α S x x1 xd is convex and rigid-body invariant, according to

Since hx x1 is location invariant and even, that is, hx + b x1 + b =

Remark 2.6. For one-dimensional datasets X = X1 Xn ,

On x ≡ x − Med1≤i≤n Xi / MAD1≤i≤n Xi

MAD1≤i≤n Xi = Med1≤i≤n Xi − Med1≤j≤n Xj

(11) PDn x = 1 + On x−1

d2M x y = x − y M−1 x − y

y ∈ arg inf Ehθ + x X1 Xr ∩ arg inf Ehθ − x X1 Xr

The convexity of hx x1 xr in x now yields

hθ X1 Xr ≤ 12 hθ + y X1 Xr + 12 hθ − y X1 Xr

Ehθ X1 Xr ≤ 12 Ehθ + y X1 Xr + 12 Ehθ − y X1 Xr

hx0 X1 Xr ≤ λhx X1 Xr + 1 − λhθ X1 Xr

where is the covariance matrix of F.

1 − α u θ − MeduX +α u x − MeduX

≤ 1 − αOθ F + αOx F

(∗) x ∈ Rd Dx P C ≥ α = ∩C PC > 1 − α C ∈ C