0% found this document useful (0 votes)
71 views15 pages

Fuzzy and Possibilistic Shell Clustering Algorithms and Their Application To Boundary Detection and Surface Approximation-Part I

Fuzzy and Possibilistic Shell Clustering Algorithms and Their Application to Boundary Detection and Surface Approximation

Uploaded by

FFSeriesvn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views15 pages

Fuzzy and Possibilistic Shell Clustering Algorithms and Their Application To Boundary Detection and Surface Approximation-Part I

Fuzzy and Possibilistic Shell Clustering Algorithms and Their Application to Boundary Detection and Surface Approximation

Uploaded by

FFSeriesvn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

29

IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 1, FEBRUARY 1995

Fuzzy and Possibilistic Shell Clustering


Algorithms and Their Application to Boundary
Detection and Surface Approximation-Part I
Raghu Krishnapuram, Member, IEEE, Hichem Frigui, and Olfa Nasraoui

Abstruct- Traditionally, prototype-based fuzzy clustering al- than to the cluster centers. Coray seems to have been the
gorithms such as the Fuzzy C Means (FCM) algorithm have first to suggest the use of this idea to find circular clusters
been used to find “compact” or “filled” clusters. Recently, there [lo]. More recently, Dave’s Fuzzy C Shells (FCS) algorithm
have been attempts to generalize such algorithms to the case [13] and the Adaptive Fuzzy C-Shells (AFCS) algorithm [16]
of hollow or “shell-like” clusters, i.e., clusters that lie in sub-
spaces of feature space. The shell clustering approach provides have proven to be successful in detecting circular and elliptical
a powerful means to solve the hitherto unsolved problem of shapes. These algorithms are computationally rather intensive
simultaneously fitting multiple curveslsurfaces to unsegmented, since one needs to solve coupled nonlinear equations to update
scattered and sparse data. In this paper, we present several fuzzy the shell parameters in every iteration [5]. They also assume
and possibilistic algorithms to detect linear and quadric shell that the number of clusters is known. A computationally
clusters. We also introduce generalizations of these algorithms in
which the prototypes represent sets of higher-order polynomial simpler Fuzzy C Spherical Shells algorithm for clustering
functions. The suggested algorithms provide a good trade-off hyperspherical shells and an unsupervised version to be used
between computational complexity and performance. Since the when the number of clusters is unknown have also been
objective function used in these algorithms is the sum of squared introduced [361. Extensions to more general quadric shapes
distances, the clustering is sensitive to noise and outliers. We show have also been proposed [16], [31], [32]. One problem with
that by using a possibilistic approach to clustering, one can make
the proposed algorithms robust. the proposed extensions is that they use a highly nonlinear
algebraic distance, which results in unsatisfactory performance
when the data are scattered [16], [32]. Finally, none of the
I. INTRODUCTION
above shell clustering algorithms can deal with situations in

C LUSTERING methods have been used extensively in


pattern recognition and computer vision [27]. Objective
function based clustering methods are one particular class of
which the clusters include lines/planes and there is much noise.
In this paper, we address these drawbacks in more detail and
present new fuzzy and possibilistic algorithms to overcome
clustering methods in which a criterion function is iteratively these drawbacks.
minimized until a global or local minimum is reached. Ob- The algorithms proposed in this paper can be used for
jective function based clustering can be either hard (crisp) simultaneously fitting a given number of parameterized
or fuzzy, depending on whether each feature vector belongs curves/surfaces to an unsegmented data set. They are
exclusively to one cluster or to all clusters to different degrees. particularly useful for boundary detection and surface
In general, the performance of fuzzy algorithms is superior to approximation in computer vision, especially when the
that of the corresponding hard versions, and they have a lower edges are jagged or when the range data is sparse and
tendency to get stuck in local minima [4]. noisy. In Section X, we qualitatively compare the shell
In objective function based clustering algorithms, each clustering approach with the more traditional generalized
cluster is usually represented by a prototype, and the sum of Hough transform approach for boundary detection. In Part I1 of
distances from the feature points to the prototypes is used this paper, we discuss problems associated with conventional
as the objective function. This method has been traditionally boundary detection and range image segmentation methods
used to detect “compact” or “filled” clusters in feature spaces, in more detail and present unsupervised shell clustering
whose prototypes are typically represented by cluster centers algorithms that use a new cluster validity measure to overcome
and cluster covariance matrices. The Fuzzy C Means (FCM) these problems.
algorithm [4] and its derivatives [12], [20], [23] may be used to In Section 11, we briefly describe prototype-based fuzzy
find clusters that resemble filled hyper spheres or filled hyper clustering. In Sections 111-VII, we introduce several fuzzy
ellipsoids. Lately this approach has been extended to the case shell clustering algorithms. Although the algorithms proposed
of hollow or shell-like clusters by using shells (manifolds) for in these sections are specifically designed to seek clusters
prototypes and measuring the distances to the shells rather that can be described by segments of second-degree curves,
Manuscript received February 3, 1993; revised April 25, 1994. This work (or by segments of shells of hyperquadrics), they can be
was supported in part by the Alexander von Humboldt Foundation, Germany. generalized easily to deal with shells of more complex types.
The authors are with the Department of Electrical and Computer Engineer-
ing, University of Missouri-Columbia, Columbia, MO 6521 1 USA.
In Section VIII, we present one such generalization in which
IEEE Log Number 9406652. the prototypes correspond to sets of higher-order polynomial
1063-6706/95$04.00 0 1995 IEEE

~
30 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 1, FEBRUARY 1995

functions. In Section IX, we describe a possibilistic approach [23]. It uses the distance measure d2(xj,ci) = ICill/n(~j -

to clustering, which has the advantage that the partition and c;)~C;'(X~ - ci). The centers are updated as above, and the
the prototype estimates are much less sensitive to noise when covariance matrices are updated by
compared with the fuzzy approach.

FUZZYCLUSTERING
11. PROTOTYPE-BASED
Let X = {xj I j = 1 . . . N} be a set of feature vectors The G-K algorithm and the unsupervised fuzzy parti-
in an n-dimensional feature space with coordinate-axis labels tion-optimum number of clusters algorithm due to Gath and
[zl,zz, ... ,z,], where xj = [zjl,zjz ..,. , 2 j n l T . Let B = Geva [20], assume that the clusters are compact ellipsoids and
(PI,. . . ,pc) represent a C-tuple of prototypes each of which allow each cluster to have a different size and orientation. They
characterizes one of the C clusters. Each pi consists of a set of can also be used to detect linear clusters in 2-D and planar
parameters. In the following, we use to denote both cluster clusters in 3-D, since these are extreme cases of ellipsoids
i and its prototype. Let uij represent the grade of membership W l .
of feature point xj in cluster Pi. The C x N matrix U = [uij] The general form of prototype-based clustering algorithms
is called a constrained fuzzy C-partition matrix if it satisfies is given below.
the following conditions [4], [25]
N
u;j E [0,1] for all i , 0 < uij < N for all i ,j and PROTOTYPE-BASED FUZZY CLUSTERING
j=1 Fix the number of clusters C;& m, m E [l,CO);
C Initialize the fuzzy C-partition U;
E U 2' 3. -- 1 for all j. (1) REPEAT
i=l Update the parameters of each cluster prototype;
The problem of fuzzily partitioning the feature vectors into C Update the partition matrix U by using (3);
clusters can be formulated as the minimization of an objective UNTIL(llAUII < E ) ;
function J ( B ,U;X) of the form
C N

The hard (crisp) versions of these algorithms are easily


obtained by changing the updating rule for the memberships
In the above equation, m E [l,m) is a weighting exponent so that they are always binary. In other words, one uses
called the fuzzifier, and d2(xj,Pi) represents the distance from 1 if d2(xj,pi)< d2(xj,pk) for all IC
a feature point xj to the prototype Pi. Minimization of the
objective function with respect to U subject to the constraints
U"
2.l
-
- { 0 otherwise.

in (1) gives us [4] Ties are broken arbitrarily. In practice, the hard versions do
not perform as well as their fuzzy counterparts. Due to space
limitation, we do not deal with the hard algorithms in this
paper.

111. THE FUZZYC QUADRIC


SHELLS(FCQS) ALGORITHM
where Ij = {i I 1 5 i 5 C,d2(xj,P i ) = 0 ) . Minimization of This algorithm [31], [32] assumes that each cluster resem-
J ( B ,U; X) with respect to B varies according to the choice bles a hyperquadric surface, and the prototypes pi consist of
of the prototypes and the distance measure. For example, in parameter vectors p; which define the equations of the hyper-
the FCM algorithm, the clusters are usually assumed to be quadric surfaces. The general equation for such a hyperquadric
compact and spherical in shape, and each of the prototypes surface is
is described by the cluster center ci. If the distance measure pTq = 0 (7)
is Euclidean or an inner product induced norm metric, these
centers may be updated in each iteration using [4] where
T
Pi =
(4)
qT =
where
2,-12n, . ., G I , 13
z1,z2,.
N
Ni = C(U;~)". (5) and s = + + + +
n 1 = T n 1. We may define the
j=1 algebraic (or residual) distance from a point xj to a prototype
In the Gustafson-Kessel (G-K) algorithm, the prototypes con- Pi as
sist of the cluster centers c; and the covariance matrices C; d2(xj,p;)= d&j = P'qjqTp; = p'Mjp; (9)
KRISHNAPURAM et al.: SHELL CLUSTERING ALGORITHMS-PART I 31

where

To obtain a fuzzy C-partition of the data, we may minimize the


objective function in (2) with d& as the underlying distance
measure. Since the objective function is homogeneous with
respect to pi, however, we need to constrain the problem to
avoid the trivial solution. Some of the possibilities are
Fig. 1. (a) A data set consisting of two ellipses and a circle. (b) The
prototypes found by the FCQS algorithm superimposed on the dataset.

Constraints i) [41] and ii) [21] do not make the distance


measure invariant to translation and rotation of the data. They
do allow the solution to be linear or planar. Constraint iii)
was used by Chen [9] and Krishnapuram et al. [36] for
circles and by Dave and Bhaswan [16] for quadric curves.
It precludes linear solutions and can lead to instabilities or
poor performance when the data points are approximately
linear (planar) as in the case of partial circles (spheres) with
very large radii [42]. Moreover, in the case of noncircular It is to be noted that HY1 exists as long as there are at least
+
(nonspherical) quadrics, this constraint makes the distance n 1 noncollinear feature points in the data set. Thus, in the
measure rotation-variant, which is undesirable. Constraint iv) FCQS algorithm, the parameters are updated using (1 l), and
[2], [6], [l 11 has the problem that no conic that passes through the memberships are updated using (3) except that d2(x,,0;)
the origin satisfies it. The last constraint was imposed by is replaced by d&.
Bookstein [7], [45] and has the advantage that the resulting Shell clustering algorithms are quite sensitive to how one
distance measure is invariant to rigid transformations of the initializes the C-partition. In our experience, a few (typically
prototype. It does not allow, however, the solution to be 10) iterations of the FCM algorithm followed by a few
linear or planar. F'ratt showed that this is a great disadvantage (typically 10) iterations of the G-K algorithm and a few
and proposed a new quadratic constraint for the case of (typically five) iterations of the Fuzzy C Spherical Shells
circles [42]. Taubin [46] generalized this idea for fitting algorithm [36] provides a good initialization for the FCQS
an implicit polynomial curve to a data set. This constraint algorithm. While using the FCM algorithm for initialization, a
will be discussed in Section VI. Bookstein's Constraint, i.e., value of 3.0 for the fuzzifier m seems to give good results. This
Constraint v) above, is in our experience the best compromise is because the FCM is not really meant for shell clusters and
between computational complexity and performance. Agin [ 13 a higher value of m gives a fuzzier partition, which is more
also discusses some of these issues. desirable for initialization purposes. In all other algorithms, a
If we define value of 2.0 for m works best in practice.
Fig. 1 shows a typical example of the results obtained by
the FCQS algorithm with a synthetic data set containing about
200 points. Fig. l(a) shows the 200 x 200 image of the original
data set. A uniformly distributed noise with an interval of 3.0
was added to the z and y locations of the data points so that the
points do not lie on ideal curves. Fig. l(b) shows the resulting
prototype curves superimposed on the original data set. The
then constraint v) in (10) simplifies to 110;11~=1. It is easily number of clusters was assumed to be known. The algorithm
verified [19], [31] that the solution is typically converges in about 20 iterations and the CPU time
on a Sun Sparc 1 workstation is less than 10 s.
ai = eigenvector of (Fi - G'H;'Gi)associated Since the FCQS algorithm uses the algebraic distance given
with the smallest eigenvalue, and by (9) which is highly nonlinear in nature, the membership
(1 1)
assignments are not very meaningful. Moreover, when there
bi = -HTIG- z zai
are curves (surfaces) of highly-varying sizes, the algebraic
32 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 1, FEBRUARY 1995

'I c = (0.2)
x T A i x + x T b ; + ci = 0. Then the distance d g i j can be
obtained by minimizing llxj - z1I2 subject to
zTA;z + ZTbi + C; =0 (12)
where z is a point on the quadric pi. By using a Lagrange
multiplier A, the solution is found to be
Fig. 2. An example to illustrate the sensitivity of the algebraic distance to 1
the size of the curve as well as location of the feature point with respect to the z = -(I
2 - XA;)-l(Xb; + 2xj). (13)
curve. Points A , B,and C are all geometrically equidistant from the bigger
ellipse, and point A is equidistant from both ellipses. However, the algebraic
distance does not reflect this.
Substituting (13) in (12) yields a quartic (fourth-degree) equa-
tion in X in the 2-D case (see Appendix A for details), and
has at most four real roots A k , k = 1, . . . ,4.The four roots
distance is biased towards smaller curves (surfaces), and for can be computed using the standard closed-form solution. For
a particular curve (surface) it is biased towards points inside higher dimensions, the equation is of sixth degree or higher,
the curve (surface) as opposed to points outside. Thus, the and iterative root finding techniques need to be used. For each
distance measure gives rather eccentric and highly curved real root X I , so computed, we calculate the corresponding z
fits if the data is scattered as the prototypes try to enclose vector z k using (13). Then, we compute dCzJ using
more points inside the curve. The distance also is sensitive to
the placement of the feature point with respect to the curve. d$zJ = minllx, - zk1I2. (14)
k
Consider for example, two ellipses El and E2 as shown in
Fig. 2. Ellipse El is centered at (0,O) and its major and minor Minimization of the objective function in (2) with respect
semi-axes have the values 2 and 1. Ellipse E2 is centered at to pz when dCZ, is used as the underlying distance measure
(5, 0) and its major and minor semi-axes have the values 1 can be achieved only by using iterative techniques such as
and 1/2. Consider points A = (3,0), B = (1,O) and C = the Levenberg-Marquardt algorithm [39], [46]. To overcome
( 0 , 2 ) .Note that all three points are equidistant from El in the this problem, we may assume that we can obtain approxi-
Euclidean sense, and point A is equidistant from both El and mately the same values for pz by using ( l l ) , which will
Ez. Keeping in mind constraint iv) in (lo), we can write the be true if all the feature points lie reasonably close to the
expressions for the algebraic distances of a point x = (z, y) hyperquadric shells. This leads to a modified FCQS algorithm,
from El and E2 as d2(x,E1)= (16/17){x2/4 y2 - 1}2 + in which the memberships are computed using d$zJ, but
+
and d2(x,E2) = (1/17){(x - 5)2 4y2 - 1}2. Thus, we the parameters are updated using d&. An alternative is to
see that d2(A,El)= 25/17,d 2 ( B ,El)= 9/17,d2(C,E l ) = use the Levenberg-Marquardt algorithm after initializing it
144/17,and d2(A,E2) = 9/17.This clearly illustrates the bias with the solution obtained by (11) in each iteration. This
of this distance measure as discussed above. Another problem is implementationally and computationally more complex,
with the nonlinear distance is that it makes the fit of each however, and is recommended only for small data sets. We
curve rather sensitive to the presence of other curves, which have observed that this can increase the CPU time by an
sometimes leads to unstable hyperbolic fits. An example is order of magnitude, although the overall number of iterations
shown in the next section. required for the FCQS algorithm to converge is somewhat
Here we would like to note that although the distance used lower. Moreover, our simulations indicate that the performance
in the AFCS algorithm [ 161 for elliptical clusters is not biased of the modified FCQS algorithm is adequate for most computer
towards points inside the curve, it is still sensitive to the vision applications. The initialization procedure recommended
size of the curve as well as the placement of the feature for the FCQS algorithm can also be used for the modified
point with respect to the curve. For example, the expressions version with good results.
for the AFCS distance of a point x = ( q y ) from El and Fig. 3(a) shows a data set with three curves for which the
E2 can be written as d 2 ( x , E 1 )= {[x2/4 y2]1/2 - l}z + original version of the FCQS algorithm fails. It can be seen
and d2(x,E2)= {[(x - 5)' +
4y2]1/2 - 1}2. Thus, we see that due to the presence of other curves, the fit for the circle
that d (A , El) = 1/4,d (B,El)= 1/4,d2(C,E1)= 1 and becomes distorted, resulting in a hyperbola. Fig. 3(b) shows
d2(A,E2) = 1. In the next sections we introduce modifications the result of the modified FCQS algorithm, illustrating the
to the basic FCQS algorithm to mitigate this problem, keeping advantage of the more meaningful membership assignments
in mind that one needs to keep the computational complexity in the modified version. It is to be noted that in the modified
as low as possible. algorithm, although the memberships are based on geometric
distances, the parameters are still estimated by minimizing the
Iv. MODIFICATIONS
TO THE FUZZY algebraic distance. This may give poor fits when the data
C QUADRICSHELLSALGORITHM is highly scattered. An example of this behavior is shown
in Section VI. Another problem with the modified FCQS
One possible way to alleviate the problem due to the algorithm is that d;,, has a closed-form solution only in the
nongeometric nature of d& is to use the geometric (per-
2-D case. In higher dimensions, solving for dCZJis not trivial.
pendicular) distance denoted by d& between the point x; Henceforth we will simply use the acronym FCQS to denote
and the shell pi. To compute dCij we first rewrite (7) as the modified version.
KRISHNAPURAM et al.: SHELL CLUSTERING ALGORITHMS-PART I 33

Add all points assigned to cluster to the data set x;


C=C+l;
Initialize the new linear prototype as the one of the two
coincident lines;
END IF;
IF is a nonflat hyperbola OR a pair of intersecting
lines OR
a pair of parallel lines THEN
Add all points assigned to cluster Pi to the data set x;
C=C+2;
I I 1 I
Initialize the new linear prototypes as the asymptotes of
(a) (b) the hyperbola or as the individual lines making up
Fig. 3. An example illustrating the advantage of the modified FCQS algo- the pair of lines;
rithm over the FCQS algorithm on a data set containing a circle, a parabola,
and an ellipse. Prototypes of clusters found by (a) the FCQS algorithm and (b) END IF;
the modified FCQS algorithm. The modified algorithm avoids the hyperbolic IF Pi is an ellipse with a very large major axis to minor
fit. axis ratio THEN
Add all points assigned to cluster 0;to the data set x;
c=c+2;
Initialize the new linear prototypes as the two tangents
to the ellipse at the two ends of the minor axis;
END IF;
IF P; is a hyperbola with a very large conjugate axis to
transverse axis ratio THEN
Add all points assigned to cluster Pi to the data set


I (a) (b)
Fig. 4. Examples illustrating the tendency of the FCQS algorithm to fit
x; c = c 2;+
Initialize the new linear prototypes as the two tangents
to the hyperbola at its two vertices;
END IF;
END FOR;
pathological prototypes to scattered linear data. (a) A “flat” hyperbolic fit
Run the G-K algorithm on the data set x with C clusters
for two parallel lines. (b) An extremely elongated elliptical fit for two parallel using the initialization for the prototype of each cluster.
lines.

V. LINEDETECTIONUSING THE FCQS ALGORITHM Appendix B summarizes the various conditions that one
needs to check to determine the nature of the second degree
The FCQS algorithm can be used to find linear clusters, even
curve. The initialization procedures for the various cases in the
though the constraint forces all prototypes to be of second
line detection algorithm are described in Appendix C. Since
degree. This is because in practice this algorithm fits a pair
the initialization is excellent, the G-K algorithm converges in a
of coincident lines for a single line, a hyperbola for two
couple of iterations. The above algorithm successfully handles
intersecting lines, and a very “flat” hyperbola (see Fig. 4(a)),
the pathological cases shown in Fig. 4.
or an elongated ellipse (see Fig. 4(b)), or a pair of lines for
two parallel lines. Hyperbolas and extremely elongated ellipses
occur rarely in practice. When the data set contains many linear VI. THEFUZZYC PLANO-QUADRIC
clusters, the FCQS algorithm characterizes them variously as SHELLS(FCPQS) ALGORITHM
hyperbolas, extremely elongated ellipses, etc. In this case, we When the exact distance is too complex to compute, one
can group all the points belonging to such pathological clusters could use what is known as the “approximate distance” (first-
into a data set and then run a line finding algorithm such as the order approximation of the exact distance) given by [24],
G-K algorithm (see [30]) on this data set with an appropriate [461
initialization. The parameters of the lines can be determined
from the centers and the covariance matrices of the clusters. d2(X,, Pz) = = ___
d&3

The line detection algorithm summarized below may be used IVdQvl2


after the FCQS algorithm converges. - PTMJP%
. (15)
PTID(q,)D(q,)TIPz
THE LINE DETECTION ALGORITHM where VdQij is the gradient of the functional pTq in (7)
Set x,the set of all data points in linear clusters to 8; evaluated at x j and the matrix D(qj) is the Jacobian of q
Set number of lines C to 0; in (8) evaluated at xj [46]. In other words, the approximate
FOR each cluster pi DO distance is simply the algebraic distance divided by the gradi-
IF is a pair of coincident lines THEN ent magnitude. The objective function to be minimized in this
34 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 1, FEBRUARY 1995

If a feature vector xj lies on the prototype of cluster pi, then


it must satisfy (21) and the above expression becomes
IVdQij12= PTD(e)D(e)TPi
= -4l1ilpi(~+z) (& .. . + + + p:(,+,)) = constant
= K.
When the memberships are almost hard, however, substituting
A suitable constraint needs to be chosen while minimizing the above result in (18), we get N;K M Ni, or K M 1.
(16) with respect to pi. One may use the constraint proposed Thus, assumption ii) is valid for circles (spheres) when qj
by Taubin [46] given by corresponds to a point on the curve (surface). It is obviously
valid for the case of planes regardless of the location of xj
since pi1 = 0. It can also be shown that the assumption is
j=1 valid for cylinders and rectangular hyperboloids. For other
This constraint has the advantage that it takes the data points quadric shapes, this assumption does not hold. This issue will
into account and allows the degenerate case of lines and planes. be discussed further in Section VII.
It is meant for fitting a single curve to a given set of data If pTD(%)D(e)Tp;M 1,we may ignore the denominator
points. Hence, we need to generalize it to the fuzzy case and in (16). The simplified objective function is given by
to fit C curves simultaneously. This is achieved by changing C N
the constraint to [33] J A ( B , U; x)= x('&j)mpTMjPi
i=l j=1

PT C ( u i j ) " [ ~ ( e ) D ( q j ) T Pi
['
j=1
f o r i = l , . . . , C , or
p'Dip; = N; f o r i = l , . . . , C
1
l = Ni

(18)
(19)
where
=
i=l
C
pTEip;

N
where Ni is as in (3,and E; = C(u;j)"Mj.
N j=1
Di = Cbij)"[D(qj)D(qj)T]. (20) The above objective function is essentially the same as the
j=1
one used in the FCQS algorithm, however, the constraint used
Minimization of (16) with respect to pi subject to (19) is different. Minimization of the simplified objective function
yields a complicated equation which cannot be solved for subject to the constraint in (19) leads to
pi explicitly. To avoid an iterative solution, we make the
following assumptions: E;pi = AiDipi.
i) All data points are reasonably close to some cluster, i.e., It is easily verified that
the u;j are close to being hard. This assumption is valid
when the data is not very scattered. (It is reasonable for 2Xjl 0 .'. 0
0 .-. 0

El
2xj2
the noisy case if we use possibilistic memberships, as will
be discussed in Section IX.) 0 0 ... 0
D(qj)= A:! , where AI= . ... .
ii) The magnitude of the gradient at all points xj that have
a high membership in pi is approximately constant, i.e.,
. ... .
lVdQij12 = pTD(qj)D(qj)TpiM 1.
-1 0 e . .

We now discuss the second assumption for the special case of xj1 *.. 0
xj2 0 1 ...
hyperspheres. Hyperspheres are described by xj3 0 ..* 0 0 0 ...
PTq = bilrPi2,.. . ,Pi(n+l),Pi(n+2)1
0 0 ... 0 . . ...
. ... >andA3= .
+ + + +
x [(x? xf xi . . . x?), 2 1 , . . . ,xn, 1]* . ...
. ...
. . ...
= 0. (21) . ...
0 0 .*.
Here pi represents a prototype parameter vector. Let xj =
[ x l j , . . .,xnjlT denote a feature point, and let qj = [(xTj + Since the last row of D(qj) is always equal to [ O O . . . O ] , D;
+ + +
xij xij . . . x i j ) , xlj, . . . ,xnj, 1IT. Then we have is singular, and the above generalized eigenvector problem
IVdQij12 = (2Pilxjl + Pi2)' + ' . + (2Pilxjn + Pi(n+l))2
' cannot be converted to regular eigenvector problem, however,
= 4Pii (piiX?l + .. * -k pii& + pi2Xji + ... we may solve it using the QZ algorithm [46]. Care must be
exercised while solving (22), because the matrices D; and E;
+ P i ( n + I ) X j n ) + (Pa+ . . + P&+l)). are highly unbalanced. Several methods for balancing matrices
KRISHNAPURAM et al.: SHELL CLUSTERING ALGORITHMS-PART I 35

are available in the literature. Thus, (22) gives us the prototype


update equation for the FCPQS algorithm. The membership
update equation in the FCPQS algorithm is identical to (3),
except that d2(x,,pi) is replaced by d i i j . In the special case
of hyperspherical prototypes, this algorithm may be called the
Fuzzy C Plano-Spherical Shells algorithm.
The FCPQS algorithm requires solving a six-dimensional
generalized eigenvector problem (in the 2-D case) as opposed
to a three-dimensional regular eigenvector problem. In addition
to its computational complexity, we noticed that the FCPQS
algorithm in general requires more iterations to converge (a) @)
than the FCQS algorithm. When the data set contains only
Fig. 5. (a) Result of the Fuzzy C Spherical Shells algorithm on a data set
one noiseless linear (planar) cluster among other nonlinear containing a line and a circle. The prototypes are shown superimposed on the
(nonplanar) clusters, the FCPQS algorithm does detect and data set. The constraint used in this algorithm does not allow the degenerate
characterize the linear (planar) cluster correctly. On the other case of lines. (b) Result of the Fuzzy C Plano-Spherical Shell algorithm.
hand, if the linear (planar) cluster is scattered or noisy, or if
multiple linear (planar) clusters are present, the lines (planes) (infinite) extension of the given cluster. To avoid this problem,
may be “overfitted” by a quadric curve (or surface). (One one may use a heuristically modified distance which is a
can always obtain a lower error of fit by fitting an extremely convex combination of the approximate distance to the shell
elongated ellipse to a scattered line.) The FCPQS algorithm dAz and the Euclidean distance d ~ z This . modified distance
also sometimes combines a pair of linear clusters into a single d ~ canz be formulated as
quadric cluster, a result similar to that of the FCQS algorithm. dM2 = (1 - E ) d ~ z-k E d p
In such cases, the line detection procedure described in the
previous section needs to be applied. Thus, in the 2-D case, where E is chosen such that the second term becomes compara-
this algorithm does not seem to have any advantage over the ble to the first term only for large dEz .The value of E was of the
FCQS algorithm. order of in our application. The objective function can be
The real advantage of the FCPQS algorithm over the FCQS easily reformulated using the modified distance. The Euclidean
algorithm is in higher dimensions. It may be recalled (from distance dE2 is measured from the statistical center (mean) of
Section IV) that the exact distance to a quadric has no the cluster, and the center is updated in every iteration using
closed-form solution in higher dimensions. Thus, (the modified (4). We recommend using the modified distance only as a
version of) the FCQS algorithm is quite impractical. The method to refine the results obtained by the FCPQS algorithm.
performance of the FCPQS algorithm, however, is quite good, Thus, this distance may be used for a couple of iterations after
and its computational complexity is reasonably low. (See [46] the FCPQS algorithm converges.
for a discussion on the computational complexity of the QZ Figs. 5(a) and 5(b) show the results of the Fuzzy C Spher-
algorithm.) This is the primary reason we chose this algorithm ical Shells algorithm [36] and the Fuzzy C Plano-Spherical
to develop an unsupervised algorithm for surface fitting. This Shells algorithm, respectively, on a data set consisting of a
algorithm will be explained in Part I1 of this paper. circle and a line. The prototypes obtained are superimposed
In the 3-D case, we noticed that the smallest eigenvalue on the original data. Unlike the Fuzzy C Spherical Shells
solution of (22) may sometimes correspond to an “overfit- algorithm, the Fuzzy C Plano-Spherical Shells algorithm with
ted” prototype or to a surface prototype that almost never the constraint in (19) is able to characterize both the line and
occurs in real images. Examples are hyperboloids of two the circle correctly. Figs. 6(a) and 6(b) show the fit obtained
sheets, hyperbolic cylinders, and imaginary quadric surfaces. by the FCQS and FCPQS algorithms for a scattered ellipse.
Therefore, we accept the smallest eigenvalue solution of (22) (It is to be noted that the modification to the FCQS algorithm
only if it represents the parameter vector of an “acceptable” presented in Section IV has no effect if there is only one
surface type, otherwise we keep checking the next eigenvectors cluster in the data set, since all memberships are equal to
(assuming that they are organized in ascending order of the 1.0.) It can be seen that the FCPQS fit is much better, even
eigenvalues) until we find the first one that represents an though the assumptions made in arriving at (22) are not valid
“acceptable” surface prototype. By an “acceptable” surface for an ellipse. The sum squared error (measured using dpz
prototype, we mean the following: real ellipsoids, hyperboloids for the two cases are 301.2 and 254.7, respectively. In the
of one sheet, real quadric cones, elliptic paraboloids, real 2-D case, we observed that the FCPQS algorithm takes 20%
elliptic cylinders, parabolic cylinders, and (pairs of) planes. to 30% more iterations to converge compared to the FCQS
The different types of quadric surfaces and their identification algorithm. The CPU requirements per iteration are about 10%
conditions are listed in Appendix B. to 20% higher. In higher dimensions, the difference will be
In the 3-D case, many quadric surfaces such as cones, considerably higher.
cylinders, and planes are not bounded surfaces, and have an
infinite extent. Therefore, a given cluster having the prototype VII. A WEIGHTING PROCEDURE TO IMPROVE FITS
of one of these unbounded surface types will attract many As discussed in the previous section, the assumption that
other points belonging to other clusters if the points lie on the 1Vd~;jl’x constant for all xj having a high membership in
36 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 1, FEBRUARY 1995

Fig. 6. An example illustrating the advantage of the FCPQS algorithm over Fig. 7. Effect of the reweight procedure on the FCQS algorithm. The
the FCQS algorithm for scattered data. (a) Prototype found by the FCQS prototype for two intersecting lines found by the FCQS algorithm (a) without
algorithm for a scattered ellipse. (b) Prototype found by the FCPQS algorithm reweight and (b) with reweight.
for the same scattered ellipse.

better than the original fit. Therefore, in each iteration, we


p; is always true for linear (planar) prototypes regardless of the
compute the parameter vector pi both with and without the
location of the feature point xj with respect to the prototype.
weights and accept the parameter vector pi resulting from the
Otherwise, this assumption is correct only for certain types of
reweight procedure only when the error of fit decreases. The
shells (such as circles and rectangular hyperbolas in 2-D, or
sum of geometric or approximate distances for each individual
spheres and cylinders in 3-D), even then only if all the feature
cluster may be used as a measure of the error of fit.
points lie close to one of the shells. Thus, this assumption is
The reweight procedure can also be adopted for the FCQS
invalid for ellipses and parabolas in 2-D, and ellipsoids and
algorithm with the same choice of weights with good results.
many other quadric shapes in 3-D. Therefore, the fit will be
Following the same steps that were used above, it can be
biased towards points where the gradient magnitude I V d ~ iI j
shown that the solution with the reweight procedure for the
is high. One solution is to weight the distance measure to
FCQS algorithm (subject to Bookstein's constraint) is given
improve the fit [42], [46]. We achieve this by minimizing the
by the following equations instead of (1 1).
weighted objective function given by
C N a; = eigenvector of ((Fw); - (Gw)T(Hw);~(Gw);)
associated with the smallest eigenvalue, and
b; = -(Hw);'(Gw)i ai
subjected to
where
N
where (Fw); = C ( U i j ) " W i j R j ,
N j=1
N
j=1 (Gw 1; = C ( ~)"wijSj
ij
j=1
Since the purpose of introducing the weights wij is to reduce
the bias due to the omission of the denominator in (16), ideally and
these weights should be chosen as N
(Hw); = C ( ~ ; j ) ~ w ; j T j .
j=1
With this choice of wij, however, the objective function in
(23) becomes identical to the one in (16), which cannot be The FCQS algorithm with the reweight procedure produced
minimized easily. To simplify the problem, we may treat the the same fit for the scattered ellipse of Fig. 6 as the one
w;j as constants which are updated in every iteration according produced by the FCPQS algorithm. Fig. 7(a) shows the results
to (24) using the parameter values pi from the previous of the FCQS algorithm on a data set with two intersecting lines.
iteration. Using this heuristic, the parameters can be obtained In this figure, the fit is poor around the intersection, because
by solving the gradient magnitude is zero at the point of intersection.
Fig. 7(b) shows the result of the FCQS algorithm with the
(Ew)ip; = Xi(Dw)ipi, where reweight procedure. An obvious improvement in the fit can
N
be seen.
In the case of the FCPQS algorithm, the use of the reweight
j=1
procedure does not have much effect in 2-D, however, its effect
Since this reweight procedure is heuristic, it is not guar- can be significant in 3-D. Therefore, we recommend the use
anteed that the fit obtained after reweighting will always be of the FCPQS algorithm with reweight in the 3-D case.
KRISHNAPURAM et al.: SHELL CLUSTERING ALGORITHMS-PART I 37

VIII. GENERALIZATION
TO PROTOTYPES REPRESENTED where Ik is a k x k identity matrix, and
BY SETS OF HIGHER-ORDER
POLYNOMIAL FUNCTIONS N
Consider the set of zeros of f (x) defined by D; = C ( ~ i j ) " [ D ( ~ j ) O ( x j ) " ] .
j=1
fl(X)=O;fZ(X) =o;"',fk(x) =o. (25)

1.
D(xj) is the Jacobian matrix D(x) evaluated at x, given by
Here X" = [ X I , 2 2 , . . . ,x,] is an n-dimensional coordinate 8Fl ( X , ) aFh(X3)
vector, as before. It is to be noted that f (x) is a vector con-
sisting of k functions, all of which have to be simultaneously
satisfied. Thus, if n = 2 and k = 1, we have a planar curve,
if n = 3 and IC = 1, we have a 3-D surface, and if n = 3
D(Xj) =
[-::: . . . . a?
aFh ( X j
8%
It can be shown [46] that the solution to the minimization of
and k = 2, we have a space curve. For example, a circle of (26) subject to (27) is given by the eigenvectors corresponding
radius 1 on the xy-plane in 3-D is defined by the equations to the least IC eigenvalues of the generalized eigenvector
+
x2 y2 - 1 = 0, and z = 0. Let problem given below.
F(x) = [Fl(x),Fz(x),...,Fh(X)lT piEi = XipiDi.

be a vector of h polynomials. Each of the F,(x) is a Each of the eigenvector solutions pi gives us one row of
polynomial of degree p or less in each of the coordinate Pi. As in the case of the FCPQS algorithm, the use of
axis labels x1,x2,..',x,. For example, when n = 2 the constraint in (27) gives us fits that correspond to the
and k = 1, and p = 2, F(x) can be chosen to be approximate distance, even though the objective function uses
qT = [z:, zF,. . . , x i , 2 1 ~ 2 , ... , x,-~x,, ~ 1 , ~ .2. , z,. 11 the algebraic distance, especially when reweighting is used.
as in (8), in which case each element of F(x) is a monomial. It is to be noted that in this general case, each prototype
For a suitable choice of F(x), we can write (25) as is represented by a set of higher-order polynomials. In this
sense, the above algorithm is also a generalization of the Fuzzy
f(x) = P F ( x ) = 0
C Regression Models (FCRM) introduced by Hathaway and
where P is a k x h matrix of parameters consisting of Bezkek recently [25]. The FCRM considers two-dimensional
the coefficients of the functions f2(x).Thus P represents prototypes of the form y = f ( z ) , where f ( x ) is a polynomial.
the prototype parameters, where each prototype is a set of In other words, y is explicitly considered as a dependent
functions f(x). We can construct an objective function based variable, and higher powers of y do not appear in this model.
on C such prototypes as The FCRM uses the algebraic distance, and the implicit
C N constraint that the coefficient of y is one.
J P ,U;X)= ~(U2J)"IIP2F(x,)112 (26) Ix. POSSIBILISTIC MEMBERSHESFOR ROBUSTCLUSTERING
z=1 ,=1
Fuzzy clustering algorithms do not always estimate the
where II = (PI,. . . , Pc) represent a C-tuple of prototypes. prototype parameters of the clusters accurately. The main
Here I(PzF(x,)l12 represents the algebraic distance from x, to source of this problem is the probabilistic constraint used
prototype P , . The above objective function can be written as in fuzzy clustering, which states that the memberships of a
C N data point across all clusters must sum to one. [See (l)]. This
J P U;
, X)= ~(2LZ,)"Tr[P2F(x,)FT(X,)PT1 problem has been discussed in detail in [34]. Here we present
2=1 ,=1 two simple examples to illustrate its drawbacks as related to
C N shell clustering. Fig. 8(a) shows a situation where there are two
= C(U2,)mTr[PzM,PTI linear clusters. Fuzzy clustering would produce very different
2=13=1 (asymmetric) memberships in cluster 1 for points A and B ,
C even though they are equidistant from the prototype. Similarly,
= Tr[PiE;PT] point A and point C may have equal membership values in
i=l cluster 1, even though point C is far less typical of cluster 1
where than point A. The resulting fit for the left cluster would thus
N be skewed. Fig. 8(b) presents a situation with two intersecting
Mj = F ( x j ) F " ( ~ j ) , and Ei = E ( u i j ) " M j . circular shell clusters. In this case, intuitively, one point A
j=1 might be considered a "good" member of both clusters, where
The above objective function can be minimized subject to the as point B might be considered a "poor" member, and point C
constraint an outlier. Here again, the constraint in (1) would force points
A, B, and C have memberships of 0.5 in both clusters. The
r N 1
membership values cannot distinguish between a moderately
a typical member and an extremely atypical member, because
the membership of a point in a class is a relative number. In
otherwords, the memberships that result from the constraint in
(1) denote degrees of sharing rather than degrees of typicality.
38 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 1, FEBRUARY 1995

'C One can cast the clustering problem into the framework of
possibility theory [18], [49] by relaxing the constraint in (1)
and reformulating the objective function in (2) as [34]
C N
J" U;X) = C('"ij)"d2(Xj> Pi)
i = l j=1
C N
+ vi C(1- Q j ) m (28)
' i=l j=1

(a) (b) where are suitable positive numbers. The first term in
Fig. 8. Disadvantages of constrained fuzzy memberships. (a) A data set with (28) requires that the distance from the feature vectors to
two linear clusters. Membership values in cluster 1 for points Aand B will the prototypes be as low as possible, whereas the second
be different, even though they are equidistant from the prototype. Point A
and point B may have equal membership values in cluster 1, even though term forces u i j to be as large as possible, thus avoiding the
point C is far less typical of cluster 1 than point A. (b) A data set with two trivial Solution. It is easy to show [34] that U may be a
intersecting circular-shell clusters. The memberships of points A, B, and C global minimum of J,(B, U;X) only if the memberships
are all 0.5 in the two clusters, even though point B is less typical of the
clusters, and point c is an outlier. are updated by

Thus, the updated value of uij depends only on the distance


of xj from pi, and not on the distance of xj from all other
prototypes, which is a desirable result. The prototypes are
updated in the same manner as in the corresponding fuzzy
algorithms.
It is best to initialize the possibilistic algorithms with
the corresponding fuzzy algorithms. Since the value of vi
determines the distance at which the membership value of a
(a) (b) point in a cluster becomes 0.5, it should relate to the overall
Fig. 9. The effect of noise points on shell clustering. Prototypes for two
size and shape of cluster P;. When the nature of the clusters is
noisy ellipses found by (a) the Hard C Quadric Shells algorithm and (b) the known, values for 71; may also be fixed apraori. For example,
Fuzzy C Quadric Shells algorithm. in the case of shell clustering algorithms, the values for vi
may be set equal to the square of the expected thickness of
Therefore, noise points, which are often quite distant from the the shells. Another possibility is to use the fuzzy intra-cluster
primary clusters, can drastically influence the estimates of the distance to compute vi, i. e.,
class prototypes, and hence the final partition.
One established technique for reducing the effect of noise
on least-squares fits is to use weights that are inversely related
to the distance of the point to the prototype [24], [37], [48]. Typically K is chosen to be one. Several other ways to
Although one may interpret memberships as weights, the estimate the q; are given in [34].
memberships generated by fuzzy clustering are not inversely Although the fits themselves are not very sensitive to the
related to the distance, since they are relative numbers. Other exact values of vi, some of the shell cluster validity measures
ways to deal with noise in clustering maybe found in [29], (to be discussed in Part I1 of this paper) do depend on the
[47]. DavC also discusses another effective way to deal with the choice of q;. For this reason, the best approach is to compute
noise problem (although not the relative membership problem) approximate values for the vi from the initial fuzzy partition
by introducing the concept of a noise cluster [14], [15]. using (30), and after the possibilistic algorithm converges, run
If there is only one cluster in the data, there is no difference a few more iterations of the algorithm with a fixed value of
between crisp and fuzzy clustering. Otherwise, the effect of 7;. This will ensure that the validities of all the clusters are
noise points on the final partition is more drastic in the crisp measured on a uniform scale. The fixed value of is the
case, because the membership of noise points tends to be expected thickness of the shell clusters. A good value for q; in
distributed among all classes in the fuzzy case. Since the sum boundary detection applications is about two. The possibilistic
of the memberships of a point is constrained to be equal to shell clustering algorithm is summarized below.
one, however, this difficulty still remains with a lesser degree THE POSSIBILISTIC SHELL CLUSTERING
in the fuzzy case. Fig. 9 shows the effect of noise on a data ALGORITHM
set containing two elliptic clusters. As can be seen, the result Fix the number of clusters C ; fix m , m E [l,co);
of the fuzzy case (Fig. 9(b)) is poor but better than that of the Initialize C-partition U using the corresponding fuzzy
hardcase (Fig. 9(a)). algorithm ;
39
KRISHNAPURAM er al.: SHELL CLUSTERING ALGORITHMS-PART 1

x reweight procedure presented in Section VI1 is particularly


important in the 3-D case, and since the computation of the
exact distance is too expensive, the FCPQS algorithm and its
possibilistic version are the only viable algorithms in this case.
The existing fuzzy clustering methods use relative member-
ships, which cannot always distinguish between good members
and poor members. On the other hand, if one takes the view
that the membership of a point in a class has nothing to do
with its membership in other classes, then one can achieve
membership distributions that correspond more closely to the
notion of typicality. The resulting possibilistic algorithms are
X
I
naturally more immune to noise. One disadvantage of the
possibilistic approach is that one needs to estimate the band-
widths rl;. In most practical applications of shell clustering,
the expected thickness of the shell clusters is known, and this
is not a major drawback. It is also to be remembered that the
possibilistic algorithms need a good initialization. Thus, the
Fig. 10. Advantage of unconstrained memberships in the possibilistic ap- fuzzy algorithms will always be useful.
proach. Prototypes for two noisy ellipses found by the PCQS algorithm.
Traditionally, the generalized Hough transform (GHT) [3],
[26], [38] has been used to detect shapes when the bound-
Estimate qi using (30); aries/surfaces are noisy or sparse. One disadvantage of the
REPEAT GHT is that its computational complexity is O ( N x Npl x
Update the prototypes using U; Np2 . . . x N p , - l ) ,where N is the total number of points in
Compute U using (29); the image to be processed, Np, is the number of quantization
UNTIL (IlAUll < &I); levels of the i-th parameter, and s is the total number of
{The remaining part of the algorithm is optional and is parameters. The memory requirement of the GHT is O(Npl x
to be used only when validity measures need to be Np2 . . . x N p , ) .Since the accuracy of the parameter values is
computed} determined by the number of quantization levels, Np, cannot
Fix the values of 77; to the expected thickness of the shells; be too small. (In contrast, the accuracy of the parameter values
REPEAT in shell clustering is limited only by computer precision.) Some
Update the prototypes using U; researchers have used hierarchical resolutions to mitigate this
Compute U using (29); problem [43]. In the case of a general second-degree curve
U”I’IL (IlAUll < ~ 2 ) ; in 2-D, we need five parameters to describe the curve. The
Possibilistic versions of the FCQS and FCPQS algorithms speed of the GHT can be improved only if we make certain
can be derived very easily by merely changing the membership assumptions about the curve, (i.e., if the curve is circular,
updating equation from (3) to (29). We will hence forth refer elliptic etc.), and if the gradient information is available [8],
to these possibilistic versions as the Possibilistic C Quadric [16], [26], [38]. Also, in spite of recent advances [28], [MI,
Shells (PCQS), and the Possibilistic C Plano-Quadric Shells if the edge points are somewhat scattered around the ideal
(PCPQS) algorithms, respectively. Fig. 10 shows the result of curve (or surface), then peak detection is very difficult in
the PCQS algorithm on the noisy data set in Fig. 9 for which multidimensional Hough space due to bin splitting. Moreover,
both hard and fuzzy clustering failed. This result shows that the detection of small segments is virtually impossible, since
the possibilistic approach makes the clustering process robust. small peaks in the GHT are lost in the bias. The GHT also
suffers from a high probability of spurious peaks [22]. Most
X. CONCLUSIONS
AND RECOMMENDATIONS importantly, peaks in the GHT correspond to “majority fits”
In this paper, we presented several fuzzy and possibilistic and not “best fits.”
shell clustering algorithms. The FCQS algorithm uses a con- The computational complexity of all the algorithms pre-
straint on the second-degree terms and does not allow the sented in this paper is O ( N C K ) , where N is the number
degenerate case of linear prototypes. This problem can be of points, C is the number of clusters, and K is the number
overcome by using a procedure to extract linear clusters from of iterations. If we have a good initialization procedure, the
certain types of pathological clusters. Although the constraint number of iterations K can be kept low. This compares very
used in the FCPQS algorithm theoretically allows the detection favorably with the complexity of the GHT. The memory
of linear clusters, in practice it often overfits second-degree requirements of these algorithms, which is O ( N C ) is very low
curves to linear segments. Also, this algorithm requires us compared to those of the GHT. A more thorough comparison
to solve a six-dimensional generalized eigenvector problem with the GHT is possible only for specific types of curves.
(in the 2-D case), compared to the three-dimensional regular An excellent comparison of the shell clustering approach
eigenvector problem in the FCQS algorithm. Therefore, in our with GHT for the case of circles and ellipses may be found
experience, the (modified) FCQS algorithm and its possibilistic in [16]. In [17] DavC and Fu also show how the GHT
version are the best choices for most applications in 2-D. The with a crude discretization of parameter space can provide
40 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. 1, FEBRUARY 1995

a good initialization for the shell clustering algorithms, thus Since A: is a diagonal matrix, (I - X A:)-’ can be easily
significantly reducing the computational burden. This is a inverted, and from (A3) we obtain
fuzzy generalization of the method suggested by O’Gorman
and Clowes [40], who obtain a crude segmentation of the data
set using the HT and then fit lines to obtain more accurate
xp:,
ZIT = [ +
2x1, Xpi5 2x1,
2(1 - Xpll) 2(1 - Xp:2)
+(A4) 1.
results. This approach is particularly viable when the parameter Substituting (A4) into (A21 yields the following quartic equa-
space is of low dimensionality. tion in X
Finally, we would like to note that to our knowledge, no
general proof of convergence has been presented for any
C4X4 + C3X3 + C2X2+ CIX1 + CO= 0 (A3
of the shell clustering algorithms, although in practice these where
algorithms always seem to converge. This is an important topic c4= 4p:1p:2(4p! ! ! - p! (2 - p! !z)
zlPz2Pz6 t2pz4 zlPz5
that needs to be researched in the future.
(73 = 8P:lP:2(P:: P13+ + 8(P:;P:: + P:;P:24)

exact distance from a point xj to the curve pi in the 2-D case. +


16P:6(P:2, +Pig 4P:lp:Z) +
We first note that in (12), A;, b; and c; are given by
C1 = -32p:1p:2(231 4 2 ) 8(&+ + + pi;)

. . Pir/2 pin J
APPENDIXB

I:
SUMMARY OF SECOND DEGREECURVE AND SURFACE TYPES
and c; =pi,.
b; =
A. Two-Dimensional Case
Pi(r+n)
The nature of the graph of the general quadratic equation
We first rotate the cluster prototype p; and the point x j so
in 21 and x2 given by
that the matrix A; becomes diagonal. This does not change
the distance. The angle of rotation in the 2-D case is given by

Equations (12) and (13) can now be written as


is described in Table I in terms of the values of

where xj’ and z’ denote the locations of points xj and z after


rotation. It is easily verified that
x3. -- R.x’.
2 3 and z = R;z’

I”’
5Pi5

B. Three-Dimensional Case
The nature of the graph of the general quadratic equation
in x1,x2, and 2 3 given by
where 2
Plxl+ PZZ22 + p3232 + P4Z122 + p5x123
+ P6z223 + p721 + p822 + P9x3 + pl0 = 0
-Sinai COSQ; 1 is described in Table 11.
KRISHNAPURAM er al.: SHELL CLUSTERING ALGORITHMS-PART I 41

TABLE I TABLE I1
TWO-DIMENSIONAL CURVETYPES
QUADRATIC .L QUADRATII
THREE-DIMENSION SURFACE TYPES

Nonzero k‘s Quadric Surface


same sign ? ~

YeS Real ellipsoid

yes Imaginary ellipsoid

no Hyperboloid of one sheet

no Hyperboloid of two sheets

no Real quadric cone

yes Imaginary quadric cone

yes Elliptic paraboloid

no Hyperbolic paraboloid

YeS Real elliptic cylinder


In Table 11, the expressions for p3, p4, A, k l , k2 and k3 are yes Imaginary elliptic cylinder
r PI ~ 4 / 2~5/2i no Hyperbolic cylinder

no Real intersecting planes

yes Imaginary intersecting planes

Parabolic cylinder
Real parallel planes
Imaginary parallel planes
Coincident planes

Before running the G-K algorithm, the linear prototypes are


b7/2 ps/2 p9/2 pi0 J initialized to be the asymptotes of the hyperbola. Finding the
A = determinant of E, and equations of these asymptotes is quite simple if the matrix A;
IP1-x P4/2 P5/2 1 is diagonalized as in Appendix A. After rotation, the equation
of each hyperbola becomes
k l , k2 and k3 are the roots of p4/2 pz-x p6/2 = 0.

lP5P P6/2 p3-2 I


, and pL5 are given by (A4) in Appendix A.
where pbl, P : ~pb4
APPENDIXC It is easy to show that the two asymptotes of the hyperbola
CONVERSION OF PATHOLOGICAL PROTOTYPES TO LINES defined by the above equation are given by
By checking the conditions satisfied by the parameters of
the prototype according to Table I, we may determine if the
cluster is a pathological case (i.e., if it is a hyperbola, a pair
xz - cilzl + cio = 0, and 22 + cilxl + cia = 0
of lines or an extremely elongated ellipse). If a cluster is a where
pathological case, it is converted to lines. This is done using
the line detection algorithm in Section V. The initialization
is carried out by determining the equations of the line(s) that
might best describe the pathological clusters. Then each of the
points belonging to the pathological clusters is crisply assigned
to the line prototype to which it is closest. This provides a very
good initial partition for the G-K algorithm. The procedures
for identifying the line prototypes are described below.
Ciz
p:4
= --
2p’i, J --
P’il I Pli5 .
PI2 2Pb2

After the prototypes have been computed, they are rotated


back to their original space.
A . Line Detectionfrom a Hyperbola or a Sometimes, the FCQS algorithms will fit a pair of intersect-
Pair of Intersecting Lines ing lines instead of a hyperbola. When cluster pi is a pair of
Each hyperbola should be split into two lines using the intersecting lines, the above equations characterize the lines
following procedure, provided it is not a very “flat” hyperbola. themselves instead of the asymptotes of the hyperbola, thus
The case of a very “flat” hyperbola will be discussed later. making the initialization even better.
42 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 3, NO. I, FEBRUARY 1995

B . Line Detection from a Pair of Parallel Lines or a in this case, Condition (C2) reduces to
Very Elongated Ellipse or a Very Flat Hyperbola
If cluster is a pair of parallel lines, after the prototype
is rotated, the two lines can be either parallel to the x1 axis
or to the x2 axis. If the lines are parallel to the x1 axis, then On the other hand, if the inequality in (C3) is reversed, then the
p:, M p14 0, and the equations of the two lines are given by above expressions for conjugate and transverse axes lengths
are interchanged, and the negative sign inside the root appears
22 = and 22 =
in the expression for the transverse axis length. In this case,
where Condition (C2) reduces to

When one of the FCQS algorithm fits either a very elongated


ellipse or a very flat hyperbola, the equations for the lines will
On the other hand, If the lines are parallel to the 2 2 axis, then be computed using the equations derived for parallel lines.
p:, M p:, M 0, and the equations of the two lines are given by
ACKNOWLEDGMENT
X I = ci3 and XI = c;4
The authors are grateful to the anonymous reviewers for
where their valuable comments, which improved the presentation and
c23 = p!2 4 - Jp!22 4 - 4p!z l d 6 contents of this paper considerably.
> and
2P: 1
REFERENCES
c;4 =
P64+ JP6i - 4PIIP66
2P6, [I] G. J. Agin, “Fitting ellipses and general second-order curves,” Dept. of
Comput. Sci., Camegie Mellon Univ., Res. Rep., Jul. 1987.
When the two lines are not exactly parallel, but form a small [2] A. Albano, “Representation of digitized contours in terms of conic arcs
and straight-line segments,” Computer Graphics Image Processing, vol.
angle between them, the FCQS algorithm will sometimes fit 3, pp. 23-33, 1974.
one very elongated ellipse or a very flat hyperbola instead of [3] D. H. Ballard, “Generalizing the Hough transform to detect arbitrary
a pair of parallel lines. An ellipse can be categorized as very shapes,” Pattern Recognition, vol. 13, no. 2, pp. 111-122, 1981.
[4] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algo-
elongated if rithms. New York Plenum, 1981.
Major Axis Length [5] J. C. Bezdek and R. H. Hathaway, “Numerical convergence and in-
> CL (C1) terpretation of the fuzzy C-shells clustering algorithm,” IEEE Trans.
Minor Axis Length Neural Networks, vol. 3, no. 5, pp. 787-793, Sept. 1992.
[6] R. H. Biggerstaff, “Three variation in dental arch form estimated by a
where quadratic equation,” J . Dental Res., vol. 51, p. 1509, 1972.
1 [7] F. L. Bookstein, “Fitting conic sections to scattered data,” Computer
Major Axis Length = 2 Vision, Graphics, Image Processing, vol. 9, pp. 5 6 7 1 , 1979.
[8] D. Casasent and R. Krishnapuram, “Curved object location by Hough
transformations and inversions,” Pattern Recognition, vol. 20, no. 2, pp.
181-188, 1987.
1 191 D. S. Chen, “A data-driven intermediate level feature extraction al-
Minor Axis Length
” =2 gorithm,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol.
[& + - pt6] ’
PAMI-11, no. 7, pp. 749-758, July 1989.
[IO] C. Coray, “Clustering algorithms with prototype selection,” in Proc.
Hawaii Intern. Conf. Syst. Sri., Jan. 1981, pp. 945-955.
and CL is chosen to be about 10. Similarly, we may also 1111 D. B. Cooper and N. Yalabick, “On the computational cost of approxi-
assume that a hyperbola can be classified as very flat if mating and recognizing noise-perturbed straight lines and quadratic arcs
in the plane,” IEEE Trans. Comput., vol. 25, no. 10, pp. 102CL1032,
Conjugate Axis Length Oct. 1976.
> CL. [I21 R. N. Davt, “Use of the adaptive fuzzy clustering algorithm todetect
Transverse Axis Length lines in digital images,” in Proc. SPIE Conf. Intell. Robots and Computer
If the transverse axis is parallel to the x1 axis, i.e., if Vision, SPIE vol. 1192, no. 2, pp. 60&611, 1989.
[ 131 -, “Fuzzy shell-clustering and application to circle detection in
digital images,” Int. J. Gen. Syst., vol. 16, pp. 343-355, 1990.
[ 141 -, “ Characterization and detection of noise in clustering,” Pattern

Recognition Lett., vol. 12, no. 11, pp. 657464, 1992.


[ 151 -, “Robust fuzzy clustering algorithms,” Proc. Second IEEE Conf.
then Fuzzy Syst., San Francisco, Mar-Apr. 1993, pp. 1281-1286.
1161 R. N. Dave and K. Bhaswan, “Adaptive fuzzy C-shells clustering and
Conjugate Axis Length = 2
detection of ellipses,” IEEE Trans. Neural Networks, vol. 3. no. 5, pu...
643462, Sept. -1992.
- -
. . R. Davt and T. Fu. “Robust shape detection using fuzzv clustering:
1171 I

Practical applications,” to appear in Fuzzy Sets and Systems, Special


and Issue on Pattern Recognition and Computer Vision, 1994.

[2 + $
[I81 D. Dubois and H. Prade, Possibility Theory: An Approach to Computer-
ized Processing of Uncertainty. New York: Plenum, 1988.
Transverse Axis Length = 2,/-$ - p6.1 . [I91 0. D. Faugeras and M. Hebert, “The representation, recognition, and
~P;Z positioning of 3D shapes from range data,” in Techniques for 3 0
KRISHNAPURAM er al.: SHELL CLUSTERING ALGORITHMS-PART I 43

Machine Perception, A. Rosenfeld, Ed. Amsterdam, The Netherlands: [45] R. D. Sampson, “Fitting conic sections to ‘very scattered’ data: An
Elsevier, 1986, pp, 113-148. iterative refinement of Bookstein algorithm,” Computer Vision and
I. Gath and A. B. Geva, “Unsupervised optimal fuzzy clustering,” IEEE Image Processing, vol. 18, pp. 97-108, 1982.
Trans. PAMI, vol. 11, no. 7, pp. 773-781, Jul. 1989. [46] G. Taubin, “Estimation of planar curves, surfaces, and nonplanar space
R. Gnanadesikan, Methods for Statistical Data Analysis of Multivariate curves defined by implicit equations with application to edge and range
Observations. New York: Wiley, 1977. image segmentation,” IEEE Trans. on Pattern Anal. Machine Intell., vol.
W. E. L. Grimson and D. P. Huttenlocher, “On the sensitivity of the 13, no. 11, pp. 1115-1138, Nov. 1991.
Hough transform for object recognition,” IEEE Trans. PAMI, vol. 12, [47] I. Weiss, “Straight line fitting in a noisy image,” in Proc. IEEE Conf.
no. 3, pp. 255-274, 1990. Computer Vision Pattern Recognition, 1988, pp. 647-652.
E. E. Gustafson and W. C. Kessel, “Fuzzy clustering with a fuzzy [48] P. Whaite and F. P. Feme, “From uncertainty to visual exploration,”
covariance matrix,” in Proc. IEEE CDC, San Diego, CA, 1979, pp. IEEE Trans. Putt. Anal. Machine Intell., vol. 13, no. IO, pp. 1038-1049,
761-766. Oct. 1990.
R. M. Haralick and L. G. Shapiro, Computer and Robot Vision, vol. I. [49] L. A. Zadeh, “Fuzzy sets as a basis for a theory of possibility,” Fuzzy
Reading, MA: Addison-Wesley, 1992, Appendices. Sets and Systems, vol. 1, 1978, pp. 3-28.
R. J. Hathaway and J. C. Bezdek, “Switching regression models and
fuzzy clustering,” IEEE Trans. Fuzzy Syst., vol. 1, no. 3, Aug. 1993,
pp. 195-204.
J. Illingworth and J. Kittler, “A survey of Hough transforms,” Computer
Vision, Graphics Image Processing, vol. 44, no. 1, Oct. 1988, pp.
87-1 16. Raghu Krishnapuram (S’83-M’84) received the
A. K. Jain and R. C. Dubes, Algorithms for Clustering Data. Engle- B.Tech. degree in electrical engineering from the
wood Cliffs, NJ: Prentice-Hall, 1988. Indian Institute of Technology, Bombay, in 1978.
J.-M. Jolion, P. Meer, and S. Bataouche, “Robust clustering with He obtained the M.S. degree in electrical engineer-
applications in computer vision,” IEEE Trans. Pattern Anal. Machine ing from Louisiana State University, Baton Rouge,
Inrell., vol. 13, no. 8, pp. 791-801, Aug. 1991. in 1985 and the Ph.D. degree in electrical and com-
J.-M. Jolion and A. Rosenfeld, “Cluster detection in background noise,” puter engineering from Camegie Mellon University,
Pattern Recognition, vol. 22, no. 5, pp. 603-607, 1989. Pittsburgh, in 1987.
R. Krishnapuram and C.-P. Freg, “Fitting an Unknown Number of Lines Dr. Krishnapuram was with Bush India, Bombay
and Planes to Image Data through Compatible Cluster Merging,” Pattern for a year where he participated in developing elec-
Recognition, vol. 25, no. 4, 1992, pp. 385-400. tronic audio entertainment equipment. From 1979
R. Krishnapuram, H. Frigui, and 0. Nasraoui, “New fuzzy shell cluster- to 1982, he was a deputy engineer at Bharat Electronics Ltd., Bangalore,
ing algorithms for boundary detection and pattern recognition,” in Proc. India, manufacturers of defense equipment. He is currently an Associate
SPIE Conf. Robotics and Computer Vision, Boston, Nov. 1991, SPIE Professor in the Electrical and Computer Engineering Department at the
vol. 1607, pp. 458-465. University of Missouri, Columbia. In 1993, he visited the European Laboratory
-, ”Quadratic shell clustering algorithms and the detection of for Intelligent Techniques Engineering (ELITE), Aachen, Germany, as a
second degree curves,” Pattern Recognition Lett., vol. 14, no. 7, Jul. Humboldt Fellow. His current research interests are many aspects of computer
1993, pp. 545-552. vision and pattern recognition as well as applications of fuzzy set theory and
-, “A fuzzy clustering algorithm to detect planar and quadric neural networks to pattern recognition and computer vision.
shapes,’’ in Proc. N . Am. Fuzzy Inform. Process. Soc. Workshop, Puerto
Vallarta, Mexico, vol. I, Dec. 1992, pp. 59-68.
R. Krishnapuram and J. M. Keller, “A possibilistic approach to clus-
tering,” IEEE Transactions on Fuzzy Systems, vol. 1, no. 2, May 1993,
pp. 98-110.
-, “Fuzzy and possibilistic clustering methods for computer vi- Hichem Frigui received the B.S. degree in electncal
sion,” in Neural Fuzzy Syst., S. Mitra, M. Gupta, and W. Kraske, Eds., and computer engineering in 1990 and the M.S.
SPIE Institute Series, vol. IS-12, 1994, pp. 133-159. degree in electrical engineenng in 1992, both from
R. Krishnapuram, 0. Nasraoui and H. Frigui, “The fuzzy C spherical the University of Missoun, Columbia.
shells algorithms: A new approach,” IEEE Trans. on Neural Networks, From 1992 to 1994 he worked with IDEE, Tunis,
vol. 3, no. 5, Sept. 1992, pp. 663471. where he participated in the development of banking
D. G. Lowe, “Fitting parametrized three-dimensional models to images,” software applications. He is currently pursuing
IEEE Trans. Pattern Anal. Machine Intell., vol. 13, no. 5, pp. 441-450, the Ph.D. degree in electrical engineering at the
May 1991. University of Missouri, Columbia.
V. Milenkovic, “Multiple resolution search techniques for the Hough His current research interests include pattem
transform in high dimensional parameter spaces,” in A. Rosenfeld, Ed., recognition, computer vision, fuzzy set theory, and
Techniquesfor 3 0 Machine Perception. Amsterdam, The Netherlands: artificial intelligence.
Elsevier, 1986, pp. 231-255.
J. J. Moore, “The Levenberg-Marquardt algorithm: Implementation and
theory,” in Numerical Analysis, G. A, Watson, Ed., Lecture Notes in
Mathematics. Berlin: Springer-Verlag. 1977, pp. 105-1 16.
F. O’Gorman and M. B. Clowes, “Finding picture edges through
collinearity of feature points,” IEEE Trans. Comput., vol. 25, 1976, pp. Olfa Nasraoui received the B.S. degree in electrical
133-142. and computer engineering, and the M.S. degree
K. Paton, “Conic sections in chromosome analysis,” Pattern Recogni- in electrical engineering, in 1990 and 1992,
tion, vol. 2, no. I, pp. 39-51, Jan. 1970. respectively, from the University of Missouri,
V. Pratt, “Direct least squares fitting of algebraic surfaces,” Computer Columbia.
Graphics, vol. 21, no. 4, pp. 145-152, 1987. She worked as a Software Engineer with IDEE,
J. Princen, J. Illingworth, and J. Kittler, “A hierarchical approach to line Tunis, from 1992 to 1994 and is currently pursuing
extraction based on the Hough transform,” Computer Vision, Graphics the Ph.D. degree in electical engineering at the
and Image Processing, vol. 52, 1990, pp. 57-77. University of Missouri, Columbia.
-, “Hypothesis testing: A framework for analyzing and optimizing Her current research interests include pattem
the Hough transform performance,” IEEE Trans. Partern Anal. Machine recognition, computer vision, neural networks, and
Intell., vol. 16, no. 4, pp. 329-341, Apr. 1994. applications of fuzzy set theory.

You might also like