0% found this document useful (0 votes)
25 views

Robust Fuzzy Clustering Algorithms

Uploaded by

Kavitha Sarvanan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Robust Fuzzy Clustering Algorithms

Uploaded by

Kavitha Sarvanan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Robust Fuzzy Clustering Algorithms

Rajesh N. Dave
Department of Mechanical and Industrial Engineering
New Jersey Institute of Technology
Newark, New Jersey 07102

Abstract - A class of fuzzy clustering algorithms based algorithms, either hard or fuzzy. In this paper, we show that
on a recently introduced "noise cluster" concept is proposed. A this approach is very general, and applies to all the generaliza-
"noise prototype" is defied such that it is equi-diistant to all the tions of the FCM algorithm. It also applies to a variety of
points in the data-set. This allows for detection of clusters regression problems. In what follows, the background on the
amongst data with or without noise. It is shown that this concept noise clustering is presented first, followed by the application
is applicable to all the generdlizations of fuzzy or hard k-means of this approach to a variety of fuzzy clustering algorilhms
algorithms. Various applicationsare also considered. Applica- including examples. Application to the regression problems
tion of this concept to a variety of regression problems is also are also presented, followed by the summary and reconunen-
considered. It is shown that the results of this approach are dations for further work.
comparable to many robust regression techniques. The paper
concludes with a summary and directions for future work. U. NOISECLUSTERING ALGORITHM
The main idea in the noise clustering algoritlm is the
Index Terms -Fuzzy clustering,robustclustering, image concept of a "noise-prototype". Although defining a separate
processing, noisy data, cluster analysis, pattern recognition, ro- cluster to dump noise points is not a new idea, the idea of
bust regression defining the noise itself as a prototype is new. The concept of
noise as a prototype requires definition. Following the nota-
I. INTRODUCTION tions in Dave[7], the noise prototype is defined below.
Fuzzy c-means (FCM) algorithms and its generalizations
have found continued applications in a variety of areas includ-
ing image processing [ 1-71. The major problems with these
algorithms,however, are poor performance when there is noise Noise prototype: Noise prototype is a universal cnlity
iu tlie data, and the requirement that the number of clusters in such that it is always at the same distance from every point in
the data is known apriori. Several techniques have been pro- the data-set. Let Vn be the noise prototype, and xk be Ilic poinl
posed to encounter these problems, for example, [7-93. The in feature space, Vn, xk E tKP . Then the noise prototype is sucli
problem of an unknown number of clusters has been addressed that the distance dnk, distance of point xk from Vn, is
in many different ways [8,9], the usual practice being the use
of cluster validity measures to determine the correct number of d*=6, V k (1)
clusters [ 101. The problem of noisy data has been a difficult Although the above definition does not tell us what the
one to solve. In theory, the FCM algorithms have a zero distance 6 is, it does imply that all the points in the data-setare
breakdown point, i.e., even a single outlier may completely at the same distance from the noise cluster, and thus defines the
throw off the prototypes. In fact, most clustering algorithms, noise prototype.
whether they are based on the principle of least-squared error
minimization or not, are not robust against noise. From the Next, we re-formulate the conventional FCM algorithm
practical point of view, real applications involve analysis of using this concept. Let there be c good clusters in the data-set,
data with some amount of noise. Therefore, in many cases, a so we add one cluster, i.e., the (c+l)th cluster as the noise
FCM type algorithm may have little practical value, unless cluster. Hereafter, we denote n=c+l. Then tlie fuimional .I,,
special care is taken to handle the noisy data. including the noise cluster is defined as,
Amongst the different techniques proposed to handle
noisy data [7-9,113, the recently introduced concept of "noise
clustering" [7] appears to have the best ability of handling noisy
data. This approach is applicable to all squared-error type
where, the distances are defined by,
Partial support for this work was received from USDOE contract
(did2 = <Xk - Vi>T Ai <Xk - Vi> for all k and i = 1 U, c, (3a)
DE-AC22-91PC90181, and NSF grant MSS-90068322.

0-7803-0614-7/93$03.00(D1993IEEE 1281
and, (dik)* = S2, for i = n (= c+l). (3b)
Here, xk is the feature vector of point k, (k = 1 through N
number of points), and Vi is the cluster prototype of class i. The
distances are measured through the norm induced by symmet-
ric, positive definite matrices Ai, m is the exponent, 1c m c 00,
>:*.:: . . .
L 0 .

-. ...
0. 5 *.

.
s
.
.
and Uik is the membership of point k in class i. Assuming that e:

the distance6 is specified, the minimization gives the following 2.- "Z e. : : .

equations.
1 ..
Uik = 1
1
9 (4)
N

j=l
N
( Uik )" xk
k=l Fig. l(a). Three clusters with noise.
Vi = N
, fori=l toc. (5)

i= 1

instead of the following in conventional FCM algorithm.


C (uik) = 1

i= 1
(7)

Although the effect of this can best realized through ex-


D o

y*
.$

. I
.
..+
. ..'
.:

amples, we argue here that this is a significant difference,


because in this case, a noise point may be assigned a total Fig. of FCM on data of Fig. I(a).
memberships of close to zero, while in the original algorithm,
it must have a total membership equal to one.
The noise distance 6 is a critical parameter in the above . .
algorithm, and would be different for different problems. Ide- 0 . . *
... 0-

$,*
ally, it should be based on the statistics of the data-set. The
following recommendation is made in [7].

a*= h 1 =: 1c

i=l k=l
N

(dik)2 (8)
where h is the value of the multiplier used to obtain 6 from the
.'
B
+*

.... .
.
average of distances. Even h may be based on some higher .+:
order statistics of the data. We present one example here to I...

illustrate the ability of this technique to correctly locate the


cluster prototypes in noisy data. An example of three compact
clusters is shown in Fig. l(a). These clusters are generated
using uniform random distribution of points centered around
the prototypes listed in Table I plus some randomly added Fig. Kc). Output of Noise Clustering on data of Fig. 1 (a).
noise. The prototypes detected using the conventional FCM
1282
TABLE I The GK algorithm [121 or the Adaptive Fuzzy Clustering
NUMERICAL RESULTS FOR THE EXAMPLE IN RG.1
(AFC ) algorithm [131can automatically detect linear structure
in the data. In combination with the "noise clustering" idgo-
rithm, they make powerful candidates for line detection in edge
maps of digital intensity images. An example of the image of
the top of a cube is considered. The edge map is shown in Fig.
2(a), showing substantial amount of noise. The results ol' line
detection using the original GK algorithm are shown In Fig.
2(b), while the results with "noise clustering" version, called
NGK, are shown in Fig. 2(c). In these figures, the crossins
lines represent the shape and the extent of the cluster found by
algorithm are shown in Fig. l(b), and the prototypes detected the algorithm. In Fig. 2(c), solid circles represent the points
using the above technique are shown in Fig. l(c). The numeri- identifkd as noisy points. As can be seen, the presence of noise
cal values of the detectedprototypes are shown in Table I. This has severely corrupted the performance of the GK algorithm,
example clearly shows that the proposed method is robust while the results of NGK algorithm are excellent.
against noise. In the next section, we show that this method is
applicable to a variety of fuzzy clustering algorithms, and show In the next section, an important class of fuzzy clustering
one example of real image data where lines are detected using algorithms based on the concept of Fuzzy c-Shells (FCS)
this approach. clustering [4] is considered. These algorithms, coupled wi 111
the "noise clustering'' concept provide a powerful tool for
111. GENERALIZED
NOISE CLUSTERING detecting circles, ellipses and quadric curves in images.

Extending this concept to other generalizations of FCM Iv. EXTENSION


OF FCS TYPE ALGORITHMS
algorithms is straight forward. In each case, the noise proto-
type and the noise distance are defined same as before, i.e. In FCS type algorithms, the cluster prototype is a hyper-
equations (1) and (8). One cluster is added to the expected spherical shell, hence the distances are measured form this
number of clusters, and then the minimization is carried out. curved hyper-surface. The distance can be defined as,
For any generalization, the functional is also same as in equa- (Dik)2 = ( [(Xk - Vi)T Ai (Xk - Vi)]'n - 1 )2 (1 1)
tion (2), but the definition of the distance and prototypes would In the above, the matrix Ai is a symmetric positive definite
be different for each different algorithm. We consider the matrix that takes care of the size and orientation of the shell.
algorithm in [12], called the GK algorithm as an example. The noise clustering functional based on the above is h e s;me
In the GK algorithm, the cluster prototype is defined by as (2), and the solution for vi, and Ai are the same as in Dave
the center Vi and the norm inducing symmetricpositive definite and Bhaswan [5]. A specific example is considered to demon-
matrix Ai. The definition of the distance from the prototypes is strate its applicability to a motion analysis problem. Using ;I
as i n (3), where Ai are variables. Minimization of the func- high-speed video imaging system, it is intended to nie;isure
tional with the constraint det(Ai) = pi yields the following for rotation of freely collidmg spheres [14]. To calibrate the
Ai. measurement procedure, the system is used to measure the
known rotation speed of a motor shaft. The shaft is marked
Ai = [pi det ( Sfi ) 1'" ( Sfi)-', 1 si I c (9) with five markers, and then by finding the center of each marker
in the successiveframe, the rotation rate is computed. The edge
where Sfi is the fuzzy scatter matrix of cluster i, given by dataof the markers and the shaft outline are shown in Fig. 3(a).
The fitted circles using the noise FCS algorithm are shown iii
N Fig. 3(b), while the final results of the calibration experiment
T are shown in Fig. 3(c). Overall error in computation W;IS less
sfi= ( Uik )" (xk - vi) (Xk - vi) . (10)
than 0.5 %. It was found that this approach resulted in ;Ihighly
k= 1
accurate computation of the marker centers. Use of €Iough
The usual practice is to take pi = 1 for all i. The solution transforms in this application required a very k g e amount 0 1
lor Uik and Vi is the same as in (4) and (5). This algorithm has computations. To achieve the level of accuracy required io tlic
an ability to adapt the norm, and hence the shape of the cluster. center location, either a very large accumulator arrrty has to be
In most practical applications it detects the structure of the used, or one has to start with a coarse resolution a i d then
cluster rather than imposing a structure as the FCM algorithms iteratively use finer resolutions. Using the later strategy re-
would. With the noise cluster included in the functional, it is quires many passes through the algorithm, since we are inter-
capable of performing well in the presence of noise. ested in finding a large number of circles.
Extension of any similar algorithm to a "noise clustering" v. ROBUSTREGRESSION THROUGH NOISE CLUSTERING
algorithm would be handled in a similar fashion, and as can be
seen, is straight forward. This is the simplicity of this concept. Robust regression has been an active area of research for
Although we have not presented the steps of the algorithm here many years, see for example, Rousseeuw and Leroy [ 151. For
to conserve space, it can be realized that the coding of the "noise simple curve fitting, the standard method of least-squared error
clustering" algorithm involves little extra effort. minimization (LS fit) is not robust against noise. Therefore, a
1283
.... ..............
1-
... ...
:
.......:
.. ...
....... ...... ...
.......

Fig. 2(a). Edge map of an image of the top of a cube. Fig. 3(a). Edge data of the shaft with markers.

Fig. 2@). Output of GK for the data of Fig. 2(a). Fig. 3@). Circles detected using noise FCS algorithm.

90.00

80.00

i0.W

3
-
d
.P
U
60.00

50.00

40.00
ld
Y
0 30.00
20.00

10.00

1 . 1
0.m 2.00 4.00 6.03 R.OO 10.W 12.00 I4.W

Time (ms)

Fig. 2(c). Output of Noise-GK for the data of Fig. 2(a). Fig. 3(c). Plot of computed rotation and actual rotation.

1284
method such as least squared median error minimization
(LMS) is recommended,which works well even in the presence 22 ,
of half the number of noisy points. The concept of noise I I I l
0
clustering can also be applied to the standard LS method to
make it robust against noise. We formulate the problem as - 0
18
follows.
e
Given (xk, yk), k=l to N number of pairs, find the "best fit" +
c/)
e
function f(a,x), where a is the vector of the parameters of the
function f. For a line fit, a could be a vector with two parame-
ters, i.e., slope and intercept. We minimize the following
functional.
N
J n ~ s(%u,w)= [(uklm(f(%xk) - YkI2 + (Wklm(6kI21 (12)
k=l
where 6k is the distance of the point from the noise class. All
& are fixed to a constant value. A point has a membership u k I I I I
in the good class, and wk in the noise class. The following 50 55 60 65 70 75
conditions on the memberships are also used. Year
O<Uk<l, O<Uk<l, and Uk+Wk=1 (13)
Minimization of (12) subject to the parameters a is essen- Fig. 4. Noise LS fit to the telephone call data from [ 151.
tially the same as in the conventional LS fit, while using
Lagrange multiplier technique and a certain amount of re-ar-
rangement, we obtain the following for the memberships.
1

and
1
-
2

1 I
-
2 +-2
[f(a,xk) - Yk]" [6kIm-' Fig. 5(a). The star cluster data from [15].
The above can be applied to the line fit problem. An
example of phone call data from [ 151 (page 25) is considered. e
Fig. 4 shows the data plotted as solid circles. The conventional e
LS fit does not pick up the desired trend in the data,while the e
e
noise LS (NLS) fit using the above formulation picks up the
correct trend according to [151. The results of LMS are similar.
Another example is of locating the center of multivariate
data where one tries to fit an ellipsoidal shape to the data. Once
again, an example from [15] (page 261) is used as shown in
Fig. 5. Part (a) shows the original data, while in part (b), the
correct ellipsoid, discounting the effect of the outliers, is shown
;LSidentified by the noise clustering algorithm. The figure
shows solid circles as outliers, and the two lines show the
principal axes of the ellipsoid. Similar results are obtained
using the MVE (minimum volume ellipsoid) estimator pre-
sented in [ 141.
Fig. 5@). Best ellipsoid fit using noise clustering approach.

1285
VI. CONCLUSIONS REFERENCES

The results presented show the potential of noise cluster- 1. J. C. Bezdek and S.K. Pal, F u u y Modelsfor Patfern recognilion.
EEEPress, New York, 1992.
ing algorithms in improving the performance of fuzzy cluster-
2. R. N. Dave, "Boundary Detection through Fuzzy Clustering,"
ing algorithms. Although not explicitly shown here, this con- Invited Paper, IEEE International Conference on Fuuy Systems,
cept is equally applicableto hard k-means type algorithms [7]. San Diego, California, March 8-12, pp. 127-134, 1992.
Through this approach, one can solve one of the major prob- 3. R. L. Cannon, J. V. Dave, J. C. Bezdek and M. M. Trivedi,
lems associated with the FCM and hard k-means type algo- "Segmentation of a thematic mappex image using the fuzzy c-means
clustering algorithm," IEEE Trans. on Geos. and Remote Sens.. vol.
rithms. Examples shown here are from real digital image GE-24(3), pp. 400-408, 1986.
analysis applications. It is shown that this approach can be 4. R. N. Dave, "Fuzzy shell-clustering and applications to circle
successfully used to find lines and curves in digital images. detection in digital images," International J. of General Syslenu.
vol. 16, pp 343-355, 1990.
Application to regression analysis problems shows an- 5. R. N. Dave and K. Bhaswan, "Adaptive fuzzy c-shells clustering and
other new area for application of this approach. In terms of detection of ellipses," IEEE Trans. on Neural Networks, vol. 3(5),
curve fitting, this approach representsan important new tool in 1992.
robust data fitting. In multivariate data analysis, this approach 6. R. Krishnapuram, 0. Nasraoui and H. Frigui, "The fuzzy c-shells
algorithm: A new approach," IEEE Trans. on Neural networks, vol.
competes well with techniques like MVE estimators. In terms 3(5), 1992.
of computational cost, it is expected that this approach has an 7. R. N. Dave, "Characterization and detection of noise i n cluslering",
advantage over other statistical methods. Another major ad- PanernRec. Leners. vol. 12(11), pp 657-664, 1991.
vantage of this approach is its applicability to handle multiple 8. R. Krishnapuram and C-P Freg. "Fitting an unknown iiuntber of lines
lines or ellipsoids in the data-set. and planes to image data through compatible cluster merging", Prrt-
tern Recognition, 1992.
The ideas presented here need further improvements in 9. R. N. Dave and K. J. Patel, "Progressive fuzzy clustering algorithms
terms of developing better strategies to define the noise dis- for characteristic shape recognition", Proceedings of NA FIPS'!N.
pp 121-124, 1990.
tance, 6 and the multiplier, h. In addition, the algorithm must 10. J. C. Bezdek. Panern Recognition with F u u y Objective Function
select this quantities adaptively so that the 31 is changed to a Algorithms. Plenum Press, New York, 1981.
smaller value as the algorithm progresses. The main shortcom- 11. J. Jolion and A. Rosenfeld, "Cluster detection in background noise,"
ing of the FCM type algorithm is that there is no guarantee of Pattern Recognition. vol. 22(5). pp. 603-607 1989.
global convergence. This disadvantage is also present in the 12. E. E. Gustafson and W. C. Kessel. "Fuzzy clustering with a fuzzy
covariance matrix," in Proc. IEEE CDC, San Diego, Calif., pp. 761-
proposed modification. It is expected that use of the techniques 766,1979.
such as progressiveclustering [9]can significantly improvethe 13. R. N. Dave, "Use of the adaptive fuzzy clustering algorithm IO dccect
performance, and also overcome the other shortcoming, i.e., lines in digital images," Intelligent Robots and Coniyitter Vision
need to know the number of clusters. VIII, vol. 119x2). pp. 600-61 1, 1989.
14. A. D. Rosato, R. N. Dave, I. S. Fischer and W.N. Carr, "Devclopmr.~~~
ACKNOWLEDGMENT of a non-intrusive particle tracing technique for granular chute
flows," Quarterly Progress Report, US DOE contract DEAC22-
91PC90181, Pittsburgh Energy Technology Center, April 1992.
The author wishes to thank Kurra Bhaswan for coding the 15. P. J. Rousseeuw and A. M. Leroy, Robust Regression and Oiiilier
algorithms, and to Jim Yu for providing results of Fig. 3(c). Detection, John Wiley, New York, 1987.

1286

You might also like