Cluster Validity
Cluster Validity
Abstract—Determining the number of clusters, which is usu- ysis aims to discover the corresponding clusters of the existing
ally approved by domain experts or evaluated by clustering va- substructure. Various clustering algorithms have been devel-
lidity indexes, is an important issue in clustering analysis. This oped; they can be categorized into centroid-based clustering,
study discusses the effectiveness of clustering validity indexes for
centroid-based partitional clustering algorithms. Most general- density-based clustering, distribution-based clustering, hierar-
purpose clustering validity indexes take the minimum/maximum chical clustering, and relational clustering, depending on the
distance between a pair of data objects, a pair of cluster centroids, clustering models they use. More comprehensive overview of
or an object and a centroid as an important evaluation factor; clustering algorithms can be found in [8]–[11].
however, they may present unstable results, especially when two Cluster validation purports to validate the discovered sub-
centroids are allocated closely. To alleviate this problem, a new
clustering validity index, which is termed the Wu-and-Li index structure or clusters. Two kinds of approaches for cluster valida-
(WLI), is proposed in this paper. Our proposed WLI partially al- tion are employed, i.e., comparison indices (external evaluation)
lows, to some extent, the existence of closely allocated centroids and validity indices (internal evaluation). Comparison indices
in the clustering results by considering not only the minimum but are designed to compare pairs of candidate partitions with each
the median distances between a pair of centroids as well; therefore other, or with a reference partition. The related applications
possessing better stability. The performances of WLI and some ex-
isting clustering validity indexes are evaluated and compared by of such indices can be found in [12] and [13]. For example,
running the fuzzy c-means algorithm for clustering various types Anderson et al. [5] proposed a method to generalize comparison
of datasets, including artificial datasets, UCI datasets, and images. indices for crisp, fuzzy, probabilistic, or possibilistic candidate
Experimental results have shown that WLI has the more accurate partitions and applied the generalization method to modify the
and satisfactory performance than other indexes. Rand-index [12], resulting a fuzzy Rand-index with a lower or-
Index Terms—Clustering analysis, clustering validity index der of complexity. Moreover, Anderson et al. [14] developed a
(CVI), fuzzy c-means (FCM) clustering algorithm, partitional clus- new method for comparing fuzzy, probabilistic, or possibilistic
tering algorithm. candidate partitions based on the earth mover’s distance and
the ordered weighted average. Another kind of validation ap-
I. INTRODUCTION proaches, i.e., validity indices, focuses on evaluating each can-
didate partition separately. Determining a proper number (K)
LUSTERING is an unsupervised learning process that
C collects data objects with homogeneous features into the
same cluster and discriminates clusters with heterogeneous fea-
of clusters and the corresponding partition is an important issue
in clustering analysis, which is usually approved by domain ex-
perts or evaluated by a clustering validity index (CVI). A CVI
tures of objects. The task of clustering is to partition a given is an evaluation function that measures, relatively, the quality
dataset into groups of homogeneous data that may reveal mean- of K discovered clusters [15], [16] in each candidate partition.
ingful information for advanced data analysis, such as data min- Therefore, the best partition can be decided as the one with the
ing, signal processing, and image analysis [1]–[4]. For clustering largest or smallest CVI value among all candidate partitions.
analysis, three issues are considered [5], i.e., clustering tendency Up to now, many CVIs have been proposed for various cluster-
assessment, cluster analysis, and cluster validation. The purpose ing algorithms. Most of them are developed for centroid-based
of clustering tendency assessment is to explore the cluster sub- clustering, as mentioned later, and the others are designed for
structure of datasets, and some formal and informal treatments the other kinds of clustering. For example, Conn_Index [17] is
[6], [7] have been discussed to address this issue. Cluster anal- designed for evaluating prototype-based clustering of datasets
with a wide variety of cluster types. However, it is restricted
Manuscript received December 3, 2013; revised February 26, 2014; accepted to prototype-based clustering and not applicable to point-based
April 9, 2014. Date of publication May 7, 2014; date of current version May clustering in which data points are clustered directly. Liang et al.
29, 2015. This work was supported in part by the Ministry of Science and
Technology, Taiwan, under Grant NSC 100-2221-E-390-029 and Grant NSC
[18] proposed an index S for deciding the optimal K and the
100-2221-E-214-065. corresponding clusters from a set of consecutive partitions that
C.-H. Wu, L.-W. Chen, and L.-W. Lu are with the Department of Electri- are obtained by the proposed hierarchical clustering algorithm,
cal Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan
(e-mail: [email protected]; [email protected]; m0995119@mail.
A-FCM. Sledge et al. [19] developed a framework to convert
nuk.edu.tw). general CVIs to a relational form for relational data. Besides
C.-S. Ouyang is with the Department of Information Engineering, I-Shou applying CVIs in a postclustering manner, some approaches es-
University, Kaohsiung 840, Taiwan (e-mail: [email protected]).
Color versions of one or more of the figures in this paper are available online
timate the number of clusters during clustering. For example,
at http://ieeexplore.ieee.org. Frigui and Krishnapuram [20] proposed a competitive agglom-
Digital Object Identifier 10.1109/TFUZZ.2014.2322495 eration algorithm to progressively transform an initialized par-
1063-6706 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
702 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 23, NO. 3, JUNE 2015
tition with a larger number of smaller clusters into an optimal K of clusters because a larger K produces smaller clusters
partition with fewer larger clusters. In [21], the clustering prob- with fewer data objects and the distances among data and cen-
lem is viewed as a multiobjective optimization problem with two troids surely become smaller. Some CVIs avoid such cases by
proposed objective functions, i.e., the mean value of the cluster considering the smaller distance between two centroids as bad
validity index I and its average difference. The archived mul- clustering results and introducing a minimum term of centroid
tiobjective simulated annealing algorithm is utilized to solve distances in their formula bodies. The design of such CVIs
the multiobjective problem. A single solution is chosen from can successfully avoid the aforementioned problem caused by
the archive of final solutions in a semisupervised manner that too-close centroids when K is large if the data objects of the un-
assumes the existence of some labeled points. Besides, a com- derlying problem are intrinsically separable. General-purpose
parison study of validity indexes for swarm-intelligence-based FCM clustering considers the case of allocating two centroids
clustering can be found in [22]. closely as a bad clustering result; hence, most CVIs avoid such
This study concentrates on the performance of CVIs for cases by magnifying/diminishing the corresponding CVI val-
validating the centroid-based clustering, especially for fuzzy ues. However, clustering data objects into two geographically
c-means (FCM), in a post-clustering manner. A typical centroid- close groups may be acceptable in some practical applications.
based clustering finds K centroids of clusters and then assigns For example, clustering on image pixels is commonly used for
each data object to the nearest cluster iteratively. Usually, an op- image segmentation. Features of image pixels are usually col-
timization problem of minimizing the overall distances among ors, positions, or textures, etc. With the aforementioned CVIs,
data objects and centroids is explored. K-means and FCM, as two groups of image pixels having similar color features but
well as their variations [23]–[25], are typical centroid-based representing different meanings may be clustered into one by
clustering methods that are widely used in practical applica- FCM. The CVIs that avoid too-close centroid distances may
tions because of their simplicity of implementation and integra- not exactly evaluate the quality of such clustering problems.
tion with the postprocessing. When using FCM, the number of Section III will give an illustrative example.
K clusters has to be known in advance. Various CVIs have been Current CVIs may not properly evaluate clustering results in
proposed for analyzing the K clusters decided by FCM, such which closely allocated clusters are allowed and may provide
as partition coefficient (PC) and partition entropy (PE) [23], misleading information to the data analysts. This study dis-
Dunn index (Dunn) [26], Calinski–Harabasz index (CHI) [27], cusses several existing CVIs and designs a new CVI, termed the
Davies–Bouldin index (DBI) [28], Xie and Beni index (XBI) Wu-and-Li index (WLI), that mitigates the too-close centroids
[29], Fukuyama and Sugeno index (FSI) [30], SC index (SCI) problem. WLI considers the distance between each data object
[31], CS index (CSI) [32], partition coefficient and exponen- and its corresponding cluster centroid when evaluating the com-
tial separation (PCAES) index [33], Pakhira–Bandyopadhyay– pactness of clusters. Additionally, for evaluating the separation
Maulik index (PBMF) [34], and OS index (OSI) [35]. The visual of clusters, WLI considers not only the minimum but the me-
cluster validity and correlation cluster validity indexes [36] are dian distances between each pair of centroids as well. Therefore,
both developed from a cluster validity framework and are based WLI partially allows, to some extent, the existence of closely
on visual comparison and correlation, respectively. Several in- allocated centroids. Comparison between several CVIs and the
dexes mentioned previously and the other indices are extended proposed WLI is also made by clustering various datasets, in-
and evaluated with a symmetry property in [37]. cluding artificial, UCI, and images in several experiments. Ex-
Among the previously mentioned CVIs, some CVIs are de- perimental results have shown that WLI has the better perfor-
signed for general-purpose clustering that evaluate clustering mance for helping decide the number of clusters for FCM-based
results with simple evaluation functions, whereas some CVIs clustering.
consider specific geometric structures of data with complicated The remainder of this paper is organized as follows. In
functions. Usually, the Euclidean distance between a pair of data Section II, FCM and several popular CVIs are reviewed. The
objects, a pair of cluster centroids or an object and a centroid, motivations of this study are described in Section III, together
is calculated to evaluate the homogeneity of objects within a with an illustrative example. The new CVI for clustering is pre-
cluster or the heterogeneity of clusters. With such distance cal- sented in Section IV. Section V shows the experimental results.
culations, the compactness and separation of clusters can be Finally, the conclusion is given in Section VI.
measured. The compactness measure evaluates the concentra-
tion of data objects that belong to the same cluster, whereas the
separation measure evaluates the isolation among clusters. In II. BACKGROUND AND RELATED WORK
addition to compactness and separation measures, an overlap In this section, FCM is briefed and the CVIs to be compared
measure is proposed in [35] to evaluate the degree of overlap of with the proposed one are reviewed. The following are termi-
a specified number of fuzzy clusters. Some CVIs further con- nologies used for describing CVIs.
sider the geometric structures of clusters which are also based 1) N is the number of data objects for clustering.
on the distance calculations. 2) m is the fuzzifier which determines the level of cluster
In most CVIs, minimizing the distances among data objects fuzziness.
and centroids comes along with a result of a large K. It is 3) xi is the ith, 1 ≤ i ≤ N , data object.
easy to verify that, as was presented in [24], many CVIs ap- 4) K is the number of clusters.
proach their maximum/minimum bounds with a larger number 5) Ck is the kth, 1 ≤ k ≤ K, cluster.
WU et al.: NEW FUZZY CLUSTERING VALIDITY INDEX WITH A MEDIAN FACTOR FOR CENTROID-BASED CLUSTERING 703
6) |Ck | is the number of data objects in the kth cluster. where dis(Cs , Ct ) = minx i ∈C s ,x j ∈C t (xi − xj ), and
7) vk is the centroid of the kth cluster. N dia(Ck )=maxx i ,x j ∈C k (xi − xj ).
8) v̄ is the centroid of all data objects, i.e., v̄ = N1 i=1 xi . 3) The CHI [27] explores the spatial relationships among
9) x − y is the distance between a pair of data objects, data objects by reconciling the agglomerative and the
a pair of cluster centroids, or an object and a centroid x divisive methods, i.e.,
and y.
BK WK
10) μik is the membership degree of xi “belonging-to” Ck . CHI(K) = (7)
K −1 N −K
A. Fuzzy C-Means Algorithm and Clustering Validity Indexes K
where BK = K
k =1 |Ck |vk − v̄ and WK =
2
k =1
x i ∈C k xi − vk .
The FCM algorithm, which was developed by Dunn [26] and 2
improved by Bezdek [23], is a widely used clustering method. 4) The DBI [28] indicates the similarity of clusters on their
FCM performs an iterative process that assigns each data object data density which is evaluated as a decreasing function
to each cluster with a certain membership degree through the of distance, i.e.,
minimization of the following objective function:
1
K
Sj + Sk
N
K DBI(K) = maxj = k (8)
K vj − vk 2
ik xi − vk , m ≥ 1.
μm 2
(1) k =1
i=1 k =1
where Sk = |C1k | x i ∈C k xi − vk 2 .
Initially, FCM starts with a given number K of clusters and
randomly chooses K centroids. The above objective function is 5) The XBI [29] evaluates overall compact and separate
optimized by iteratively updating μik and vk as fuzzy c-partitions according to the dataset, geometric dis-
tance measure, distance between cluster centroids, and
1 the fuzzy partition, i.e.,
μik = m 2−1 (2)
K x i −v k K N
j =1 x i −v j μ2 xi − vk 2
N XBI(K) = k =1 i=1 ik . (9)
μm N · mini= j {vi − vj 2 }
ik xi
vk = i=1
N
. (3)
i=1 μm
ik
6) FSI [30] is calculated with the following equation:
The iteration stops when Up+1 − Up < ε, where Up = [μik ]
K
N
ik xi − vk
μm 2
is the matrix that is composed of all μik ’s, p is the number of FSI(K) =
iterations, and ε is a threshold given by the user. Otherwise, new k =1 i=1
centroids are calculated according to (3) and the iteration goes
K
N
on. − ik vk − v̄ .
μm 2
(10)
To evaluate the quality of a clustering result, some CVIs k =1 i=1
have been proposed. The following are the CVIs that have been
discussed in this paper. 7) The SCI [31] evaluates the compactness–separation ratio
1) The PC and PE [23] are two commonly used indexes by combining two functions SC1 and SC2 , i.e.,
for evaluating the clustering validity. PC evaluates the SC(K) = SC1 (K) − SC2 (K) (11)
averaged strength of “belongingness” of data as
where SC1 considers the geometrical properties and
1 2
K N
P C(K) = μik (4) membership degrees of data
N i=1 k =1
K
1
K k =1 v k − v̄2
whereas PE is an entropy measurement that is calculated SC1 (K) =
K N N
by taking the logarithmic form of PC, i.e., k =1 μ m
i=1 ik x i − v k 2
i=1 μik
1 (12)
K N
PE(K) = − μik log2 (μik ). (5) and SC2 considers the properties of membership degrees
N i=1
k =1
SC2 (K)
2) The Dunn [26] investigates the distances between two
data objects that belongs to the same clusters, as well as K −1 K N 2
k =1 j =k +1 i=1 (min(μik , μij )) /nij
the distances between two clusters. The definition of the = .
N 2 N
Dunn is as follows: i=1 max 1≤k ≤K μik / i=1 max 1≤k ≤K μik
Dunn(K) (13)
dis(Cs , Ct ) N
= min1≤s≤K mins+1≤t≤K −1 Note that nij = i=1 min(μik , μij ) in (13).
max1≤k ≤K dia(Ck ) 8) The CSI [32] deals with clustering with differ-
(6) ent densities and/or sizes. It evaluates the ratio of
704 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 23, NO. 3, JUNE 2015
compactness–separation of data objects and the cen- degree of isolation among clusters. Objects with heteroge-
troids, i.e., neous features should be separated as far as possible. This
K 1 can be evaluated by the interdistance between each pair
k =1 |C k | x j ∈C k maxx i ∈C k {xj − xi } of centroids, i.e., vk − vh , k = h. Sometimes, the inter-
CSI(K) = K . distance between two heterogeneous objects from two dif-
j =1 mini= j {vi − vj }
(14) ferent clusters, respectively, is calculated, i.e., xi − xj ,
9) The PCAES index [33] uses the factors from a normal- x i ∈ Ck , xj ∈ Ch . The larger the above distances, the
ized PC and an exponential separation measure for each better the separation of clusters.
cluster and then pools these two factors to create the 2) With the changes on the centroid locations, the aforemen-
PCAES validity index tioned distances present various values. Some CVIs only
consider the crisp distances, such as Dunn, CHI, DBI,
K N
μ2ik and CSI. In some clustering problems, the size and dis-
PCAES(K) = tribution of data objects may affect the clustering results.
μ M
k =1 i=1 Therefore, some CVIs investigate the properties of the
−minh= k {vk − vh 2 } fuzzy membership degrees and the structure of clusters,
− exp (15) such as PC, PE, XBI, FSI, SCI, PCAES, PBMF, and OSI.
βT
N 3) General-purpose CVIs evaluate the overall clusters,
where μM = min1≤k ≤K { i=1 μ2ik } and βT = whereas some CVIs emphasize on the evaluation of unique
K
K
1
k =1 v k − v̄2
. characteristics of clusters. For example, PC and PE only
10) The PBMF [34] is the fuzzy variant of PBM [38] which consider the compactness of clusters but do not take
emphasizes on the compactness within clusters and large the structure of the dataset into account. The CVIs like
separation between clusters DBI, XBI, FSI, and SCI consider the compactness for
clusters and the structure of data of the overall clus-
PBMF(K) tering results but lack considerations on compactness–
N separation in each cluster. In some CVIs, such as PBMF,
maxj = k {vj − vk } × i=1 μi1 xi − v1
= K N . (16) the maximum distance between a pair of centroids, i.e.,
K k =1 i=1 μm ik xi − vk MAX = maxi= j {vi − vj 2 }, is considered, and good
11) The OSI [35] employs a measure of multiple cluster clustering results are centroids that are separately allo-
overlap and a separation measure for each data point, cated, which is not reasonable in image clustering. Con-
both based on an aggregation operation of membership versely, in some CVIs, such as XBI, CSI, and PCAES, the
degrees minimum distance between a pair of centroids MIN =
mini= j {vi − vj 2 } is considered.
1 N
⊥l=2,K (⊥k =1,K μik )
1 l
4) Some simple CVIs consider only the fuzzy membership
OSI(K) =
N i=1 ⊥1 (⊥1k =1,K μik , . . . , ⊥1k =1,K μik ) degree μik reflecting the belongingness of a data object
xi to a cluster centroid vk , such as PC and PE. Advanced
K − 1 times CVIs consider the distance between xi and vk or between
(17)
vh and vk and calculate the average values of such dis-
where ⊥lk =1,K μik is defined as the l-order fuzzy-OR
tances (like CHI and FSI), the minimal distances (like
operator on {μik |k = 1, 2, . . . , K} [39].
XBI, CSI, and PCAES), or the maximal distances (like
PBMF) and use them as a key component in the CVI
B. Remarks formulas.
Some characteristics of the above CVIs are discussed as 5) A CVI usually works with clustering methods indepen-
follows. dently as a postprocessing. An optimal result of K clusters
1) A basic clustering criterion is to make sure that the data comes up with the maximum/minimum CVI value with
objects with homogeneous features are grouped into the respect to various settings on K. According to the struc-
same cluster, whereas the ones across different clusters ture of these CVIs, the optimal number of clusters can be
are heterogeneous. For this purpose, the compactness and decided when PC, Dunn, CHI, SCI, PCAES, and PBMF
separation of clusters are commonly measured, as used reach their maximum values, or in reverse, when PE, DBI,
in the above CVIs. The compactness measures the con- XBI, FSI, CSI, and OSI reach their minimum values.
centration of data objects within a cluster. Objects with For convenience, this paper denotes a larger-the-better CVI
homogeneous features should have smaller distances to as CVI+ and a smaller-the-better CVI as CVI− .
each other. This can be evaluated by the intradistance be-
tween each pair of objects in the cluster, i.e., xi − xj ,
i = j, or between each cluster object and the correspond- III. MOTIVATION
ing cluster centroid xi − vk , where xi , xj ∈ Ck . The No matter what criteria are used by CVIs, the intradistances
smaller the above distances, the stronger the compactness and interdistances are the primary evaluation factors. Many
of a cluster. On the other hand, the separation measures the CVIs are designed based on the assumption of high compactness
WU et al.: NEW FUZZY CLUSTERING VALIDITY INDEX WITH A MEDIAN FACTOR FOR CENTROID-BASED CLUSTERING 705
Fig. 1. Separation and closeness of centroids. (a) Data distribution. (b) Clustering with K = 2. (c) Clustering with K = 3. (d) Clustering with K = 4. (e)
Clustering with K = 5.
TABLE I not be identified [40]. Other examples are from image analysis
CLUSTERING RESULTS IN VARIOUS CVIS
for medical diagnosis [41] and monitoring debris flow, coast,
or marine pollution [42], [43]. Image pixels presenting areas of
K MIN MAX XBI− CSI− PBMF+
marine pollution or debris flow may be hard identify if they have
2 1.4954 1.4954 0.1913 0.8977 (0.4705) very similar pixel values.
3 1.5128 3.6534 (0.0396) (0.3685) 0.2443 This study is motivated by the above observations. Below are
4 0.1625 4.5385 0.1599 0.3744 0.1943
5 0.0154 4.5948 1.5068 0.4156 0.1571
the main concepts of this study on designing a new CVI.
1) The compactness and separation are two essential mea-
(. . .): the best value of the CVI in the column sures for CVIs and should be considered when designing
a new CVI. Some CVIs consider the structure of clus-
ters or the distribution of data; they are effective for the
and separation and inform FCM to separate cluster centroids as datasets containing such specific structures. Experimental
far as possible. Sometimes, this assumption does not hold. Be- results show that such considerations are not always effec-
low is an example. Suppose four groups of 20 data objects, as tive because such structures are not always detectable. An
shown in Fig. 1(a), are clustered. Among the clusters, Group-0 effectively detecting function is usually costly and com-
and Group-1 are closely allocated, whereas Group-2 and Group- putationally complicated to be incorporated with CVIs. In
3 are relatively far away from other groups. Visually, two and this study, we intend to design a simple but effective CVI.
five are improper numbers of clusters for these data objects. The simple compactness and separation measures will be
Three clusters seem also reasonable but four clusters are cor- considered in the proposed CVI.
rect. By performing FCM on these data objects with K = 2, 3, 2) Regarding the minimum distance between a pair of cen-
4, and 5, respectively, centroids of the clusters are decided, as troids that is usually considered in the separation mea-
presented in Fig. 1(b)–(e). Three CVIs, XBI, CSI, and PBMF sure, because smaller MIN usually comes up with a large
are discussed, as listed in Table I. K, an effective CVI should consider the distance factor
Because Group-0 and Group-1 are closely allocated, their and prevent too fine-grained clustering results. However,
centroids are also closely allocated when K ≥ 4. The mini- in practical clustering analysis, centroids are allowed to
mum and maximum distances between a pair of centroids, i.e., be closely allocated, such as in clustering of color pix-
MIN and MAX, become smaller and larger, respectively, as els for image segmentation and function approximation
K increases. In XBI and CSI, the term MIN is used in their where data objects are crowded. When clustering on such
denominator to avoid the case of clustering with too closely al- crowded data objects, FCM unavoidably allocates cen-
located centroids. In this example, the minimum and maximum troids to be very close to each other. Unfortunately, some
distances between a pair of centroids are 0.0154 and 4.5948, MIN-related CVIs consider such closely allocated cen-
respectively, when K = 5. XBI, CSI, and PBMF evaluate the troids, which may be acceptable in practical applications,
clustering results and suggest two or three groups, however, as bad results. An effective CVI should allow, to some
which is obviously not the best answer. This misleading in- extent, to allocate two centroids closely.
formation is given since the distance between the centroids of
Group-0 and Group-1 is physically small and makes XBI and
CSI to produce large CVI values. IV. PROPOSED INDEX
Some real-world applications reveal the existence of close
A. Index Structure
centroids. For example, in medical records analysis, some hered-
itary diseases have similar symptoms that are also observed We propose a new CVI, termed as Wu-and-Li index (WLI),
from other diseases and can only be identified by some reces- for evaluating fuzzy clustering results. The main characteristic
sive symptoms. When clustering all medical records, recessive of WLI is the introduction of the median distance between a pair
diseases may be grouped together with other diseases and can- of centroids. Similar to the existing CVIs, WLI evaluates the
706 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 23, NO. 3, JUNE 2015
Fig. 2. Distributions of the artificial datasets. (a) AD1 (800 objects). (b) AD2 (300 objects). (c) AD3 (300 objects). (d) AD4 (450 objects).
Fig. 3. Image datasets. (a) Img1. (b) Img2. (c) Img3. (d) Img4. (e) Img5. (f) Img6.
image dataset. The cluster number depends on the granularity and U = {u1 , u2 , . . . , uk } be two sets of k centroids, and let
of interests of analysis. Usually, level-wise pixel clustering may ui be the nearest neighboring centroid of vi , 1 ≤ i ≤ k. CV is
perform iteratively; each iteration focuses on analyzing some defined as the ratio of the standard deviation to the mean of all di
specific image segments that are extracted from previous iter- = vi − ui , 1 ≤ i ≤ k. Let #C = [a, b] be a reasonable range
ations. In this experiment, we are mainly interested in if CVIs of cluster numbers for a dataset; a and b are integers and a ≤ b.
suggest reasonable cluster numbers. For deciding the cluster For AD1–AD4, HABE, WDBC, a = b, because their cluster
number, each image is first reviewed by domain experts and numbers are known certainly. For IRIS and IMG, a < b, #C is
they identify objects that should be noted. Remarkable objects a numerical range. Suppose we have R CVIs, I1 , . . . , IR , and
in an image are targets for image analysis, as well as useful for D datasets, S1 , . . . , SD . The following indicators evaluate the
investigating the clustering quality. For example, roads, sand- experiment results.
bank, shoal, and rooftops are remarkable objects in Img1–Img4. 1) Diff (problem difficulty): the number of CVIs that cannot
The shapes of aircrafts are remarkable in Img4. Brain tissue and effectively decide the cluster number of a dataset Sd , 1 ≤
neoplasm are notable objects in Img5–Img6. Image analysts in- d ≤ D, i.e.,
vestigate clustering results to see if shapes and edges of such
R
1, if Kr∗ ∈ [a, b]
objects are reserved after clustering. In such a way, a reasonable Diff(Sd ) = e(Ir ), e(Ir ) = (37)
range of numbers of clusters is determined for each image. r =1 0, else.
In the experiments, FCM performs clustering with various K
Note that Kr∗ is K ∗ of Ir . This denotes the difficulty of
values. Because FCM initializes centroids randomly, a dataset
clustering of a dataset.
may not have identical clustering results in different rounds of
2) Eff (effectiveness ratio): the ratio of the number of datasets
FCM. To ensure the quality of the experiments, the following
that CVI Ir , 1 ≤ r ≤ R can effectively decide their cluster
procedure is deployed. For each dataset, a round of FCM is
numbers to the total number of datasets, i.e.,
performed with K = 1, . . . , k. Then, every CVI evaluates the
1
clustering results. Among k CVI values of the same CVI, an D
extreme (minimal or maximal) CVI value is found. Increase Eff(Ir ) = δ(Sd ), δ(Sd )
∗
D
CK by 1 if the extreme CVI value is observed when K is the d=1
cluster number. Perform ten rounds of FCM and accumulate all 1, if Kr∗ ∈ [a, b]
∗ ∗ = (38)
CK . A large CK means that extreme CVI values concentrate on
cases when K clusters are partitioned by FCM. The best cluster 0, else.
number K ∗ decided by a CVI is K with largest CK ∗
, i.e., A CVI with high Eff ratio can be considered as effective,
K =∗ ∗
arg maxK =1,...,k CK . (36) and vice versa.
3) CV∗ (average CV): the average CV for all FCM rounds
The K ∗ clusters with the extreme CVI value are analyzed. Ad- with specific K clusters.
ditionally, the instability of FCM is observed by coefficient of Most CVIs that are presented in Section II postevaluate a
variation (CV) of cluster centroids. Let V = {v1 , v2 , . . . , vk } clustering result of K partitions, except CSI. CSI needs to refer
WU et al.: NEW FUZZY CLUSTERING VALIDITY INDEX WITH A MEDIAN FACTOR FOR CENTROID-BASED CLUSTERING 709
TABLE III
CLUSTERING RESULTS: AD (a) CV∗ VALUES OF ALL DATA SETS. (b) COUNTS OF C K
∗ OF ALL CVIS AMONG TEN ROUNDS OF FCM. (c) CLUSTER NUMBERS
DECIDED BY CVIS
(a)
data set K=2 K=3 K=4 K=5 K=6 K=7 K=8 K=9 K=10 Average
AD1 0.000 0.074 0.000 0.000 0.021 0.144 0.000 0.011 0.050 0.033
AD2 0.000 0.000 0.007 0.259 0.088 0.238 0.273 0.209 0.140 0.135
AD3 0.000 0.000 0.000 1.129 0.002 0.099 1.398 0.165 0.136 0.325
AD4 0.392 12.443 4.004 0.513 0.374 0.598 0.317 0.330 0.427 2.155
(b)
data set PC+ PE− CHI+ DBI− XBI− FSI− SCI+ CSI− PCAES+ PBMF+ Dunn+ OSI− WLI−
AD1 41 0 21 0 41 0 41 0 41 0 41 0 41 0 41 0 10 1 0 41 0 41 0 49 , 51 41 0
AD2 21 , 39 21 , 39 21 0 21 , 39 21 , 39 51 , 65 , 42 , 51 , 61 , 21 , 39 51 , 71 , 21 , 39 39 , 51 21 0 21 , 39
72 , 82 72 , 82 , 92 8 3 , 10 5
AD3 31 0 31 0 31 0 31 0 31 0 73 , 82 , 9 1 , 10 9 31 0 8 4 , 10 6 29 , 31 31 0 31 0 31 0
9 4 , 10 1
AD4 27 , 32 , 21 0 68 , 71 , 38 , 42 38 , 42 81 , 96 , 81 , 95 , 49 , 51 9 1 , 10 9 27 , 32 , 31 , 48 31 , 49 37 , 81 ,
41 91 10 3 10 4 41 71 10 2
data set #C PC+ PE− CHI+ DBI− XBI− FSI− SCI+ CSI− PCAES+ PBMF+ Dunn+ OSI− WLI− Diff
AD1 4 4 2* 4 4 4 4 4 4 10* 4 4 4 4 2
AD2 3 3 3 2* 3 3 6* 8* 3 10* 3 3 2* 3 5
AD3 3 3 3 3 3 3 9* 10* 3 10* 2* 3 3 3 4
AD4 2 2 2 6* 3* 3* 9* 9* 4* 10* 2 4* 4* 3* 10
∗
: wrong or out of acceptable range of clusters.
all results with all various K. An optimal CSI value is deter- Very high CV∗ values are observed in AD4 with various K. Ac-
mined after all rounds of FCM on all K have been done. The cording to the data distribution of AD4, it is hard to assign ideal
remaining CVIs refer the result of each round of FCM with a centroids, and FCM finds various ways to assign centroids, even
specific K and calculate the extreme values. in the same K. For AD4, only PC, PE, and PBMF report the cor-
rect cluster numbers (two clusters) and other CVIs prefer larger
B. Deciding Cluster Numbers cluster numbers. Fig. 4(l) presents the clustering result of FCM,
when K = 2. It can be found that FCM allocates two very close
1) AD Datasets: Table III presents statistics on the clustering centroids and vertically partitions data objects into two clusters.
results of AD. First, the CV∗ values are discussed. The average Some CVIs suggest K ∗ = 2 correctly; however, the clustering
CV∗ values in AD1, AD2, AD3, and AD4 [see Table III(a)] indi- result is incorrect. Some CVIs prefer the case of three clusters
cate that FCM can produce stable results for AD1 and AD2 but [see Fig. 4(m)]. Although this is not the correct cluster number,
not for AD3 and AD4 [see Table III(b)]. When K is close to #C, it is reasonable to partition data objects along the line into two
CV∗ is usually (not always) small. A large CV∗ value indicates groups, e.g., the right-hand and left-hand groups. This is one of
that centroids vary and the clustering results are unstable. It may the limitations of FCM and will be discussed in Section VI.
also indicate that there are more than one ways for partitioning 2) UCI Datasets: The average of CV∗ associated with all
a dataset. Diff in Table III(c) confirms this observation. UCI datasets [see Table IV(a)] are relatively small, showing that
Fig. 4 provides visual evidence and explains the above phe- FCM can produce stable results [see Table IV(b)]. Table IV(c)
nomena. Because data objects in AD1 and AD2 are normally shows that PC, PE, DBI, XBI, CSI, PBMF, Dunn, OSI, and WLI
distributed around a center point with similar variances along X perform stably and report correct cluster numbers in all three
and Y dimensions (referring to Fig. 2), it is more likely for FCM UCI datasets. CHI reports correctly for HABE and IRIS but
to allocate centroids near to the true center points, as shown in incorrectly for WDBC. FSI, SCI, and PCAES do not provide
Fig. 4(a)–(h). Data objects in AD3 are also generated in normal any correct information and seem to prefer a larger number of
distribution, but the variances along X and Y dimensions are clusters. Because the UCI datasets are high-dimensional, their
different, as shown in Fig. 4(i)–(k). Hence, the cluster structures data objects cannot be displayed easily in a 2-D space. For the
in AD3 are more complicated than those in AD1 and AD2; length of this paper, we only discuss on the data distribution of
some data objects are far away from the true centroids and may IRIS in all its pairwise features. Fig. 5 presents the distribution
be misclustered. For AD1–AD3, PC, PE, CHI, DBI, XBI, CSI, of data objects by their classification labels, i.e., the ground
Dunn, OSI, and WLI perform stably and find their maximal truth. Fig. 6 presents the clustering results with various K.
∗
CK at correct K ∗ . In AD4, the correct cluster number is two.
710 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 23, NO. 3, JUNE 2015
Fig. 4. Clustering results on AD1 (#C = 4), AD2 (#C = 3), AD3 (#C = 3), AD4 (#C =2). (a) AD1, K = 2. (b) AD1, K = 3. (c) AD1, K = 4. (d) AD1,
K = 10. (e) AD2, K = 2. (f) AD2, K = 3. (g) AD2, K = 4. (h) AD2, K = 10. (i) AD3, K = 2. (j) AD3, K = 3. (k) AD3, K = 7. (l) AD4, K = 2. (m) AD4,
K = 3. (n) AD4, K = 4.
TABLE IV
CLUSTERING RESULTS: UCI (a) CV∗ VALUES OF ALL DATA SETS (b) COUNTS OF C K
∗ OF ALL CVIS AMONG TEN ROUNDS OF FCM (c) CLUSTER NUMBERS
DECIDED BY CVIS
(a)
data set K=2 K=3 K=4 K=5 K=6 K=7 K=8 K=9 K=10 Average
HABE 0.000 0.001 0.001 0.003 0.010 0.074 0.013 0.019 0.018 0.015
WDBC 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
IRIS 0.000 0.017 0.003 0.070 0.064 0.076 0.069 0.101 0.042 0.049
(b)
data set PC+ PE− CHI+ DBI− XBI− FSI− SCI+ CSI− PCAES+ PBMF+ Dunn+ OSI− WLI−
HABE 21 0 21 0 21 0 21 0 21 0 9 6 , 10 4 8 2 , 10 8 21 0 8 1 , 9 5 , 10 4 21 0 21 0 41 0 21 0
WDBC 21 0 21 0 10 1 0 21 0 21 0 10 1 0 10 1 0 21 0 10 1 0 29 , 31 21 0 29 , 91 21 0
IRIS 29 , 31 29 , 31 31 0 29 , 31 29 , 31 51 , 65 42 , 51 , 61 29 , 31 51 , 71 29 , 31 29 , 51 29 , 31 29 , 31
72 , 82 72 , 82 , 92 8 3 , 10 5
(c)
data set #C PC+ PE− CHI+ DBI− XBI− FSI− SCI+ CSI− PCAES+ PBMF+ Dunn+ OSI− WLI− Diff
HABE 2 2 2 2 2 2 9* 10* 2 9* 2 2 4* 2 4
WDBC 2 2 2 10* 2 2 10* 10* 2 10* 2 2 2 2 4
IRIS 2—3 2 2 3 2 2 6* 9* 2 10* 2 2 2 2 3
Visually, data objects of IRIS can be partitioned into two cause FCM evaluates clusters by distances among data objects,
groups if ignoring the classification labels. A clustering re- clustering of such cases with two partitions can be considered
sult with two partitions seems to be reasonable. However, the as a reasonable result. Studies like [24], [31], [33], [35], and
correct cluster number is 3. Referring to Fig. 5(a), (c), and [49] consider 2–3 partitions as acceptable, and reasonable parti-
(e), there exist data objects with different classification labels tions. Table IV(a) shows that IRIS’s CV∗ values are higher than
that are not distinguishable to each other by their features. Be- HABE and WDBC, showing the difficulty of cluster partition-
WU et al.: NEW FUZZY CLUSTERING VALIDITY INDEX WITH A MEDIAN FACTOR FOR CENTROID-BASED CLUSTERING 711
Fig. 5. Data distribution of IRIS in pair-wise features by classification labels. (a) dim-1&2. (b) dim-1&3. (c) dim-1&4. (d) dim-2&3. (e) dim-2&4. (f) dim-3&4.
Fig. 6. Data distribution of clustered IRIS (#C = 2) with K = 2, 3, 6 in pair-wise features. (a) K = 2, dim-1&2. (b) K = 2, dim-1&3. (c) K =2, dim-1&4.
(d) K = 2, dim-2&3. (e) K = 2, dim-2&4. (f) K = 2, dim-3&4. (g) K = 3, dim-1&2. (h) K = 3, dim-1&3. (i) K = 3, dim-1&4. (j) K = 3, dim-2&3. (k) K
= 3, dim-2&4. (l) K = 3, dim-3&4. (m) K = 6, dim-1&2. (n) K = 6, dim-1&3. (o) K = 6, dim-1&4. (p) K = 6, dim-2&3. (q) K = 6, dim-2&4. (r) K = 6,
dim-3&4.
ing. Notice the locations of centroids when K = 2, as shown in ing if remarkable objects can be found from clustered pixels.
Fig. 6(a)–(f), data objects can be partitioned with centroids rea- For the length of the paper, only interesting results of some K
sonably separated. When K is large, some centroids are closely are presented and discussed. Figs. 7 and 9 present the clustered
allocated [see Fig. 6(g)–(r)]. Even when the cluster number K is pixels of Img1–Img6. Fig. 8 presents the pixels and centroids
correctly assigned as 3, the distance between centroids of clus- distribution of Img1–Img4 in the RGB space, whereas Fig. 10
ter 1 and cluster 2 is small [see Fig. 6(g)–(l)]. Such a clustering presents that of Img5–Img6 in the grayscale space.
result with K = 3 is not preferred by XBI and WLI. 1) The remarkable objects in Img1 include the highway from
3) Image Datasets: Table V presents statistics on the clus- northwest to southeast, a sandbank in the right part of the
tering results of all images. In Table V(a), all CV∗ values in all image, and a shoal area in the upper side of the ocean. Seg-
datasets are almost 0, which indicates that FCM can produce sta- mentation with K = 3 and K = 4 [see Fig. 7(b) and (c)]
ble results for various settings of K. A difference between IMG are preferable because the shapes of the highway and out-
and the other two types of datasets is the compactness of data look of sandbank and shoal can be recognized. DBI, XBI,
objects. A 256×256 image file contains 65536 data compactly CSI, OSI, and WLI present reasonable cluster numbers.
distributing in a 3-D RGB or 1-D grayscale feature space. Low However, K = 4 is more preferable because the highway
possibility to find many variances of cluster partitioning can be and ocean are clustered together when K = 3. From this
drawn. In addition, clustering with an extreme cluster number, aspect, only WLI is correct. Similar investigation applies
e.g., K = 2 or K = 10, does not produce helpful information to Img2. Reasonable cluster numbers are K = 4 and K =
∗
for image analysis. Most interestingly, CVIs choose various CK 5 for recognizing shoal areas. The most important objects
∗
and K . Reasonable cluster numbers are decided by investigat- in Img3 are shoal area and the small island in the middle
712 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 23, NO. 3, JUNE 2015
TABLE V
CLUSTERING RESULTS: IMG (a) CV∗ VALUES OF ALL DATA SETS (b) COUNTS OF C K
∗ OF ALL CVIS AMONG TEN ROUNDS OF FCM (c) CLUSTER NUMBERS
DECIDED BY CVIS
(a)
data set K=2 K=3 K=4 K=5 K=6 K=7 K=8 Average
(b)
data set PC+ PE− CHI+ DBI− XBI− FSI− SCI+ CSI− PCAES+ PBMF+ Dunn+ OSI− WLI−
Img1 21 0 21 0 51 0 31 0 31 0 71 0 61 0 31 0 81 0 21 0 71 0 31 0 41 0
Img2 31 0 31 0 81 0 31 0 31 0 61 0 81 0 31 0 81 0 23 , 27 81 0 31 0 41 0
Img3 31 0 21 0 41 , 53 , 86 31 0 31 0 73 , 87 73 , 87 31 0 79 , 81 36 , 44 73 , 87 31 0 41 0
Img4 21 0 21 0 71 0 21 0 21 0 71 0 81 0 31 0 81 0 21 0 71 0 21 0 41 0
Img5 31 0 31 0 21 0 31 0 31 0 81 0 81 0 51 0 81 0 27 , 33 81 0 31 0 41 0
Img6 31 0 31 0 21 0 21 0 31 0 81 0 81 0 31 0 81 0 21 0 81 0 31 0 41 0
(c)
data set #C PC+ PE− CHI+ DBI− XBI− FSI− SCI+ CSI− PCAES+ PBMF+ Dunn+ OSI− WLI− Diff
Img1 3—4 2* 2* 5* 3 3 7* 6* 3 8* 2* 7* 3 4 8
Img2 4—5 3* 3* 8* 3* 3* 6* 8* 3* 8* 3* 8* 3* 4 12
Img3 4—5 3* 2* 8* 3* 3* 8* 8* 3* 7* 3* 8* 3* 4 12
Img4 3—4 2* 2* 7* 2* 2* 7* 8* 5* 8* 2* 7* 2* 4 12
Img5 3—5 3 3 2* 2* 3 8* 8* 3 8* 2* 8* 3 4 7
Img6 3—5 3 3 2* 2* 3 8* 8* 3 8* 2* 8* 3 4 7
of the image. Clusters with K = 4 and K = 5 are prefer- and PCAES prefer larger numbers of clusters. Some CVIs are
able, and the island and shoal are recognizable, as shown effective for AD datasets but are not for UCI or IMG datasets,
in Fig. 7(o) and (p). These objects are also recognizable and vice versa. Nevertheless, a correct cluster number suggested
when K = 6 [see Fig. 7(q)], but there are many noisy by a CVI does not imply to a correct and reasonable clustering
spots. Only CSI and WLI return correct information of result. It needs to investigate how data objects distribute and
cluster numbers. Img4, aircrafts are the most remarkable are clustered. For AD datasets, PC is the best CVI with Eff =
objects in the clustering results. Clusters with K = 3 or 4 1.00. For UCI datasets, PC, PE, DBI, XBI, CSI, PBMF, Dunn,
are preferable because the shapes of aircrafts can be easily and WLI have Eff = 1.00. For IMG datasets, only our proposed
recognized. CVI, WLI, has 1.00 Eff. As for the overall effectiveness, the
2) The locations of centroids in Img1–Img4 are presented proposed WLI has Eff = 0.92, which is the best one among all
in Fig. 8. Among all these figures, centroids of clusters CVIs.
with reasonable cluster numbers are close. For example,
Fig. 8(b), (d), (f), and (h) show the existence of close C. Clustering Validity Index Variation Analysis
centroids in clusters. This may explain why WLI can have
This section presents the CVI values of all clustering results
better performance.
with various K. Figs. 11–13 present the CVI values correspond-
3) Similar investigation applies to CT images. From
ing to the clustering of AD, UCI, and IMG datasets, respectively.
Table V(c), 3–5 clusters are reasonable for CT images.
The variation of CVI values with various settings of K is dis-
PC, PE, XBI, CSI, OSI, and WLI produce reasonable re-
cussed. An ideal CVI may have only one extreme value among
sults. Soft brain tissue and neoplasm are notable objects
various K and its other CVI values grow monotonically. That
in Img5–Img6, as shown in Fig. 9. Although some of the
is, an ideal CVI should be sensitive to K and have a drastic
image analysts assume that three clusters are acceptable,
change of CVI values when K = #C (true cluster number).
more details are shown when four or five clusters are pro-
The CVI values with K other than #C should not change dras-
duced, especially the soft tissue part in the bottom-center
tically as K increases/decreases. However, several local extreme
part of the images.
values are found in most CVIs, as shown in Figs. 11–13. The
4) Summary: Table VI summarizes the results of this exper-
variance of CVI values is affected by several factors. For ex-
iment in terms of Eff ratio. The overall Eff shows that some
ample, the relationships among data objects and centroids may
CVIs prefer smaller numbers of clusters and some prefer larger
remain unchanged because of specific geometrical characteris-
ones. For example, PC and PE prefer smaller, and FSI, SCI,
tics of clusters, no matter how FCM works. In such cases, CVI
WU et al.: NEW FUZZY CLUSTERING VALIDITY INDEX WITH A MEDIAN FACTOR FOR CENTROID-BASED CLUSTERING 713
Fig. 7. Clustering results on Img1–Img4 (each color denotes a cluster). (a) Img1, K = 2. (b) Img1, K = 3. (c) Img1, K = 4. (d) Img1, K =5. (e) Img1, K = 6.
(f) Img1, K = 7. (g) Img2, K =2. (h) Img2, K = 3. (i) Img2, K = 4. (j) Img2, K =5.
(k) Img2, K = 6. (l) Img2, K = 7. (m) Img3, K =2. (n) Img3, K = 3. (o) Img3, K =4.
(p) Img3, K =5. (q) Img3, K =6. (r) Img3, K =7. (s) Img4, K =2. (t) Img4, K =3. (u) Img4, K =4. (v) Img4, K =5. (w) Img4, K =6. (x) Img4,
K =7.
TABLE VI
EFF. OF ALL CVIS
Dataset PC+ PE− CHI+ DBI− XBI− FSI− SCI+ CSI− PCAES+ PBMF+ Dunn+ OSI− WLI−
AD 1.00 0.75 0.50 0.75 0.75 0.25 0.25 0.75 0.00 0.75 0.75 0.50 0.75
UCI 1.00 1.00 0.67 1.00 1.00 0.00 0.00 1.00 0.00 1.00 1.00 0.67 1.00
Img 0.33 0.33 0.00 0.33 0.50 0.00 0.00 0.50 0.00 0.00 0.00 0.50 1.00
Average 0.78 0.69 0.39 0.69 0.75 0.08 0.08 0.58 0.00 0.58 0.58 0.56 0.92
values may remain almost unchanged with various K. Another figures, several CVIs behave monotonically and the extreme
reason is the structure of CVI. Compactness and separation of values appear when K = 2 or 10, such as PC, PE, CHI, FSI,
clusters are main elements in many CVI structures. However, SCI, PCAES, and PBMF. A drastic variation of CVI values is
they have various organization and representation. Such struc- observed in CHI, DBI, XBI, PCAES, and Dunn. In Fig. 13,
tural deficiencies may derive the unstable CVI values. In these the variation of CVI values is smoother than that in Figs. 11
714 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 23, NO. 3, JUNE 2015
Fig. 8. Pixel distributions of Img1–Img4 in the RGB space. (a) Img1, K =3. (b) Img1, K =4. (c) Img2, K =3. (d) Img2, K =4. (e) Img3, K =3. (f) Img3, K =4.
(g) Img4, K =2. (h) Img4, K =4.
Fig. 9. Clustering results on Img5–Img6. (a) Img5, K =2. (b) Img5, K =3. (c) Img5, K =4. (d) Img5, K =5. (e) Img5, K =8. (f) Img6, K =2. (g) Img6, K =3.
(h) Img6, K =4. (i) Img6, K =6. (j) Img6, K =8.
Fig. 10. Pixel distributions of the Img datasets (in 8-bit grayscale). (a) Img5, K =3. (b) Img5, K =4. (c) Img6, K =2. (d) Img6, K =4.
and 12. Because data of image pixels are crowed, each pixel found that FCM allocates two centroids on two very close, even
could be selected as a centroid and the distances among pix- overlapped, positions, resulting in the term MIN approaching 0
els do not change drastically. This can be confirmed by CV∗ and, thus, huge CVI values. WLI is not always stable in all K.
shown in Table V. CVIs that include the term MIN in the de- However, the term MED employed in WLI mitigates, to some
nominator usually have drastic variation in CVI values, such as extent, drastic variation as K changes.
XBI and PCAES. By investigating the insides of clusters, it is
WU et al.: NEW FUZZY CLUSTERING VALIDITY INDEX WITH A MEDIAN FACTOR FOR CENTROID-BASED CLUSTERING 715
Fig. 11. Comparisons on the CVI values: AD. (a) PC+ . (b) PE− . (c) CHI+ . (d) DBI− . (e) XBI− . (f) FSI− . (g) SCI+ . (h) CSI− . (i) PCAES+ . (j) PBMF+ .
(k) Dunn+ . (l) OSI− . (m) WLI− .
VI. CONCLUDING REMARKS 3) The image datasets tested in the experiments are in
This paper presents a new CVI, called the Wu-and-Li index RGB or grayscale. Img1–Img4 are described by three
(WLI), for centroid-based partitional clustering. WLI considers features (RGB); Img5 and Img6 are by only one feature
the overall compactness–separation ratio of all clusters, as well (grayscale). Other features like texture or position of pix-
as that of each cluster. Besides, WLI eliminates the instability els are possible features to be considered for clustering.
that happens in other CVIs when two centroids are nearly al- For the length of the paper, this study only tests images in
located by introducing a new median factor. The experimental RGB and grayscale features. Moreover, deciding the clus-
results show that WLI has better performance and stability than tering quality of an image is arbitrary; it depends on the
other CVIs. Some remarks about the characteristics of the CVIs application purposes. Clustering is an aid of image anal-
are given as follows. ysis. We do not claim that our proposed WLI is the most
1) Simple CVIs like PC and PE have satisfactory results suitable CVI for image clustering. Instead, our proposed
for simple problems (such as AD1–AD3) but may not WLI may be helpful for image clustering when centroids
be effective for complicated problems (such as Img1– of clusters are close.
Img6). This may be due to the simple compactness of 4) Sophisticated factors of clusters that consider specific data
clusters used by these CVIs. On the contrary, complicated distribution, such as symmetry and overlapping of data ob-
CVIs, such as SCI, PCAES, and PBMF, that consider the jects and centroids, may further improve the effectiveness
geographic structure of clusters or other characteristics of of a CVI. However, such characteristics are not always
clusters may not be effective for clustering if such specific applicable in all clustering problems. Additionally, eval-
structures are not correctly detected or do not exist. uating such complicated factors may introduce additional
2) The proposed CVI (WLI) is effective for the UCI and im- calculation complexity. One of the advantages of WLI
age datasets. WLI does not provide correct information is its simple structure. The main idea of WLI considers
about the cluster number for AD4. However, AD4 is not the compactness and separation of clusters in terms of
suitable for centroid-based clustering methods like FCM. mini= j {vi − vj 2 } and mediani= j {vi − vj 2 }. Such
It is more likely to have three clusters: one for Cluster 1 terms can be obtained easily in most clustering problems.
(green) and two for Cluster 2 (Red). Although PC and PE 5) The clustering method and CVIs that are discussed in this
paper are centroid-based clustering methods that partition
report the correct cluster numbers, FCM simply splits all
data along the X dimension into two clusters. The clus- data objects in the feature space. In this manner, each
tering result is not reasonable because of the limitation cluster only has one centroid. FCM is one such typical
clustering method. For data objects that overlappedly dis-
of FCM. If three clusters are reasonable in this consid-
eration, WLI has the better overall performance. Some tribute in their feature space, such a clustering method may
not separate two clusters effectively. Figs. 4(m)–(p) and 5
methods other than FCM-like algorithm may cluster AD4
are such cases where most CVIs cannot evaluate clusters
correctly; however, discussing the variousness of cluster-
ing algorithms is not in the scope of this paper. well. However, this is the nature of FCM-like clustering
methods. Some density-based clustering methods [50] are
716 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 23, NO. 3, JUNE 2015
Fig. 12. Comparisons on the CVI values: UCI. (a) PC+ . (b) PE− . (c) CHI+ . (d) DBI− . (e) XBI− . (f) FSI− . (g) SCI+ . (h) CSI− . (i) PCAES+ . (j) PBMF+ .
(k) Dunn+ . (l) OSI− . (m) WLI− .
Fig. 13. Comparisons on the CVI values: IMG. (a) PC+ . (b) PE− . (c) CHI+ . (d) DBI− . (e) XBI− . (f) FSI− . (g) SCI+ . (h) CSI− . (i) PCAES+ . (j) PBMF+ .
(k) Dunn+ . (l) OSI− . (m) WLI− .
able to discover arbitrary-shaped clusters and are, there- allocated on the position of an existing data object, such
fore, suitable for clustering such kinds of datasets. Our as [51], and may improve the clustering quality for some
proposed CVI may not be used for such clusters directly. applications. WLI and all CVIs that are discussed in this
However, it is possible to use WLI for evaluating such paper can also work with such algorithms. Comparisons
clustering results with minor modifications on the defini- on the effectiveness of WLI and other CVIs when they
tions of centroids and distances. In FCM, a centroid may work with various centroid-based clustering algorithms
be allocated at any position in the feature space. Some for clustering different types of applications may be a re-
clustering algorithms restrict that a centroid can only be search direction. These will be considered in future work.
WU et al.: NEW FUZZY CLUSTERING VALIDITY INDEX WITH A MEDIAN FACTOR FOR CENTROID-BASED CLUSTERING 717
REFERENCES [30] Y. Fukuyama and M. Sugeno, “A new method of choosing the number of
clusters for fuzzy c-means method,” in Proc. 5th Fuzzy Syst. Symp., 1989,
[1] J.-Y. Jiang, R.-J. Liou, and S.-J. Lee, “A fuzzy self-constructing feature pp. 247–250.
clustering algorithm for text classification,” IEEE Trans. Knowl. Data [31] N. Zahid, M. Limouri, and A. Essaid, “A new cluster-validity for fuzzy
Eng., vol. 23, no. 3, pp. 335–349, Mar. 2011. clustering,” Pattern Recog., vol. 32, no. 7, pp. 1089–1097, 1999.
[2] A. Ducournau, A. Bretto, S. Rital, and B. Laget, “A reductive approach [32] C.-H. Chou, M.-C. Su, and E. Lai, “A new cluster validity measure and
to hypergraph clustering: An application to image segmentation,” Pattern its application to image compression,” Pattern Anal. Appl., vol. 7, no. 2,
Recog., vol. 45, no. 7, pp. 2788–2803, 2012. pp. 205–220, Jul. 2004.
[3] T. F. Ng, T. D. Pham, and X. Jia, “Feature interaction in subspace clustering [33] K.-L. Wu and M.-S. Yang, “A cluster validity index for fuzzy clustering,”
using the Choquet integral,” Pattern Recog., vol. 45, no. 7, pp. 2645–2660, Pattern Recog. Lett., vol. 26, no. 9, pp. 1275–1291, Jul. 2005.
2012. [34] M. K. Pakhira, S. Bandyopadhyay, and U. Maulik, “A study of some
[4] J. Feyereisl and U. Aickelin, “Privileged information for data clustering,” fuzzy cluster validity indices, genetic clustering and application to pixel
Inf. Sci., vol. 194, pp. 4–23, 2012. classification,” Fuzzy Sets Syst., vol. 155, no. 2, pp. 191–214, Oct. 2005.
[5] D. T. Anderson, J. C. Bezdek, M. Popescu, and J. M. Keller, “Comparing [35] H. Le Capitaine and C. Frelicot, “A cluster-validity index combining an
fuzzy, probabilistic, and possibilistic partitions,” IEEE Trans. Fuzzy Syst., overlap measure and a separation measure based on fuzzy-aggregation
vol. 18, no. 5, pp. 906–918, Oct. 2010. operators,” IEEE Trans. Fuzzy Syst., vol. 19, no. 3, pp. 580–588, Jun.
[6] A. K. Jain and R. C. Dubes, Algorithms for clustering data. Upper Saddle 2011.
River, NJ, USA: Prentice-Hall, 1988. [36] M. Popescu, J. C. Bezdek, T. C. Havens, and J. M. Keller, “A cluster
[7] B. Everitt, Graphical Techniques for Multivariate Data. London, U.K.: validity framework based on induced partition dissimilarity,” IEEE Trans.
Heinemann, 1978. Cybern., vol. 43, no. 1, pp. 308–320, Feb. 2013.
[8] R. Xu and D. Wunsch II, “Survey of clustering algorithms,” IEEE Trans. [37] S. Saha and S. Bandyopadhyay, “Performance evaluation of some
Neural Netw., vol. 16, no. 3, pp. 645–678, May 2005. symmetry-based cluster validity indexes,” IEEE Trans. Syst., Man, Cy-
[9] E. Hruschka, R. Campello, A. Freitas, and A. de Carvalho, “A survey of bern. C, Appl. Rev., vol. 39, no. 4, pp. 420–425, Jul. 2009.
evolutionary algorithms for clustering,” IEEE Trans. Syst., Man, Cybern. [38] M. K. Pakhira, S. Bandyopadhyay, and U. Maulik, “Validity index for
C, Appl. Rev., vol. 39, no. 2, pp. 133–155, Mar. 2009. crisp and fuzzy clusters,” Pattern Recog., vol. 37, no. 3, pp. 487–501,
[10] K.-L. Wu, M.-S. Yang, and J.-N. Hsieh, “Robust cluster validity indexes,” 2004.
Pattern Recog., vol. 42, no. 11, pp. 2541–2550, Nov. 2009. [39] L. Mascarilla, M. Berthier, and C. Fricot, “A k-order fuzzy or operator for
[11] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. pattern classification with k-order ambiguity rejection.” Fuzzy Sets Syst.,
San Francisco, CA, USA: Morgan Kaufmann, 2011. vol. 159, no. 15, pp. 2011–2029, 2008.
[12] W. Rand, “Objective criteria for the evaluation of clustering methods,” J. [40] J. Sun, F. Wang, J. Hu, and S. Edabollahi, “Supervised patient similar-
Amer. Statist. Assoc., vol. 66, no. 336, pp. 846–850, 1971. ity measure of heterogeneous patient records,” ACM SIGKDD Explor.
[13] L. Hubert and P. Arabie, “Comparing partitions,” J. Classification, vol. 2, Newslett., vol. 14, no. 1, pp. 16–24, Dec. 2012.
no. 1, pp. 193–218, 1985. [41] P. Markelj, D. Tomaževič, B. Likar, and F. Pernuš, “A review of 3D/2D
[14] D. T. Anderson, A. Zare, and S. Price, “Comparing fuzzy, probabilistic, registration methods for image-guided interventions,” Med. Image Anal.,
and possibilistic partitions using the earth Mover’s distance,” IEEE Trans. vol. 16, no. 3, pp. 642–661, 2012.
Fuzzy Syst., vol. 21, no. 4, pp. 766–775, Aug. 2013. [42] M. Lim, D. N. Petley, N. J. Rosser, R. J. Allison, A. J. Long, and D. Pybus,
[15] A. Gordon, “Cluster validation,” in Data Science, Classification, and Re- “Combined digital photogrammetry and time-of-flight laser scanning for
lated Methods, C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H. Bock, monitoring cliff evolution,” Photogrammetr. Rec., vol. 20, no. 110, pp.
and Y. Bada, Eds. New York, NY, USA: Springer-Verlag, 1998, pp. 22–39. 109–129, 2005.
[16] R. Xu, J. Xu, and D. Wunsch, “A comparison study of validity indices on [43] L.-C. Wu, L. Z.-H. Chuang, D.-J. Doong, and C. C. Kao, “Ocean re-
swarm-intelligence-based clustering,” IEEE Trans. Syst., Man, Cybern. B, motely sensed image analysis using two-dimensional continuous wavelet
Cybern., vol. 42, no. 4, pp. 1243–1256, Aug. 2012. transforms,” Int. J. Remote Sens., vol. 32, no. 23, pp. 8779–8798, 2011.
[17] K. Tasdemir and E. Merenyi, “A validity index for prototype-based clus- [44] M. Ahmad and D. Sundararajan, “A fast algorithm for two dimensional
tering of data sets with complex cluster structures,” IEEE Trans. Syst., median filtering,” IEEE Trans. Circuits Syst., vol. CAS-34, no. 11, pp.
Man, Cybern. B, Cybern., vol. 41, no. 4, pp. 1039–1053, Aug. 2011. 1364–1374, Nov. 1987.
[18] Z. Liang, P. Zhang, and J. Zhao, “Optimization of the number of clusters [45] H. Hwang and R. Haddad, “Adaptive median filters: new algorithms and
in fuzzy clustering,” in Proc. Int. Conf. Comput. Design Appl., 2010, pp. results,” IEEE Trans. Image Process., vol. 4, no. 4, pp. 499–502, Apr.
V3-580–V3-584 . 1995.
[19] I. J. Sledge, J. C. Bezdek, T. C. Havens, and J. M. Keller, “Relational [46] S. Marshall, “New direct design method for weighted order statistic filters,”
generalizations of cluster validity indices,” IEEE Trans. Fuzzy Syst., vol. IEE Proc. Vis., Image Signal Process., vol. 151, no. 1, pp. 1–8, Feb. 2004.
18, no. 4, pp. 771–786, Aug. 2010. [47] S. Perreault and P. Hebert, “Median filtering in constant time,” IEEE Trans.
[20] H. Frigui and R. Krishnapuram, “Clustering by competitive agglomera- Image Process., vol. 16, no. 9, pp. 2389–2394, Sep. 2007.
tion,” Pattern Recog., vol. 30, no. 7, pp. 11–11, Jun. 1997. [48] A. Asuncion, and D. Newman. (2007). UCI machine learn-
[21] S. Bandyopadhyay, “Multiobjective simulated annealing for fuzzy cluster- ing repository [Online]. Available: http://www.ics.uci.edu/ ∼mlearn/
ing with stability and validity,” IEEE Trans. Syst., Man, Cybern. C, Appl. MLRepository.html
Rev., vol. 41, no. 5, pp. 682–691, Sep. 2011. [49] D.-W. Kim, K. H. Lee, and D. Lee, “On cluster validity index for estimation
[22] R. Xu, J. Xu, and D. C. Wunsch, “A comparison study of validity indices of the optimal number of fuzzy clusters,” Pattern Recog., vol. 37, no. 10,
on swarm-intelligence-based clustering,” IEEE Trans. Syst., Man, Cybern. pp. 2009–2025, 2004.
B, Cybern., vol. 42, no. 4, pp. 1243–1256, Aug. 2012. [50] X.-F. Wang and D.-S. Huang, “A novel density-based clustering frame-
[23] J. C. Bezdek, “Cluster validity with fuzzy sets,” J. Cybern., vol. 3, no. 3, work by using level set method,” IEEE Trans. Knowl. Data Eng., vol. 21,
pp. 58–73, 1973. no. 11, pp. 1515–1531, Nov. 2009.
[24] N. Pal and J. Bezdek, “On cluster validity for the fuzzy c-means model,” [51] L. Kaufman and P. J. Rousseeuw, Finding Groups in Data: An Introduction
IEEE Trans. Fuzzy Syst., vol. 3, no. 3, pp. 370–379, Aug. 1995. to Cluster Analysis. New York, NY, USA: Wiley, 1990.
[25] M.-C. Chiang, C.-W. Tsai, and C.-S. Yang, “A time-efficient pattern re-
duction algorithm for k-means clustering,” Inf. Sci., vol. 181, no. 4, pp.
716–731, 2011. Chih-Hung Wu (M’00) was born in 1967. He re-
[26] J. C. Dunn, “A fuzzy relative of the ISODATA process and its use in ceived the B.S. degree in engineering science from
detecting compact well-separated clusters,” J. Cybern., vol. 3, no. 3, pp. National Cheng-Kung University, Tainan City, Tai-
32–57, 1973. wan, in 1990 and the M.S. and Ph.D. degrees in elec-
[27] T. Caliński and J. Harabasz, “A dendrite method for cluster analysis,” tronic engineering from National Sun Yat-sen Uni-
Commun. Statist., vol. 3, no. 1, pp. 1–27, 1974. versity, Kaohsiung, Taiwan, in 1992 and 1996, re-
[28] D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEE spectively.
Trans. Pattern Anal. Mach. Intell., vol. 1, no. 2, pp. 224–227, Feb. 1979. He is currently a Professor with the Department of
[29] X. L. Xie and G. Beni, “A validity measure for fuzzy clustering,” IEEE Electrical Engineering, National University of Kaoh-
Trans. Pattern Anal. Mach. Intell., vol. 13, no. 8, pp. 841–847, Aug. 1991. siung, Kaohsiung, Taiwan. His research interests in-
clude artificial intelligence, soft computing, robotics,
and cloud-computing. He is the Director of Intelligent Computation and Appli-
718 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 23, NO. 3, JUNE 2015