0% found this document useful (0 votes)
74 views12 pages

Scalable Fuzzy Clustering With Anchor Graph

This document proposes a scalable fuzzy clustering algorithm called Scalable Fuzzy Clustering with Anchor Graph (SFCAG) to address the scalability issues of fuzzy clustering on large datasets. SFCAG introduces an anchor graph technique to reduce computational complexity by selecting a small number of anchors and constructing a sparse graph. It also formulates a trace ratio model to learn the membership matrix of anchors, which speeds up the clustering procedure. Experimental results on synthetic and real-world datasets demonstrate SFCAG's superiority in effectiveness and scalability compared to other large-scale clustering methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views12 pages

Scalable Fuzzy Clustering With Anchor Graph

This document proposes a scalable fuzzy clustering algorithm called Scalable Fuzzy Clustering with Anchor Graph (SFCAG) to address the scalability issues of fuzzy clustering on large datasets. SFCAG introduces an anchor graph technique to reduce computational complexity by selecting a small number of anchors and constructing a sparse graph. It also formulates a trace ratio model to learn the membership matrix of anchors, which speeds up the clustering procedure. Experimental results on synthetic and real-world datasets demonstrate SFCAG's superiority in effectiveness and scalability compared to other large-scale clustering methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 35, NO.

8, AUGUST 2023 8503

Scalable Fuzzy Clustering With Anchor Graph


Chaodie Liu , Feiping Nie , Rong Wang , and Xuelong Li , Fellow, IEEE

Abstract—Fuzzy clustering algorithms have been widely used to reveal the possible hidden structure of data. However, with the
increasing of data amount, large scale data has brought genuine challenges for fuzzy clustering. Most fuzzy clustering algorithms suffer
from the long time-consumption problem since a large amount of distance calculations are involved to update the solution per iteration.
To address this problem, we introduce the popular anchor graph technique into fuzzy clustering and propose a scalable fuzzy clustering
algorithm referred to as Scalable Fuzzy Clustering with Anchor Graph (SFCAG). The main characteristic of SFCAG is that it addresses
the scalability issue plaguing fuzzy clustering from two perspectives: anchor graph construction and membership matrix learning.
Specifically, we select a small number of anchors and construct a sparse anchor graph, which is beneficial to reduce the computational
complexity. We then formulate a trace ratio model, which is parameter-free, to learn the membership matrix of anchors to speed up the
clustering procedure. In addition, the proposed method enjoys linear time complexity with the data size. Extensive experiments
performed on both synthetic and real world datasets demonstrate the superiority (both effectiveness and scalability) of the proposed
method over some representative large scale clustering methods.

Index Terms—Fuzzy clustering, scalable clustering, anchor graph, trace ratio

1 INTRODUCTION These methods mentioned above are all hard clustering,


where each data point belongs to an exact cluster. However,
is an extensive data processing tool in
C LUSTERING
machine learning, which has been widely applied to
many applications, such as image processing [1], [2], docu-
in real applications, there is often no clear boundary
between clusters. It is necessary to introduce the fuzzy set
theory [14] into clustering analysis, and fuzzy clustering
ment grouping [3], [4] and social network [5], [6]. It attempts
comes into being. Fuzzy clustering is a relatively new trend
to partition data points into different groups such that simi-
in data clustering, which softly divides data points into dif-
lar data points belong to the same cluster while dissimilar
ferent clusters by designing a membership matrix whose
data points belong to different clusters. As a preprocessing
values range from 0 to 1 to indicate the belongingness of
step to improve data understanding, many clustering algo-
data points to different clusters. The most well known fuzzy
rithms have been proposed in the past decades. Among
clustering algorithm is Fuzzy C-Means (FCM) [15], which is
them, K-Means (KM) clustering algorithm [7] and its var-
first proposed by Bezdek. By virtue of data uncertainty
iants [8], [9], [10] have been utilized in many substantive
modeling and easy implementation, many variants based
fields due to their simplicity and efficiency. Nevertheless,
on FCM have been extensively studied [16], [17], [18], [19],
KM-type clustering methods fail when the distribution of
[20] to improve clustering performance.
data is not spherical. To adapt data with arbitrary shapes,
The generalized objective function of fuzzy clustering
graph-based clustering algorithms [11], [12], [13] have been
can be defined by fuzzy membership and distances between
proposed based on the spectral graph theory. Graph-based
data points and cluster centers to minimize the loss of data
clustering methods can obtain better clustering results since
to the nearest cluster center. Different distance measures
they can seek an optimal partitioning of data.
[21], [22], [23] have been explored to handle complicated
distribution data. However, with the rapid development of
technology, data amount has become larger and larger,
which brings great challenges for fuzzy clustering algo-
 Chaodie Liu, Feiping Nie, and Xuelong Li are with the School of Computer
Science and School of Artificial Intelligence, Optics and Electronics (iOPEN), rithms. Most fuzzy clustering algorithms suffer from long
Northwestern Polytechnical University, Xian, Shaanxi 710072, China. time-consumption since a large number of distance calcula-
E-mail: [email protected], [email protected], [email protected]. tions are involved to update the solution in each iteration.
 Rong Wang is with the School of Cybersecurity and School of Artificial
To address this problem, many efforts have been devoted
Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical
University, Xian 710072, China. E-mail: [email protected]. to accelerate fuzzy clustering algorithms from different per-
Manuscript received 24 February 2022; revised 24 June 2022; accepted 6 August spectives. On the one hand, some researchers resort to data
2022. Date of publication 22 August 2022; date of current version 21 June 2023. reduction strategy [24], [25], [26], [27], [28]. For example, Jon-
This work was supported in part by the National Key Research and Develop- athon K. Parker et al. [26] use a statistical method to estimate
ment Program of China under Grant 2018AAA0101902, in part by the Natu- the sub-sample size and introduce two new accelerated
ral Science Basic Research Program of Shaanxi under Grant 2021JM-071, in
part by the National Natural Science Foundation of China under Grants fuzzy clustering algorithms. Nevertheless, these methods
61936014 and 61772427, and in part by the Fundamental Research Funds for easily lose original data information and degenerate cluster-
the Central Universities under Grant G2019KY0501. ing performance. On the other hand, some researchers
(Corresponding author: Feiping Nie.)
Recommended for acceptance by M. Zhang.
impose initialization [29], [30], [31], [32]. A high quality ini-
Digital Object Identifier no. 10.1109/TKDE.2022.3200685 tialization can provide fast convergence for the final result.
1041-4347 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See ht_tps://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
8504 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 35, NO. 8, AUGUST 2023

Cebeci et al. [32] propose a novel data-dependent initializa- TABLE 1


tion algorithm to improve computational efficiency by using Main Mathematical Notations Used in This Paper
the frequency polygon data of the feature with the highest
Notation Description
peak count in a dataset. In addition, Shen et al. [33] work in a
different direction and design a hyperplane division method n Number of data points
to split the entire dataset into several disjoint subsets to d Dimension of data points
c Number of clusters
make the clustering process efficient. Zhou et al. [34] equip k Number of the nearest neighbors
fuzzy clustering with the triangle inequality to accelerate the m Number of anchors
convergence speed. Although important progress has been B Data-to-anchor similarity matrix
achieved by these methods in reducing the computational U Membership matrix of anchors
burden, they are not sufficiently scalable for large scale data. F Membership matrix of data
Recently, anchor graph technique has attracted extensive points
X ¼ fx1 ; x2 ; . . . ; xn gT 2 Rnd Data matrix
attention to deal with large scale problems. The basic idea
A ¼ fa1 ; a2 ; . . . ; am gT 2 Rmd Anchor set
of anchor graph is to replace data points with a relatively V ¼ fv1 ; v2 ; . . . ; vc g 2 Rdc Cluster center matrix
small number of anchors to explore the intrinsic structure of
data distribution and then build a data-to-anchor graph by
similarities such that the subsequent operations can be con- method has an elegant balance in reducing the
ducted on the subtle similarity graph. The number of computational cost and achieving satisfactory clus-
selected anchors is much smaller than data points, thus it tering performance.
can greatly reduce the computational cost. It has been suc-
cessfully applied in many fields, such as semi-supervised 2 RELATED WORK
learning [35], [36], spectral clustering [37], [38], feature
2.1 Notations
selection [39], and fuzzy clustering [40].
We first give the main mathematical notations used in this
There are two vital steps in anchor graph. The first is
paper. Throughout the paper, scalars are written as normal
anchor generation. It is important to generate sufficiently
dense representative anchors to reveal the intrinsic structure lowercase. Uppercase boldface and lowercase boldface are
of data. Generally, the cluster centers are selected as anchors, used for matrix and vector, respectively. For a matrix A ¼
such as KM, since they have more powerful representation ½aij  2 Rnm , aij means the ith row and jth column element
to capture the intrinsic structure of data. The second step is in the matrix A. ai and aj is its ith row and jth column. AT
anchor graph construction, in which a similarity matrix means the transpose of A. TrðAÞ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi denotes the trace of matrix
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn Pm
needs to be designed to effectively measure the similar rela- A. kAkF ¼ i¼1 j¼1 aij ¼ TrðAT AÞ is the Frobenius
tionships between data points and anchors. In this paper, a norm of matrix A. A  0 and a  0 mean all the elements of
parameter-free yet effective neighbor assignment strategy is A and a are nonnegative. 1 denotes a column vector that all
adopted to construct the anchor graph to improve the clus- elements are 1.
tering performance. For ease of reading and understanding, main mathemati-
To reduce the time-consumption cost while preserving a cal notations used in this paper are listed in Table 1.
good clustering performance, a novel scalable fuzzy cluster-
ing algorithm is proposed to deal with large scale data, 2.2 Generalization of Fuzzy Clustering
referred to as Scalable Fuzzy Clustering with Anchor Graph Let X ¼ fx1 ; x2 ; . . . ; xn gT 2 Rnd be a dataset. The goal of
(SFCAG). The proposed SFCAG attempts to address the fuzzy clustering is to partition X into c clusters with mem-
scalability issue plaguing fuzzy clustering from two per- bership matrix F 2 Rnc and cluster center matrix V ¼
spectives: anchor graph construction and membership fv1 ; v2 ; . . . ; vc g 2 Rdc . The generalized fuzzy clustering can
matrix learning. be formulated by minimizing the following objective function
The main contributions are summarized as follow:
X
n X
c  
 A scalable fuzzy clustering model (SFCAG) by cou- min fijb xi  vj  þ a<ðFÞ ; (1)
F1¼1;F0;vj
i¼1 j¼1
pling anchor graph construction and membership
matrix learning is proposed, which leads to a low
computational complexity and scalable ability for where fij is the probability to show the membership grade
large scale data. of xi belonging to cj and b is an appropriate level of cluster
 A novel trace ratio model for fuzzy clustering is for- fuzziness. <ðFÞ is a regularization term on F to overcome
mulated, which overcomes the problem of parameter overfitting or enhance the sparsity, and a is the regulariza-
selection and improves the clustering performance. tion parameter.
Moreover, an efficient iterative optimization algo- k  k represents a certain vector norm. For instance, FCM
rithm is designed to obtain the optimal solution. uses squared Frobenius-norm. The Robust and Sparse Fuzzy
 The membership matrix of data points can be repre- C-means (RSFCM) [17] adopts l21 -norm and capped l1 -norm
sented as linear combinations of anchors’, which to replace squared Frobenius-norm to enhance its robust
results in that the proposed method has a linear and sparsity. Therefore, previous fuzzy clustering algorithms
computational complexity with data size. such as FCM, RSFCM, Weighted Laplacian Fuzzy Clustering
 Experimental results conducted on synthetic and (WLFC) [41] and Membership Scaling Fuzzy C-means
real world datasets demonstrate that the proposed (MSFCM) [34] can be included in this generalized framework.
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
LIU ET AL.: SCALABLE FUZZY CLUSTERING WITH ANCHOR GRAPH 8505

2.3 Fast Fuzzy Clustering Based on Anchor Graph only the nearest anchor can be the neighbor of xi with
Anchor graph model is a recently proposed graph based similarity 1.
learning model for large scale problems, which constructs a In practice, we prefer to learn a sparse bi , i.e., only the k
similarity matrix to measure the relationship between data nearest neighbors of xi have chance to connect to xi , to
points and anchors by capturing the intrinsic structure of achieve better performance and alleviate computation bur-
data distribution. Then, clustering is performed on this sub- den. Then, according to the method P proposed in [42], g can
tle similarity matrix. be set as g ¼ k2 kxi  akþ1 k22  12 kh¼1 kxi  ah k22 . Thus, the
Inspired by the anchor graph, Nie et al. [40] propose a solution to problem (3) is
Fast Fuzzy Clustering algorithm based on Anchor Graph 8 2 2
< kxi akþ1 k2 kxi aj k2
(FFCAG) to reduce the time-consumption and gives intui- P jk
bij ¼ kkxi akþ1 k22  kh¼1 kxi ah k22 ; : (4)
tive interpretations for their model. Specifically, FFCAG sol- :
ves the following optimization problem 0; others
    It can be easily observed that the constructed anchor graph
max Tr UT BT B  g 1 I  g 2 BT 11T B U ; (2)
U1¼1;U0 B 2 Rnm is sparse. It has much less spurious connections
between dissimilar points and tends to exhibit high quality.
where B is the data-to-anchor similarity matrix and U is the It is also worth noting that the construction method is
membership matrix of anchors. g 1 and g 2 are the regulariza- extremely efficient, having a linear computational complexity
tion parameters. By utilizing the anchor graph model, with data size OðndmÞ, which is helpful to reduce the compu-
FFCAG can able to improve the computational efficiency. tational burden and obtain better clustering performance.
However, it requires more parameters to tune for better per-
formance, which is impractical since the value of parame- 3.2 Membership Matrix Learning
ters may be an arbitrary number. The computational cost of fuzzy clustering mainly comes
Therefore, in this paper, we propose a parameter-free from the full-size membership matrix learning model, since
fuzzy clustering algorithm named scalable fuzzy clustering the number of data size is huge in large scale problems and
with anchor graph, which is simple and scalable, having lin- learning a full-size membership matrix is inefficient. Accord-
ear time complexity with data size. ing to [43], [44], once the membership matrix associated with
anchors can be referred, the membership matrix of data
3 METHODOLOGY points can be obtained by a simple linear combination. There-
fore, we propose to construct a scalable fuzzy clustering
In this section, we elaborate the proposed scalable fuzzy
model by coupling anchor graph and membership matrix of
clustering with anchor graph (SFCAG) for large scale data
anchors.
in detail, which attempts to address the scalability issue
Define U 2 Rmc as the membership matrix of anchors.
plaguing fuzzy clustering from two perspectives: anchor
For an anchor, to achieve a clear clustering division, the dif-
graph construction and membership matrix learning.
ference between membership values belonging to different
clusters should be diverse. Therefore, we obtain the mem-
3.1 Anchor Graph Construction bership matrix of anchors via solving the following problem
In our work, we address the scalability issue through a
 
small number of representative anchors, which can ade- max Tr UT BT BU : (5)
quately cover the intrinsic manifold structure of data points. U1¼1;U0

Therefore, cluster centers are first selected as anchors by


Balanced K-means based Hierarchical K-means (BKHK) However, problem (5) has a trivial solution that all the
proposed in [38] to effectively capture the manifold struc- anchors are partitioned into one cluster. To properly fine-
ture of X. We use A ¼ fa1 ; a2 ; . . . ; am gT 2 Rmd to represent tune the membership values of anchors, we introduce a
the anchor set, and ai 2 Rd is the ith anchor point. membership regularization term into the objective function,
In anchor graph based clustering, the construction of which is defined as follows
anchor graph plays an essential role in clustering perfor-  
mance. Hence, a parameter-free yet effective neighbor assign- min Tr UT 11T U : (6)
U1¼1;U0
ment strategy is adopted to construct the anchor graph
between X and A. For the ith data point xi , all the anchors For better explanation how Eq. (6) solves the problem of
fa1 ; a2 ; . . . ; am g can be connected to xi as a neighbor with a trivial solution, the regularization term on U, TrðUT 11T UÞ,
similarity bij . Generally, a smaller distance kxi  aj k22 should can be written as
be assigned a larger similarity bij . Therefore, the similarity !2
between xi and aj can be obtained by solving the following   Xc X
m  
problem Tr UT 11T U ¼ uij  : (7)
j¼1 i¼1
X m 
n X  
min xi  aj 2 bij þ gb2 : (3) Eq. (7) can be considered as a combination of the l1 -norm
2 ij
bi 1¼1;bi 0 i¼1 j¼1
of the elements in the same column and the l2 -norm on the
l1 -norm of each row. Since l1 -norm tends to obtain a sparse
where g is the regularization parameter. The second term in solution, the construction of Eq. (7) essentially introduces a
Eq. (3) is a regularization to avoid the trivial solution that competition among different clusters for anchors.
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
8506 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 35, NO. 8, AUGUST 2023

Using Eq. (6) as a regularization term, we have the over- Define fðt Þ
all optimization problem written as    
T T f ðt Þ ¼ Tr UTtþ1 BT B  t 11T Utþ1 : (15)
TrðU B BUÞ
max : (8)
U1¼1;U0 TrðUT 11T UÞ
where Utþ1 can be efficiently calculated according to Algo-
rithm 2 described in the next part.
This economical membership matrix learning model Then, we can get the slope of fðÞ at point t
indeed mitigates the computational burden of the full-sized
data points model. After obtaining U, the membership  
f 0 ðt Þ ¼ Tr UTtþ1 11T Utþ1  0 : (16)
matrix of data points can be expressed as F ¼ BU accord-
ing to the similarities between data points and anchors.
We use a liner function gðÞ to approximate the linear
function fðÞ, such that
3.3 Optimization
To solve the challenging formulation in Eq. (8), we propose gðÞ ¼ f 0 ðt Þð  t Þ þ f ðt Þ
to first transform it into an equivalent form for easily solv-    
ing. Then an iterative optimization algorithm is developed ¼ Tr UTtþ1 BT B  11T Utþ1 (17)
to solve the equivalent model.
Suppose that problem (8) obtains maximum value  if
Let gðtþ1 Þ ¼ 0, we have
U ¼ U , that is,
   
Tr UT BT BU Tr UTtþ1 BT BUtþ1
 T T  ¼ ; (9) tþ1 ¼   : (18)
Tr U 11 U Tr UTtþ1 11T Utþ1
and
  Since gðÞ is an approximation of fðÞ, we can update t
Tr UT BT BU
 T T    ; 8U : (10) by tþ1 . Thus, we can find the root of fðÞ ¼ 0 and the opti-
Tr U 11 U mal solution in problem (8).

It can be derived from the two Eqs. (9) and (10)


3.3.2 U-Subproblem
   
max Tr UT BT B 11T U  0 : (11) When  is obtained, the membership matrix of anchors U
U can be calculated as
According to Eq. (9), we have TrðUT ðBT B   11T ÞU Þ ¼    
max Tr UT BT B11T U : (19)
0. Then, we can obtain U1¼1;U0

   
max Tr UT BT B 11T U ¼ 0 : (12) Define M ¼ 11T  BT B, problem (19) can be written as
U
 
Define the function min Tr UT MU : (20)
U1¼1;U0
  
f ðÞ ¼ max Tr UT BT B11T U : (13)
U Problem (20) is a Quadratic Programming problem,
which can be efficiently solved by the Augmented Lagrang-
Obviously, fð Þ ¼ 0. Therefore, we can obtain the opti- ian Multiplier (ALM) method [45], [46]. According to ALM,
mal  by finding the root of equation fðÞ ¼ 0. In the next a slack variable P is introduced and then problem (20) is
two parts, we develop an iterative algorithm to efficiently equivalently transformed as
find the optimal solution  . The detailed procedure of the
 
proposed SFCAG is described in Algorithm 1. min Tr UT MP : (21)
U1¼1;U0;U¼P

Algorithm 1. The Algorithm to Solve Problem (8)


The optimal solution of problem (21) can obtained by
Input: Similarity matrix B 2 Rnm . minimizing the following augmented Lagrangian function
Output: Membership matrix of anchors U 2 Rmc .
 2
1: while not converged do   m 1 
min Tr UT MP þ  U  P þ S  ; (22)
2 m F
T T
B BUÞ
2: Calculate  ¼ TrðU
TrðUT 11T UÞ
. U1¼1;U0;P
3: Calculate U according to Algorithm 2.
4: end while where m is penalty parameter and S is the Lagrange multi-
plier matrix. There are two variable in problem (22), we iter-
atively optimize one variable by fixing another, which leads
3.3.1 -Subproblem to the following two-subproblem.
Suppose for a Ut , t can be calculated by Updating P when fixing U. problem (22) becomes
   2
Tr UTt BT BUt  m 1 
t ¼  T T  : (14) min Tr U MP þ U  P þ S

T
: (23)
Tr Ut 11 Ut P 2 m F
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
LIU ET AL.: SCALABLE FUZZY CLUSTERING WITH ANCHOR GRAPH 8507

Taking the derivative of Eq. (23) w.r.t. P, we can obtain That is, tþ1  t ,  is monotonically increasing in
each iteration. Therefore, Algorithm 1 monotonically
1  increases the objective function value of problem (8) in
P¼ mU þ S  MT U : (24)
m each iteration until converge. u
t

Updating U when fixing P . problem (22) becomes 3.5 Computational Complexity


 2 The computational complexity of SFCAG contains three
 T  m 1 
min Tr U MP þ U  P þ S
 : (25) stages: 1) Anchors generation by BKHK proposed in [38],
U1¼1;U0 2 m F which needs OðndlogðmÞt1 Þ, where t1 is the iterative number
of BKHK. 2) Anchor-based similarity graph construction,
Let Z ¼ MP, problem (25) is equivalent to which costs Oðndm þ nmlogðmÞÞ. 3) Membership matrix
learning. We iteratively update  and U. It needs OðmncÞ
X
m X
c
mX m X c
1 2
and Oðm2 ct2 Þ to update  and U, respectively. As a result,
min uij zij þ uij  pij þ Sij : (26)
ui 1¼1;ui 0 i¼1 j¼1
2 i¼1 j¼1 m the third part costs Oððmnc þ m2 ct2 ÞÞ, where t2 is the itera-
tive number of ALM algorithm.
Problem (26) is independent of each other, hence, we Therefore, the overall computational complexity of SFCAG
divide problem (26) into m subproblems as follows is OðndlogðmÞt1 þ ndm þ nmlogðmÞ þ ðmnc þ m2 ct2 ÞÞ. Con-
sidering that m n, c m, t1 and t2 are usually pretty small,
2X c Xc
1 2 the overall computational complexity of SFCAG can be calcu-
min uij zij þ uij  pij þ Sij : (27) lated as OðndmÞ, which has a linear computational complex-
ui 1¼1;ui 0 m j¼1 j¼1
m
ity with respect to data size.
which can be transformed into a more compact form
4 EXPERIMENTS
 2
 i 1 i 1 i 
min  u  p  m S  m z
i  :
 (28) In this section, we show experimental results conducted on
ui 1¼1;ui 0 2 the synthetic and real word datasets to investigate the effec-
tiveness of the proposed method.
We use the method mentioned in [47] to solve problem
(28) to update the membership vector, which can be solved 4.1 Datasets
with a closed-form solution. The detail of the algorithm to To evaluate the performance of the proposed SFCAG, we
solve problem (19) is listed in Algorithm 2. conduct experiments on eight widely used datasets: Cars,
Ecoli, Waveform, Connect-4, SensIT, MNIST2, Covertype
Algorithm 2. The Algorithm to Solve Problem (19) and EMNIST. Among them, Cars, Ecoli, Waveform and Con-
Input:Similarity matrix B 2 Rnm , m > 0, 1 < r < 2, S. nect-4 datasets are downloaded from UCI machine learning
Output:Membership matrix of anchors U 2 Rmc . repository1; SensIT and Covertype datasets are downloaded
1: while not converged do from LibSVM datasets page2; MNIST2 [48] and EMNIST [49]
2: Update P by Eq. (24). are handwritten digit image datasets. To verify the scalabil-
3: Update U by Eq. (28). ity of the proposed method, according to data size, we divide
4: Update S by S ¼ S þ mðU  PÞ. these datasets into three groups: small datasets (Cars, Ecoli,
5: Update m by m ¼ rm. and Waveform), medium datasets (Connect-4, SensIT and
6: end while MNIST2) and large datasets (Covertype and EMNIST). The
brief description of these datasets is listed in Table 2.

3.4 Convergence Analysis 4.2 Comparison Methods


We prove the convergence of the proposed method. We compare the proposed SFCAG with KM [7], some fuzzy
Theorem 1. Algorithm 1 monotonically increases the objective clustering methods, i.e., FCM [15], RSFCM [17], MSFCM
function value of problem (8) in each iteration until converge. [34], several anchor graph based large scale clustering meth-
ods, i.e., LSC [37], FSC [38], FFCAG [40], and deep cluster-
Proof. ing methods, i.e., PLRSC [50], DFKM [20].
   
Tr UTt BT BUt Tr UT BT BU  KM is the simplest and most widely used hard clus-
t ¼  T T   max  T T  : (29) tering method, which assigns each sample to a defi-
Tr Ut 11 Ut U Tr U 11 U
nite cluster.
 FCM is a pioneering work for fuzzy clustering,
From Eq. (16), we know that fðÞ is a monotonically
which designs a membership matrix from 0 to 1 to
decreasing function and fðt Þ  0. According to Eq. (15),
indicate which cluster the data point belongs to.
we have
 RSFCM is a robust and sparse fuzzy c-means method
  by introducing a robust function to handle outliers
Tr UTtþ1 BT BUtþ1
   t : (30)
Tr UTtþ1 11T Utþ1 1. http://archive.ics.uci.edu/ml/datasets.php
2. https://www.csie.ntu.edu.tw/ cjlin/libsvm/
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
8508 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 35, NO. 8, AUGUST 2023

TABLE 2 neighbors k is set to 15 and the number of anchors is


Datasets Description searched in the range of ½26 ; 27 ; 28 ; 29 ; 210 ; 211  for better per-
formance. For FFCAG, the range of g 1 is in ½0:1; 1 every 0.1
Size Datasets # of Instances # of Features # of Clusters
step and the optimal value of g 2 is set by the grid search
Cars 392 8 3 method in the rang of ½1e  6; 1e  5; 1e  4; 1e  3; 1e  2.
Small Ecoli 336 343 8 Since PLRSC and DFKM are based on deep encoder, we
Waveform 2749 21 3
group them as “Deep clustering methods”. There are six
Connect-4 67557 126 3 tuning hyperparameters in PLRSC and DFKM, we search
Medium SensIT 98528 100 3 the optimal parameters via the similar patterns recoded in
MNIST2 12000 256 2 the related papers. To offset the influence of randomness,
Covertype 581012 54 7 all methods are run for 20 times and the means are reported.
Large
EMNIST 630000 86 10 All the experiments are run on a windows 10 machine
with an Inter (R) Core (TM) i5-8265 1.6 GHZ CPU, and 8 GB
main memory. The codes of deep clustering methods are
and a penalty function to make the membership of implemented under Pytorch 1.11.0 with Python 3.9 and the
data points having sparseness. codes of the other methods are implemented under MAT-
 MSFCM is a membership scaling fuzzy c-means LAB R2018b.
algorithm, which draws the triangle inequality into
fuzzy clustering to accelerate the convergence of the 4.4 Experimental Results
algorithm. In this section, we evaluate the clustering performance of the
 LSC is an anchor-based scalable spectral clustering proposed algorithm on synthetic and real world datasets.
approach, which represents data points as sparse lin-
ear combinations of anchors such that the spectral
embedding can be efficiently computed. 4.4.1 Experiments on Synthetic Dataset
 FSC is a fast spectral clustering method for large The synthetic dataset we used is a two-moon shape dataset.
scale data, which can efficiently performed spectral It has 200 data points, which is divided into two clusters
analysis on the anchor based similarity graph. with same number. The qualitative evaluation of clustering
 FFCAG is a fast fuzzy clustering algorithm based on results is visualized in Fig. 1. Moreover, We provide the
anchor graph, which integrates anchor graph con- quantitative evaluation of clustering results of all methods,
struction and membership learning into a unified which is reported in Table 3.
framework. From Fig. 1, we can clearly see the results obtained by the
 PLRSC is a projective low-rank subspace clustering proposed method in each stage. In Fig. 1a, the two clusters
method based on a non-iterative deep encoder to are presented with cyan and blue color points. The red
quickly calculating the low-rank coding subspaces points in Fig. 1b are selected representative anchors, which
for large scale clustering problem. can effectively capture the intrinsic distribution of data
 DFKM is a deep fuzzy k-means with adaptive loss points. The constructed anchor graph is exhibited in Fig. 1c.
function and entropy regularization to simulta- Several pairs of data points and anchors from different clus-
neously perform deep feature extraction and fuzzy ters are connected. Figs. 1d and 1e portray the membership
clustering. value of anchors and data points learned from the proposed
method, respectively. The clustering result is depicted in
4.3 Evaluation Metrics and Parameters Setting Fig. 1f. Although several pairs of data points and anchors
To quantify the clustering performance of the proposed from different clusters are connected, the proposed method
method, four widely used evaluation metrics, i.e., clustering successfully partitions data points into two clusters. More-
Accuracy (ACC), Normalized Mutual Information (NMI), over, clustering performance of our method is also superior
Adjusted Rand Index (ARI) and F-score are adopted. A larger to other methods, as reported in Table 3, which demon-
value indicates a better clustering performance. More detailed strates the effectiveness of the proposed method.
information about these metrics can be referred to [51].
In our experiments, the number of clusters is set accord- 4.4.2 Experiments on Real World Datasets
ing to the ground-truth labels. We group these three meth- The experimental results in terms of ACC, NMI, ARI and F-
ods (FCM, RSFCM and MSFCM) as “fuzzy clustering score on small datasets, medium datasets and large datasets
method”. For FCM and MSFCM, there is one parameter, the are shown in Tables 4, 5, and 6, respectively. The best clus-
fuzziness weighting exponent b, we set it to 1.2. For RSFCM, tering results are highlighted in boldface, and the second
there are two parameters: the regularization parameter  best results are underlined.
and threshold value . The optimal value of the regulariza- From these Tables, we have the following observations:
tion parameter  is researched in the range of ½0:1; 10 every
0.5 step and the threshold value  is set to 1 according to the 1) The proposed SFCAG is superior to baselines for any
paper. Since LSC, FSC, FFCAG and the proposed method data size in most cases. The better clustering perfor-
are all based on anchor graph, we group them as “Anchor mance is beneficial from the fact that the proposed
Graph based methods”. There are two common parameters SFCAG integrates anchor graph construction and
for these methods, the number of nearest neighbors k and membership matrix learning into a unify framework,
the number of anchors m. Therefore, the number of nearest which not only helps to reduce the computational
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
LIU ET AL.: SCALABLE FUZZY CLUSTERING WITH ANCHOR GRAPH 8509

Fig. 1. Visualization of the clustering results of the proposed SFCAG on the two-moon dataset.

TABLE 3
Clustering Performance Comparisons of Different Methods on the Synthetic Dataset

Methods KM FCM RSFCM MSFCM PLRSC DFKM LSC FSC FFCAG SFCAG
ACC 0.8300 0.8200 0.8500 0.8200 0.9400 0.8900 0.9480 0.9775 1 1
NMI 0.3425 0.3199 0.3904 0.3199 0.7024 0.5006 0.7716 0.8871 1 1
ARI 0.4328 0.4066 0.4874 0.4066 0.7733 0.6064 0.8147 0.9148 1 1
F-score 0.7150 0.7018 0.7425 0.7018 0.8860 0.7967 0.9085 0.9554 1 1

cost but also obtains better performance. Since these robustness of the algorithm. Therefore, the proposed
datasets are from different scenarios, the clustering method can achieve better performance.
results clearly reveal that the proposed SFCAG is an 4) The anchor graph based clustering methods can
effective and promising clustering algorithm with always obtain better performance than others. This
good scalability in different domains. phenomenon illustrates that the anchor graph can
2) SFCAG obtains best performance on most datasets. well capture the essence structure of complicates
For Covertype dataset, it is an extremely unbalanced data, which tends to enhance the clustering perfor-
dataset, the clustering results of most comparison mance. It also shows that it is effective and feasible
methods will have empty clusters and harm the clus- to introduce anchor graph technique into fuzzy clus-
tering quality. Nevertheless, FFCAG introduces a tering to improve its scalability.
balanced regularization to constraint the size of each 5) Deep clustering methods have poor performance on
cluster and obtains the best clustering performance. small datasets, which may be caused by over-fitting.
Even so, the proposed method still achieves the sec- For medium and large datasets, DFKM performs bet-
ond best clustering result. ter than PLRSC, since DFKM conducts deep features
3) For anchor graph based methods, FSC and LSC, do extraction and clustering simultaneously, while
not perform as well as the proposed model, since PLRSC first learns a low-ran presentation by deep
they need a post-processing to obtain the clustering encoder and then adopts LSC to segment the repre-
assignment, which may case information loss and sentation. Each independent stage inevitably produ-
poor clustering performance. The clustering results ces information loss and thus unreliability of
of FFCAG are often only next to the proposed SFCAG. clustering assignment.
The reason may be that both FFCAG and the pro- 6) KM and FCM always have poor clustering perfor-
posed method adopt anchor graph technique to bal- mance since they are dependent on the initialization
ance clustering performance and computation cost. of the clustering centers or fuzzy membership matrix
Whereas FFCAG has two parameters that need to be and only can be performed on convex samples. while
manually tuned, the proposed SFCAG is parameter- MSFCM performs better due to that it introduces the
free. It avoids not only the laborious parameter tun- triangle inequality to scale the membership degrees
ing process but also the influence of parameter on the of the selected samples to enhance the clustering
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
8510 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 35, NO. 8, AUGUST 2023

TABLE 4
Clustering Performance Comparison on Three Small Datasets

Fuzzy clustering Deep clustering Anchor graph based clustering


Dataset Metric
KM FCM RSFCM MSFCM PLRSC DFKM LSC FSC FFCAG SFCAG
ACC 0.4474 0.4566 0.4898 0.4577 0.5572 0.5791 0.5218 0.4872 0.6888 0.6888
NMI 0.1947 0.1862 0.2099 0.1899 0.2413 0.2233 0.2039 0.2011 0.2773 0.2923
Cars
ARI 0.0722 0.0821 0.1351 0.0829 0.2052 0.1778 0.1685 0.1515 0.3314 0.3384
F-score 0.4485 0.4494 0.4729 0.5273 0.5264 0.5504 0.4913 0.5704 0.6643 0.6643
ACC 0.5399 0.6364 0.6637 0.6256 0.5061 0.6399 0.5352 0.6369 0.7530 0.7619
NMI 0.4889 0.3041 0.5043 0.2624 0.3525 0.4481 0.4876 0.5133 0.5311 0.5622
Ecoli
ARI 0.4013 0.4289 0.5456 0.4003 0.3390 0.5347 0.3629 0.4972 0.6799 0.6986
F-score 0.5215 0.6107 0.6707 0.6073 0.5062 0.5678 0.4723 0.6513 0.7796 0.7907
ACC 0.5114 0.5022 0.6449 0.5089 0.5693 0.6777 0.5572 0.5496 0.7541 0.7633
NMI 0.3624 0.3296 0.3175 0.3411 0.3815 0.4074 0.3812 0.3718 0.4239 0.4538
Waveform
ARI 0.2540 0.2423 0.3790 0.2476 0.2754 0.3634 0.2721 0.2572 0.4174 0.4452
F-score 0.5041 0.4960 0.6241 0.4987 0.5185 0.5883 0.5149 0.5042 0.6267 0.6308

TABLE 5
Clustering Performance Comparison on Three Medium Datasets

Fuzzy clustering Deep clustering Anchor graph based clustering


Dataset Metric
KM FCM RSFCM MSFCM PLRSC DFKM LSC FSC FFCAG SFCAG
ACC 0.3762 0.4095 0.3848 0.4032 0.6389 0.6442 0.3683 0.4912 0.4802 0.5412
NMI 0.0016 0.0006 0.0004 0.0005 0.0007 0.0001 0.0003 0.0009 0.0012 0.0018
Connect-4
ARI 0.0004 0.0002 0.0006 0.0005 0.0023 0.0017 0.0001 0.0016 0.0035 0.0041
F-score 0.4079 0.4059 0.4161 0.4166 0.6505 0.6567 0.4077 0.5040 0.5022 0.5424
ACC 0.6835 0.6915 0.7064 0.7105 0.7018 0.7013 0.6742 0.6644 0.6913 0.7352
NMI 0.3021 0.3088 0.3815 0.3139 0.3193 0.2927 0.3078 0.3031 0.3249 0.3421
SensIT
ARI 0.3634 0.3766 0.3831 0.3879 0.3647 0.3404 0.3365 0.3088 0.3591 0.3952
F-score 0.6006 0.6096 0.5940 0.6108 0.5979 0.5534 0.5843 0.5579 0.5935 0.6168
ACC 0.9438 0.9417 0.9552 0.9379 0.9575 0.9696 0.9662 0.9482 0.9621 0.9833
NMI 0.7090 0.7025 0.7442 0.6900 0.7193 0.8110 0.7892 0.7186 0.7874 0.8807
MNIST2
ARI 0.7879 0.7803 0.8278 0.7671 0.7647 0.8820 0.8946 0.8035 0.8451 0.9344
F-score 0.8944 0.8907 0.9141 0.8842 0.9479 0.9592 0.9348 0.8994 0.9271 0.9672

TABLE 6
Clustering Performance Comparison on two Large Datasets

Fuzzy clustering Deep clustering Anchor graph based clustering


Dataset Metric
KM FCM RSFCM MSFCM PLRSC DFKM LSC FSC FFCAG SFCAG
ACC 0.2506 0.2414 0.2302 0.2500 0.3739 0.4362 0.2239 0.2709 0.4638 0.4530
NMI 0.0617 0.0652 0.0618 0.0638 0.0364 0.0501 0.0639 0.0700 0.0446 0.0483
Covertype
ARI 0.0045 0.0045 0.0098 0.0002 0.0436 0.0014 0.0181 0.0139 0.0202 0.0106
F-score 0.2429 0.2318 0.2231 0.2389 0.4115 0.4116 0.2264 0.3940 0.4381 0.4151
ACC 0.4720 0.3218 0.4635 0.4029 0.5404 0.4052 0.5785 0.5853 0.6652 0.6989
NMI 0.4023 0.1974 0.4003 0.2718 0.5541 0.2491 0.5578 0.5864 0.6078 0.6316
EMNIST
ARI 0.2954 0.1817 0.2894 0.2289 0.4211 0.2853 0.4519 0.4567 0.5346 0.5510
F-score 0.3671 0.2907 0.3616 0.3335 0.4871 0.3093 0.5098 0.5214 0.5875 0.6028

quality. RSFCM adopts l1 -norm instead of squared Waveform, SensIT and Covertype datasets. The experimen-
Frobenius norm as loss function to enhance robust- tal results are presented in Fig. 2. Since FCM, RSFCM and
ness to the outliers. MSFCM have no parameter to tune, their ACC and running
time remain constant, which are indicated by the green, yel-
4.5 Parameter Sensitivity Analysis low and magenta dashed lines, respectively.
In this subsection, we report a detail parameter analysis From the Fig. 2, we can observe that SFCAG can achieve
about the number of anchors m in the range of ½25 ; 26 ; the best performance in terms of ACC when the number of
27 ; 28 ; 29 ; 210  from the aspects of ACC and running time on anchors increases. When the number of anchors is small, the
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
LIU ET AL.: SCALABLE FUZZY CLUSTERING WITH ANCHOR GRAPH 8511

Fig. 2. Clustering ACC and running time versus the number of selected anchors on Waveform, SensIT and Covertype datasets.

Fig. 3. Running time comparison of each method on eight real datasets.

performance of all anchor graph based methods is poor since  The running time of SFCAG is more advantageous
the selected anchors cannot effectively represent the intrinsic than compared methods on medium and large data-
structure of data points. When the number of anchors is 27 , sets. Specifically, the time cost of SFCAG is much
SFCAG can obtain better or comparable ACC compared with less than that of fuzzy clustering methods (FCM,
other methods. In addition, as a general trend, it can be seen RSFCM, and MSFCM), while the time cost of SFCAG
that the running time of the graph-based methods become is comparable to that of anchor graph based methods
larger and larger as the number of anchors increases. Even so, (LSC, FSC, and FFCAG). Although the running time
the running time of the proposed method on large dataset is of the proposed method is larger on small datasets, it
much lower than other fuzzy clustering methods. From these can obtain better clustering performance.
findings, we can conclude that the proposed method SFCAG  For small datasets, we can observe that fuzzy cluster-
can achieve an elegant balance between computational bur- ing methods often perform more efficiently, as
den and clustering performance. shown in Figs. 3a, 3b, and 3c. The reason may be that
the distance calculation amount is small since data
4.6 Computational Efficiency points involved in updating the solution per itera-
We now evaluate the computational efficiency of the pro- tion are small, which leads to a high efficiency.
posed SFCAG in terms of running time and scalability.  For medium and large datasets, anchor graph based
methods are much faster than fuzzy clustering meth-
ods, as shown in Figs. 3g and 3h. The reason is that the
4.6.1 Running Time number of selected anchors is far less than data points,
The running time of different methods on all datasets is which greatly reduces the computational cost and
illustrated in Fig. 3. From this Figure, we have the following accelerates the subsequent clustering procedure. We
conclusions: can see that the SFCAG achieves the best performance
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
8512 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 35, NO. 8, AUGUST 2023

TABLE 7 4.7 Convergence Analysis


Computational Complexity of all Comparison Methods We have theoretically proved that the value of objective
function is monotonically increasing in Section 3.4. In this
FCM RSFCM
Method
MSFCM
LSC FSC FFCAG SFCAG part, the convergence curves of our model on the eight
Complexity OðndctÞ Oðnm2 Þ OðndmÞ OðndmÞ OðndmÞ datasets are shown in Fig. 5. For each sub-figure, the x-axis
is the number of iterations and the y-axis is the objective
n: number of data points; m: number of anchors; d: number of dimensions; c: function value. It can be observed that the proposed
number of clusters; t: number of iterations. method can converge fast on all datasets. Specifically, it
converges within five iterations, which indicates the effi-
in most cases, where the running time of our SFCAG is ciency of our method by introducing anchor-based graph
only inferior to FFCAG on Connect-4 dataset. into fuzzy clustering.
Additionally, computational complexity of all compari-
son methods are reported in Table 7 to reveal the efficiency
of the proposed method and support the experiment results
5 CONCLUSION
shown in Fig. 3. It can be seen from Fig. 3 and Table 7, the In this paper, we propose a scalable fuzzy clustering algo-
running time is roughly consistent with the computational rithm, in which anchor graph and membership matrix are
complexity. learned in succession to address the scalability issue plagu-
ing fuzzy clustering. We adopt a parameter-free yet effec-
tive neighbor assignment strategy to construct anchor
4.6.2 Scalability graph to capture the similarities between data points and
To further verify the efficiency of the proposed algorithm, anchors. A maximized trace ratio model is design by cou-
we conduct a scalability experiment on two datasets (SensIT pling anchor graph and membership matrix to learn the
and Covertype). The two datasets are divided into 10 bal- membership matrix of anchors. Moreover, an iterative opti-
anced subsets, respectively. Then, we evaluate the running mization algorithm is designed to solve the proposed
time of SFCAG by increasing the number of subsets and model. As a result, SFCAG scales linearly with the data size,
perform a polynomial fit. The results are displayed in Fig. 4. which can achieve an elegant balance between computa-
From Fig. 4, we can draw a conclusion that the test results tional burden and clustering performance. The experimen-
on each dataset satisfy a linear polynomial distribution and tal results performed on synthetic and real datasets
the time complexity of our algorithm is around OðnÞ as dis- demonstrate the superiority and scalability of the proposed
cussed in Section 3.5. SFCAG model.

Fig. 4. Scalability evaluation on SensIT and Covertype datasets.

Fig. 5. The convergence curves of our method on real datasets. In general, the objective function of each dataset converges less than 5 iterations.
Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
LIU ET AL.: SCALABLE FUZZY CLUSTERING WITH ANCHOR GRAPH 8513

Several questions remain to be investigated in the future. [22] H. Wei, L. Chen, and L. Guo, “Kl divergence-based fuzzy cluster
ensemble for image segmentation,” Entropy, vol. 20, no. 4, 2018,
 How to generate anchors to effectively capture the Art. no. 273.
[23] M.-S. Yang and Y. Nataliani, “A feature-reduction fuzzy cluster-
manifold structure of data points is important. There- ing algorithm based on feature-weighted entropy,” IEEE Trans.
fore, more elegant methods need to be designed to Fuzzy Syst., vol. 26, no. 2, pp. 817–835, Apr. 2018.
select the optimal representative anchors in the future. [24] S. Eschrich, J. Ke, L. O. Hall, and D. B. Goldgof, “Fast accurate
fuzzy clustering through data reduction,” IEEE Trans. Fuzzy Syst.,
 Designing a good anchor graph to more accurately vol. 11, no. 2, pp. 262–270, Apr. 2003.
represent the similarities between data points and [25] M. B. Al-Zoubi, A. Hudaib, and B. Al-Shboul, “A fast fuzzy clus-
anchors is also a good direction of future work. tering algorithm,” in Proc. 6th WSEAS Int. Conf. Artif. Intell. Knowl.
Eng. Data Bases, 2007, pp. 28–32.
[26] J. K. Parker and L. O. Hall, “Accelerating fuzzy-c means using an
REFERENCES estimated subsample size,” IEEE Trans. Fuzzy Syst., vol. 22, no. 5,
[1] J. Gu, L. Jiao, S. Yang, and F. Liu, “Fuzzy double c-means cluster- pp. 1229–1244, Oct. 2014.
ing based on sparse self-representation,” IEEE Trans. Fuzzy Syst., [27] R. J. Hathaway and Y. Hu, “Density-weighted fuzzy c-means
vol. 26, no. 2, pp. 612–626, Apr. 2017. clustering,” IEEE Trans. Fuzzy Syst., vol. 17, no. 1, pp. 243–252,
[2] T. Lei, X. Jia, Y. Zhang, L. He, H. Meng, and A. K. Nandi, Feb. 2009.
“Significantly fast and robust fuzzy C-means clustering algorithm [28] I. A. Atiyah, A. Mohammadpour, and S. M. Taheri, “KC-means: A
based on morphological reconstruction and membership filter- fast fuzzy clustering,” Adv. Fuzzy Syst., vol. 2018, pp. 1–8, 2018.
ing,” IEEE Trans. Fuzzy Syst., vol. 26, no. 5, pp. 3027–3041, Oct. [29] K. Zou, Z. Wang, and M. Hu, “An new initialization method for
2018. fuzzy c-means algorithm,” Fuzzy Optim. Decis. Mak., vol. 7, no. 4,
[3] I.-J. Chiang, C. C.-H. Liu, Y.-H. Tsai, and A. Kumar, “Discovering pp. 409–416, 2008.
latent semantics in web documents using fuzzy clustering,” IEEE [30] Q. Yang, D. Zhang, and F. Tian, “An initialization method for
Trans. Fuzzy Syst., vol. 23, no. 6, pp. 2122–2134, Dec. 2015. fuzzy c-means algorithm using subtractive clustering,” in Proc.
[4] J.-P. Mei, Y. Wang, L. Chen, and C. Miao, “Large scale document 3th Int. Conf. Intell. Netw. Intell. Syst., 2010, pp. 393–396.
categorization with fuzzy clustering,” IEEE Trans. Fuzzy Syst., [31] Z. Cebeci, “Initialization of membership degree matrix for fast
vol. 25, no. 5, pp. 1239–1251, Oct. 2017. convergence of fuzzy c-means clustering,” in Proc. Int. Conf. on
[5] L. Hu and K. C. Chan, “Fuzzy clustering in a complex network Artif. Intell. Data Process., 2018, pp. 1–5.
based on content relevance and link structures,” IEEE Trans. Fuzzy [32] Z. Cebeci and C. Cebeci, “A fast algorithm to initialize cluster
Syst., vol. 24, no. 2, pp. 456–470, Apr. 2015. centroids in fuzzy clustering applications,” Information, vol. 11,
[6] A. Pister, P. Buono, J.-D. Fekete, C. Plaisant, and P. Valdivia, no. 9, 2020, Art. no. 446.
“Integrating prior knowledge in mixed-initiative social network [33] Y. Shen, W. Pedrycz, Y. Chen, X. Wang, and A. Gacek,
clustering,” IEEE Trans. Vis. Comput. Graphics, vol. 27, no. 2, “Hyperplane division in fuzzy c-means: Clustering big data,”
pp. 1775–1785, Feb. 2021. IEEE Trans. Fuzzy Syst., vol. 28, no. 11, pp. 3032–3046, Nov. 2020.
[7] J. MacQueen et al., “Some methods for classification and analysis [34] S. Zhou, D. Li, Z. Zhang, and R. Ping, “A new membership scaling
of multivariate observations,” in Proc. 5th Berkeley Symp. Math. fuzzy C-means clustering algorithm,” IEEE Trans. Fuzzy Syst.,
Statist. Probability, 1967, pp. 281–297. vol. 29, no. 9, pp. 2810–2818, Sep. 2020.
[8] J. Ye, Z. Zhao, and M. Wu, “Discriminative K-means for clustering,” [35] M. Wang, W. Fu, S. Hao, D. Tao, and X. Wu, “Scalable semi-super-
in Proc. Adv. Neural Informat. Process. Syst., 2007, pp. 1649–1656. vised learning by efficient anchor graph regularization,” IEEE
[9] C. Boutsidis, A. Zouzias, M. W. Mahoney, and P. Drineas, Trans. Knowl. Data Eng., vol. 28, no. 7, pp. 1864–1877, Jul. 2016.
“Randomized dimensionality reduction for K-means clustering,” [36] H. Hu, K. Wang, C. Lv, J. Wu, and Z. Yang, “Semi-supervised
IEEE Trans. Inf. Theory, vol. 61, no. 2, pp. 1045–1062, Feb. 2015. metric learning-based anchor graph hashing for large-scale image
[10] X. Shen, W. Liu, I. Tsang, F. Shen, and Q.-S. Sun, “Compressed K- retrieval,” IEEE Trans. Image Process., vol. 28, no. 2, pp. 739–754,
means for large-scale clustering,” in Proc. 31th AAAI Conf. Artif. Feb. 2019.
Intell., 2017, pp. 2527–2533. [37] D. Cai and X. Chen, “Large scale spectral clustering via landmark-
[11] U. Von Luxburg, “A tutorial on spectral clustering,” Statist. Com- based sparse representation,” IEEE Trans. Cybern., vol. 45, no. 8,
put., vol. 17, no. 4, pp. 395–416, 2007. pp. 1669–1680, Aug. 2014.
[12] Y. Pang, J. Xie, F. Nie, and X. Li, “Spectral clustering by joint spec- [38] W. Zhu, F. Nie, and X. Li, “Fast spectral clustering with efficient
tral embedding and spectral rotation,” IEEE Trans. Cybern., large graph construction,” in Proc. IEEE Int. Conf. Acoust. Speech
vol. 50, no. 1, pp. 247–258, Jan. 2020. Signal Process., 2017, pp. 2492–2496.
[13] Z. Wang, Z. Li, R. Wang, F. Nie, and X. Li, “Large graph clustering [39] H. Hu, R. Wang, F. Nie, X. Yang, and W. Yu, “Fast unsupervised fea-
with simultaneous spectral embedding and discretization,” IEEE ture selection with anchor graph and l_2, 1-norm regularization,”
Trans. Pattern Anal. Mach. Intell., vol. 43, no. 12, pp. 4426–4440, Multimedia Tools Appl., vol. 77, no. 17, pp. 22099–22E113, 2018.
Dec. 2020. [40] F. Nie, C. Liu, R. Wang, Z. Wang, and X. Li, “Fast fuzzy clustering
[14] L. Zadeh, “Fuzzy sets,” Inf. Control, vol. 8, no. 3, pp. 338–353, 1965. based on anchor graph,” IEEE Trans. Fuzzy Syst., vol. 30, no. 7,
[15] J. C. Bezdek, R. Ehrlich, and W. Full, “FCM: The fuzzy c-means clus- pp. 2375–2387, Jul. 2022.
tering algorithm,” Comput. Geosci.s, vol. 10, no. 2/3, pp. 191–203, [41] A. Guillon, M.-J. Lesot, and C. Marsala, “Laplacian regularization
1984. for fuzzy subspace clustering,” in Proc. IEEE Int. Conf. Fuzzy Syst.,
[16] S. Krinidis and V. Chatzis, “A robust fuzzy local information c- 2017, pp. 1–6.
means clustering algorithm,” IEEE Trans. Image Process., vol. 19, [42] F. Nie, X. Wang, and H. Huang, “Clustering and projected cluster-
no. 5, pp. 1328–1337, May 2010. ing with adaptive neighbors,” in Proc. 20th ACM SIGKDD Int.
[17] J. Xu, J. Han, K. Xiong, and F. Nie, “Robust and sparse fuzzy k- Conf. Knowl. Discov. Data Mining, 2014, pp. 977–986.
means clustering,” in Proc. 25th Int. Joint Conf. Artif. Intell., 2016, [43] X. Zhu and J. Lafferty, “Harmonic mixtures: Combining mixture
pp. 2224–2230. models and graph-based methods for inductive and scalable
[18] Z. Bian, H. Ishibuchi, and S. Wang, “Joint learning of spectral clus- semi-supervised learning,” in Proc. 22nd Int. Conf. Mach. Learn.,
tering structure and fuzzy similarity matrix of data,” IEEE Trans. 2005, pp. 1052–1059.
Fuzzy Syst., vol. 27, no. 1, pp. 31–44, Jan. 2019. [44] W. Liu, J. He, and S.-F. Chang, “Large graph construction for scal-
[19] L. Guo, L. Chen, X. Lu, and C. P. Chen, “Membership affinity able semi-supervised learning,” in Proc. 27th Int. Conf. Mach.
lasso for fuzzy clustering,” IEEE Trans. Fuzzy Syst., vol. 28, no. 2, Learn., 2010, pp. 679–686.
pp. 294–307, Feb. 2020. [45] M. R. Hestenes, “Multiplier and gradient methods,” J. Optim. The-
[20] R. Zhang, X. Li, H. Zhang, and F. Nie, “Deep fuzzy k-means with ory Appl., vol. 4, no. 5, pp. 303–320, 1969.
adaptive loss and entropy regularization,” IEEE Trans. Fuzzy [46] D. P. Bertsekas, Constrained optimization and Lagrange Multiplier
Syst., vol. 28, no. 11, pp. 2814–2824, Nov. 2020. Methods. New York, NY, USA: Academic Press, 2014.
[21] L. Chen, C. P. Chen, and M. Lu, “A multiple-kernel fuzzy c-means [47] J. Huang, F. Nie, and H. Huang, “A new simplex sparse learning
algorithm for image segmentation,” IEEE Trans. Syst., Man, model to measure data similarity for clustering,” in Proc. 24th Int.
Cybern., vol. 41, no. 5, pp. 1263–1274, Oct. 2011. Joint Conf. Artif. Intell., 2015, pp. 3569–3575.

Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.
8514 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 35, NO. 8, AUGUST 2023

[48] L. Deng, “The MNIST database of handwritten digit images for Rong Wang received the BS degree in informa-
machine learning research [best of the web],” IEEE Signal Process. tion engineering, the MS degree in signal and
Mag., vol. 29, no. 6, pp. 141–142, Nov. 2012. information processing, and the PhD degree in
[49] G. Cohen, S. Afshar, J. Tapson, and A. Van Schaik, “EMNIST: computer science from Xian Research Institute of
Extending MNIST to handwritten letters,” in Proc. Int. Joint Conf. Hi-Tech, Xian, China, in 2004, 2007 and 2013,
Neural Netw., 2017, pp. 2921–2926. respectively. During 2007 and 2013, he received
[50] J. Li and H. Liu, “Projective low-rank subspace clustering via the PhD degree in the Department of Automation,
learning deep encoder,” in Proc. 26th Int. Joint Conf. Artif. Intell., Tsinghua University, Beijing, China. He is cur-
2017, pp. 2145–2151. rently an associate professor at the School of
[51] K. Zhan, F. Nie, J. Wang, and Y. Yang, “Multiview consensus graph Cybersecurity and School of Artificial Intelligence,
clustering,” IEEE Trans. Image Process., vol. 28, no. 3, pp. 1261–1270, Optics and Electronics (iOPEN), Northwestern
Mar. 2019. Polytechnical University, Xian, China. His research interests focus on
machine learning and its applications.
Chaodie Liu is currently working toward the PhD
degree in the School of Computer Science and
School of Artificial Intelligence, Optics and Electron-
ics (iOPEN), Northwestern Polytechnical University, Xuelong Li (Fellow, IEEE) is a full professor with
Xi’an, Shaanxi, China. Her research interests include the School of Artificial Intelligence, Optics and
machine learning and pattern recognition. Electronics (iOPEN), Northwestern Polytechnical
University, Xi’an, China.

Feiping Nie received the PhD degree in computer


science from Tsinghua University, China, in 2009,
and currently is a full professor in Northwestern Pol-
ytechnical University, China. His research interests
are machine learning and its applications, such as " For more information on this or any other computing topic,
pattern recognition, data mining, computer vision, please visit our Digital Library at www.computer.org/csdl.
image processing, and information retrieval. He has
published more than 100 papers in the following
journals and conferences: the IEEE Transactions
on Pattern Analysis and Machine Intelligence, Inter-
national Journal of Computer Vision, IEEE Transac-
tions on Image Processing, IEEE Transactions on Neural Networks and
Learning Systems, IEEE Transactions on Knowledge and Data Engineering,
ICML, NIPS, KDD, IJCAI, AAAI, ICCV, CVPR, ACM Multimedia Conference.
His papers have been cited more than 20000 times and the H-index is 78.
He is now serving as associate editor or PC member for several prestigious
journals and conferences in the related fields.

Authorized licensed use limited to: Universita degli Studi di Roma Tor Vergata. Downloaded on March 06,2024 at 16:15:15 UTC from IEEE Xplore. Restrictions apply.

You might also like