0% found this document useful (0 votes)
48 views6 pages

A Novel Kernelized Fuzzy Clustering Algorithm For Data Classification

Data are expanding day by day, clustering plays a main role in handling the data and to discover knowledge from it. Most of the clustering approaches deal with the linear separable problems.

Uploaded by

WARSE Journals
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views6 pages

A Novel Kernelized Fuzzy Clustering Algorithm For Data Classification

Data are expanding day by day, clustering plays a main role in handling the data and to discover knowledge from it. Most of the clustering approaches deal with the linear separable problems.

Uploaded by

WARSE Journals
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

ISSN 2347 - 3983

Volume 9. No. 8, August 2021


International
Neha Mehra, Journal
International Journal of Emerging
of Emerging TrendsResearch,
Trends in Engineering in Engineering Research
9(8), August 2021, 1073 – 1078
Available Online at http://www.warse.org/IJETER/static/pdf/file/ijeter07982021.pdf
https://doi.org/10.30534/ijeter/2021/07982021

A Novel Kernelized Fuzzy Clustering Algorithm for


Data Classification
Neha Mehra
Department of Computer Engineering
Shri Govindram Seksaria Institute of Technology and Science Indore (M.P.),India
[email protected]

ABSTRACT Fuzzy C-Means become the most widely used


algorithms in fuzzy clustering, wherein each data point
Data are expanding day by day, clustering plays a can have membership in more than one cluster [4].
main role in handling the data and to discover Fuzzy C-Means minimizes an objective function
knowledge from it. Most of the clustering approaches subject to some constraints. In most cases Fuzzy C-
deal with the linear separable problems. To deal with Means is based on dealing with linear relations among
the nonlinear separable problems, we introduce the data samples. For handling non-linear data samples,
concept of kernel function in fuzzy clustering. In the concept of kernel function is introduced [5].
Kernelized fuzzy clustering approach the kernel
function defines the non- linear transformation that 2. LITERATURE REVIEW
projects the data from the original space where the Many researchers have worked on the non-linear
data are can be more separable. The proposed mapping of data samples. As we know that the
approach uses kernel methods to project data from the efficiency of fuzzy C- Means algorithm is limited to
original space to a high dimensional feature space spherical clusters. Huang [6] discussed this problem by
where data can be separable linearly. We performed mapping nonlinear data in appropriate feature space by
the test on the real world datasets which shows that our applying kernel tricks that extend Fuzzy C-Means
proposed kernel based clustering method gives better method with multiple learning, where data can be
accuracy as compared to the fuzzy clustering method. linearly mapped. Girolami [7] has explored the notion
of data clustering in a kernel defined feature space. He
Key words: Fuzzy clustering, Fuzzy C-Means, Kernel also discussed about the choice of the kernel chosen in
methods defining the nonlinear mapping and how to choose the
parameter of the kernel. Tzortzis [8] introduces K-
1. INTRODUCTION
Means groups, which minimizes the clustering error by
Clustering is mostly used unsupervised technique in finding the sum of the squared Euclidean distances
data mining [1]. It is a way of assigning similar data between each dataset point and its cluster center. He
points into clusters based on some similarity measures. discussed Kernel K-Means method that identifies
The main aim is to assign data point such as there nonlinearly separable cluster that minimizes the error
should be high inter cluster distance and low intra in clustering approach in the feature space.
cluster distance. Clustering is broadly divided into two
parts, i.e., hierarchical and partitioning clustering. In the research field, Image segmentation also plays
Hierarchical clustering finds the clusters by important role in identifying patterns and image
partitioning the data in either a top-down or bottom-up analysis. Reddy [9] discussed the kernel based FCM
fashion in a recursive manner, whereas partitioning method that was used for the segmentation of low
clustering creates partitions of data, by using any contrast images and medical images. Baili [10]
optimizing criteria [2]. Partitioning clustering is proposed the Fuzzy C-Means with multiple kernels
further divided into crisp and fuzzy clustering. In crisp which allow the soft linear partitioning of the feature
clustering each data point in the sample space is space. In his paper, he discussed to partition the data
assigned to only one cluster [3]. To overcome this into appropriate clusters and assigns weight for each
limitation, the concept of fuzzy clustering is kernel in each cluster. Liu [11] in this paper developed
introduced. a kernel based fuzzy attribute c-means clustering
algorithm that finds the distance in the fuzzy attribute
In Fuzzy clustering, a data sample can belong to one C- Means clustering algorithm with induced kernel
than one cluster with different degrees of membership distance.
and the membership is spread among all clusters.

1073
Neha Mehra, International Journal of Emerging Trends in Engineering Research, 9(8), August 2021, 1073 – 1078

Lui [12] in his paper introduce an attribute multi-


kernel weighted Fuzzy C- Means method for Cluster Center can be computed as follows
projection of data in feature space and extracting
features so that efficiency can be improved. Nia [13] ∑
discussed about the limitation of segmentation method v′ = , ⩝ j (3)

that gets fixed in local minima. Its behavior becomes
random because of the outlier in the data. He proposed To minimize the objective function given in equation
a meta-heuristic algorithm, differential evolution and (1) the basic FCM algorithm is explained in Algorithm
RBF kernel based algorithm to segment image in noise 1.
data.
The proposed kernel based Fuzzy C- Means algorithm
deals with the limitation of not separating the data Algorithm 1: Fuzzy C-Means
sample linearly by mapping data into a high
dimensional feature space, where the patterns can be Input: X, V, c, m
linear among data samples. Section 2 briefly discussed Output: U, V’
about the preliminaries needed to develop our Step 1: Initialize cluster centers at random V= {v1, v2,
algorithm. Our proposed method is described in detail ..,vc}
in section 3. The experimental results are discussed Step 2: Compute cluster membership by using
and compared with other clustering methods in section equation (2)
4, followed by concluding remarks in section 5. Step 3: Compute cluster centers by using equation (3)
3. PRELIMINARIES Step 4: If V ′ − V < ϵ then stop, otherwise continue
with step 2.
The mostly used clustering algorithm is Fuzzy C-
Means, where each data point belongs to a cluster with The equation (2) and (3) are updated until the
some degree of membership. Fuzzy C-Means were predefined convergence threshold is met and then the
originally developed by Jim Bezdek in 1981 [14]. This algorithm terminates.
technique takes a dataset xi={x1, x2,…,xN} as input and
produces the membership matrix U, which denotes the Fuzzy C-Means can handle linear relations; to
possibility of belongingness of a data sample to the overcome this limitation the concept of kernel method
clusters. is introduced [15]. The kernel method maps data
space that is highly non linear to a high dimensional
The objective function J can be minimized by: feature space, using a set of mathematical functions
(ϕ: X → F).
( , ) = ∑ ∑ − (1)
Where, each data sample xi is an object, N is the total The kernel space can be of any number of dimensions.
number of training samples, c is the number of clusters The main aim of going to higher dimensions is that,
and has a set of membership values uij associated with while remaining in feature space where the data
it, the set of cluster centers is represented by v = {v1 , samples are highly non-linear and not linearly
v2,…vc}. The fuzzy matrix U = (uij)cxN is the fuzzy separable, it is possible to apply a linear classifier in
memberships of each training sample xi to each cluster that space. So, that data sample can be linearly
vj. The membership degree of all the samples on fuzzy separated.
partitions must be one. The value of fuzzification
parameter m affects the result of clustering algorithms A main concept behind the kernel based algorithms is
and also used to reduce sensitivity to noise in the the kernel trick. A kernel trick mathematically applied
clustering algorithm. The higher values lead to more to any algorithm which mainly depends on the dot
clusters; all elements tend to belong to all clusters and product between two vectors. The main aim of the
lower values lead to hard clusters. Bezdek has proved kernel trick is to convert the nonlinear problem to
that the value of fuzzification parameter should be set linear problem in the low dimensional input space
between 1.5 and 2.5 for better clustering results. [16].

The cluster membership matrix uij corresponding to A kernel is a function K that for all x, k from the
data point xi to each cluster vj can be calculated as original input space x satisfies

K (x, k) = 〈ϕ(x), ϕ (k)〉 (4)


u = , ⩝ i, j (2)
∑ ‖ ‖

1074
Neha Mehra, International Journal of Emerging Trends in Engineering Research, 9(8), August 2021, 1073 – 1078

Where ϕ(x), ϕ (k) denotes the inner dot product and ϕ


is a nonlinear map function from the input space x to a To minimize the objective function the membership
rather high dimensional feature space F. matrix and cluster center need to be computed.

ϕ ∶ x → ϕ (x) ϵ F (5) The membership matrix can be calculated as

With the consideration of the beneficial result of the ( )


,
kernel based method that is more robust to outliers and u = (9)
( )
performs better on non-linearity of data. We will ∑ ,
expand the equation (1) using the kernel in our
proposed method. The cluster center can be computed as
4. KERNEL BASED FUZZY CLUSTERING ∑ ,
Consider some non-linear mapping function ϕ ∶ x → v = ∑ ,
(10)
ϕ (x) ϵ F with d dimensions of the feature vector.
The kernel function can be of many types Polynomial, The choice of kernel parameter is the most critical
radial basis function (RBF), etc. task. In our proposed work with the kernel parameter
can be selected as below:

RBF kernel K(x , x ) = exp ( x − v ) is the
most popular kernel function, where is the kernel ∑
σ = (11)
parameter. The proposed method uses the RBF kernel.

Kernel based FCM (KFCM) function using the Where, d = ‖x − x‖ and ̅ Is the average of all
mapping ϕ can be generally defined as the constrained distances d .
minimization of the objective as described below:
To minimize the objective function given in equation
(9) the proposed kernel based fuzzy clustering
J(U, V) = ∑ ∑ u ϕ(x ) − ϕ(v ) (6)
algorithm is explained in Algorithm 2.

Where ϕ(x ) − ϕ(v ) is the Euclidean distance Algorithm 2: Kernel based Fuzzy Clustering
between ϕ(x ) and ϕ(v ), and ϕ(x ) and ϕ(v ), are the Algorithm
kernel spaces of x and v , respectively. N is the total
number of training samples and c is the number of Input: X, V, c, m, σ
clusters. is the fuzzy membership on the each Output: U, V’
Step 1: Randomly Initialize cluster centers V= {v1, v2,
training sample xi to each cluster vj.
..,vc}
Step 2: Compute cluster membership by using
ϕ(x ) − ϕ(v ) equation (9)
= ϕ(x ) − ϕ(v ) ϕ(x ) − ϕ(v ) Step 3: Compute cluster centers by using equation (10)
= Step 4: If V ′ − V < ϵ then stop, otherwise continue


ϕ(x ) ϕ(x ) − ϕ v ϕ(x ) − ϕ(x ) ϕ v + with step 2.

ϕ v ϕ v The equation (9) and (10) are updated until the
predefined convergence threshold is met and then the
= K (x , x ) + K v , v − algorithm terminates.
2 K x , v
5. EXPERIMENTS AND RESULTS
(7)
In this section, experimental studies of the proposed
In the case of RBF kernel approach are conducted on five real world dataset.
K (x , x ) = 1 and K v , v = 1 Firstly, c objects are chosen at random from datasets as
initial clusters initialize the membership matrix U. We
set fuzzification parameter m as 1.75 and convergence
So the equation (7) can be rewritten as
threshold as 10-3. In the experiments, we compare the
performance of FCM and KFCM to show the
J(U, V) = 2 ∑ ∑ u 1 − K x , v (8)

1075
Neha Mehra, International Journal of Emerging Trends in Engineering Research, 9(8), August 2021, 1073 – 1078

clustering results and robustness of KFCM over FCM 5.2.1 F-Measure


on various datasets.
Originally, the F-measure was proposed by Rijsbergen
5.1 Datasets [17] in the terms of information retrieval. The F-
measure is often used as a standard balance between
All the datasets are taken from the UCI Machine precision and recall for measuring goodness or
learning Repository. The dataset which has different accuracy of clustering methods. The value near to one
number of instances and attributes are chosen for of F-measure denotes the better clustering results.
comparison.
1) Seed: The dataset consists of 210 instances, Recall = (12)
with each instance has 7 attributes. All data points
were divided into 3 classes.
2) Wine: The dataset consists of 178 instances, Precision = (13)
with each instance has 13 attributes. All data points
were divided into 3 classes. ∗ ∗
F − Measure = (14)
3) Breast Cancer Wisconsin: The dataset consists
of 699 instances, with each instance has 10 attributes.
All data points were divided into 2 classes. The dataset Where N is elements of class and N is the number
contains 16 missing values. For our experimentation, of elements of cluster , N is the numbers of elements
we have ignored the instances which have missing of class in cluster .
values and considered 683 instances.
4) Balance Scale: The dataset consists of 625 5.2.2 Normalized Mutual Information (NMI)
instances, with each instance has 4 attributes. All data
points were divided into 3 classes. NMI is a good measure for determining the quality of
5) Ecoli: The dataset consists of 336 instances, clustering [18].
with each instance has 7 attributes. All data points
.
were divided into 8 classes. ∑ ∑
.
NMI = (15)
∑ ∑
Table 1 shows the different dataset name, number of
instances in each dataset and the number of classes
which each data point is divided. Where, n is the total number of data points, n and n
are the numbers of data points in the c cluster and
Table 1. Description of Datasets
the p class, respectively, and n is the number of
common data points in class p and cluster c.
Parameters
Datasets No. of No. of No. of
Instances Features Classes 5.2.3 Adjusted Rand Index (ARI)

Seed 210 7 3 The Rand index measures the similarity between


Wine 178 13 3 two data clustering. An extended form of the Rand
index is the adjusted Rand index. The adjusted Rand
Breast 683 10 2 index has the maximum value 1, and the value 0 is in
Cancer the case of random clusters [19].
Wisconsin
. .
Balance 625 4 3 ∑, ∑ ∑ /
Scale ARI = . . . .
(16)
∑ ∑ ∑ ∑ /
Ecoli 336 7 8

5.2.4 Objective Function


5.2 Performance Parameters
The main aim of the clustering approach is to
We evaluate the performance of the clustering method minimize the objective function. The proposed kernel
using four parameters described below: based Fuzzy clustering approach is also evaluated on
the basis of the objective function.

1076
Neha Mehra, International Journal of Emerging Trends in Engineering Research, 9(8), August 2021, 1073 – 1078

Table 2. F-Measure for FCM and KFCM on various in F-Measure. The NMI measure also shows the
Datasets improvement of our proposed algorithm. Maximum
improvement is shown in Balance Scale dataset. Ecoli
Datasets FCM KFCM dataset shows the maximum improvement in ARI
measure. Nearly 10 % to 20% improvement is
Seed 0.43517 0.60976 observed in Ari Measure. The main aim of the fuzzy
clustering approach is to minimize the objective
Wine 0.34721 0.64912 function. The proposed algorithm shows maximum
minimization of the objective function in Wine and
Breast Cancer 0.47674 0.78793 Breast Cancer Wisconsin dataset. The above
Wisconsin discussion shows that the proposed kernel based
Balance Scale 0.21968 0.63089 clustering algorithm is comparable efficient to the
Fuzzy C-Means algorithm.
Ecoli 0.37596 0.68654
6. CONCLUSION
Clustering is most efficient technique used in the field
Table 3. NMI for FCM and KFCM on various Datasets of research. The main advantage of the research is to
include the concept of kernel methods. In this paper a
Datasets FCM KFCM Kernel based Fuzzy clustering algorithm is proposed
to handle the non-linear data which deals with the
Seed 0.69483 0.70976 limitation of Fuzzy C-Means algorithm. It uses RBF
Wine 0.46952 0.87422 kernel as the kernel function for fuzzy clustering. The
effectiveness of the proposed method is compared with
Breast Cancer 0.73001 0.89603 the Fuzzy C means, which shows that the proposed
Wisconsin approach performs better clustering results. The
Balance Scale 0.24539 0.79285 proposed algorithm shows 10 % to 20% improvement
Ecoli 0.55712 0.8274 in all the measures that we have discussed. The work
can be extended in future to process big data.
Table 4. ARI for FCM and KFCM on various Datasets REFERENCES
Datasets FCM KFCM
[1] Garima, Hina Gulati, P.K Singh, "Clustering
Seed 0.3381 0.6234 Techniques in Data Mining: A Comparison", 2nd
Wine 0.48052 0.92468 International Conference on Computing for
Breast Cancer 0.95608 0.97281 Sustainable Global Development (INDIACom), New
Wisconsin Delhi, India, pp. 410 -415, 2015.
Balance Scale 0.3024 0.6752 [2] Malika Bendechache, Nhien-An Le-Khac and M-
Ecoli 0.28571 0.71905 Tahar Kechadi, "Efficient Large Scale Clustering
based on Data Partitioning", IEEE International
Conference on Data Science and Advanced Analytics,
Table 5. Objective Function for FCM and KFCM on
Montreal, QC, Canada, pp. 612 - 621, 2016.
various Datasets
[3] Raju G, Binu Thomas, Sonam Tobgay and Th.
Shanta Kumar, "Fuzzy Clustering Methods in Data
Datasets FCM KFCM
Mining: A comparative Case Analysis", International
Conference on Advanced Computer Theory and
Seed 1182.73 838.9414
Engineering, Phuket, Thailand, pp. 489 - 493, 2008.
Wine 2788.39 777.5576
[4] Dae-Won Kim, Kwang H. Lee, Doheon Lee, “On
Breast Cancer 30147.6 832.4609
cluster validity index for estimation of the optimal
Wisconsin
number of fuzzy clusters”, Pattern Recognition
Balance Scale 1250.44 891.3369 Society, Published by Elsevier, pp. 2009- 2025, April
Ecoli 959.84 853.4041 2004.
[5] Xiao-Hong Wu, Jian-Jiang Zhou, “Kernel-based
The above result shows that in all the datasets F- Fuzzy K-nearest-neighbor Algorithm”, International
Measure is increased in our proposed algorithm. Conference on Computational Intelligence for
Maximum improvement is shown in Balance Scale Modelling, Control and Automation and International
dataset. Nearly 10 % to 20% improvement is observed Conference on Intelligent Agents, Web Technologies

1077
Neha Mehra, International Journal of Emerging Trends in Engineering Research, 9(8), August 2021, 1073 – 1078

and Internet Commerce (CIMCA-IAWTIC'06), Conference on Machine Vision and Image Processing
Vienna, Austria, Vol. 2, pp. 159 - 162, November (MVIP), Isfahan, Iran, pp. 221 – 227, 2017.
2005. [14] James C. Bezdek, “Pattern Recognition with
[6] Hsin-Chien Huang, Yung-Yu Chuang, and Chu- Fuzzy Objective Function Algorithms”, Plenum Press,
Song Chen, “Multiple Kernel Fuzzy Clustering”, IEEE New York, 1981.
Transactions on Fuzzy Systems, Vol. 20, No. 1, pp. [15]Tomer Hertz Tomboy, Aharon Bar Hillel and
120 - 134, February 2012. Aharonbh Daphna Weinshall, “Learning a Kernel
[7] Mark Girolami, “Mercer Kernel – Based Clustering Function for Classification with small Training
in Feature Space”, IEEE Transactions on Neural Samples”, In proceeding of Machine Learning,
Networks, Vol. 12, No. 3, pp, 780 – 784, May 2002. proceedings of the twenty-third International
[8] Grigorios F. Tzortzis and Aristidis C. Likas, “The Conference (ICML 2006), Pittsburgh, Pennsylvania,
Global Kernel- Means Algorithm for Clustering in USA, pp. 401 - 408, 25-29 June 2006.
Feature Space”, IEEE Transactions on Neural [16] Tomas M. Cover, “Geometrical and statistical
Networks, Vol. 20, No. 7, pp, 1181 – 1194, July 2009. properties of systems of linear inequalities in pattern
[9] G. R. Reddy, K. Ramudu, S. Zaheeruddin, and R. recognition”, IEEE transactions on Electronic
R. Rao, “Image segmentation using kernel fuzzy c- Computers. 14, Vol. EC-14, No. 3, pp, 326–334, June
means clustering on level set method on noisy 1965.
images,” in Proceedings of the International [17] C. J. Van Rijsbergen, Information Retrieval,
Conference on Communications and Signal Processing Butterworth, 1979.
(ICCSP ’11), Calicut, India, pp. 522–526, February [18] Neha Bharill, Aruna Tiwari and Aayushi Malviya,
2011. “Fuzzy Based Scalable Clustering Algorithms for
[10] Naouel Baili and Hichem Frigui, “Fuzzy handling Big data Using Apache Spark”, IEEE
Clustering with Multiple Kernels in Feature Space”, Transactions on Big Data, Vol. 2, No. 4, pp, 339–352,
WCCI IEEE World Congress on Computational December 2016.
Intelligence, Brisbane, Australia, pp. 1- 8, 10-15, June, [19] Liang Bai, Jiye Liang and Yike Guo, “An
2012. ensemble clusterer of multiple fuzzy k-means
[11] J. Liu and M. Xu, “Kernelized fuzzy attribute C- clusterings to recognize arbitrarily shaped clusters”,
means clustering algorithm,” Fuzzy Sets Syst., vol. IEEE Transactions on Fuzzy Systems, IEEE Early
159, pp. 2428–2445, 2008. Access Articles, pp, 1-1, May 2018.
[12] Zhulin Liu, C. L. Philip Chen, Long Chen, and Jin [20 Himanshi Agrawal, Neha Mehra, Vandan Tewari,
Zhou, ‘Multi-Attribute Based Fuzzy C-means in “Implementation of Protein Sequence Classification for
Approximated Feature Space”, International Globin family using Ensemble Learning”, International
Conference on Fuzzy Theory and Its Applications, Journal of Emerging Trends in Engineering Research,
Volume 9. No. 4, April 2021.
Pingtung, Taiwan, pp. 1 – 6, 2017.
[13]Alireza Farrokhi Nia and Mousa Shamsi, “Brain
MR Image Segmentation Using Differential Evolution
Guided by RBF-Kernel based FCM”, 10th Iranian

1078

You might also like