A Hybrid Metaheuristic and Kernel Intuitionistic Fuzzy C-Means
A Hybrid Metaheuristic and Kernel Intuitionistic Fuzzy C-Means
a r t i c l e i n f o a b s t r a c t
Article history: Cluster analysis is a very useful data mining approach. Although many clustering algorithms have been
Received 25 October 2016 proposed, it is very difficult to find a clustering method which is suitable for all types of datasets. This
Received in revised form 2 January 2018 study proposes an evolutionary-based clustering algorithm which combines a metaheuristic with a ker-
Accepted 21 February 2018
nel intuitionistic fuzzy c-means (KIFCM) algorithm. The KIFCM algorithm improves the fuzzy c-means
Available online 9 March 2018
(FCM) algorithm by employing an intuitionistic fuzzy set and a kernel function. According to previous
studies, the KIFCM algorithm is a promising algorithm. However, it still has a weakness due to its high
Keywords:
sensitivity to initial centroids. Thus, this study overcomes this problem by using a metaheuristic algo-
Cluster analysis
Metaheuristics
rithm to improve the KIFCM result. The metaheuristic can provide better initial centroids for the KIFCM
Particle swarm optimization algorithm. This study applies three metaheuristics, particle swarm optimization (PSO), genetic algorithm
Genetic algorithm (GA) and artificial bee colony (ABC) algorithms. Though the hybrid method is not new, this is the first
Artificial bee colony algorithm paper to combine metaheuristics and KIFCM. The proposed algorithms, PSO-KIFCM, GA-KIFCM and ABC-
Intuitionistic fuzzy set KIFCM algorithms are evaluated using six benchmark datasets. The results are compared with some
Kernel function other clustering algorithms, namely K-means, FCM, Kernel fuzzy c-means (KFCM) and KIFCM algorithms.
Fuzzy c-means The results prove that the proposed algorithms achieve better accuracy. Furthermore, the proposed algo-
rithms are applied to solve a case study on customer segmentation. This case study is taken from franchise
stores selling women’s clothing in Taiwan. For this case study, the proposed algorithms also exhibit better
cluster construction than other tested algorithms.
© 2018 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.asoc.2018.02.039
1568-4946/© 2018 Elsevier B.V. All rights reserved.
300 R.J. Kuo et al. / Applied Soft Computing 67 (2018) 299–308
stable. The reason is because the KIFCM algorithm was developed Step 1: Let T = 0. Set up cluster number c, membership degree
from a fuzzy c-means algorithm. This algorithm is highly sensi- m, tolerance rate ε and generate the initial membership value U0
tive to the initial membership value which commonly initialize randomly.
randomly. Therefore, the drawback of FCM still remains in KIFCM Step 2: Calculate the cluster centers v1 T+1 , . . ., vc T+1 using Eq.
algorithm. This study aims to overcome this drawback by applying (4):
metaheuristic. The metaheuristic can provide better initial cen- n m
troids for the KIFCM algorithm. Thus, it will improve the KIFCM i=1
uTij xi
T +1
result. Since this study employs three metaheuristics, PSO, GA, and vj =
n m , ∀j = 1, 2, . . ., c. (4)
ABC, there are three proposed algorithms in this study. They are the i=1
uTij
PSO-KIFCM, GA-KIFCM, and ABC-KIFCM algorithms.
After evaluating the three algorithms’ performances using Step 3: Update membership function UT+1 using Eq. (5):
benchmark datasets, a case study is presented. This study applies 1
the proposed algorithms to a customer segmentation problem. In uij (T +1) = (5)
1⁄m − 1
marketing, designing accurate marketing strategy for all customers c 2
T +1 2
is very difficult, since every individual has a unique customer k=1
dij xi , vTj +1 /dij xi , vk
behavior. Customer segmentation improves understanding of cus-
tomer behavior. This study presents an application for customer Step 4: Update objective function JT+1 using Eq. (6):
segmentation for franchise stores selling women’s clothing in c
n
m 2
Taiwan. This company has many stores all over Taiwan. In order J T +1 = uTij +1 × dij xi , vTj +1 (6)
to increase their profit, the company must identify potential cus-
i=1 j=1
tomers before designing their marketing strategies. This study
therefore applies the proposed algorithms to solve this problem. Step 5: If |J (T +1) − J (T ) | ≤ ε, then stop. Otherwise, increase T by
The remainder of this study is organized as follows. Section 2 one and return to Step 2.
presents a survey of related literature for this study. The proposed In the intuitionistic fuzzy set, an object x ∈ Ã consists of mem-
methods are presented in Section 3. Section 4 discusses the method bership degree A (x) and non-membership degree vA (x), where
validation, while Section 5 presents the case study in customer 0 ≤ A (x) + vA (x) ≤ 1. If A (x) = 1 − vA (x) , ∀x ∈ Ã, then à is an
segmentation. Finally, concluding remakes are made in Section 6. intuitionistic fuzzy set (IFs). In IFs, there is a hesitation degree A (x),
defined in Eq. (7):
2. Literature review A (x) = 1 − A (x) − vA (x) (7)
This section presents the related background necessary for this Therefore, in the IFCM algorithm, the membership degree is
study. This includes cluster analysis using kernel intuitionistic calculated by Eq. (8):
fuzzy c-means algorithm, and metaheuristics algorithms. u∗ij = uij + ij , (8)
Due to its complexity, solving an optimization problem using Datasets Iris Wine Tae Flame Glass Wbc
an exact method is very difficult. For this reason, metaheuristic
Parameters
methods have become popular means of solving such problems.
Number of attributes 4 13 5 2 9 9
Metaheuristic algorithms such as particle swarm optimization
Number of instances 150 178 151 240 214 683
(PSO), genetic algorithm (GA), artificial bee colony (ABC), differ- Number of clusters 3 3 3 2 6 2
ential evolution (DE), ant colony optimization (ACO), gradient
evolution (GE), to name a few, have been employed to solve many
optimization problems [16–22]. In order to improve the KIFCM GA algorithm proposed by Maulik and Bandyopadhyay [25], and
algorithm, this study employs PSO, GA and ABC algorithms. These the ABC algorithm for clustering introduced by Krishnamoorthi and
algorithms are chosen since they have shown good performance Natarajan [26]. Figs. 2–4 illustrate the PSO-KIFCM, GA-KIFCM and
in solving many problems [23]. The GA algorithm is quite an old ABC-KIFCM algorithms, respectively.
metaheuristic algorithm. It explores the search space using three In this study, the distance between two data points is calcu-
operators: selection, crossover and mutation. The PSO algorithm lated using Euclidean distance. The reason for this is that it is the
was inspired the flocking or schooling behavior of birds or fish. In simplest and most used similarity measurement. Another reason
exploring the search space, particles move according to their indi- is that the base of this study is the KIFCM algorithm. In FCM, it
vidual best and social best positions. On the other hand, the ABC is more appropriate to use Euclidean distance. For other types of
algorithm mimics the behavior of a bee colony, consisting of scout, data, other similarity measurements could be applied. However,
worker and onlooker bees, each with a different purpose. Scout bees in order to minimize computational time, this study only applies
are the pioneers, and search for potential food locations. The search- the Euclidean distance, since the KIFCM algorithm itself requires
ing process is then continued by worker bees, who measure the significant computation.
amount of food in each source. Finally, onlooker bees will explore
the chosen food sources. 4. Experiment results
evidence, the investigator should identify it correctly. The dataset cluster. Thus, the final cluster label for each data point is defined as
comprises of six types of glasses. They are building windows the cluster which has the highest membership value.
float processed, building windows non-float processed, vehicle
windows float processed, vehicle windows non-float processed, 4.1. Analysis of parameter settings
containers, tableware, and headlamps. In this study, the inves-
tigator should be able to distinguish the glass type based on the Besides comparisons with other algorithms, this study also con-
composition of the glasses which are the refractive index, sodium, ducted experiments to evaluate the effect of parameter settings
magnesium, aluminum, silicon, potassium, calcium, barium, and on the results. This is because metaheuristic algorithms require
iron [31]. predefined parameters, which control the scope of the exploration
• Wisconsin-Breast Cancer (Diagnostics) dataset (WBC) consists and exploitation performed by the algorithm. In some cases, these
of the measurements for breast cancer cases. This dataset was parameters can significantly influence the results. Therefore, this
obtained from the University of Wisconsin Hospitals, Madison study conducted experiments to analyze the effect of parameter
from Dr. Wolberg. There are two types of breast cancer cases, settings on the results obtained by the PSO-KIFCM, GA-KIFCM and
benign and malignant. In this dataset, the type of cancer is iden- ABC-KIFCM algorithms. Table 2 lists the tested parameter settings
tified based on clump thickness, uniformity of cell size and shape, in this study. These values are taken from previous studies [33–36].
marginal adhesion, single epithelial cell size, bare nuclei, bland 30 independent runs are conducted for each combination of param-
chromatin, normal nucleoli, and mitoses [32]. eter settings.
The experiment results reveal that for the PSO-KIFCM algorithm,
In the experiments, the proposed PSO-KIFCM, GA-KIFCM and learning rate 2, which control social exploration, has a significant
ABC-KIFCM algorithms were compared with the K-means, FCM, influence on the results for most datasets. Learning rate 1, how-
IFCM, KFCM, KIFCM, PSO-IFCM, GA-IFCM and ABC-IFCM algorithms. ever, appears to only have a significant effect on some datasets.
Since the tested datasets include cluster labels, the comparisons For the GA-KIFCM algorithm, crossover rate has a greater effect
conducted based on the accuracy achieved by each algorithm. In on the results than does mutation rate. According to these results,
fuzzy clustering, each data point has a membership value for each a higher crossover rate will yield a better result. On the other
R.J. Kuo et al. / Applied Soft Computing 67 (2018) 299–308 303
Fig. 3. GA-KIFCM.
Fig. 4. ABC-KIFCM.
304 R.J. Kuo et al. / Applied Soft Computing 67 (2018) 299–308
Table 2 Table 3
Parameter settings for all clustering algorithms. Best parameter settings.
Method Factors Level 1 Level 2 Level 3 Dataset Iris Wine Tae Flame Glass Wbc
have significant effects on the results. These results imply that the
metaheuristic-based KIFCM algorithm requires wider exploration
in order to upgrade the local search conducted by the KIFCM algo-
rithm. low mutation rate are suggested, while for the ABC-KIFCM, the limit
Finally, the best parameter settings for each algorithm are of scouts should be around 20% of the number of food sources, and
summarized in Table 3. According to these results, this study the search limit should be around 10% to 20% of the number of food
recommends parameter settings for PSO-KIFCM to be 1.495 for sources. Population sizes of 80–100 are recommended. However,
learning rates 1 and 2. For GA-KIFCM, a higher crossover rate and for datasets with high noise, larger populations should be tried.
Table 4
Accuracy obtained by each algorithm.
Table 5
Statistic testing results.
4.2. Comparison with other algorithms the time spent by metaheuristic algorithm is still acceptable. For
instance, the longest time for PSOKIFCM is around 18 s, while the
The comparisons with other algorithms are conducted using GAKIFCM spends around 8 min for the biggest dataset. ABCKIFCM
the best parameter settings, as listed in Table 3. Each algorithm needs more time, since it has more computation with three differ-
is executed 30 times. Table 4 summarizes the results. ent kinds of bees, scout, employee, and onlooker bees. According
According to the result in Tables 4 and 5, the proposed to these results, GAKIFCM shows a better performance within a
metaheuristic-based KIFCM algorithms perform significantly bet- relatively short time.
ter than the other algorithms. Table 4 shows the comparison with a
previous paper which used similar datasets. In the previous paper, 5. Case study
Kuo et al. [37] showed the comparison with 13 clustering algo-
rithms. Table 4 shows the minimum and maximum accurate rates Clustering algorithms can be used in many applications, includ-
in that paper. It shows that the results obtained by the proposed ing marketing. In this study, the proposed algorithms are employed
algorithm are close to the best results obtained by the previous to analyze the customer data collected by a franchise store in
algorithms. For Tae and Flame datasets, the proposed algorithm Taiwan. This company sells women’s clothing, and has many stores
obtained the better results. The global search procedure in these all over Taiwan. In order to increase the company’s profit, it must
algorithms is able to improve on KIFCM results. Where the KIFCM design an accurate marketing strategy. However, in order to do
algorithm is only able to search within a small area, and is prone this, analysis of customer data is very important. To this end, cus-
to becoming trapped in local optima, the metaheuristic algorithms tomer segmentation can be applied to help the company better
can dramatically move the centroids to a different area in order understand its customers.
to obtain a better solution. The standard deviations of the results The collected customer data includes purchasing data, in addi-
obtained by the metaheuristic-based KIFCM algorithms are also tion to the products bought and the transaction times. In this study,
smaller than that of the KIFCM algorithm. This shows that the clus- this raw data is preprocessed using the recency, frequency and
tering results obtained by the proposed algorithms are more stable. monetary (RFM) method [38]. There are 1786 customers included
In other words, the cluster centroids obtained by the proposed in the dataset. Before applying the proposed algorithms, a prelim-
algorithms have converged to the optimal locations, and the algo- inary study is conducted to determine the number of clusters. This
rithms therefore achieve higher accuracies and more stable results. step is conducted by dividing the dataset into several clusters and
Of the PSO-KIFCM, GA-KIFCM and ABC-KIFCM algorithms, the GA- calculating the DB index for each cluster. The DB index is a ratio
KIFCM algorithm achieves better results for four out of six datasets, of intra-cluster and inter-cluster distances [39]. Let a dataset be
although the differences are not significant. This is because the divided into k clusters. The DB value is calculated using Eq. (15):
mutation in GA makes it better able to avoid becoming trapped
1
k
in local optima than the PSO and ABC algorithms. These results
demonstrate that the KIFCM algorithm needs an improvement to DB (k) = Di,j , (15)
k
enhance its searching radius in order to find better centroids. This i=1
requirement can be successfully met by metaheuristics. In this where Di,j is defined in Eq. (16):
study, GA with its mutation operator achieves better results than
the PSO and ABC algorithms. d̄i + d̄j
In terms of computational time as shown in Table 6, PSOKIFCM Di,j = max . (16)
i=
/ j di,j
algorithm works faster than GAKIFCM and ABCKIFCM. The reason
is that PSO algorithm has the simpler algorithm than GA and ABC. d̄i is the average distance between each point in cluster i to the
The computational time of PSOKIFCM is about three times longer centroid of cluster i, while di,j is the distance between centroids i
than KIFCM algorithm. However, although they require more time, and j.
306 R.J. Kuo et al. / Applied Soft Computing 67 (2018) 299–308
Table 6
Computational Time.
Fig. 6. Clustering result obtained by FCM algorithm.
Iris Wine Tae Flame Glass Wbc
Table 7
Case study result.
Table 8
Statistic test for case study (p-value).
required. In addition, new metaheuristics, like imperialist compet- [20] M. Dorigo, C. Blum, Ant colony optimization theory: a survey, Theoretical
itive algorithm [33], can be applied to clustering. Comput. Sci. 344 (2005) 243–278.
[21] R.J. Kuo, F.E. Zulvia, The gradient evolution algorithm: a new metaheuristic,
Inform. Sci. 316 (2015) 246–265.
References [22] S.J. Nanda, G. Panda, A survey on nature inspired metaheuristic algorithms for
partitional clustering, Swarm Evol. Comput. 16 (2014) 1–18.
[1] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, [23] X.S. Yang, Engineering Optimization: An Introduction with Metaheuristic
Kluwer Academic Publishers, 1981. Applications, John Wiley & Sons, Inc, New Jersey, 2010.
[2] M.-S. Yang, A survey of fuzzy clustering, Math. Comput. Modell. 18 (1993) [24] C. Dong, G. Wang, Z. Chen, Z. Yu, A method of self-adaptive inertia weight for
1–16. PSO, 2008 International Conference on Computer Science and Software
[3] X. Liu, L. Wang, Computing the maximum similarity bi-clusters of gene Engineering, IEEE (2008) 1195–1198.
expression data, Bioinformatics 23 (2007) 50–56. [25] U. Maulik, S. Bandyopadhyay, Genetic algorithm-based clustering technique,
[4] T. Kohonen, Self-Organizing Maps, Springer Science & Business Media, Berlin, Pattern Recogn. 33 (2000) 1455–1465.
Germany, 2001. [26] M. Krishnamoorthi, A. Natarajan, A comparative analysis of enhanced Artificial
[5] K. Honda, H. Ichihashi, Linear fuzzy clustering techniques with missing values Bee Colony algorithms for data clustering, 2013 International Conference on
and their application to local principal component analysis, IEEE Trans. Fuzzy Computer Communication and Informatics (ICCCI), IEEE (2013) 1–6.
Syst. 12 (2004) 183–193. [27] L. Fu, E. Medico, FLAME a novel fuzzy clustering method for the analysis of
[6] L. An, X. Gao, X. Li, D. Tao, C. Deng, J. Li, Robust reversible watermarking via DNA microarray data, BMC Bioinf. 8 (2007) 3.
clustering and enhanced pixel-wise masking, IEEE Trans. Image Process. 21 [28] R.A. Fisher, The use of multiple measurements in taxonomic problems, Ann.
(2012) 3598–3611. Eug. 7 (1936) 179–188.
[7] L. An, X. Gao, Y. Yuan, D. Tao, Robust lossless data hiding using clustering and [29] M. Forina, PARVUS – an Extendible Package for Data Exploration Classification
statistical quantity histogram, Neurocomputing 77 (2012) 1–11. and Correlation.
[8] J. MacQueen, Some methods for classification and analysis of multivariate [30] W.-Y. Loh, Y.-S. Shih, Split selection methods for classification trees, Stat. Sin.
observations, in: Proceedings of the Fifth Berkeley Symposium on (1997) 815–840.
Mathematical Statistics and Probability, California, USA, 1967, pp. 281–297. [31] I.W. Evett, J.S. Ernest, Rule Induction in Forensic Science. Central Research
[9] P.-N. Tan, M. Steinbach, V. Kumar, Introduction to Data Mining, Pearson Establishment, Home Office Forensic Science Service, Aldermaston, Reading,
Education, Inc, USA, 2006. Berkshire RG7 4PN, 1987.
[10] D.-Q. Zhang, S.-C. Chen, Clustering incomplete data using kernel-based fuzzy [32] W.H. Wolberg, O.L. Mangasarian, Multisurface method of pattern separation
c-means algorithm, Neural Process. Lett. 18 (2003) 155–162. for medical diagnosis applied to breast cytology, Proc. Natl. Acad. Sci. 87
[11] T. Chaira, A novel intuitionistic fuzzy C means clustering algorithm and its (1990) 9193–9196.
application to medical images, Appl. Soft Comput. 11 (2011) 1711–1717. [33] R.C. Eberhart, S. Yuhui, Particle swarm optimization: developments,
[12] K.T. Atanassov, Intuitionistic fuzzy sets, Fuzzy Sets Syst. 20 (1986) 87–96. applications and resources, Proceedings of the 2001 Congress on Evolutionary
[13] K.-P. Lin, A novel evolutionary kernel intuitionistic fuzzy-means clustering Computation, Seoul Korea vol. 81 (2001) 81–86.
algorithm, IEEE Trans. Fuzzy Syst. 22 (2014) 1074–1087. [34] R. Kuo, C.-F. Wang, Z.-Y. Chen, Integration of growing self-organizing map and
[14] W. Pedrycz, P. Rai, Collaborative clustering with the use of Fuzzy C-Means and continuous genetic algorithm for grading lithium-ion battery cells, Appl. Soft
its quantification, Fuzzy Sets Syst. 159 (2008) 2399–2427. Comput. 12 (2012) 2012–2022.
[15] J. Fan, M. Han, J. Wang, Single point iterative weighted fuzzy C-means [35] E. Michielssen, S. Ranjithan, R. Mittra, Optimal multilayer filter design using
clustering algorithm for remote sensing image segmentation, Pattern Recogn. real coded genetic algorithms, IEE Proc. J. (Optoelectronics) 139 (1992)
42 (2009) 2527–2540. 413–420.
[16] J. Kennedy, R. Eberhart, Particle swarm optimization, IEEE International [36] B. Akay, D. Karaboga, A modified artificial bee colony algorithm for
Conference on Neural Networks, 1995 Proceedings, IEEE (1995) 1942–1948. real-parameter optimization, Inform. Sci. 192 (2012) 120–142.
[17] C.A. Murthy, N. Chowdhury, In search of optimal clusters using genetic [37] R.J. Kuo, C.H. Mei, F.E. Zulvia, C.Y. Tsai, An application of a metaheuristic
algorithms, Pattern Recognit. Lett. 17 (1996) 825–832. algorithm-based clustering ensemble method to APP customer segmentation,
[18] R. Storn, K. Price, Differential evolution – a simple and efficient heuristic for Neurocomputing 205 (2016) 116–129.
global optimization over continuous spaces, J. Global Optim. 11 (1997) [38] J.R. Bult, T. Wansbeek, Optimal selection for direct mail, Market. Sci. 14 (1995)
341–359. 378–394.
[19] D. Karaboga, B. Basturk, A powerful and efficient algorithm for numerical [39] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern
function optimization: artificial bee colony (ABC) algorithm, J. Global Optim. Anal. Mach. Intell. PAMI-1 (1979) 224–227.
39 (2007) 459–471.