We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 1
Contents ists available at Si
Computers & Industrial Engineering
ELSEVIER
Journal homepage: ww elsevio
®
Sequential clustering and classification using deep learning technique and 8"
multi-objective sine-cosine algorithm
RJ. Kuo", Muhammad Rakhmat Setiawan“, Thi Phuong Quyen Nguyen”
‘pref dara Manag, Nana Tan Unbrayo Scleean Teoly, Tbs, Tatar ROC
» ft of Pret Managemen Te Unto of Danang Uniery of Soren aly, 98 Np any Ban Dang, i om
ARTICLE INFO ABSTRACT
Nlcefeie seine agen
fvosen)
Tis sudy inoduced a novel data analyl-based sequential clustering and dassfeation (SCC approach. The
proposed approech named deep MOSCA-SCC integrates a mult-objectve sne-osine algorithm (MOSCA), deep
‘Sustering technique, and cassfcatin algoritims to exploit data structure before implerseting the prediction
‘motel, Herein, the autoencader combined wih the Kemeanealgrithm wae wie forthe deep chstering to
reveal the dat patter. Regarding casifcation apport vector machine, ik-propagiton dra network, ane
Ascson tre election were implemented to explore the corcatd factors withthe revealed! pattem. To
taluate the performance ofthe proposed meth, 2 comparison was eondicted betwen the proposed Seep
[MOSCA.SCC and other benchmark lgoihms, cng the NSGAILSCC andthe regu MOSCA-SCC. The deep
1, Introduction
According to MeKinsey (\nalytic, 2016), data seientiss are in high
demand for many organizations. These organizations need to increase
the number of data scientists to help them to analyze their dta and get
actioneble insight from their data. Big data and business analytics So-
lutions are expected to generate 274.3 billion USD in revenue for the
flobal market in 2022. However, MeKinsey & Company conducted a
Survey and found a drawback to challenge the IDC hypothesis, They
discovered that almost half of executives across regions end industries
elobally reported more challenges to find analtlal expertise than any
other type of position. As data grow more complex, a Jot of data has
‘many altibutes without labels, There i sll lack of novely research to
help data scientists with preliminary data analysis. Thus, analyzing,
integrating, and transforming data atthe initial stage is essential for an
organization
Clustering and classification are two key approaches for data ana-
Iyves that have been explored in various felds for many years (T2"
ct al, 2016). The objective of data clustering iso classify data instances
ina given dacaset into several distinct groups based on some similarity
caleulations. In concrast, the objective of data classification is t0
* corresponding author
Email odrese
w (Kuo), ts
categorize «dats object without a label into predefined group of data
with similar attributes Inthe classification procedure, a labeled dataset
Js used in the taining model to learn the concemed features of the
labeled objects. Thus, data classifiation is considered 2 supervised
learning approach, while data clustering is unsupervised learning. Both
clustering and classification have been used in variety of real-world
fareas such a6 patter recognition, biology, text mining, and image
processing
‘Combining clustering and clasieation has been a new research
Airetion of data mining for several years. Yang and Quyen proposed an
Innovative framework of sequential clustering and classification (SCC)
that ean explore the hidden structure of data by clustering technique and
then find the correlated features with the revealed patterns by classifi
cation technigue (Yang & Quyen, 2018), The elitist non-dominated
sorting genetic algorithm (NSGAI was employed to seek out an opt-
‘mum solution forthe SCC framework. Especially, the SCC framework
implemented :wo different datasets that are correlated. The first dataset
contains “target interest” features used fr clurterin, while the second
dataset used for clasification comprises correlated factors with “target
Interest’. The NSGAILSCC proved an outstanding performance in terms
of solution quality of both clustering and classification. The SCC
1 (M, Rakmat Setawan,nsgyenssur win. vn (LP, NexveR)
Received 17 Api 2022; Received in revised form 2 September 2022; Accepted 21 September 2022
Available online 29 September 2022
(0360.8352/6 2022 Fei It, All ght reserved
Data Classification - Algorithms and Applications-Chapman and Hall - CRC (2014) - (Chapman & Hall - CRC Data Mining and Knowledge Discovery Series) Charu C. Aggarwal PDF