0% found this document useful (0 votes)
20 views1 page

6 Paper3

Data clustering method

Uploaded by

Sunny Nguyen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
20 views1 page

6 Paper3

Data clustering method

Uploaded by

Sunny Nguyen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 1
Contents ists available at Si Computers & Industrial Engineering ELSEVIER Journal homepage: ww elsevio ® Sequential clustering and classification using deep learning technique and 8" multi-objective sine-cosine algorithm RJ. Kuo", Muhammad Rakhmat Setiawan“, Thi Phuong Quyen Nguyen” ‘pref dara Manag, Nana Tan Unbrayo Scleean Teoly, Tbs, Tatar ROC » ft of Pret Managemen Te Unto of Danang Uniery of Soren aly, 98 Np any Ban Dang, i om ARTICLE INFO ABSTRACT Nlcefeie seine agen fvosen) Tis sudy inoduced a novel data analyl-based sequential clustering and dassfeation (SCC approach. The proposed approech named deep MOSCA-SCC integrates a mult-objectve sne-osine algorithm (MOSCA), deep ‘Sustering technique, and cassfcatin algoritims to exploit data structure before implerseting the prediction ‘motel, Herein, the autoencader combined wih the Kemeanealgrithm wae wie forthe deep chstering to reveal the dat patter. Regarding casifcation apport vector machine, ik-propagiton dra network, ane Ascson tre election were implemented to explore the corcatd factors withthe revealed! pattem. To taluate the performance ofthe proposed meth, 2 comparison was eondicted betwen the proposed Seep [MOSCA.SCC and other benchmark lgoihms, cng the NSGAILSCC andthe regu MOSCA-SCC. The deep 1, Introduction According to MeKinsey (\nalytic, 2016), data seientiss are in high demand for many organizations. These organizations need to increase the number of data scientists to help them to analyze their dta and get actioneble insight from their data. Big data and business analytics So- lutions are expected to generate 274.3 billion USD in revenue for the flobal market in 2022. However, MeKinsey & Company conducted a Survey and found a drawback to challenge the IDC hypothesis, They discovered that almost half of executives across regions end industries elobally reported more challenges to find analtlal expertise than any other type of position. As data grow more complex, a Jot of data has ‘many altibutes without labels, There i sll lack of novely research to help data scientists with preliminary data analysis. Thus, analyzing, integrating, and transforming data atthe initial stage is essential for an organization Clustering and classification are two key approaches for data ana- Iyves that have been explored in various felds for many years (T2" ct al, 2016). The objective of data clustering iso classify data instances ina given dacaset into several distinct groups based on some similarity caleulations. In concrast, the objective of data classification is t0 * corresponding author Email odrese w (Kuo), ts categorize «dats object without a label into predefined group of data with similar attributes Inthe classification procedure, a labeled dataset Js used in the taining model to learn the concemed features of the labeled objects. Thus, data classifiation is considered 2 supervised learning approach, while data clustering is unsupervised learning. Both clustering and classification have been used in variety of real-world fareas such a6 patter recognition, biology, text mining, and image processing ‘Combining clustering and clasieation has been a new research Airetion of data mining for several years. Yang and Quyen proposed an Innovative framework of sequential clustering and classification (SCC) that ean explore the hidden structure of data by clustering technique and then find the correlated features with the revealed patterns by classifi cation technigue (Yang & Quyen, 2018), The elitist non-dominated sorting genetic algorithm (NSGAI was employed to seek out an opt- ‘mum solution forthe SCC framework. Especially, the SCC framework implemented :wo different datasets that are correlated. The first dataset contains “target interest” features used fr clurterin, while the second dataset used for clasification comprises correlated factors with “target Interest’. The NSGAILSCC proved an outstanding performance in terms of solution quality of both clustering and classification. The SCC 1 (M, Rakmat Setawan,nsgyenssur win. vn (LP, NexveR) Received 17 Api 2022; Received in revised form 2 September 2022; Accepted 21 September 2022 Available online 29 September 2022 (0360.8352/6 2022 Fei It, All ght reserved

You might also like