0% found this document useful (0 votes)
82 views5 pages

Customer Segmentation in Banking Dataset Using Machine Learning

The document discusses using machine learning techniques for customer segmentation in the banking sector. It provides an overview of various clustering algorithms like K-means clustering and agglomerative clustering that are commonly used for customer segmentation. The goal is to identify customer segments in order to better understand customers and offer personalized products and services.

Uploaded by

Hoàng Vương
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views5 pages

Customer Segmentation in Banking Dataset Using Machine Learning

The document discusses using machine learning techniques for customer segmentation in the banking sector. It provides an overview of various clustering algorithms like K-means clustering and agglomerative clustering that are commonly used for customer segmentation. The goal is to identify customer segments in order to better understand customers and offer personalized products and services.

Uploaded by

Hoàng Vương
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

International Journal of Scientific Research in Engineering and Management (IJSREM)

Volume: 06 Issue: 05 | May - 2022 Impact Factor: 7.185 ISSN: 2582-3930

Customer Segmentation in Banking Dataset using Machine Learning


Prof. Leena Aakone1, Aniket Parate2, Pranay Dandekar3, Dasharath Pache4
Department of Computer Science and Engineering, Wainganga College of Engineering & Management, Nagpur.

ABSTRACT

Machine learning techniques analyze and extract useful information from data sets in order to solve problems in different
areas. For the banking sector, knowing the characteristics of customers entails a business advantage since more
personalized products and services can be offered. The goal of this study is to identify and characterize data mining and
machine learning techniques used for bank customer segmentation, their support tools, together with evaluation metrics
and datasets. We performed a systematic literature mapping of 87 primary studies published between 2005 and 2019. We
found that decision trees and linear predictors were the most used data mining and machine learning paradigms in bank
customer segmentation. From the 41 studies that reported support tools, Weka and Matlab were the two most commonly
cited. Regarding the evaluation metrics and datasets, accuracy was the most frequently used metric, whereas the UCI
Machine Learning repository from the University of California was the most used dataset. In summary, several data mining and
machine learning techniques have been applied to the problem of customer segmentation, with clear tendencies regarding
the techniques, tools, metrics and datasets.

Keywords: Customer Segmentation, Banking, Data, Clustering Methods.

I. INTRODUCTION
II. LITERATURE REVIEW
The customer segmentation has the importance as it
includes, the ability to modify the programs of market so Over the years, the competition amongst businesses is
that it is suitable to each of the customer segment, support increased and the large historical data that is available has
in business decision; identification of products associated resulted in the widespread use of data mining techniques in
with each customer segment and to manage the demand and extracting the meaningful and strategic information from
supply of that product; identifying and targeting the the database of the organization. According to, Clustering
potential customer base, and predicting customer defection, techniques consider data tulles as objects. They partition
providing directions in finding the solutions. The thrust of the data objects into groups or clusters.
this paper is to identify customer segments using the data
mining approach, using the partitioning algorithm called as The key to meaningful segmentation is to define customer
K-means clustering algorithm Company’s database. variables and attributes that are relevant to your unique
Customer segmentation is one of the applications of data business. The customers are becoming more concerned
mining which helps to segment the customers with similar and sophisticated in how they navigate their shopping
patterns into similar clusters hence, making easier for the choices, and the online retailers are discovering the one-
business to handle the large customer base. size-fits-all marketing approaches that aren’t so effective
any more.
This segmentation can directly or indirectly influence the
marketing strategy as it opens many new paths to discover Xiaojun Chen, Yixiang Fang, Min Yang, Feiping Nie,
like for which segment the product will be good, Zhou Zhao and Joshua Zhexue Huang suggested a
customizing the marketing plans according to the each partitioned clustering algorithm named “PurTreeClust”
segment, providing discounts for a specific segment, and for the faster clustering of customer’s transaction records
decipher the customer and object relationship which has where one of the major key element in achieving
been previously unknown to the company. successful modern marketing and customer relationship
management is the customer segmentation or the clusters
Customer segmentation allows companies to visualize what of customers [1].
actually the customers are buying which will prompt the
companies to better serve their customers resulting in Segments are not necessarily predictive in nature,
customer satisfaction it, also allows the companies to find although they are generally descriptive and serve as a type
who their target customers are and improvise their of classification that can be used to aid in understanding
marketing tactics to generate more revenues from them. the future behaviors and the needs of customer.

© 2022, IJSREM | www.ijsrem.com | Page 1


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 06 Issue: 05 | May - 2022 Impact Factor: 7.185 ISSN: 2582-3930

Ion SMEUREANU has graduated the Faculty of Planning


and Economic Cybernetics in 1980, as leader. He holds a
PhD diploma in “Economic Cybernetics” from 1992 and
has a remarkable didactic activity since 1984, when he
joined the staff of Bucharest Academy of Economic
Studies. Currently, he is a full Professor of Economic
Informatics within the Department of Economic
Informatics and the dean of the Faculty of Cybernetics,
Statistics and Economic Informatics from the Bucharest
University of Economic Studies. He is the author of more
than 16 books and an impressive number of articles on
economic modeling and computer applications. He was
also project director or member in many important research
projects. He was awarded the Nicolae Georgescu- Roegen
diploma, the award for the entire research activity offered by
the Romanian Statistics Society, General Romanian
Economist Association Excellence Diploma and many
others.

Gheorghe RUXANDA is a PhD in Economic


Cybernetics, Editor-in-chief of ISI Thompson Reuters
Journal “Economic Computation and Economic
Cybernetics Studies and Research” and Director of
Doctoral School of Economic Cybernetics and Statistics.
He is full Professor and PhD Adviser within the
Department of Economic Informatics and Cybernetics,
The Bucharest Academy of Economic Studies. He
graduated from the Faculty of Economic Cybernetics,
Statistics and Informatics, Academy of Economic Studies, 1. K-means Clustering It is the simplest
Bucharest (1975) where he also earned his Doctor’s algorithm of clustering based on partitioning principle.
Degree (1994). Had numerous research visits in Columbia The algorithm is sensitive to the initialization of the
University – School of Business, New York, USA (1999), centroids position, the number of K (centroids) is
Southern Methodist University (SMU), Faculty of calculated by elbow method (discussed in later section),
Computer Science and Engineering, Dallas, Texas, USA after calculation of K centroids by the terms of
(1999), Ecole Normale Superieure, Paris, France (2000),
Euclidean distance data points are assigned to the
Reading University, England (2002), North Carolina
University, Chapel Hill, USA (2002). He is full professor closest centroid forming the cluster, after the cluster
of Multidimensional Data Analysis (Doctoral School), formation the bar centre’s are once again calculated by
Data Mining and Multidimensional Data Analysis (Master the means of the cluster and this process is repeated until
Studies), Modeling and Neural Calculation (Master there is no change in centroid position.
Studies), Econometrics and Data Analysis (Undergraduate
Studies). 2. Agglomerative Clustering Agglomerative
Clustering is based on forming a hierarchy represented
Laura Maria BADEA is a PhD candidate in Economic
Cybernetics at the Bucharest Academy of Economic by dendrograms (discussed in later section).
Studies, has an MA in Corporate Finance (2010) and Dendrogram acts as memory for the algorithm to tell
graduated the Faculty of Finance, Insurance, Banking and about how formed. The clustering starts with forming N
Stock Exchange from Bucharest Academy of Economic clusters for N data points and then merging along the
Studies (2008). Fields of scientific interest: machine closest data points together in each step such that the
learning and other modeling techniques used for
current step contains one cluster less than the previous
classification matters in economic and financial domains,
with a focus on artificial neural networks. Scientific one.
research activity: one published article in ISI Thompson
ReutersJournal.

© 2022, IJSREM | www.ijsrem.com | Page 2


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 06 Issue: 05 | May - 2022 Impact Factor: 7.185 ISSN: 2582-3930

3. Mean shift Clustering This clustering


algorithm is a non-parametric iterative algorithm
functions by assuming the all the data points in the
feature space as empirical probability density
function. The algorithm clusters each data point
by allowing data point converge to a region of
local maxima which is achieved by fixing a
window around each data point finding the
mean and then shifting the window to the mean
and repeat the steps until all the data point
converges forming the clusters.

III. FLOWCHART

IV. CONCLUSION

As our dataset was unlabelled, in this paper we


have opted for internal clustering validation rather
than external clustering validation, which depend
on some external data like labels. Internal cluster
validation can be used for choosing clustering
algorithm which best suits the dataset and can
correctly cluster data into its opposite cluster.From
the above visualization it can be observed that

© 2022, IJSREM | www.ijsrem.com | Page 3


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 06 Issue: 05 | May - 2022 Impact Factor: 7.185 ISSN: 2582-3930

Cluster 1 denotes the customer who has high 3. Hong, T., Kim, E. (2011). It separates consumers
annual income as well as high yearly spend. from online stores based on factors that affect the
Cluster 2 represents the cluster having high annual
customer's intention to purchase. Expert System
income and low annual spend. Cluster 3
represents Applications, 39 (2), 2127-2131
customer with low annual income and low
annual1999. Vol. 31, No. 3. [13] Vishish R.
4. Hwang, Y. H. (2019). Hands-on Advertising
Patel1 and Rupa G. Mehta. MpImpact for
External Removal. spend. Cluster 5 denotes the Science Data: Develop your machine learning
low annual income but high yearly spend. marketing strategies… using python and r. S.l: Packt
Cluster 4 and cluster 6 denotes the customer
printing is limited Puwanenthiren Premkanth, -
with medium income and medium spending
score. Market Classification and Its Impact on Customer
Satisfaction and Special Reference to the Commercial
Bank of Ceylon PLC.‖ Global Journal of Management
and Business Publisher Research: Global Magazenals
Inc. (USA). 2012. Print ISSN: 0975-5853. Volume 12
Issue 1.

5. Puwanenthiren Premkanth, - Market


Classification and Its Impact on Customer Satisfaction
and Special Reference to the Commercial Bank of
Ceylon PLC.‖ Global Journal of Management and
Business Publisher Research: Global Magazenals Inc.
(USA). 2012. Print ISSN: 0975-5853. Volume 12 Issue

6. Sulekha Goyat. "The basis of market


segmentation: a critical review of the literature. www
V. REFERENCES
1. Blanchard, Tommy. Bhatnagar, Pranshu. Behera, European Journal of Business and
Trash. (2019). Marketing Analytics Scientific Data: Management.iiste.org. 2011. ISSN 2222-1905 (Paper)
Achieve your marketing objectives with Python's data ISSN 2222-2839 (Online). Vol 3, No.9,
analytics capabilities. S.l: Packt printing is limited 2011.

7. By Jerry W Thomas. 2007. Accessed at:


2. Griva, A., Bardaki, C., Pramatari, K., www.decisionanalyst.com on July 12, 2015.
Papakiriakopoulos, D. (2018). Sales business analysis:
Customer categories use market basket data. Systems
Expert Systems, 100, 1-16. 8. T.Nelson Gnanaraj, Dr.K.Ramesh Kumar
N.Monica. AnuManufactured cluster analysis using a
new algorithm from structured and unstructured data.

© 2022, IJSREM | www.ijsrem.com | Page 4


International Journal of Scientific Research in Engineering and Management (IJSREM)
Volume: 06 Issue: 05 | May - 2022 Impact Factor: 7.185 ISSN: 2582-3930

International Journal of Advances in Computer


Science and Technology. 2007. Volume 3, No.2

9. McKinsey Global Institute. Big data. The


next frontier is creativity, competition and
productivity. 2011. Accessed at:
www.mckinsey.com/mgi on July 14, 2015. Research.
10.

© 2022, IJSREM | www.ijsrem.com | Page 5

You might also like