Customer Segmentation in Banking Dataset Using Machine Learning
Customer Segmentation in Banking Dataset Using Machine Learning
ABSTRACT
Machine learning techniques analyze and extract useful information from data sets in order to solve problems in different
areas. For the banking sector, knowing the characteristics of customers entails a business advantage since more
personalized products and services can be offered. The goal of this study is to identify and characterize data mining and
machine learning techniques used for bank customer segmentation, their support tools, together with evaluation metrics
and datasets. We performed a systematic literature mapping of 87 primary studies published between 2005 and 2019. We
found that decision trees and linear predictors were the most used data mining and machine learning paradigms in bank
customer segmentation. From the 41 studies that reported support tools, Weka and Matlab were the two most commonly
cited. Regarding the evaluation metrics and datasets, accuracy was the most frequently used metric, whereas the UCI
Machine Learning repository from the University of California was the most used dataset. In summary, several data mining and
machine learning techniques have been applied to the problem of customer segmentation, with clear tendencies regarding
the techniques, tools, metrics and datasets.
I. INTRODUCTION
II. LITERATURE REVIEW
The customer segmentation has the importance as it
includes, the ability to modify the programs of market so Over the years, the competition amongst businesses is
that it is suitable to each of the customer segment, support increased and the large historical data that is available has
in business decision; identification of products associated resulted in the widespread use of data mining techniques in
with each customer segment and to manage the demand and extracting the meaningful and strategic information from
supply of that product; identifying and targeting the the database of the organization. According to, Clustering
potential customer base, and predicting customer defection, techniques consider data tulles as objects. They partition
providing directions in finding the solutions. The thrust of the data objects into groups or clusters.
this paper is to identify customer segments using the data
mining approach, using the partitioning algorithm called as The key to meaningful segmentation is to define customer
K-means clustering algorithm Company’s database. variables and attributes that are relevant to your unique
Customer segmentation is one of the applications of data business. The customers are becoming more concerned
mining which helps to segment the customers with similar and sophisticated in how they navigate their shopping
patterns into similar clusters hence, making easier for the choices, and the online retailers are discovering the one-
business to handle the large customer base. size-fits-all marketing approaches that aren’t so effective
any more.
This segmentation can directly or indirectly influence the
marketing strategy as it opens many new paths to discover Xiaojun Chen, Yixiang Fang, Min Yang, Feiping Nie,
like for which segment the product will be good, Zhou Zhao and Joshua Zhexue Huang suggested a
customizing the marketing plans according to the each partitioned clustering algorithm named “PurTreeClust”
segment, providing discounts for a specific segment, and for the faster clustering of customer’s transaction records
decipher the customer and object relationship which has where one of the major key element in achieving
been previously unknown to the company. successful modern marketing and customer relationship
management is the customer segmentation or the clusters
Customer segmentation allows companies to visualize what of customers [1].
actually the customers are buying which will prompt the
companies to better serve their customers resulting in Segments are not necessarily predictive in nature,
customer satisfaction it, also allows the companies to find although they are generally descriptive and serve as a type
who their target customers are and improvise their of classification that can be used to aid in understanding
marketing tactics to generate more revenues from them. the future behaviors and the needs of customer.
III. FLOWCHART
IV. CONCLUSION
Cluster 1 denotes the customer who has high 3. Hong, T., Kim, E. (2011). It separates consumers
annual income as well as high yearly spend. from online stores based on factors that affect the
Cluster 2 represents the cluster having high annual
customer's intention to purchase. Expert System
income and low annual spend. Cluster 3
represents Applications, 39 (2), 2127-2131
customer with low annual income and low
annual1999. Vol. 31, No. 3. [13] Vishish R.
4. Hwang, Y. H. (2019). Hands-on Advertising
Patel1 and Rupa G. Mehta. MpImpact for
External Removal. spend. Cluster 5 denotes the Science Data: Develop your machine learning
low annual income but high yearly spend. marketing strategies… using python and r. S.l: Packt
Cluster 4 and cluster 6 denotes the customer
printing is limited Puwanenthiren Premkanth, -
with medium income and medium spending
score. Market Classification and Its Impact on Customer
Satisfaction and Special Reference to the Commercial
Bank of Ceylon PLC.‖ Global Journal of Management
and Business Publisher Research: Global Magazenals
Inc. (USA). 2012. Print ISSN: 0975-5853. Volume 12
Issue 1.