(2018) Data Analysis of Consumer Complaints in Banking
(2018) Data Analysis of Consumer Complaints in Banking
(2018) Data Analysis of Consumer Complaints in Banking
Abstract—This paper focus on exploring and analyzing alternative clusters. The mining techniques' goal is to detect the
Consumer Finance Complaints data, to find how many similar intrinsic grouping of a data set. In hierarchical clustering, a
complaints are there in relation to the same bank or service or treelike cluster structure (dendrogram) is created through
product. These datasets fall under the complaints of Credit recursive partitioning (divisive methods) or combining
reporting, Mortgage, Debt Collection, Consumer Loan and
(agglomerative) of existing clusters, whereas in k-means
Banking Accounting. By using data mining techniques, cluster
analysis as well as predictive modeling is applied to obtain clustering divides a cluster of k points with reference to a
valuable information about complaints in certain regions of the centroid, which helps if we are aware of the data points that are
Country. The banks that are receiving customer complaints filed probable and output relevant. We hope to find a correlation
against them will analyse the complaint data to provide results on between complaints, companies and consumers to refine
where the most complaints are being filed, what products/ company applications to better accommodate consumer needs
services are producing the most complaints and other useful using a hybrid approach of hierarchal and k-means clustering.
data. Our model will assist banks in identifying the location and
types of errors for resolution, leading to increased customer
satisfaction to drive revenue and profitability. II. LITERATURE REVIEW
Keywords—Consumer, Complaint, analysis, clustering, The number of studies has been conducted regarding the
predictive. services to customers and their awareness. As such, we have
I. INTRODUCTION reviewed some of them.
As we are aware that in today’s modern era people are Kamakodi (2007) concluded that modern day generation is
more into business, so receiving a complaint from a consumer influenced by the computation features used by banks and so
happens almost every day. A consumer’s complaints present the banks study about factors influencing their preferences.
bank or reporting agency with an opportunity to identify and Residence relocation, salary fluctuation and unavailability
rectify specific problems with their current product or service. banking based services are reasons enough to change bank.
Service complaints management is a critical part of business
management. A good complaint-management strategy will Uppal and Kaur(2007) determined how consumer's
result in best customer relationship outcome with minimal awareness of web domains used by banks and gave some
human-resource investment and so hope to find a correlation measures to make these applications more successful. They
between complaints, companies, and consumers to refine concluded that the limitation about today's web domain
company applications to better accommodate consumer needs. application is spreading the awareness about the varied features
Increasingly companies are recognizing the value of a offered.
customer complaint in that it is feedback on their experience,
and an opportunity to not only resolve a problem for that Mishra and Jain (2007) took up dimensions of consumer
particular customer but perhaps also for a much larger number satisfaction in national and private banks. The study talks about
of customers and that leads to inevitable amounts of data that how satisfaction is the foremost asset to the organization,
has to be analyzed and specific functions are used to aggregate which provides unmatched competitive edge that helps
the analysis results. achieving loyalty of a customer. They also spoke how high
level of customer satisfaction leads to loyalty. The study
Clustering is regarded as a crucial unsupervised learning observed ten factors and five areas of satisfaction for both
problem, that tries to search for similar structures among an national and private sector bank.
unlabeled data set .These similar structure are data sets, usually Jain and Jain (2006) demonstrated that the banking sector,
referred to as clusters. the information within every cluster is both private and public have suffered radical as well as
comparable (or close) to components within its cluster, and is revolutionary changed due to the liberalization act of 1991.
dissimilar to (or additional from) parts that belong to Retail banking is the consumer preferred choice which
articulates itself responses received from 200 customers of A distance function dist(c1,c2)
HDFC bank, ICICI bank and some other banks in the city of
for i=1 to n
Varanasi, Uttar Pradesh and he looked upon the schemes
offered by the banks, quantized satisfaction in different types ci = {xi}
of services, expectations about these schemes and the height of
segmentation among the services offered. end for
C={c1,...,cn}
Singh (2006) discusses CRM approaches in various banks.
He emphasized on how the management targets customers in l=n+1
order to gain insight and gives out value added services and while c.size >1 do
products. Web as provided a smooth user experience, giving
access to the various features used by the customers thereby - (cmin1,cmin2) = minimum dist(ci,cj) for all ci,cj in c
achieving customer satisfaction. Management has to strive to - remove cmin1 and cmin2 from c
ensure end to end delivery and ensure customer satisfaction
which is essential to the banks in terms of maintaining high - add{cmin1, cmin2} to c
regards and loyalty obtained from customers. - l = l+1
Although Singh (2004) spurred about the reality of banks 2. Repeat until convergence: {
in terms of providing customer support and found out that the
customers are influenced by the banks location and the For every i, set () () 2
minutest detail of the banking details including the banking
interest rates as well as attitudes and customer support
∑ () ()
provided by the personnel.
For each j, set
∑
III. METHODOLOGY AND PROCEDURE
C. Multi-linear Regression
A. Hierarchical Clustering
As a predictive analysis, the multiple linear regression is
Probably the most applied method in economy is
used to explain the relationship between one continuous
agglomerative hierarchical cluster analysis. It is based on a
dependent variable and two or more independent variables.
proximity matrix which includes the similarity evaluation for
The independent variables can be continuous or categorical .
all pairs of objects. It means that various similarity or
dissimilarity measures for different types of variables
(quantitative, qualitative and binary)can be used. Moreover,
different approaches for evaluation of the cluster similarity
(single linkage, complete linkage, average linkage, Ward’s Relevant to understand the correlation between our variables
method, etc.) can also be applied. and against the single response
D. Outlier Analysis
Given:
In data mining, anomaly detection (also outlier detection) is
A set X of objects{x1,x2,.....xn}
the identification of items, events or observations which do
CONCLUSION