Application of Data Mining in Term Deposit Marketing - IAENG
Application of Data Mining in Term Deposit Marketing - IAENG
Application of Data Mining in Term Deposit Marketing - IAENG
mentioned that the banking industry lacks scientific algorithms can be evaluated by the average silhouette
marketing management and banks generally adopt some coefficient of all instances [14]. A higher silhouette
traditional marketing methods, including relationship coefficient indicates that the instances are better matched to
marketing (use employees’ personal relationship to find its own clusters.
deposit clients), self-interest marketing (obtain deposits by
satisfying clients’ individual interests, such as gifts), passive
marketing (attract customers to increase deposit by offering IV. DATA UNDERSTANDING
warm and thoughtful counter service) and simple service A secondary dataset related to direct marketing campaigns on
marketing (attract deposits by meeting the low-level term deposit accounts of a Portuguese banking institution is
requirements of customers, such as providing door-to-door obtained from the Internet [15]. The dataset contains 41188
services) [10]. They came up with the idea that carrying out observations and 21 variables. The detailed attribute
market segmentation of deposit marketing and selecting the information is shown in the table below.
marketing target is the scientific way of marketing TABLE I
management. However, problems like obsolescence of data, ATTRIBUTE INFORMATION
inadequate maps, lack of data and specific methods encounter Name Data type Description
in practical application of deposit market segmentation.
Bank Client Data
age numeric age
This study will adopt data mining techniques to predict job categorical type of job
customers’ term deposit subscription behaviors and marital categorical marital status
understand customers’ features to improve the effectiveness education categorical education background
default categorical has credit in default?
and accuracy of bank marketing. In order to achieve this housing categorical has housing loan?
objective, the following questions will be addressed. loan categorical has personal loan?
I. How to predict whether a bank client will subscribe to a Contact/ Campaign Data
contact categorical contact communication type
term deposit or not? month categorical last contact month of year
II. Which determinants would indicate a client is ready to day_of_week categorical last contact day of the week
subscribe to a term deposit through direct marketing? duration numeric last contact duration, in seconds
campaign numeric number of contacts performed
III. How to segment term deposit market? during this campaign and for
IV. Are there any common features of clients who have this client
subscribed to a term deposit? pdays numeric number of days that passed by
after the client was last
contacted from a previous
campaign
III. METHODOLOGY previous numeric number of contacts performed
before this campaign and for
In this research, classification models and clustering models this client
will be built through SPSS Modeler. A number of machine poutcome categorical outcome of the previous
marketing campaign
learning algorithms and modeling techniques are included in
Social and Economic Context Attributes
IBM SPSS Modeler for different types of problems solving.
emp.var.rate numeric employment variation rate -
quarterly indicator
Classification algorithms are used to establish predictive cons.price.idx numeric consumer price index - monthly
model by learning and discovering the relationship between a indicator
cons.conf.idx numeric consumer confidence index -
set of feature variables and a target variable. Two phases are monthly indicator
typically contained in classification algorithms [11]. In the euribor3m numeric euribor 3 month rate - daily
first phase, models are constructed from the training instance. indicator
In the second phase, unlabelled testing instances can be nr.employed numeric number of employees -
quarterly indicator
predicted and assigned through the model established in the Output Variable
training phase. Several indicators are typically used to y binary has the client subscribed a term
evaluate the performance of a binary classifier. For example, deposit?
accuracy is used to describe outcomes that are predicted
correctly. Moreover, AUC is the area under the ROC
(Receiver Operating Characteristic) curve, which is a
V. MODELING
probability [12]. Furthermore, Gini coefficient is related to
AUC that Gini=2*AUC-1. A Gini coefficient above 60% Classification and clustering models are established on the
corresponds to a good classification model. processed data.
A. Classification
Clustering algorithms are applied to customer segmentation. Classification algorithms are used to establish a predictive
Instances can be divided into natural groups through model of whether a client will subscribe to a term deposit or
clustering techniques, which is an unsupervised learning not. Auto Classifier node of SPSS Modeler enables to
scheme [13]. Instances with strong resemblance will be in the automatically create and compare multiple different
same cluster. There are different types of clustering classification models. As a result, C5.0 model shows the best
algorithms, including portioning approaches, hierarchical performance with the highest accuracy.
methods, density-based methods, grid-based methods,
model-based methods, etc. The quality of clustering
Therefore, a boosted C5.0 model is built to further improve much longer than regular customers. Meanwhile, more
the performance of the C5.0 model. Figure 1 presents that the number of contacts are performed for new customers during
boosted C5.0 improve the accuracy of the model to 97%. In the marketing campaign with the average of 2.16 times.
addition, both AUC and Gini coefficient indicate that the Furthermore, usual communication type for new customers is
boosted C5.0 classifier generate a more accurate telephone and commonly used communication type for
classification results and a better predictive model. regular customers is cellular.
REFERENCES
[1] Khir, K., Gupta, L., & Shanmugam, B. (2008) ‘Islamic banking: A
practical perspective’.
[2] Islam, M.A. and Ghosh, P. (2014) ‘A comparative analysis of deposit
products in banking industry: an opportunity for eastern bank Ltd.’,
Journal of Investment and Management, 3(1), January, pp.7-20.
[3] Moro, S., Cortez, P. & Laureano, R. (2013) A data mining approach for
bank telemarketing using the rminer package and r tool [Online].
Available from:
https://www.researchgate.net/publication/256464440_A_data_mining
_approach_for_bank_telemarketing_using_the_rminer_package_and_
r_tool?enrichId=rgreq-ef9c19b19ab77f6e64e62c02ff6bdc5c-XXX&e
nrichSource=Y292ZXJQYWdlOzI1NjQ2NDQ0MDtBUzoxMTkyMz
AyMzgzMDIyMTFAMTQwNTQzODExMjgxNQ%3D%3D&el=1_x
_2&_esc=publicationCoverPdf (Accessed: 4 September 2017).
[4] Ling, X. and Li, C. (1998) ‘Data Mining for Direct Marketing:
Problems and Solutions’. Proceedings of the 4th KDD conference,
AAAI Press, pp.73–79.
[5] Ou, C., Liu, C., Huang, J. & Zhong, N. (2003) ‘On Data Mining for
Direct Marketing’. Proceedings of the 9th RSFDGrC conference, 2639,
pp.491–498.
[6] Wu, Q.H. (2008) ‘Some Issues with Applying Association Rules in
Commercial Bank’, Journal of System Simulation, 20 (8), April,
pp.2206-2209.
[7] Moro, S., Cortez, P. & Rita, P. (2014) A Data-Driven Approach to
Predict the Success of Bank Telemarketing [Online]. Available from:
https://pdfs.semanticscholar.org/4a27/709545cfa225d8983fb4df8061f
b205b9116.pdf (Accessed: 14 September 2017).
[8] Nachev, A. (2015) Application of data mining techniques for direct
marketing [Online]. Available from:
http://www.foibg.com/ibs_isc/ibs-30/ibs-30-p09.pdf (Accessed: 14
September 2017).
[9] Predue, R.T. (1974) ‘SOME TYPICAL USES OF CENSUS DATA IN
BANK MARKETING RESEARCH’, Review of Public Data Use,
2(2), pp.31-36.
[10] Wang, B.Z., Song, J.L., & Fang, C. (2002) ‘Opinions on Deposit
Marketing of Commercial Banks’, Financial Theory and Practice, 2002
(9), August, pp.32-33.
[11] Aggarwal, C.C. (2015) ‘Data Classification Algorithms and
Applications’, CRC Press, EBSCOhost [Online]. Available from:
http://10.7.1.204:81/read.php?resid=99673992 (Accessed: 25
November 2017).
[12] Fawcett, T. (2006) ‘An introduction to ROC analysis’, Pattern
recognition letters, 27(8), pp. 861-874.
[13] Witten, I., Frank, E. & Hall, M.A. (2011) Data Mining – Pratical
Machine Learning Tools and Techniques. Burlington: Elsevier.
[14] Chen, X. and Li, Z. (2013) ‘Effectiveness Analysis of The Application
of Clustering in Student Grouping’, International Conference on
Education Technology and Information System, Atlantis Press,
pp.988-991.
[15] UCI Machine Learning Repository (2014). Bank Marketing Data Set
[Online]. Available from:
http://archive.ics.uci.edu/ml/datasets/Bank+Marketing (Accessed: 4
September 2017).