A Survey On Customer Churn Prediction Using Machine Learning and Data Mining Techniques in E-Commerce
A Survey On Customer Churn Prediction Using Machine Learning and Data Mining Techniques in E-Commerce
Authorized licensed use limited to: Unitec Library. Downloaded on September 17,2022 at 05:57:52 UTC from IEEE Xplore. Restrictions apply.
section. Generally, they are categorized into two as follows.
i) Hybrid Technique (Integrates soft computing along with
machine learning) ii) Machine learning-based methods. Fig.
Previous studies, proposed widespread methods for churn 1 represents classification of various techniques utilized for
prediction and analysis. Recent techniques extensively used churn prediction.
for analysis and prediction studies are discussed in this
Authorized licensed use limited to: Unitec Library. Downloaded on September 17,2022 at 05:57:52 UTC from IEEE Xplore. Restrictions apply.
By observing the training set hidden pattern, each customers To perform churn prediction and analysis for greater datasets,
individuality is discovered and the obtained knowledge is another technique was proposed [29].This method was based
applied to the test set to improve the performance of the on metaheuristic which used firefly algorithm as a classifier.
prediction in enhanced prediction system developed by The comparison block was the intensive component of firefly
Jafari-Marandi [21]. The error-driven learning and self- algorithm. This algorithm was used to identify the firefly
organized approaches (ChP-SOEDNN) advantage are used with brighter intensity of light compared to others .
by artificial neural network for designing the prediction
system of churn. V. ISSUES FROM EXISTING WORK
For proper functioning of ML–based models, detailed
III. REVIEW ON SOFT COMPUTING WITH emphasis is required on the key features of existing models.
MACHINE LEARNING-BASED CHURN PREDICTION The appropriate feature definition is required for each model
Chr-PmRF approach which consists of RF, mRMR and PSO which depends upon the problems. Traditional models failed
techniques to provide better result than the other existing to identify the behavioral pattern due to the unstructured
classification, feature reduction and sampling techniques.[22] format and an enormous volume of data. To overcome the
Based on AUC, the Random forest and KNN were used for failure, data mining and its application in customer churn are
primary examination and optimal selection from each subset. comes under research focus, lot of researches are carried out
Each optimal subset for which Fisher’s ratio, PCA, mRMR in this field to find optimal solutions. In the analysis of
techniques are applied individually, were further analyzed customer churn, family of deep learning algorithms also
using KNN and RF classifier. The data set is subsample by applied in the research of covering algorithms and its
an under sampling method of PSO. The power of prediction application which is evident from the literature survey. Deep
of various algorithms in Bayesian Network were investigated learning being an emerging field in churn prediction analysis,
by Verbraken [23] which is ranges from Naive Bayes paves way to greater research under machine learning.
classifier to General Bayesian Network classifiers. Markov TABLE 1. INFERENCE OF EXISTING MACHINE
blanket concept-based feature selection method is tested in LEARNING MODEL BASED ON CLASSIFICATION
this work which corresponded to Bayesian Networks. To use
the customer data in both related source and target transfer S. Methods Merits Demerits
learning theory is proposed in feature-selection-based No
dynamic transfer ensemble (FSDTE) model which is
developed by Xiao [24]. 1 extended Proficient of Constructed models for
support vector handling large scale, nonlinearity
The particle fitness calculation (PFC) and particle machine massive data and high-
classification optimization (PCO) are executed iterative (ESVM) dimensionality
manner in telecommunication BP network particle accuracy besides
classification optimization-based customer churn prediction generalization
(PBCCP) algorithm which is proposed by Yu [25]. To capability might not be
develop high-performance churn prediction system with assured.
superior ability in churn identification, AdaBoost
classification capability is combined with genetic 2 logistic can be used Doesn't perform well
programming (GP) searching capability. The learning-based regression and to solve both when feature space is
AdaBoost is used to evaluate the features selected frequently random forests classification too large.
from the expression of various GP which are noted and as well
as regression Restriction in
examined. Churn prediction system (ChP-GPAB) which is
problems investigation of people
developed by the integration of GP-AdaBoost with Particle
buying list in physical
swarm optimization based under sampling method is used to
discovers the hidden reason behind the behavior of customers stores
churning and provides better churners learning. Yu [25] focus
on a discussion of number of research work in the area of
accurate churn measurement in customer churn and various 3 RotBoost in Feature space Limitations is restricted
approaches to retention. 2Manivannan proposed Grey Wolf amalgamation is diminished to to application
Optimization approach (GWO)[27] algorithm whose with mRMR through domains in which
improved accuracy is 89.26% by matching the actual churn features (CP- mRMR datasets does not
and converge in minimal time compared to other ACO and MRB) efficiently, possesses high
PSO approaches which take more time of convergence for leading to dimensionality as well
appreciable churn prediction. CUPGO retains the valuable better as imbalance
customers with 34.81% customer retention which can be learning distribution.
achieved by processing the dataset gathered over two capabilities.
consistent years. The particle swarm optimization (PSO) with
three variants such as feature selection integrated PSO, 4 Rough K- Good Outcomes are not
simulated annealing integrated PSO and PSO integrated both means+ SVM accuracy satisfactory
are used for proposed a technique for the prediction of churn besides lesser
in telecom sector by Vijaya &Sivasankar [18]. The feature misclassificat
selection is used as a pre-processing mechanism. ion error in
Authorized licensed use limited to: Unitec Library. Downloaded on September 17,2022 at 05:57:52 UTC from IEEE Xplore. Restrictions apply.
contradiction TABLE 2. INFERENCE OF EXISTING MACHINE
to single LEARNING MODEL BASED ON CLUSTRING
classifier
model S. Methods Merits Demerits
No
5 rough set theory Predicted An open research
(RST) churn at risk problem may exist due 1 LRFM clustering Able to Very less frequency
which to the sort of obtain irrespective of good
perhaps classification technique important customer performance
might churn. used to approach churn characteristic contribution.
prediction attributes for
customer
6 Fuzzy Classifier Fuzzy Domain knowledge relationship
Classifiers is Augmentation may management.
greatly prone to erroneous
utilized for results. 2 K- local Efficiency is Necessitates greater
Maximum maximum enhanced in computational resource
churners margin feature addition to overhead besides
captured by extraction churn extended computing
Fuzzy algorithm prediction time
Classifiers (KLMM) accuracy in
due to high telecom
TP rate.
Authorized licensed use limited to: Unitec Library. Downloaded on September 17,2022 at 05:57:52 UTC from IEEE Xplore. Restrictions apply.
normal through integrating Deep learning is the enhanced research of machine learning,
firefly game theory. which uses a hierarchical learning process for obtaining high-
algorithm. level features of data and utilized to handle the enormous
amount of unstructured data. By eliminating the low-level
features, the proposed novel Stacked Auto-Encoder (SAE)
6 hybrid It can gain a Better precision may
incorporated in multi-layer feature selection obtained only
methodology considerably cause low Recall
high-level feature from the data of churn.
higher measure and vice versa
accuracy
Authorized licensed use limited to: Unitec Library. Downloaded on September 17,2022 at 05:57:52 UTC from IEEE Xplore. Restrictions apply.
Accuracy comparison
Authorized licensed use limited to: Unitec Library. Downloaded on September 17,2022 at 05:57:52 UTC from IEEE Xplore. Restrictions apply.
machine learning-based (MBD) field for providing better test for customer churn in e-commerce. Expert Systems with
result for real data. The difficulties faced by the methods Applications, 38(3), 1425-1430.
using machine learning in MBD analysis is given as follows: [8] Li, D. C., Dai, W. L., & Tseng, W. T. (2011). A two-
stage clustering method to analyze customer
1) Large-Scale. For attaining, better precision and efficiency characteristics to build discriminative customer
in methods using machine learning for MBD analysis, an management: A case of textile manufacturing
enormous amount of applications and real-time data is business. Expert Systems with Applications, 38(6), 7186-
required. 7191.
2) Generalization Problem. The scalability or the [9] Miguéis, V. L., Van den Poel, D., Camanho, A. S., & e
generalization ability plays a vital role in performance Cunha, J. F. (2012). Predicting partial customer churn
evaluation of trained deep learning or machine learning using Markov for discrimination for modeling first
model. It is appropriate for diverse data subspace. The entire purchase sequences. Advances in Data Analysis and
data gaining in impractical in the massive scale MBD, Classification, 6(4), 337-353.
although it belongs to a specific field. [10] Idris, A., Khan, A., & Lee, Y. S. (2013). Intelligent churn
prediction in telecom: employing mRMR feature
3) Multimodal Learning. It is a great challenge to acquire selection and RotBoost based ensemble
useful pattern and hidden knowledge from the multimodal classification. Applied intelligence, 39(3), 659-672.
learning which uses diverse input data and multimodal with [11] Keramati, A., Jafari-Marandi, R., Aliannejadi, M.,
deep and machine learning. Ahmadian, I., Mozaffari, M., & Abbasi, U. (2014).
Improved churn prediction in telecommunication
Scalability attains more attention which increases the inferred industry using data mining techniques. Applied Soft
classes number, in recognition and classification problem Computing, 24, 994-1012.
besides with generalization problem. To upfront the [12] Zhao, L., Gao, Q., Dong, X., Dong, A., & Dong, X.
requirement of the churn prediction analysis, it is necessary (2017). K-local maximum margin feature extraction
to enhance the scalability of the methods with high feasibility algorithm for churn prediction in telecom. Cluster
and accuracy. In the future work, for better churn prediction Computing, 20(2), 1401-1409.
deep learning approaches should be investigated. For trend [13] Rajamohamed, R., &Manokaran, J. (2018). Improved
analysis and prediction, soft computing and autoencoder are credit card churn prediction based on rough clustering
utilized in artificial intelligence techniques which is used to and supervised learning techniques. Cluster
discover the behavior pattern change of churn customer in Computing, 21(1), 65-77.
future work. [14] Ahmed, A. A., &Maheswari, D. (2019). An enhanced
ensemble classifier for telecom churn prediction using
cost based uplift modelling. International Journal of
REFERENCES Information Technology, 11(2), 381-391.
[15] Amin, A., Anwar, S., Adnan, A., Nawaz, M., Alawfi, K.,
[1] Berger, P., &Kompan, M. (2019). User modeling for
Hussain, A., & Huang, K. (2017). Customer churn
churn prediction in E-commerce. IEEE Intelligent
prediction in the telecommunication sector using a rough
Systems, 34(2), 44-52.
set approach. Neurocomputing, 237, 242-254.
[2] Ren, A. H., & Zhao, W. W. (2013, December).
[16] Azeem, M., Usman, M., & Fong, A. C. M. (2017). A
Electronic Commerce Based on Self-Organizing Data
churn prediction model for prepaid customers in telecom
Mining Customer Churn Prediction Model. In 2013
using fuzzy classifiers. Telecommunication
International Conference on Advances in Social Science,
Systems, 66(4), 603-614.
Humanities, and Management (ASSHM-13). Atlantis
[17] Vijaya, J., &Sivasankar, E. (2018). Computing efficient
Press.
features using rough set theory combined with ensemble
[3] Akter, S., &Wamba, S. F. (2016). Big data analytics in
classification techniques to improve the customer churn
E-commerce: a systematic review and agenda for future
prediction in telecommunication
research. Electronic Markets, 26(2), 173-194.
sector. Computing, 100(8), 839-860.
[4] Vanneschi, L., Horn, D. M., Castelli, M., &Popovič, A.
[18] Sivasankar, E., & Vijaya, J. (2019). Hybrid PPFCM-
(2018). An artificial intelligence system for predicting
ANN model: an efficient system for customer churn
customer default in e-commerce. Expert Systems with
prediction through probabilistic possibilistic fuzzy
Applications, 104, 1-21.
clustering and artificial neural network. Neural
[5] Yu, X., Guo, S., Guo, J., & Huang, X. (2011). An
Computing and Applications, 31(11), 7181-7200.
extended support vector machine forecasting framework
[19] De Caigny, A., Coussement, K., & De Bock, K. W.
for customer churn in e-commerce. Expert Systems with
(2018). A new hybrid classification algorithm for
Applications, 38(3), 1425-1430.
customer churn prediction based on logistic regression
[6] Wu, H. L., Zhang, W. W., & Zhang, Y. Y. (2010,
and decision trees. European Journal of Operational
August). An empirical study of customer churn in e-
Research, 269(2), 760-772.
commerce based on data mining. In 2010 International
[20] Ullah, I., Raza, B., Malik, A. K., Imran, M., Islam, S. U.,
Conference on Management and Service Science (pp. 1-
& Kim, S. W. (2019). A churn prediction model using
4). IEEE.
random forest: analysis of machine learning techniques
[7] Yu, X., Guo, S., Guo, J., & Huang, X. (2011). An
for churn prediction and factor identification in telecom
extended support vector machine forecasting framework
sector. IEEE Access, 7, 60134-60149.
Authorized licensed use limited to: Unitec Library. Downloaded on September 17,2022 at 05:57:52 UTC from IEEE Xplore. Restrictions apply.
[21] Jafari-Marandi, R., Denton, J., Idris, A., Smith, B. K., optimization based feature selection model with
&Keramati, A. (2020). Optimum profit-driven churn simulated annealing. Cluster Computing, 22(5), 10757-
decision making: innovative artificial neural networks in 10768.
telecom industry. Neural Computing and Applications, [29] Ahmed, A. A., &Maheswari, D. (2017). Churn
1-34. prediction on huge telecom data using hybrid firefly
[22] Idris, A., Rizwan, M., & Khan, A. (2012). Churn based classification. Egyptian Informatics
prediction in telecom using Random Forest and PSO Journal, 18(3), 215-220.
based data balancing in combination with various feature [30] Mishra, A., & Reddy, U. S. (2017, December). A novel
selection strategies. Computers & Electrical approach for churn prediction using deep learning.
Engineering, 38(6), 1808-1819. In 2017 IEEE International Conference on
[23] Verbraken, T., Verbeke, W., &Baesens, B. (2014). Profit Computational Intelligence and Computing Research
optimizing customer churn prediction with Bayesian (ICCIC) (pp. 1-4). IEEE.
network classifiers. Intelligent Data Analysis, 18(1), 3- [31] Li, R., Wang, P., & Chen, Z. (2016). A feature extraction
24. method based on stacked auto-encoder for telecom churn
[24] Xiao, J., Xiao, Y., Huang, A., Liu, D., & Wang, S. prediction. In Theory, methodology, tools and
(2015). Feature-selection-based dynamic transfer applications for modeling and simulation of complex
ensemble model for customer churn systems (pp. 568-576). Springer, Singapore.
prediction. Knowledge and information systems, 43(1), [32] De Caigny, A., Coussement, K., De Bock, K. W.,
29-51. &Lessmann, S. (2019). Incorporating textual
[25] Yu, R., An, X., Jin, B., Shi, J., Move, O. A., & Liu, Y. information in customer churn prediction models based
(2018). Particle classification optimization-based BP on a convolutional neural network. International
network for telecommunication customer churn Journal of Forecasting.
prediction. Neural Computing and Applications, 29(3), [33] Wangperawong, A., Brun, C., Laudy, O.,
707-720. &Pavasuthipaisit, R. (2016). Churn analysis using deep
[26] Idris, A., Iftikhar, A., &ur Rehman, Z. (2019). Intelligent convolutional neural networks and autoencoders. arXiv
churn prediction for telecom using GP-AdaBoost preprint arXiv:1604.05377.
learning and PSO undersampling. Cluster [34] Cao, S., Liu, W., Chen, Y., & Zhu, X. (2019, October).
Computing, 22(3), 7241-7255. Deep Learning Based Customer Churn Analysis. In 2019
[27] Manivannan, R., Saminathan, R., & Saravanan, S. 11th International Conference on Wireless
(2019). An improved analytical approach for customer Communications and Signal Processing (WCSP) (pp. 1-
churn prediction using Grey Wolf Optimization 6). IEEE.
approach based on stochastic customer profiling over a
retail shopping analysis: CUPGO. Evolutionary
.
Intelligence, 1-10.
[28] Vijaya, J., &Sivasankar, E. (2019). An efficient system
for customer churn prediction through particle swarm
Authorized licensed use limited to: Unitec Library. Downloaded on September 17,2022 at 05:57:52 UTC from IEEE Xplore. Restrictions apply.