0% found this document useful (0 votes)
57 views4 pages

Research On A Customer Churn Combination Prediction Model Based On Decision Tree and Neural Network

The document presents a research on combining a decision tree and neural network model to improve customer churn prediction accuracy. It describes building individual prediction models using each method and then combining the results with weights based on the confidence levels from each model to generate a final prediction.

Uploaded by

danty.dmc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views4 pages

Research On A Customer Churn Combination Prediction Model Based On Decision Tree and Neural Network

The document presents a research on combining a decision tree and neural network model to improve customer churn prediction accuracy. It describes building individual prediction models using each method and then combining the results with weights based on the confidence levels from each model to generate a final prediction.

Uploaded by

danty.dmc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics

Research on a Customer Churn Combination Prediction Model Based on Decision


Tree and Neural Network

Xin Hu Yanfei Yang


Basic Department, Basic Department,
Air Force Early Warning Academy, Air Force Early Warning Academy,
Wuhan, Hubei, China Wuhan, Hubei, China
e-mail: [email protected] e-mail: [email protected]

Lanhua Chen Siru Zhu


Basic Department, Basic Department,
Air Force Early Warning Academy, Air Force Early Warning Academy,
Wuhan, Hubei, China Wuhan, Hubei, China
e-mail: [email protected] e-mail: [email protected]

Abstract—Customer churn is a prominent issue facing customer churn. Through the model, information useful for
companies. Preventing customer churn, trying to retain and predicting customer loss can be extracted from a large
retain customers has become an important issue for business number of customer data so that enterprises can formulate
operations and development. Most of the current customer relevant customer work plans based on this information [3].
churn predictions use a single prediction model, which makes At present, domestic and foreign customer churn prediction
it difficult to accurately predict customer churn. Based on the algorithms include predictions based on traditional statistics
prediction results and confidence of decision tree and neural and predictions based on combined classifiers [4]. Based on
network model, this paper designs a combined prediction machine learning methods and statistical theory, Y. Hang et
model of customer churn and conducts empirical research on
al. [5] used customers' demographic statistics to correlate
the effectiveness of the model. The prediction results show that
indicators to predict churn customers. Based on the
compared with the single customer churn prediction model,
the combined prediction model has higher accuracy and better transaction time of retail customers, Miguéis et al. [6]
prediction effect, and can more intuitively display the basic established a predictive model based on Logistic regression.
characteristics of the churn customers. Yin Ting et al. [7] combined the prior information method of
Bayesian classification with the information entropy gain
Keywords-customer churn; decision tree; prediction model; method of decision tree classification and applied it to the
neural network analysis of customer churn in the telecommunications
industry. Du Gang and Huang Zhenyu [8] used the improved
I. INTRODUCTION decision tree model to predict customer purchase behavior,
Since China's entry into the WTO, many industries have and analyzed the effect comparison before and after
opened to the outside world, which has led to a more fierce optimization to verify the effectiveness and efficiency of the
market competition environment for enterprises. The improved algorithm in the customer purchase behavior. Cui
intensified market competition environment has made the Yongzhe [9] uses the C4.5 algorithm in the decision tree
problem of customer churn faced by enterprises more and algorithm to establish a churn early warning model for
more serious. Generally, the original customers of an telecommunications customers.
enterprise no longer purchase the company's products or However the traditional parametric model or a single
receive corporate services are called corporate customer artificial intelligence-based method cannot achieve relatively
churn. Different industries have different definitions of high-precision prediction, so the establishment of a
customer churn [1]. Generally speaking, it can be divided combined prediction model to improve the prediction
into two categories: active customer churn and passive accuracy is an inevitable trend to solve the problem of
customer churn. Usually excessive customer churn can have customer churn [10]. Based on this, this paper designs a
a significant impact on a company's performance. How to combined prediction model based on two models of decision
retain new customers while developing new customers has tree C5.0 and neural network. The confidence of the two
become a subject that related staff of the company have to prediction models is used as the weight, and the weighted
study [2]. score is used as the customer churn probability. The
Data mining technology is the most commonly used combined prediction model is used to predict customer churn
method to predict customer churn. Data mining technology in a supermarket. By comparing the prediction accuracy of
uses decision tree, neural network, classification regression the three models, the validity of the combined prediction
tree and other technologies to build a model for predicting model is verified.

978-1-7281-6024-5/20/$31.00 ©2020 IEEE 129

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO CEARA. Downloaded on April 03,2023 at 17:25:19 UTC from IEEE Xplore. Restrictions apply.
II. CUSTOMER CHURN PREDICTION MODEL brand new network is created. But when the "Continue to
train the existing model" option is selected, the training will
A. Decision Tree Customer Churn Prediction Model continue to use the network successfully generated by the
Based on the information gain theory, the decision tree is previous node. When a stream containing the generated
one of the most widely used data classification algorithms at neural network model is executed, for each output field in
present. The decision tree structure contains several nodes the original training data, a new field is added to the stream.
and branches, where nodes represent tests on a certain This new field contains the network prediction for the
attribute and branches represent the results of the tests. corresponding output field. For the symbol output field,
Common decision tree algorithms include ID3, C5.0, etc., another new field is added that contains the confidence of the
which are mainly used for predictive analysis of events. The prediction.
decision tree prediction process is performed in two steps:
one is to build and evolve a decision tree using the training C. Combined Customer Churn Prediction Model
set; the other is to test the attribute values of each node, The so-called combined prediction model takes the
classify the input data, and use the attribute values of this confidence of the decision tree and the neural network
class to complete the estimation of the prediction object [11]. prediction model as the weight, the prediction result as the
This article adopts the C5.0 decision tree method, which variable, recalculates the probability of customer churn by
is widely used in the industry for dirt tolerance and strong weighting, and gives the prediction result of customer churn
interpretation ability, to build a customer churn model. The based on the probability of churn. The formula for
C5.0 model uses the field with the largest information gain to calculating the probability of customer churn is as follows:
split the sample. The sample subset obtained from the first
split is usually split according to another attribute. This is 1
repeated until the sample subset can no longer be split. P (aX 1  bX 2 )  
Finally, we remove or trim out a subset of the samples that  2 
do not contribute significantly to the model.
The model uses the bootstrap method to improve the where a, b are the confidence of each sample record in the
accuracy of the C5.0 algorithm. The entire model set is used
decision tree and neural network customer churn prediction
for sample classification, and each decentralized prediction is
integrated into a comprehensive prediction through a respectively, and X 1 , X 2 are the prediction results of
weighted voting process. When the node's flow is executed, customer churn in the decision tree and neural network
two new fields are added: the predicted value of each record model respectively. The value of X 1 , X 2 is either 0 or 1. If
and the confidence.
the customer is lost, the value is 1; if the customer is not lost,
B. Neural Network Customer Churn Prediction Model the value is 0.
As a data analysis mode of human brain thought According to formula (1), if the prediction results of both
simulation, neural networks are based on massive data models are not churn, that is, the values of X 1 and X 2 are
parallel processing and calculation, and are used to describe both 0, then the customer churn probability P of the
cognitive, decision-making and other intelligent control combined prediction is also 0, that is, the customer is not
behaviors. The model structure of a typical neural network churn. If the prediction results of both models are churn, that
includes input layer, hidden layer, and output layer, which
are connected by several neurons. BP neural network is the is, the values of X 1 and X 2 are 1, and the accuracy of the
most widely used neural network algorithm, and its output predictions of both models is above 90%, then the customer
expression [12] is: churn probability P of the combined prediction will be
H f i (¦Zij xi T j ) greater than 0.5, that is, the customer is predicted churn. If
the prediction result of one model is churn and the prediction
where Zij is the connection weight coefficient, f i is the result of the other model is not churn, then the churn
probability P of the combined prediction will be less than 0.5.
excitation function, T j is the threshold of the neuron, and xi Further, if the confidence level predicted for the churn record
is the input of the neuron. BP neural network is trained using is greater than 0.8, indicating that the model predicts the
a teacher-learning method and can implement any complex customer churn with a greater probability, then the combined
non-linear mapping function. The training process is based model will give the probability of customer churn. If the
on the principle of minimum output error, and the connection confidence of the churn record is less than 0.8, it is
weight coefficients and thresholds are modified layer by considered that the possibility of customer churn is small,
layer. then the combination model predicts that the customer is not
This paper selects the customer related characteristic churn.
attributes as the input of the neural network, and the output is It can be seen that the biggest advantage of the combined
whether the customer is lost. In order to facilitate the customer churn prediction model is that it can integrate the
effective evaluation of the model later, 2/3 of the data is results of the two models to clearly distinguish between
randomly selected as the training set and 1/3 as the test set. churn customers and non-churn customers, and for customers
By default, every time a neural network node is executed, a in between, it gives the probability of churn. This can help

130

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO CEARA. Downloaded on April 03,2023 at 17:25:19 UTC from IEEE Xplore. Restrictions apply.
marketers to more accurately identify the churn status of customer purchases is increasing and the trend of purchases
customers with different values, and select customers with is 0.112.
different churn probability for marketing purposes.
TABLE I. CHARACTERISTIC ATTRIBUTES OF A MEMBER CUSTOMER
III. EMPIRICAL ANALYSIS OF THE COMBINED PREDICTION Member Non- Average Unit
Membership Trend value
MODEL card
Card Level
Age shopping shopping shopping
of purchases
number points interval points
Ordinary
A. Data Collection and Selection of Attributes 1800193
member
34 0 41.6 0.504 0.112

The data comes from the customer customer information


of a supermarket from June 2018 to April 2019. After In order to predict customer churn based on these
missing value processing, 2681 member customers were attributes, this article compares the latest shopping interval of
finally used for customer churn prediction analysis. member customers with the maximum shopping interval as a
The member database has accumulated a lot of criterion for judging whether customers are churn. Among
information about members, such as gender, occupation, age, them, the latest shopping interval refers to the length of time
average monthly income, consumption amount, etc. We call that the member customer's last shopping time is away from
them attributes. Some of these attributes are closely related the current time, and the maximum shopping interval refers
to customer loyalty, while others are not. The first step in to the maximum interval between two purchases made by
customer churn prediction is to choose the most reasonable member customers within a certain period of time. When the
customer characteristic attributes. The membership customer’s latest shopping interval is greater than the
characteristic attributes selected in this article include maximum shopping interval, the customer is considered to
membership card level, age, trend value of the number of be churn, otherwise the customer is considered to be not
purchases, sum of non-shopping points, unit shopping points churn. According to this customer churn judgment standard,
and average shopping interval. Among them, the out of 2,681 member customers, there were 669 churn
membership card level refers to the level of our store's customers and 2012 unchurn customers.
membership card owned by member customers, which are
ordinary members, silver members, gold members, and C. Application of Combined Prediction Model
platinum members in descending order. The trend value of After the data is prepared and pre-processed, the data can
the number of purchases indicates the speed at which be input into the model. First, the customer churn prediction
customers' purchases increase or decrease within a certain is performed using the decision tree prediction model and the
period of time. The sum of non-shopping points refers to neural network prediction model respectively. This article
points earned by members through activities such as sharing uses the C5.0 algorithm in SPSS modeler 14.1 to build a
products. Unit shopping points represent the ratio of the total decision tree customer churn prediction model, and selects
points accumulated by customers to the total amount of "Expert Mode" in the "Model" option, and sets "Construction
purchases over a period of time. The average shopping Severity" to 80%. In order to improve the accuracy of
interval is the average shopping interval of a member predicting churn customers, this article sets the
customer during a certain period of time. misclassification loss that actual churn customers are
predicted to be churn to 10. After the model is run, the
B. Data Pre-processing and Churn Judgment prediction result and the confidence level are obtained. When
Before the analysis, first of all the indicators are using the neural network customer churn prediction model,
dimensionless. For continuous numeric variables, all data is the membership characteristic attribute is used as an input
mapped between [0, 1]. The next step is to summarize the field of the neural network customer churn prediction model,
discrete variables and divide the customer age into six age with "whether churn" as the target, and the proportion of
groups. Finally, some new derived variables are generated, samples selected for modeling is 67%. After the model is
including unit shopping points, average shopping interval output, the prediction result and the confidence level are
and trend value of the number of purchases. The trend value obtained. Tables II and III are the confusion matrices
of the number of purchases indicates the speed and direction predicted by the two models respectively.
of the increase or decrease of the number of purchases by the
customer within 11 months, and is expressed by the slope of TABLE II. PREDICTION RESULTS OF THE DECISION TREE MODEL
a linear regression. The slope is greater than 0, indicating Predicted churn Predicted unchurn
that the number of customer purchases is increasing, and the customers customers
larger the value, the faster the increase. On the contrary, if Actual churn customers 562 107
the slope is less than 0, it means that the number of customer Actual unchurn customers 68 1944
purchases is decreasing, and the smaller the value is, the
faster the decrease is. The characteristic attribute values of a TABLE III. PREDICTION RESULTS OF THE NEURAL NETWORK MODEL
member customer are shown in Table I. In Table I, an Predicted churn Predicted unchurn
ordinary member customer with a card number of 1800193, customers customers
is 34 years old, the sum of non-shopping points is 0 points, Actual churn customers 618 51
the average shopping interval is 41.6 days, and the unit Actual unchurn customers 45 1967
shopping point is 0.504 points / yuan. The number of

131

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO CEARA. Downloaded on April 03,2023 at 17:25:19 UTC from IEEE Xplore. Restrictions apply.
Then the prediction results and confidence of the model makes full use of the advantages of a single prediction
customer churn model of the decision tree and neural model and improves the accuracy of the prediction.
network are summarized, and the prediction results are Enterprises can formulate corresponding countermeasures to
converted: not churn is recorded as 0, churn is recorded as 1. avoid customer churn based on the prediction results.
Substitute into formula (1) for weighted calculation, and the
customer churn probability P is between 0 and 0.99. When IV. CONCLUSION
P does not exceed 0.4, the prediction result of the combined Aiming at the problem that a single model is difficult to
prediction model is that the customer is not churn. When achieve high-precision customer churn prediction, this paper
P is not less than 0.5, the prediction result of the combined uses the prediction results and confidence of decision tree
prediction model and neural network prediction model to
prediction model is the customer churn. And when P is
build a combined customer churn prediction model. The
between 0.4 and 0.5, the combined prediction model gives
empirical results show that the combined prediction model
the probability value of the customer churn. Table IV shows
can not only have a better interpretation ability like a
the prediction results of the loss of some member customers.
decision tree model, but also a higher prediction accuracy
TABLE IV. PREDICTION RESULTS OF CHURN OF SOME MEMBER rate of a neural network model, which can better make up for
CUSTOMERS the shortcomings of a single prediction model, and can also
Decision Neural get more stable and accurate prediction results.
Member Decision Neural Customer
tree network Whether
card tree network churn
number
prediction
confidence
prediction
confidence probability
churn ACKNOWLEDGMENT
results results
1800200 0 0.912 0 0.878 0.0000 not churn Grateful acknowledgement is made to Ms. Shi Ziyan who
1800371 1 0.721 0 0.543 0.3605 not churn
1800433 1 0.823 0 0.521 0.4115 0.4115
provided me with the data, and Mr. Fang Chensheng who
gave me considerable help by means of suggestion,
D. Model Evaluation comments and criticism. Without their help, the completion
The evaluation index for predicting the quality of the of the article would be impossible. In addition, I deeply
customer churn model is the accuracy rate. The so-called appreciate the contribution to this thesis made in various
accuracy rate refers to the ratio of the number of predicted ways by my friends and colleagues.
correct customers to the total number of customers. The
combined churn model is used to predict member customers, REFERENCES
and the results are shown in Table V. Of the 2681 customers, [1] Chen Mingliang. Discussion on the framework of the basic theoretical
21 have a churn probability between 0.4 and 0.5. Excluding system of customer relationship management[J]. Journal of
Management Engineering, 2006,20 (4): 36- 41㸬
these 21 customers, there are 2660 customers left. Among
[2] Li Yang. Neural network-based customer churn data mining
these 2660 customer churn predictions, there are 2630 prediction model[J]. Journal of Computer Applications, 2006,33 (S1):
correct predictions and 30 incorrect predictions. The 48-51.
accuracy of the model prediction is as high as 98.87%. [3] Zeng Yaohui. Application of Data Mining in Customer Relationship
Management of Communication Industry[J]. Telecommunication
TABLE V. PREDICTION RESULTS OF THE COMBINED PREDICTION Engineering Technology and Standardization, 2006,19 (7): 63- 66㸬
MODEL
[4] Yu Xiaobin, Cao Jie, Kong Zaiwu. A review on customer churn[J].
Predicted Predicted Customers with a Computer Integrated Manufacturing System, 2012(10).
churn unchurn churn probability of [5] Zhang X, Zhu J, Xu S, et al. Predicting Customer Churn through
customers customers (0.4,0.5) Interpersonal Influence[J]. Knowledge-Based Systems, 2012,
Actual churn 28(6):97-104.
640 17 12
customers
[6] Miguéis V L, Van Den Poel D, Camanho A S, et al. Modeling Partial
Actual unchurn
13 1990 9 Customer Churn: On the Value of First Product-category Purchase
customers
Sequences[J]. Expert Systems with Applications, 2012,
39(12):11250-11256.
TABLE VI. COMPARISON OF PREDICTION ACCURACY OF THE THREE
MODELS [7] Yi Ting, Ma Jun, Qin Xizhong, et al. Application of Bayes Decision
Tree in Prediction of Customer Churn[J]. Computer Engineering and
Prediction Model Prediction accuracy Applications, 2014(7).
Decision tree customer churn prediction model 93.47%
[8] Du Gang, Huang Zhenyu. Prediction of Customer Buying Behavior in
Neural network customer churn prediction model 96.42% Big Data Environment[J]. Management Modernization, 2015(1).
Combined customer churn prediction model 98.87%
[9] Cui Yongzhe. Application of data mining technology in early warning
of customer churn[J]. Journal of Yanbian University: Natural Science
In order to further analyze the prediction accuracy of Edition, 2008, 34 (2).
different model algorithms, the prediction results of the [10] Chen Chen. Analysis of Telecom Customer Churn Forecast Based on
decision tree churn model, the neural network churn model Combination Forecast [D]. Hunan University, 2011.
and the combined churn model are compared, as shown in [11] Yu Lu. A Combined Prediction Model of Telecommunications
Table VI. The results show that the accuracy of customer Customer Churn[J]. Journal of Huaqiao University (Natural Science
churn prediction using the combined model is higher than Edition), 2016, 37 (5): 637- 640.
that of decision tree and neural network customer churn [12] Li Aiqun, Qiao Yan, Wang Ruchuan, et al. Analysis of Telecom
prediction models. This is mainly because the combination Customer Churn Based on Distributed Hybrid Data Mining [J].
Computer Technology and Development, 2010, 20 (10):43-46.

132

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DO CEARA. Downloaded on April 03,2023 at 17:25:19 UTC from IEEE Xplore. Restrictions apply.

You might also like