0% found this document useful (0 votes)

56 views7 pages

A Neuro-Fuzzy Classifier For Customer Churn Prediction: Hossein Abbasimehr Mostafa Setak M. J. Tarokh

The document describes a study that uses an Adaptive Neuro Fuzzy Inference System (ANFIS) as a neuro-fuzzy classifier to predict customer churn. ANFIS combines artificial neural networks and fuzzy logic to create a model that is both accurate and comprehensible. The study builds two ANFIS models and benchmarks them against traditional classifiers like C4.5 and RIPPER on a telecommunications customer dataset. The results show that the ANFIS models have acceptable performance in terms of accuracy, specificity, and sensitivity, while producing significantly fewer rules than the other classifiers, making them more comprehensible. The study concludes that ANFIS is an appropriate choice for churn prediction applications as it balances accuracy and comprehensibility.

Uploaded by

Anonymous RrGVQj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views7 pages

A Neuro-Fuzzy Classifier For Customer Churn Prediction: Hossein Abbasimehr Mostafa Setak M. J. Tarokh

Uploaded by

Anonymous RrGVQj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

International Journal of Computer Applications (0975 8887)

Volume 19 No.8, April 2011

A Neuro-Fuzzy Classifier for Customer Churn Prediction

Hossein Abbasimehr

Mostafa Setak

M. J. Tarokh

K. N. Toosi University of Tech

Tehran, Iran

K. N. Toosi University of Tech

Tehran, Iran

K. N. Toosi University of Tech

Tehran, Iran

ABSTRACT
Churn prediction is a useful tool to predict customer at churn
risk. By accurate prediction of churners and non-churners, a
company can use the limited marketing resource efficiently to
target the churner customers in a retention marketing campaign.
Accuracy is not the only important aspect in evaluating a churn
prediction models. Churn prediction models should be both
accurate and comprehensible. Therefore, Adaptive Neuro Fuzzy
Inference System (ANFIS) as neuro-fuzzy classifier is applied to
churn prediction modeling and benchmarked to traditional rulebased classifier such as C4.5 and RIPPER. In this paper, we
have built two ANFIS models including ANFIS-Subtractive
(subtractive clustering based fuzzy inference system (FIS)) and
ANFIS-FCM (fuzzy C-means (FCM) based FIS) models. The
results showed that both ANFIS-Subtractive and ANFIS-FCM
models have acceptable performance in terms of accuracy,
specificity, and sensitivity. In addition, ANFIS-Subtractive and
ANFIS-FCM clearly induce much less rules than C4.5 and
RIPPER. Hence ANFIS-Subtractive and ANFIS-FCM are the
most comprehensible techniques tested in the experiments.
These results indicate that ANFIS shows acceptable
performance in terms of accuracy and comprehensibility, and it
is an appropriate choice for churn prediction applications.

General Terms
Data Mining & Churn

Keywords
Churn Prediction, Data mining, ANFIS, Fuzzy C-means,
Subtractive clustering.

Logistic regression [11, 12]. Accuracy is not the only important

aspect in evaluating a churn prediction models. Churn prediction
models should be both comprehensible and accurate.
Comprehensibility of model causes it to reveal some knowledge
about churn drivers of customers. Such knowledge can be
extracted in the form of if then rules which allows developing
a more effective retention strategy. In this study we apply
Adaptive Neuro Fuzzy Inference System (ANFIS) as neurofuzzy classifier for customer churn prediction. Neuro-fuzzy
systems have been deployed successfully in many applications,
and yields a rule set that is derived from a fuzzy perspective
inherent in data. Indeed, the main objective of this study is to
compare the ANFIS as neuro fuzzy classifier with two states-ofthe-art crisp classifiers including C4.5 and RIPPER.
Furthermore, we introduce generating fuzzy inference system
using fuzzy C-means clustering.
The reminder of this paper is organized as follows. Firstly,
Executed methods are described in section 2. In section 3, the
data preprocessing, the evaluation metrics and model building
are described. The results of experiments are analyzed in
section4. Conclusions are considered in section 5.

2. METHODS
2.1 Fuzzy c-means (FCM) clustering
algorithm
Fuzzy c-means (FCM) is a data clustering method wherein each
data point belongs to a cluster to some degree that is specified
by a membership grade. This method was originally introduced
by Jim Bezdek in 1981[13].

1. INTRODUCTION
In recent years, Due to the saturated markets and competitive
business environment, Customer churn becomes a focal concern
of most firms in all industries. Neslin et al. [1] defined customer
churn as the tendency of customers to stop doing business with a
company in a given time period. Churn prediction is a useful
tool to predict customer at churn risk. Technically spoken, the
purpose of churn prediction is to classify the customers into two
types: customers who churn (leave the company) and customer
who continue doing their business with company [2]. By
accurate prediction of churners and non-churners, a company
can use the limited marketing resource efficiently to target the
churner customers in a retention marketing campaign.
Gaining a new customer costs 12 times more than retaining the
existing one [3]; Therefore, a small improvement on the
accuracy of churn prediction can result a big profit for a
company [4].
Data mining techniques had been used widely in churn
prediction context such as support vector machines (SVM) [5, 6,
7], decision tree [8], artificial neural network (ANN) [9, 10],

Suppose a collection of n data point {

dimensional space.

} in an p-

The unknowns in FCM clustering are:

1- A fuzzy c-partition of the data, which is a c x n membership
matrix U= [u(ik)] with c rows and n columns. The values in
row i give the membership of all n input data in cluster for k=1
to n; the kth column of U gives the membership of vector k in all
c cluster for i=1 to c. each of the entries in U lies in [0, 1]; each
row sum is greater than zero; and each column of sum equals 1.
2- The second set of unknowns is a set of c cluster centers,
arrayed as the c columns of a p x c matrix V. these cluster
centers are data point in the input space of p-tuples. Pairs (U, V)
of coupled estimates are found by alternating optimization
through the first-order necessary conditions for U and V. the
objective function of FCM is as follows.

International Journal of Computer Applications (0975 8887)

Volume 19 No.8, April 2011
FCM performs the clustering with the aim of minimization the
following objective function:
, 1m (1)

Where m is any real number greater than 1,

is the degree of

membership of
in the cluster j,
is the ith of p-dimensional
data,
is the p-dimensional center of the cluster, and
is
any norm expressing the similarity between any measured data
and the center. Fuzzy partitioning is carried out through an
iterative optimization of the objective function shown above,
with the update of membership
and cluster centers by:
=

(2)

This iteration will stop when

where is a termination criterion between 0 and 1, while k is

the iteration steps. This procedure converges to a local minimum
or a saddle point of

After clustering, the clusters information is used for

determining the initial number of rules and antecedent
membership function that is used for identifying the Fuzzy
Inference System (FIS).

2.3 Adaptive Neuro Fuzzy Inference System

(ANFIS)
Fuzzy logic (FL) and fuzzy inference systems (FIS), first
proposed by Zadeh [16], provide a solution for making decisions
based on vague, ambiguous, imprecise or missing data. FL
represents models or knowledge using IFTHEN rules in the
form of if X and Y then Z. A fuzzy inference system mainly
consists of fuzzy rules and membership functions and
fuzzification and de-fuzzification operations. By applying the
fuzzy inference, ordinary crisp input data produces ordinary
crisp output, which is easy to be understood and interpreted. A
more generalized description of fuzzy problems and uncertainty
is provided in [17].
There are two types of fuzzy inference systems that can be
implemented: Mamdani type and Sugeno type [18, 19]. Because
the Sugeno system is more compact and computationally
efficient than a Mamdani system, it lends itself to the use of
adaptive techniques for constructing the fuzzy models.
A fuzzy rule in a Sugeno fuzzy model has the form of, if x is A
and y is B then z=f(x, y) where A and B are input fuzzy set in
antecedent and usually z=f(x, y) is a zero or first order
polynomial function in the consequent.

2.2 Subtractive clustering

Subtractive clustering is one of the automated data-driven based
methods for constructing the primary fuzzy models proposed by
chiu [14]. It is an extension of the Mountain Clustering in
traduced by Yager and Filev [15]. This method avoids from
rule-base explosion problem. It is a fast, one-pass algorithm for
estimating the number of clusters and the cluster centers in a set
of data. The main processes of subtractive clustering are as
follows:
Consider a collection of m data point { ,,
} in an Ndimensional space. The algorithm assumes each data point is a
potential cluster center and calculates some measure of potential
for each of them according to Eq.(3)

Fuzzy reasoning procedure for the first order Sugeno Fuzzy

Model is shown in Figure 1(a).
In order for a FIS to be mature and well established so that it
can work appropriately in prediction mode, its initial structure
and parameters (linear and nonlinear) need to be tuned or
adapted through a learning process using a sufficient inputoutput pattern of data. One of the most commonly used learning
systems for adapting the linear and nonlinear parameters of an
FIS, particularly the first-order Sugeno fuzzy model, is the
ANFIS. ANFIS is a class of adaptive networks that are
functionally equivalent to fuzzy inference systems [20].
ANFIS architecture:

(3)

Where

and

defines the neighborhood

radius for each cluster center. After calculating the potential for
each vector, the one with the higher potential is selected as the
first cluster center. Let
be the center of the first group and
its potential. Then the potential for each
is reduced
according to Eq.(4)
=

Also =

(4)
and

represent radius of neighborhood

for which considerable potential reduction will happen.

is regularly chosen to avoid obtaining closely spaced
cluster centers.

Assume a fuzzy inference system with two inputs x, y and one

output z with the first order of Sugeno Fuzzy Model. Fuzzy rule
set with two fuzzy if-then rules are as follows:
If x is A1 and y is B1, then f1=p1x+q1+r1.
If x is A2 and y is B2, then f2=p2x+q2+r2.
Where (p1,q1,r1) and (p2,q2, r2) are parameters of output
functions.
As it is shown in Figure 1(b), we can implement the reasoning
mechanism into a feed forward neural network with supervised
learning capability, which is known as ANFIS architecture. The
ANFIS has the following layers as illustrated in figure1(b).
Layer 0: it consists of plain input variable set.
Layer 1: The node function of every node i in this layer take the
form [20]:
=
(5)

International Journal of Computer Applications (0975 8887)

Volume 19 No.8, April 2011
Where is the input to node i,
is the membership function
(which can be triangular, trapezoidal, Gaussian functions or
other shapes) of the linguistic label
associated with this
node. In other words,
is the membership function of
and
it specifies the degree to which the given
satisfies the
quantifier .
In this study, the Gaussian-shaped MFs defined below are
utilized:
(6)
Where
are the parameters of the MFs governing the
Gaussian functions. The parameters in this layer are usually
referred to as premise parameters.

Layer 2. Every node in this layer multiplies incoming signals

from layer 1and send product out as follows:

(7)
Where the output of this layer
represents the firing strength
of a rule.
Layer 3. Every node i in this layer, determines the ratio of the
ith rule's firing strength to the sum of all rules' firing strengths
as:
i=1, 2.
(8)
Where the output of this layer represents the normalized firing
strengths.
Layer 4: Every node i in this layer is an adaptive node with a
node function of the form
=
=
(
+
+ )
(9)
Where
is the output of layer 3, and {
} is the
parameter set. Parameters in this layer are referred to as the
consequent parameters.
Layer 5: this layer consists of one single node that computes the
overall output as the summation of all incoming signals from
layer 4 as
Overall output =

(10)

Bothe premise and consequent parameters of the ANFIS should

be tuned, using a learning algorithm to optimally the relationship
between input space and output space. Basically, ANFIS takes
the initial fuzzy model and tunes it by means of a hybrid
technique combining gradient descent back-propagation and
mean least squares optimization algorithms. At each epoch, an
error measure, usually defined as the sum of the squared
difference between actual and desired output, is reduced.
Training stops when either the predefined epoch number or error
rate is obtained. There are two passes in the hybrid learning
procedure for ANFIS. In the forward pass of the hybrid learning
algorithm, functional signals go forward till layer 4 and the
consequent parameters are identified by the least-squares
estimate. In the backward pass, the error rates propagate
backward and the premise parameters are updated by the
gradient descent method.

Figure 1:(a) the sugeno fuzzy model reasoning (b)

Equivalent ANFIS structure[20]

3. EMPIRICAL ANALYSIS
3.1 Dataset
All algorithms used in this paper are applied on a publicly
available dataset downloaded from the UCI Repository of
Machine Learning Databases at the University of California,
Irvine1. The data set contains 20 variables worth of information
about 5000 customers, along with an indication of whether or
not that customer churned (left the company). The proportion of
churner in the dataset is 14.3%. For a full description of the
dataset, one may refer to [21]. We first split the data set into
67%/33% training / test set split. The proportion of churners
was oversampled in order to give the predictive model a better
ability of discerning discriminating patterns. Therefore the
proportion of churner and non-churner in training data set is
50%|50%. The test set was not oversampled to provide a more
realistic test set; the churn rate remained 14.3%. All models
constructed during this work are evaluated on this test set.

3.2 Data preprocessing

Data preprocessing is an essential phase in data mining. Low
quality data will lead to low quality mining results. Data
processing techniques, when applied before mining, can
substantially improve the overall quality of the patterns mined
and/or the time required for the actual mining. There are a
number of data preprocessing techniques such as data cleaning,
data transformation, data integration, data reduction [22]. In this
paper, we have done feature subset selection to remove
irrelevant attributes from dataset. Furthermore, we have used
sampling techniques in order to make balance between positive
and negative classes.

http://www.ics.uci.edu/mlearn/MLRepository.html

International Journal of Computer Applications (0975 8887)

Volume 19 No.8, April 2011
VmailP

3.3 Feature selection

We have used the PART (partial decision tree) algorithm, a
novel data mining techniques for feature subset selection. This
algorithm combines the divide-and-conquer strategy for decision
tree learning with the separate-and-conquer one for rule
learning. A detailed description about the PART algorithm is
given in [23]. Berger, et al.,(2006) [24] have introduced feature
selection by using PART algorithm. They showed that
classifiers show comparable performance in their classification
task when applied to the feature subset selected by using the
PART algorithm. In this paper, we have obtained a reduced
subset of features by applying the PART algorithm on dataset.
First, a set of decision rules is built by applying the PART on the
training set. Each rule contains a number of features. We then
extract all features contained in the rule. Finally, the set of
reduced features is derived. These features are shown in Table1.

Dichotomous
Categorical

TotalDayMins

Continuous

VoiceMail Plan
Subscriber(0=no,
1=yes)
Daytime usage

TotalEveMins

Continuous

Evening usage

TotalEveCharge

Continuous

TotalNightCharge

Continuous

TotalInterMins

Continuous

Charge for evening

usage
Charge for night
time usage
International usage

TotalInterCalls

Continuous

NumberofCalltoCS

Continuous

3.4 Handling class imbalance

Customer churn is often a rare event in service industries, but of
great significance and great value [25]. This means that real
customer churn datasets have extremely skewed class
distribution. For example, the dataset used in this study has
extremely skewed class distribution; such that the Class
distribution of churners versus non-churners is 14.3:85.7. This
causes the classification modeling techniques experience
difficulties in learning which customers are about to churn.
There are several data mining problems related to rarity along
with some methods to address them [25]. The basic sampling
methods include under-sampling and over-sampling. Undersampling eliminates majority-class examples while oversampling, in its simplest form, duplicates minority-class
examples. Both of these sampling techniques decrease the
overall level of class imbalance, thereby making the rare class
less rare [26].
We applied the over-sampling techniques to make balance
between churners and non-churners instances. By doing so the
distribution of churners versus non-churners is the same.

3.5 Evaluation Criteria

If TP, FP, TN, and FN are the True Positives, False Positives,
True Negatives and False Negatives in the confusion matrix,
then accuracy is defined as (TP + TN)/(TP + FP + TN + FN).
The sensitivity is (TP/ (TP+FN)): the proportion of positive
cases which are predicted to be positive.
The specificity is (TN/ ((TN+ FP)): the proportion of negative
cases which are predicted to be negative [22].
In this study, we have used Accuracy, Sensitivity, and
Specificity to quantify the accuracy of the predictive models.
Furthermore, we have used the number of generated rules
(#rules) to measure the comprehensibility of the constructed
models.
Table 1: Top nine features selected by PART
Feature
InterPlan

Type

What

Dichotomous
Categorical

International Plan
Subscriber(0=no,
1=yes)

Number of
international calls
Number of calls to
customer service

3.6 Model building

In this paper, we used the subtractive clustering technique with
(genfis2) function. Given separated sets of input and output data,
the genfis2 uses a subtractive clustering method to generate a
fuzzy inference system (FIS). When there is only one output,
genfis2 may be used to generate an initial FIS for ANFIS
training by first implementing subtractive clustering on the data.
The genfis2 function uses the subclust function to estimate the
antecedent membership functions and a set of rules. This
function returns an FIS structure that contains a set of fuzzy
rules to cover the feature space. The parameters of subtractive
clustering were set as follows: the range of influence is 0.5,
squash factor is 1.25, accept ratio is 0.5; rejection ration is 0.15.
The number of epoch is equal to 100. We name the FIS
generated by subtractive clustering and trained by ANFIS as
ANFIS-Subtractive model.
We also used the FCM clustering technique with (genfis3)
function. genfis3 generates a FIS using fuzzy c-means (FCM)
clustering by extracting a set of rules that models the data
behavior. Similar to genfis2 this function requires separate sets
of input and output data as input arguments. When there is only
one output, you can use genfis3 to generate an initial FIS for
ANFIS training. The rule extraction method first uses the fcm
function to determine the number of rules and membership
functions for the antecedents and consequents [27].
We set the number of cluster for FCM equal to 6. The number
of epoch is equal to 100. We name the FIS generated by FCM
clustering and trained by ANFIS as ANFIS-FCM model.
C4.5 decision tree, RIPPER, and logistic regression with default
parameters were executed in WEKA (Waikato Environment for
Knowledge Analysis) data mining software [23].

4. RESULTS AND ANALYSES

4.1 Predictive power
As can be seen from the table, the highest accuracy is obtained
using RIPPER rule learner (Accuracy = 95%). However, C4.5
decision tree, ANFIS- Subtractive, and ANFIS-FCM follow
closely, and except for logistic regression, all results lies in
interval between 91% and 95%. Because accuracy implicitly

International Journal of Computer Applications (0975 8887)

Volume 19 No.8, April 2011
assumes a relatively balanced class distribution among the
observations and equal misclassification costs, it alone is not an
adequate performance measure to evaluate the experimental
results [28].

4.2 Comprehensibility
Accuracy, sensitivity, specificity are not the only important
aspect in evaluating a churn prediction models [28]. A churn
prediction model should be both comprehensible and accurate.
Comprehensibility of model causes it to reveal some knowledge
about churn drivers of customers. Such knowledge can be
extracted in the form of if then rules which allows developing
a more effective retention strategy. Therefore, comprehensibility
of the classification model is an important requirement in churn
prediction modeling.

A Real churn dataset, has a skewed distribution, therefore the

supposition of equal misclassification costs cannot be sustained.
Typically, for a customer relationship manager, the most
important issue is the correct detection of future churner. Since
the costs related with the misclassification of churners are
clearly higher than the costs related to misclassification of a
non-churner, we should assume unequal misclassification costs.
As a result, a high sensitivity is preferred to a high specificity
from a companys view point. Of course this does not mean that
specificity can be completely ignored. Indeed, a reasonable
tradeoff has to be made between specificity and sensitivity. A
churn prediction model that predicts all customers as churners
might performs well in including all churning customers in
retention campaign, but this lead to the an extremely high
retention marketing costs.

Among the five algorithm used in this paper, logistic regression

doesnt support a rule based representation. On the other hand,
RIPPER, C4.5, ANFIS-Subtractive, and ANFIS-FCM induce
comprehensible rules from a dataset. As the results shows
ANFIS-Subtractive and ANFIS-FCM clearly induce much less
rules than C4.5 and RIPPER. Hence ANFIS-Subtractive and
ANFIS- FCM which result in a comparable number of rules are
the most comprehensible techniques tested in the experiments.
The if-then rules generated from ANFIS-Subtractive clustering
were shown in figure2. These results indicate that ANFIS has
acceptable performance in terms of accuracy and
comprehensibility, and it is an appropriate choice for churn
prediction applications.

The highest sensitivity in our experiments is obtained with C4.5

(Sensitivity=87%). RIPPER, ANFIS- subtractive, ANFIS- FCM
and logistic regression dont perform significantly worse.
The highest specificity in our experiments is reached with
RIPPER (Specificity= 97.5%). C4.5, ANFIS- subtractive, and
ANFIS-FCM models dont differ significantly in terms of
specificity, and except for logistic regression, all results lies in
interval between 92 % and 95.6%.
In sum, ANFIS-Subtractive and ANFIS-FCM models have
reasonable performance in terms of accuracy, specificity, and
sensitivity.

Table 2: Performance of algorithms

Technique

Accuracy

Specificity

Sensitivity

#rules

C4.5

94%

95.6%

87%

RIPPER

95%

97.5%

85.7%

Logistic regression

77.3%

76.6%

82%

----

ANFIS-Subtractive

92%

93%

84%

ANFIS-FCM

91%

92%

84%

International Journal of Computer Applications (0975 8887)

Volume 19 No.8, April 2011

Figure 2 :The if-then rules generated from ANFIS-Subtractive results

5. CONCLUSIONS
Both accuracy and comprehensibility are two important
requirements in churn prediction modeling. This paper presents
application of ANFIS in churn prediction context. Particularly,
we compared ANFIS as a neuro-fuzzy classifier with two stateof-the-arts crisp classifiers including C4.5 and RIPPER rule
learner. The results showed that both ANFIS-Subtractive and
ANFIS-FCM models have acceptable performance in terms of
accuracy, specificity, and sensitivity. In addition, ANFISSubtractive and ANFIS-FCM clearly induce much less rules
than C4.5 and RIPPER. Hence ANFIS-Subtractive and ANFISFCM which result in a comparable number of rules are the most
comprehensible techniques tested in the experiments. These
results indicate that ANFIS showed acceptable performance in
terms of accuracy and comprehensibility, and it is an appropriate
choice for churn prediction applications.

6. ACKNOWLEDGMENTS
We thank the Iran Telecommunication Research Center for
financial support.

7. REFERENCES
[1] Neslin, S.A. Gupta, S. Kamakura, W. Lu, J. Mason, C.,
2006. Defection detection: Measuring and understanding
the predictive accuracy of customer churn models. Journal
of Marketing Research, 43(2), 204211.
[2] Coussement, K. F. Benoit, D. Van den Poel, D., 2010.
Improved marketing decision making in a customer churn
prediction context using generalized additive models,
Expert Systems with Applications 37, 21322143.

[3] Torkzadeh, G., Chang, J. C.-J., & Hansen, G. W., 2006.

Identifying issues in customer relationship management at
Merck-Medco. Decision Support Systems, 42(2).
[4] Van den Poel, D., & Larivire, B., 2004. Customer
attrition analysis for financial services using proportional
hazard models. European Journal of Operational Research,
157(1), 196217.
[5] Coussement, K. Van den Poel, D., 2008a. Churn prediction
in subscription services: An application of support vector
machines while comparing two parameter-selection
techniques. Expert Systems with Applications 34, 313327.
[6] Xie, Y. Li, X. Ngai, E. Ying, W., 2009. Customer churn
prediction using improved balanced random forests, Expert
Systems with Applications 36, 54455449.
[7] Yu, X., et al. 2010, An extended support vector machine
forecasting framework for customer churn in e-commerce.
Expert
Systems
with
Application
,
doi:10.1016/j.eswa.2010.07.049.
[8] [8]Huang, B. Buckley, B. Kechadi, T., 2010.Multiobjective feature selection by using NSGA-II for custom-er
churnprediction in telecommunications, Expert Systems
with Applications 37, 36383646.
[9] Tsai, C. Lu, Y., 2009. Customer churn prediction by hybrid
neural networks, Expert Sys-tems with Applications, 36,
1254712553.
[10] Pendharkar, P., 2009,Genetic algorithm based neural
network approaches for predicting churn in cellular
wireless network services, Expert Systems with
Applications 36, 67146720.

International Journal of Computer Applications (0975 8887)

Volume 19 No.8, April 2011
[11] Lemmens,A., and Croux,C., 2006. Bagging and boosting
classification trees to predict churn, Journal of Marketing
Research, vol. 43, no. 2, pp. 276-286, 2006.

[20] Jang, J.-S. R., 1993. "ANFIS: Adaptive-Network-based

Fuzzy Inference Systems," IEEE Transactions on Systems,
Man, and Cybernetics, Vol. 23, No. 3, pp. 665-685.

[12] Coussement, K. Van den Poel, D., 2008b. Integrating the

voice of customers through call center emails into a
decision support system for churn prediction- Information
& Management, 45 , 164174.

[21] Larose, D., 2005. Discovering knowledge in data: An

introduction to data mining. NewJersey, USA: Wiley.

[13] Bezdec, J.C., 1981. Pattern Recognition with Fuzzy

Objective Function Algorithms, Plenum Press, New York
[14] Chiu, S., 1994. "Fuzzy Model Identification Based on
Cluster Estimation," Journal of Intelligent & Fuzzy
Systems, Vol. 2, No. 3, Spet.
[15] Yager, R. Filev, D., 1994. Generation of fuzzy rules by
mountain clustering, J. Intell. Fuzzy Syst. 2 (3) , 209219.
[16] Zadeh, L.A., 1965. Fuzzy sets, Information and Control 8,
338353.
[17] Zadeh, L.A., 2005. Toward a generalized theory of
uncertainty (GTU) an outline, Information Sciences 172,
140.
[18] Mamdani, EH., Assilian. S., 1975. An experiment in
linguistic synthesis with a fuzzy logic controller. Int J
Man_Machine Studies, 7(1), 1-13.
[19] Sugeno, M., 1985. Industrial applications of fuzzy control.
Elsevier Science Pub., Co.

[22] Han, J., & Kamber, M., 2006. Data Mining Concepts and
Techniques. Morgan Kaufmann.
[23] Witten, I. H. & Frank, E., 2005. Data mining: Practical
machine learning tools and techniques. San Francisco:
Morgan Kaufmann. 0-12-088407-0.
[24] Berger, H., Merkl, D., Dittenbach, M. 2006. Exploiting
Partial Decision Trees For Feature Subset Selection in eMail Categorization, In Proceedings of the ACM
Symposium on Applied Computing (SAC ).
[25] Burez, J. Van den Poel , D., 2009.Handling class
imbalance in customer churn prediction, Expert Systems
with Applications 36, 46264636
[26] Weiss, G. M., 2004. Mining with rarity: A unifying
framework. SIGKDD Explorations, 6(1), 719.
[27] Fuzzy logic toolbox user's guide for use with MATLAB
2010.
[28] Verbeke, W., et al., 2011. Building comprehensible
customer churn prediction models with ad-vanced rule
induction techniques. Expert Systems with Applications,
38, 23542364.

Audit Objectives Procedures Evidences and Documentation
100% (4)
Audit Objectives Procedures Evidences and Documentation
35 pages
Final Marketing Plan Whole
No ratings yet
Final Marketing Plan Whole
19 pages
Biostatistics Concepts and Applications For Biologists
No ratings yet
Biostatistics Concepts and Applications For Biologists
210 pages
SAP SD Credit Memo, Debit Memo and Return Order
100% (2)
SAP SD Credit Memo, Debit Memo and Return Order
21 pages
Leporello Aluminium Casting Alloys RHEINFELDEN ALLOYS 2018
No ratings yet
Leporello Aluminium Casting Alloys RHEINFELDEN ALLOYS 2018
10 pages
Ble 90
No ratings yet
Ble 90
268 pages
Ble 90
No ratings yet
Ble 90
268 pages
MW Product Knowledge
No ratings yet
MW Product Knowledge
79 pages
GMAT - 2018.PDF Version 1
No ratings yet
GMAT - 2018.PDF Version 1
21 pages
Mini Project 1.. 1
No ratings yet
Mini Project 1.. 1
15 pages
Leg en D: Construction Project Schedule
No ratings yet
Leg en D: Construction Project Schedule
6 pages
Various - Rock'n Roll Project
No ratings yet
Various - Rock'n Roll Project
15 pages
A Provably Time-Efficient Parallel Implementation of Full Speculation
No ratings yet
A Provably Time-Efficient Parallel Implementation of Full Speculation
46 pages
06. ĐỀ SỐ 06 HSG ANH 9 (HUYỆN)
No ratings yet
06. ĐỀ SỐ 06 HSG ANH 9 (HUYỆN)
7 pages
Parallel Thinking: Guy Blelloch Carnegie Mellon University
No ratings yet
Parallel Thinking: Guy Blelloch Carnegie Mellon University
37 pages
OPEL Aplication - Form
No ratings yet
OPEL Aplication - Form
4 pages
Provably Efficient Scheduling For Languages With Fine-Grained Parallelism
No ratings yet
Provably Efficient Scheduling For Languages With Fine-Grained Parallelism
41 pages
Algorithms For Efficient Near-Perfect Phylogenetic Tree Reconstruction in Theory and Practice
No ratings yet
Algorithms For Efficient Near-Perfect Phylogenetic Tree Reconstruction in Theory and Practice
25 pages
Digital Electronics and Communication Systems: Curriculum
No ratings yet
Digital Electronics and Communication Systems: Curriculum
83 pages
Impro New 2.7 Preview
No ratings yet
Impro New 2.7 Preview
24 pages
Sbi General Set PPT 2012
No ratings yet
Sbi General Set PPT 2012
20 pages
Assoc Parallel Journal
No ratings yet
Assoc Parallel Journal
16 pages
2-D SIMD Algorithms in The Perfect Shue Networks: N P N P L N P P L I PE
No ratings yet
2-D SIMD Algorithms in The Perfect Shue Networks: N P N P L N P P L I PE
16 pages
Optimal Communication Algorithms For Hypercubes : Journal of Parallel and Distributed Computing 11, 263-275 (1991)
No ratings yet
Optimal Communication Algorithms For Hypercubes : Journal of Parallel and Distributed Computing 11, 263-275 (1991)
13 pages
Assoc Parallel
No ratings yet
Assoc Parallel
12 pages
Fixed Parameter Tractability of Binary Near-Perfect Phylogenetic Tree Reconstruction
No ratings yet
Fixed Parameter Tractability of Binary Near-Perfect Phylogenetic Tree Reconstruction
12 pages
Site Case Study
No ratings yet
Site Case Study
3 pages
Scheduling Threads For Constructive Cache Sharing On Cmps
No ratings yet
Scheduling Threads For Constructive Cache Sharing On Cmps
11 pages
Using Page Residency To Balance Tradeoffs in Tracing Garbage Collection
No ratings yet
Using Page Residency To Balance Tradeoffs in Tracing Garbage Collection
11 pages
Strongly History-Independent Hashing With Applications
No ratings yet
Strongly History-Independent Hashing With Applications
11 pages
Scan Primitives For Vector Computers
No ratings yet
Scan Primitives For Vector Computers
10 pages
Provably Good Multicore Cache Performance For Divide-and-Conquer Algorithms
No ratings yet
Provably Good Multicore Cache Performance For Divide-and-Conquer Algorithms
10 pages
Risk Matrix Rev-06 (Finalized)
No ratings yet
Risk Matrix Rev-06 (Finalized)
1 page
Forwardingindices of Folded: N-Cubes
No ratings yet
Forwardingindices of Folded: N-Cubes
3 pages
Forwardingindices of Folded: N-Cubes
No ratings yet
Forwardingindices of Folded: N-Cubes
3 pages
Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees
No ratings yet
Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees
8 pages
About Paraguay
No ratings yet
About Paraguay
4 pages
The Hamiltonicity of Crossed Cubes in The Presence of Faults
No ratings yet
The Hamiltonicity of Crossed Cubes in The Presence of Faults
7 pages
Using The Universal PE Unpacker
No ratings yet
Using The Universal PE Unpacker
11 pages
Pinterest
No ratings yet
Pinterest
6 pages
KF Quick Reference Guide Method Parameters
No ratings yet
KF Quick Reference Guide Method Parameters
2 pages
Welding Classification
No ratings yet
Welding Classification
30 pages
8BVI0055HWDS.000-1 en
No ratings yet
8BVI0055HWDS.000-1 en
10 pages
References: Intell. AI Games, Vol. 7, No. 3, Pp. 255-265, Sep 2015
No ratings yet
References: Intell. AI Games, Vol. 7, No. 3, Pp. 255-265, Sep 2015
4 pages
Twee - Everyday Life Vocabulary
No ratings yet
Twee - Everyday Life Vocabulary
5 pages
Behavioral Attributes and Financial Churn Prediction: Regulararticle Open Access
No ratings yet
Behavioral Attributes and Financial Churn Prediction: Regulararticle Open Access
18 pages
ReSci - Retention Marketing & Predictive Analytics
No ratings yet
ReSci - Retention Marketing & Predictive Analytics
27 pages
A Neural Network Based Approach For Predicting
No ratings yet
A Neural Network Based Approach For Predicting
6 pages
Midterm Exam: TEST I MULTIPLE CHOICE. Select The Best Answer by Writing The Letter of Your Choice.
100% (1)
Midterm Exam: TEST I MULTIPLE CHOICE. Select The Best Answer by Writing The Letter of Your Choice.
3 pages
Customer Churn Prediction by Hybrid Neural Networks
No ratings yet
Customer Churn Prediction by Hybrid Neural Networks
7 pages
CNS Unit 3
No ratings yet
CNS Unit 3
94 pages
Churn Analysis Report
No ratings yet
Churn Analysis Report
28 pages
A Proposed Churn Prediction Model: Essam Shaaban, Yehia Helmy, Ayman Khedr, Mona Nasr
No ratings yet
A Proposed Churn Prediction Model: Essam Shaaban, Yehia Helmy, Ayman Khedr, Mona Nasr
5 pages
Mini-Project - Churn Analysis .
No ratings yet
Mini-Project - Churn Analysis .
15 pages
Customer Churn Prediction Review
100% (1)
Customer Churn Prediction Review
7 pages
Aqautec Ocean Parts Manual
No ratings yet
Aqautec Ocean Parts Manual
4 pages
Churn Time Series
No ratings yet
Churn Time Series
12 pages
Predictive Modeling
No ratings yet
Predictive Modeling
7 pages
Royal Park Property Development Limited
No ratings yet
Royal Park Property Development Limited
7 pages
56 Customer Churn Analysis and Prediction Using Data Mining Models in Banking Industry
No ratings yet
56 Customer Churn Analysis and Prediction Using Data Mining Models in Banking Industry
6 pages
12622-Article Text-22383-1-10-20220510
No ratings yet
12622-Article Text-22383-1-10-20220510
5 pages
Slidesgo Unlocking Retention Mastering Churn Prediction For Business Success 202410030646572TEu
No ratings yet
Slidesgo Unlocking Retention Mastering Churn Prediction For Business Success 202410030646572TEu
8 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
8 pages
Research On A Customer Churn Combination Prediction Model Based On Decision Tree and Neural Network
No ratings yet
Research On A Customer Churn Combination Prediction Model Based On Decision Tree and Neural Network
4 pages
2017 CustomerChurn
No ratings yet
2017 CustomerChurn
6 pages
INNOVATION - PDF Phrase 2
No ratings yet
INNOVATION - PDF Phrase 2
9 pages
Churn Modeling
100% (1)
Churn Modeling
11 pages
CHURNFORGE Research Paper Kajal
No ratings yet
CHURNFORGE Research Paper Kajal
6 pages
A Survey and Implementation of Machine Learning Algorithms For Customer Churn Prediction
No ratings yet
A Survey and Implementation of Machine Learning Algorithms For Customer Churn Prediction
7 pages
Rahman 2020
No ratings yet
Rahman 2020
6 pages
Abstract On CPP Project Sample
No ratings yet
Abstract On CPP Project Sample
19 pages
Age Prediction and Performance Comparison by Adaptive Network Based Fuzzy Inference System Using Subtractive Clustering
No ratings yet
Age Prediction and Performance Comparison by Adaptive Network Based Fuzzy Inference System Using Subtractive Clustering
5 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
5 pages
Final Report Srini
No ratings yet
Final Report Srini
24 pages
Applying Data Mining To Telecom Churn Ma
No ratings yet
Applying Data Mining To Telecom Churn Ma
10 pages
Customer Attrition Problem: © Springer Nature Switzerland AG 2020 K. Tarnowska Et Al.,, Studies in Big Data 55, 113
No ratings yet
Customer Attrition Problem: © Springer Nature Switzerland AG 2020 K. Tarnowska Et Al.,, Studies in Big Data 55, 113
10 pages
Dr. Richard Felder and Dr. Rebecca Brent Part 3
No ratings yet
Dr. Richard Felder and Dr. Rebecca Brent Part 3
3 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
6 pages
A Comparison Study Between Various Fuzzy Clustering Algorithms
No ratings yet
A Comparison Study Between Various Fuzzy Clustering Algorithms
9 pages
GRP 10 Report
No ratings yet
GRP 10 Report
16 pages
Seminar Synopsisreport
No ratings yet
Seminar Synopsisreport
6 pages
Presentation 2
No ratings yet
Presentation 2
19 pages
Review1 1
No ratings yet
Review1 1
16 pages
Efficacy of Customer Churn Prediction System
No ratings yet
Efficacy of Customer Churn Prediction System
8 pages
Literature Survey On Customer Churn Prediction
No ratings yet
Literature Survey On Customer Churn Prediction
4 pages
BIL Report 2
No ratings yet
BIL Report 2
11 pages
BIL Report 1
No ratings yet
BIL Report 1
11 pages
OBL Final
No ratings yet
OBL Final
16 pages
Comparative Study of Customer Churn Prediction Based On Data Ensemble Approach
No ratings yet
Comparative Study of Customer Churn Prediction Based On Data Ensemble Approach
10 pages
Ref 1
No ratings yet
Ref 1
10 pages
3 Customer Churn Prediction Using Composite Deep Learning Technique
No ratings yet
3 Customer Churn Prediction Using Composite Deep Learning Technique
17 pages
Fuzzy Clustering-Based Switching Non-Negative Matrix Factorization and Its Application To Environmental Data Analysis
No ratings yet
Fuzzy Clustering-Based Switching Non-Negative Matrix Factorization and Its Application To Environmental Data Analysis
6 pages
Reseacch
No ratings yet
Reseacch
29 pages
Experimental Analysis of Fuzzy Clustering Techniques - 2023 - Procedia Computer
No ratings yet
Experimental Analysis of Fuzzy Clustering Techniques - 2023 - Procedia Computer
10 pages
Predicting Customer Churn A Systematic Literature Review
No ratings yet
Predicting Customer Churn A Systematic Literature Review
22 pages
Décortication Article 1
No ratings yet
Décortication Article 1
4 pages
Assignment Csit
No ratings yet
Assignment Csit
5 pages
Customer Churn Prediction Employing Ensemble Learning
No ratings yet
Customer Churn Prediction Employing Ensemble Learning
5 pages
Test Bank For Understanding Economics A Contemporary Perspective, 9th Edition Mark Lovewell
100% (1)
Test Bank For Understanding Economics A Contemporary Perspective, 9th Edition Mark Lovewell
10 pages
131 574 1 PB
No ratings yet
131 574 1 PB
12 pages
Abhishek Singh 15 ICICN Research Paper Feb 2025
No ratings yet
Abhishek Singh 15 ICICN Research Paper Feb 2025
6 pages
Journal of Forecasting - 2021 - Pekel Ozmen - A Novel Deep Learning Model Based On Convolutional Neural Networks For
No ratings yet
Journal of Forecasting - 2021 - Pekel Ozmen - A Novel Deep Learning Model Based On Convolutional Neural Networks For
12 pages
20pd02 Aakar
No ratings yet
20pd02 Aakar
16 pages
Intelligent Machine Learning System For Predicting Customer Churn
No ratings yet
Intelligent Machine Learning System For Predicting Customer Churn
6 pages
Wa0001.
No ratings yet
Wa0001.
11 pages
PowerCo Problem
No ratings yet
PowerCo Problem
2 pages
Customer Churn Prediction Detailed Presentation
No ratings yet
Customer Churn Prediction Detailed Presentation
11 pages
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
From Everand
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet

A Neuro-Fuzzy Classifier For Customer Churn Prediction: Hossein Abbasimehr Mostafa Setak M. J. Tarokh

Uploaded by

A Neuro-Fuzzy Classifier For Customer Churn Prediction: Hossein Abbasimehr Mostafa Setak M. J. Tarokh

Uploaded by

International Journal of Computer Applications (0975 8887)

Volume 19 No.8, April 2011

A Neuro-Fuzzy Classifier for Customer Churn Prediction

K. N. Toosi University of Tech

K. N. Toosi University of Tech

K. N. Toosi University of Tech

Logistic regression [11, 12]. Accuracy is not the only important

Suppose a collection of n data point {

The unknowns in FCM clustering are:

International Journal of Computer Applications (0975 8887)

Where m is any real number greater than 1,

This iteration will stop when

where is a termination criterion between 0 and 1, while k is

After clustering, the clusters information is used for

2.3 Adaptive Neuro Fuzzy Inference System

2.2 Subtractive clustering

Fuzzy reasoning procedure for the first order Sugeno Fuzzy

defines the neighborhood

represent radius of neighborhood

for which considerable potential reduction will happen.

Assume a fuzzy inference system with two inputs x, y and one

International Journal of Computer Applications (0975 8887)

Layer 2. Every node in this layer multiplies incoming signals

Bothe premise and consequent parameters of the ANFIS should

Figure 1:(a) the sugeno fuzzy model reasoning (b)

3.2 Data preprocessing

International Journal of Computer Applications (0975 8887)

3.3 Feature selection

Charge for evening

3.4 Handling class imbalance

3.5 Evaluation Criteria

3.6 Model building

4. RESULTS AND ANALYSES

International Journal of Computer Applications (0975 8887)

A Real churn dataset, has a skewed distribution, therefore the

Among the five algorithm used in this paper, logistic regression

The highest sensitivity in our experiments is obtained with C4.5

Table 2: Performance of algorithms

International Journal of Computer Applications (0975 8887)

Figure 2 :The if-then rules generated from ANFIS-Subtractive results

[3] Torkzadeh, G., Chang, J. C.-J., & Hansen, G. W., 2006.

International Journal of Computer Applications (0975 8887)

[20] Jang, J.-S. R., 1993. "ANFIS: Adaptive-Network-based

[12] Coussement, K. Van den Poel, D., 2008b. Integrating the

[21] Larose, D., 2005. Discovering knowledge in data: An

[13] Bezdec, J.C., 1981. Pattern Recognition with Fuzzy

You might also like