Customer Churn Prediction For Telecom Services: Utku Yabas Hakki Candan Cankaya Turker Ince
Customer Churn Prediction For Telecom Services: Utku Yabas Hakki Candan Cankaya Turker Ince
Abstract—Customer churn is a big concern for telecom service experiments. Although competition is over, improvements
providers due to its associated costs. This short paper briefly can still be made. Evaluations for the ranking consider the
explains our ongoing work on customer churn prediction for area under the Receiver Operating Characteristic (ROC)
telecom services. We are working on data mining methods to curve. We are working on the ensemble methods to improve
accurately predict customers who will change and turn to the solution to the churn prediction problem. The first place
another provider for the same or similar service. Sample in the KDD 2009 competition at churn problem is still held
dataset we use for our experiments has been compiled by by IBM [2]. We have chosen the same dataset to work with,
Orange Telecom from real data. They posted the sample because it includes all the features and challenges for churn
dataset for 2009 Knowledge Discovery and Data Mining
prediction. The dataset does not reveal any information about
Competition. IBM has scored the highest on this dataset
requiring significant amount of computational resources. We
customers. We can also see other scores from the
are aiming to find alternative methods that can match or competition’s website which helps us evaluate how we are
improve the recorded highest score with more efficient use of doing.
resources. Dataset has very large number of features, examples
II. METHODS USED AND POTENTIAL CONTRIBUTIONS
and incomplete values. As the first step, we employ some
methods to preprocess the dataset for its imperfections. Then, To overcome the churn prediction problem, we have been
we compare and contrast various ensemble and single using machine learning algorithms and data mining tools.
classifiers. We conclude the paper with future directions for One of the popular tools in the field is Weka [3]. Weka is an
the study. open source software for data mining, developed by the
University of Waikato in New Zealand. We also use
Keywords-churn prediction; machine learning; data mining; additional libraries for the methods that are not implemented
pattern recognition; in Weka. We built and implemented these additional
methods that were not included in the tool set.
I. INTRODUCTION We encountered some challenges so far in the study. Our
Rapid improvements and dynamics in technology market first challenge was the size of the dataset. It includes 100,000
place make customer retention a competitive effort. examples with 15,000 variables. The dataset was equally
Especially in saturated telecommunications market, there are divided into training and test sets. Test set’s class labels are
incumbent service providers and newcomers offering deals not posted. The other concern was preprocessing of the
and packages for consumers who would like to churn to their variables. Variables are polluted by high number of missing
services. On the defending end, strategies and counter offers values and outliers. There are some variables that are not
have to be made for potential churners as it is more normalized with different dynamic ranges. Some categorical
expensive earning a customer back once s/he churns. variables have huge number of vocabulary and some
According to the SAS Institute report [1], the annual rate of numerical variables have only few distinct values. The
customer churn in telecommunications industry is currently amount of required computational resources was high.
at about 30% with an upward trend in correlation with the Because of the size of the dataset and its huge number of
growth of the market. variables and examples, model building takes a long time.
In this study, we concentrate on evaluation and analysis The other major challenge in our study was the unbalanced
of performance of different machine learning methods for numbers of positive and negative samples per class in churn
accurate churn prediction. In 2009, the French Telecom data sets. The positive-to-negative sample ratio is only less
Company Orange sponsored a competition in knowledge than 10% of the class examples.
discovery and data mining (KDD), and posted three After the preprocessing, we focused on mostly ensemble
problems [2]. One of the problems was churn analysis and methods. Algorithm that scores highest in our experiments is
prediction. They provided a sample real dataset to be used in Random Forests [5]. Random Forests is built from many
the competition. In our study, we use this dataset for our decision trees. At each node of single decision trees, m
359
357
354