Customer Churn Prediction in The Telecom Sector
Customer Churn Prediction in The Telecom Sector
Abstract — Customer churn, the phenomenon of customers In recent years, customer churn prediction has gained
terminating their subscription or services with a telecom significant attention in the telecom sector due to
provider, poses a significant challenge in the telecom industry. advancements in data collection, storage, and processing
Predicting customer churn is crucial for companies to retain technologies. This has enabled telecom companies to
customers and minimize revenue loss. This paper presents a leverage large-scale datasets and apply sophisticated
comprehensive approach to churn prediction using various machine learning algorithms to develop effective customer
machine learning(ML) and deep learning(DL) techniques, churn prediction models.
including Logistic Regression, Support Vector Machine
(SVM), Random Forest, K Nearest Neighbours (KNN), Traditionally, telecom companies have relied on manual
Gradient Boosted Tree, and Multi-Layer Perceptron (MLP) analysis and basic statistical methods to identify potential
Classifier. However, existing methods often suffer from churners. However, these approaches often lack accuracy
limitations such as suboptimal accuracy and inability to and fail to capture the underlying patterns and complexities
capture complex patterns in the data. To address these
present in the data. In response to these limitations, machine
drawbacks, this paper proposes a novel model called Hybrid
Churn Prediction (HCP). The HCP model combines the learning (ML) and deep learning (DL) techniques have
strengths of different algorithms by integrating the predictions surfaced as potent instruments for forecasting churn..
of multiple base models, resulting in improved accuracy and
robustness in churn prediction. In this study, evaluation of This paper aims to explore the effectiveness of various
performance of each individual model and comparison with machine learning[4][5] and deep learning methods in
the proposed HCP model is done using two distinct datasets predicting customer churn in telecom industry. Specifically,
from the telecom industry. The experimental results this study investigate the performance of algorithms such as
demonstrate that HCP model consistently outperforms existing Logistic Regression, ,Random Forest, K Nearest Neighbours
methods, achieving higher accuracy in predicting customer (KNN), Support Vector Machine (SVM), Gradient Boosted
churn. Tree, and Multi-Layer Perceptron (MLP) Classifier.
Additionally, this study proposes a novel approach called
Keywords- Churn Prediction, Data mining, Customer Hybrid Churn Prediction (HCP), which combines the
Retention, Gradient Boost, Logistic Regression, SVM, MLP, strengths of multiple algorithms to enhance predictive
KNN, Random forest, HCP
accuracy. By analyzing the shortcomings of existing methods
and comparing them with our proposed HCP model, we seek
I. INTRODUCTION to demonstrate the potential of hybrid approaches in
improving churn prediction accuracy. Ultimately, the
The telecommunications sector is fiercely competitive, findings of this study aim to provide valuable insights for
with multiple service providers competing for the attention telecom companies seeking to mitigate customer churn and
of customers.. Retaining existing customers is crucial for enhance customer retention strategies.
telecom companies as acquiring new customers can be
expensive[1]. Customer churn, defined as the frequency at This paper is organized as follows: Section 2 presents
which customers cease utilizing a company's services, poses literature survey, Section 3 presents the churn prediction
a notable challenge within the telecommunications industry.. methodology, Section 4 presents the implementation details
It not only results in revenue loss[2] but also affects and section 5 presents the results and discussions.
customer satisfaction and loyalty.
To mitigate customer churn, telecom companies can II. LITERATURE REVIEW
benefit from predictive analytics techniques, specifically According to A. Gaur and R. Dubey[6], customer churn
customer churn prediction models. These models leverage in the telecom industry is a significant problem that affects
machine learning algorithms[3] to analyze historical data and their revenue. Customers leaving a service or subscription
discern patterns and factors influencing customer churn. By causes these businesses to spend more money. Companies
accurately predicting which customers are likely to churn, have discovered that getting new clients is approximately six
telecom companies can implement proactive measures to times more expensive than keeping the ones they already
retain those customers, including providing personalized have[7]. In order to retain customers, businesses use
promotions, improving customer service, or addressing predictive analysis of consumer behaviour to assist them add
specific pain points. new features and products as well as to address service-
related problems. Their study offers summary of the most algorithms may exhibit biases towards the majority class,
recent research in area of Customer Churn prediction. Their leading to poor predictive performance for the minority class.
goal is to offer a clear roadmap that will make it easier to
design unique churn prediction systems in the future Limited Interpretability: While some traditional machine
learning models like logistic regression offer interpretable
According to [8], customer churn is defined as the act of results, others such as random forests or gradient boosting
a customer discontinuing use of an organization's goods or are considered black-box models. These models provide
services. The rising charges, poor assistance, low level of limited insight into the underlying factors driving churn,
knowledge of the service plan, high subscription rate, and making it challenging for stakeholders to interpret and trust
service quality, among other factors, are the causes of this the model predictions.
unhappiness. To keep existing customers and lower the
customer churn rate in advance of events, businesses should
be able to forecast customer behaviour precisely. The article
delves into a comprehensive review analysis of numerous
churn prediction models utilized across various industries.
These models employ a range of machine-learning
approaches, deep-learning algorithms, metaheuristic
optimization strategies, feature extraction-based methods,
and hybrid approaches. Additionally, the article surveys
commonly employed machine-learning techniques on a
cloud computing platform to discern patterns of customer
turnover.. The churn prediction model, with greater accuracy
findings, makes it easier to identify businesses that are about
to churn and to focus on reducing the total churn percentage,
forming retention strategies, and increasing the company's
revenue.
According to [9], in the telecommunications sector,
customer churn is a severe issue that happens more
frequently. The cost of retaining current consumers is
substantially lower than the cost of acquiring new ones[10],
and according to the literature, the cost of acquiring new
customers must be five times higher. One of the most
Fig. 1. Taxonomy of Churn Prediction
significant challenges in predicting customer attrition is
dealing with unbalanced data, and various solutions have
been explored. Their study employs machine learning The taxonomy diagram shown in Fig.1, provides a
methods such as Decision Tree, Support Vector Machine, structured framework for understanding the diverse range of
Multi-Layer Perceptron, Random Forest, and Gradient churn prediction methods, spanning from machine learning
Boosting. Techniques including Synthetic Minority Over- to deep learning techniques. By categorizing these methods
sampling Technique (SMOTE), random over-sampling, and into distinct branches, it offers researchers and practitioners
random under-sampling were utilized to balance the data. In insights into the different methodologies available for
terms of the Area Under Curve (AUC) index of the receiver predicting customer churn in various domains.
operating characteristic (ROC) curve, both over-sampling
and under-sampling approaches yielded comparable and Though there are several methods, there is a need for better
appropriate results. However, under-sampling exhibited churn prediction method. This paper proposes a new Hybrid
superior specificity and over-sampling demonstrated higher Churn Prediction method and discusses about the existing
sensitivity. Additionally, compared to other algorithms, the machine learning and deep learning approaches.
random forest and gradient boosting techniques performed
better. III. CHURN PREDICTION METHODOLOGY
A. EXISTING SYSTEM This paper uses various machine learning and deep learning
The existing system typically involves traditional machine techniques to predict customer churn in order to build
learning techniques and algorithms applied to customer classification models such as Random Forest, SVM [12],
churn prediction in the telecom sector. These techniques may Logistic Regression [13], and Gradient boosted tree [14],
include logistic regression, gradient boosting, random KNN, MLP. Each machine learning algorithm has its own
forests, decision trees, support vector machines (SVM) and drawbacks,so this paper proposes a new algorithm “Hybrid
others. Data preprocessing steps such as data cleaning, Churn Prediction(HCP)” which combines the strengths of
feature engineering, and encoding categorical variables are Logistic Regression, Random Forest, Gradient Boost and
performed before training the models. The models are then SVM. This paper also compares the performance of all these
trained on historical data containing customer attributes, models on two different datasets.
usage patterns, and churn labels. Dataset1 Details: The data set includes information about:
Drawbacks: Imbalanced Data: Customer churn datasets in customerID: Unique identifier for each customer.
the telecom sector are often imbalanced [11], with a
significantly higher number of non-churn instances gender: Gender of the customer.
compared to churn instances. Traditional machine learning SeniorCitizen: If a customer is a senior citizen or not.
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)
A. PROBLEM STATEMENT
Partner: If customer has a partner. The phenomena of customers quitting or terminating
their subscription or services with a telecom provider is
Dependents: If customer has dependents. known as "customer churn" in telecom industry. This
Tenure: Number of months a customer has remained with the phenomenon can lead to huge revenue loss and customer
company. unhappiness. The following can be used to build the issue
statement for the telecom industry's customer churn
PhoneService: If customer has phone service. prediction:
MultipleLines: If customer has multiple lines. The challenge is to create reliable and understandable
InternetService: Type of internet service subscribed by churn prediction algorithm that can use past data to estimate
customer. potential churners, offer insights into the primary causes of
churn, and direct proactive retention tactics. To enable
OnlineSecurity: Whether customer opt for online security or prompt responses, the models must be scalable, able to
not. handle large-scale datasets, and capable of early prediction.
OnlineBackup: Whether customer opt for online backup. With the ultimate goal of lowering customer turnover and
raising customer retention rates in the telecom industry, the
DeviceProtection:Whether customer has device protection. performance of the models should be assessed using relevant
TechSupport: Whether customer opt for tech support. measures to determine how well they anticipate attrition and
direct retention efforts.
StreamingTV: Whether customer opt for streaming TV.
B. HYBRID CHURN PREDICTION ARCHITECTURE
StreamingMovies: Whether customer opt for streaming .
movies.
Contract: Type of contract the customer has.
PaperlessBilling: Whether customer has paperless billing.
PaymentMethod: Payment method used by the customer.
MonthlyCharges: Monthly charges paid by customer.
TotalCharges: Total charges paid by customer.
Churn: Whether the customer churned or not.
Dataset2 Details: The data set includes information about:
customer_id: Unique identifier for each customer.
telecom_partner: Partner company of the telecom service
used by the customer.
gender: Gender of the customer (Categorical: 'M' for male,
'F' for female).
age: Age of the customer. Fig. 2. Architecture of Hybrid Churn Prediction
state: State where the customer resides. The detailed architecture of Hybrid churn prediction is shown
city: City where the customer resides. in Fig.2 with step by step process.
pincode: Postal code of the customer's location. Algorithm Used For HCP is given below:
date_of_registration: Date when the customer registered with 1. Data Preprocessing: It includes
the telecom service.
• Removing Duplicate Values
num_dependents: Number of dependents of the customer.
• Renaming Coulmn Names
estimated_salary: Estimated salary of the customer.
• Converting to numeric datatype
calls_made: Number of calls made by the customer.
• Checking Null Values.
sms_sent: Number of SMS messages sent by the customer.
2. Model Training with Algorithm: Random forests,
data_used: Amount of data used by customer. Logistic Regression, K nearest neighbor and Support
vector machine algorithms have been used to train
churn: Binary variable indicating whether customer churned
the model.
(1) or not (0).
3. Model Testing: Various metrics have been used to
test model such as:
• Precision = True Positives(TP) / (True Positives +
False Positives(FP))
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)
• Define the fit method to train the hybrid classifier: IV. IMPLEMENTATION DETAILS
def fit(self, X_train, y_train): All the experiments were performed using an Intel(R)
Core(TM) i5-9300H CPU @ 2.40GHz and 8 GB of RAM
• Iterate over each base classifier and train them using running Windows. The work has been done in google colab
the provided training data. using python. The data given contains 7043 observations and
for classifier_name, classifier_param in 21 variables.
self.classifier_param.items(): The data is analyzed then it found out the top 4 variables
classifier = create_classifier(classifier_name, which has highest correlation with output column ‘churn’.
classifier_param)
classifier.fir(X_train,y_train)
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)
94% for 0 value and 36% for 1 value. The f1 score was G. Hybrid Churn Prediction(HCP)
estimated to be 87% for 0 value and 47% for 1 value.
For Dataset1: The test classification accuracy of the HCP
For Dataset2: The test classification accuracy of the SVM Classifier model was 79.6%. The model’s precision was
model was 86%. The model’s precision was estimated to be estimated to be 83% for 0 value and 68% for 1 value. The
86% for 0 value and 0% for 1 value. The recall to be 100% recall to be 91% for 0 value and 48% for 1 value. The f1
for 0 value and 0% for 1 value. The f1 score was estimated to score was estimated to be 87% for 0 value and 56% for 1
be 92% for 0 value and 0% for 1 value. value.
C. Random Forest(RF) For Dataset2: The test classification accuracy of the HCP
Classifier model was 86%. The model’s precision was
For Dataset1: The test classification accuracy of the estimated to be 86% for 0 value and 0% for 1 value. The
Random Forest model was 76.6%. The model’s precision recall to be 100% for 0 value and 0% for 1 value. The f1
was estimated to be 81% for 0 value and 61% for 1 value. score was estimated to be 92% for 0 value and 0% for 1
The recall to be 89% for 0 value and 45% for 1 value. The f1 value.
score was estimated to be 85% for 0 value and 52% for 1
value. V. RESULTS AND DISCUSSIONS
For Dataset2: The test classification accuracy of the In this paper customers are categorized on the basis of their
Random Forest model was 86%. The model’s precision was age groups and requirements which is shown in the figure
estimated to be 86% for 0 value and 0% for 1 value. The below:
recall to be 100% for 0 value and 0% for 1 value. The f1
score was estimated to be 92% for 0 value and 0% for 1
value.
D. Gradient Boost
For Dataset1: The test classification accuracy of the
Gradient Boost model was 78.2%. The model’s precision
was estimated to be 80% for 0 value and 69% for 1 value.
The recall to be 92% for 0 value and 43% for 1 value. The f1
score was estimated to be 86% for 0 value and 53% for 1
value. Fig 4. Different age groups and requirements
For Dataset2: The test classification accuracy of the According to Fig. 4, derivation of summary for each
Gradient Boost model was 79%. The model’s precision was customer category and their requirements can be made:
estimated to be 86% for 0 value and 11% for 1 value. The
recall to be 91% for 0 value and 7% for 1 value. The f1 score Young customers in the age group of 0-20 tend to have lower
was estimated to be 88% for 0 value and 9% for 1 value. data usage, but they make a relatively higher number of calls
compared to other age groups. They also send fewer SMS
E. K Nearest Neighbours(KNN) messages on average.
For Dataset1: The test classification accuracy of the KNN Middle-aged customers between 21 and 40 years old exhibit
model was 77.3%. The model’s precision was estimated to higher data usage compared to younger customers. They also
be 83% for 0 value and 58% for 1 value. The recall to be send more SMS messages on average, while the number of
88% for 0 value and 47% for 1 value. The f1 score was calls made is slightly lower compared to the younger age
estimated to be 85% for 0 value and 52% for 1 value. group.
For Dataset2: The test classification accuracy of the KNN Older customers aged between 41 and 60 show similar data
model was 85%. The model’s precision was estimated to be usage patterns to middle-aged customers but tend to make
87% for 0 value and 33% for 1 value. The recall to be 98% slightly fewer calls on average. They also send a comparable
for 0 value and 7% for 1 value. The f1 score was estimated to number of SMS messages.
be 92% for 0 value and 12% for 1 value.
Senior customers aged 61 and above exhibit relatively
F. Multi-Layer Perceptron(MLP) consistent data usage compared to middle-aged and older
customers. They also maintain a moderate level of SMS
For Dataset1: The test classification accuracy of the MLP
usage and make a moderate number of calls.
model was 79.4%. The model’s precision was estimated to
be 83% for 0 value and 65% for 1 value. The recall to be Understanding these patterns can help telecom companies
91% for 0 value and 46% for 1 value. The f1 score was tailor their services and offerings to better meet the needs
estimated to be 87% for 0 value and 54% for 1 value. and preferences of different customer segments based on
their age.
For Dataset2: The test classification accuracy of the MLP
model was 82%. The model’s precision was estimated to be Telecom companies need to understand what causes
88% for 0 value and 30% for 1 value. The recall to be 92% customer churn to keep their current client from defecting.
for 0 value and 21% for 1 value. The f1 score was estimated Telecom data can reveal this type of information. This work
to be 90% for 0 value and 25% for 1 value. includes the training of six distinct machine learning models
and building a new ‘Hybrid Churn Prediction(HCP)’ model.
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.