0% found this document useful (0 votes)
80 views6 pages

Customer Churn Prediction in The Telecom Sector

Uploaded by

snehadhake1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views6 pages

Customer Churn Prediction in The Telecom Sector

Uploaded by

snehadhake1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)

CUSTOMER CHURN PREDICTION IN THE


TELECOM SECTOR
2024 3rd International Conference on Artificial Intelligence For Internet of Things (AIIoT) | 979-8-3503-7212-0/24/$31.00 ©2024 IEEE | DOI: 10.1109/AIIoT58432.2024.10574660

Pallav Aggarwal Vaidehi Vijayakumar


SCOPE SCOPE
Vellore Institute of Technology, Vellore Institute of Technology,
Chennai, India Chennai, India
[email protected] [email protected]

Abstract — Customer churn, the phenomenon of customers In recent years, customer churn prediction has gained
terminating their subscription or services with a telecom significant attention in the telecom sector due to
provider, poses a significant challenge in the telecom industry. advancements in data collection, storage, and processing
Predicting customer churn is crucial for companies to retain technologies. This has enabled telecom companies to
customers and minimize revenue loss. This paper presents a leverage large-scale datasets and apply sophisticated
comprehensive approach to churn prediction using various machine learning algorithms to develop effective customer
machine learning(ML) and deep learning(DL) techniques, churn prediction models.
including Logistic Regression, Support Vector Machine
(SVM), Random Forest, K Nearest Neighbours (KNN), Traditionally, telecom companies have relied on manual
Gradient Boosted Tree, and Multi-Layer Perceptron (MLP) analysis and basic statistical methods to identify potential
Classifier. However, existing methods often suffer from churners. However, these approaches often lack accuracy
limitations such as suboptimal accuracy and inability to and fail to capture the underlying patterns and complexities
capture complex patterns in the data. To address these
present in the data. In response to these limitations, machine
drawbacks, this paper proposes a novel model called Hybrid
Churn Prediction (HCP). The HCP model combines the learning (ML) and deep learning (DL) techniques have
strengths of different algorithms by integrating the predictions surfaced as potent instruments for forecasting churn..
of multiple base models, resulting in improved accuracy and
robustness in churn prediction. In this study, evaluation of This paper aims to explore the effectiveness of various
performance of each individual model and comparison with machine learning[4][5] and deep learning methods in
the proposed HCP model is done using two distinct datasets predicting customer churn in telecom industry. Specifically,
from the telecom industry. The experimental results this study investigate the performance of algorithms such as
demonstrate that HCP model consistently outperforms existing Logistic Regression, ,Random Forest, K Nearest Neighbours
methods, achieving higher accuracy in predicting customer (KNN), Support Vector Machine (SVM), Gradient Boosted
churn. Tree, and Multi-Layer Perceptron (MLP) Classifier.
Additionally, this study proposes a novel approach called
Keywords- Churn Prediction, Data mining, Customer Hybrid Churn Prediction (HCP), which combines the
Retention, Gradient Boost, Logistic Regression, SVM, MLP, strengths of multiple algorithms to enhance predictive
KNN, Random forest, HCP
accuracy. By analyzing the shortcomings of existing methods
and comparing them with our proposed HCP model, we seek
I. INTRODUCTION to demonstrate the potential of hybrid approaches in
improving churn prediction accuracy. Ultimately, the
The telecommunications sector is fiercely competitive, findings of this study aim to provide valuable insights for
with multiple service providers competing for the attention telecom companies seeking to mitigate customer churn and
of customers.. Retaining existing customers is crucial for enhance customer retention strategies.
telecom companies as acquiring new customers can be
expensive[1]. Customer churn, defined as the frequency at This paper is organized as follows: Section 2 presents
which customers cease utilizing a company's services, poses literature survey, Section 3 presents the churn prediction
a notable challenge within the telecommunications industry.. methodology, Section 4 presents the implementation details
It not only results in revenue loss[2] but also affects and section 5 presents the results and discussions.
customer satisfaction and loyalty.
To mitigate customer churn, telecom companies can II. LITERATURE REVIEW
benefit from predictive analytics techniques, specifically According to A. Gaur and R. Dubey[6], customer churn
customer churn prediction models. These models leverage in the telecom industry is a significant problem that affects
machine learning algorithms[3] to analyze historical data and their revenue. Customers leaving a service or subscription
discern patterns and factors influencing customer churn. By causes these businesses to spend more money. Companies
accurately predicting which customers are likely to churn, have discovered that getting new clients is approximately six
telecom companies can implement proactive measures to times more expensive than keeping the ones they already
retain those customers, including providing personalized have[7]. In order to retain customers, businesses use
promotions, improving customer service, or addressing predictive analysis of consumer behaviour to assist them add
specific pain points. new features and products as well as to address service-

979-8-3503-7212-0/24/$31.00 ©2024 IEEE


Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)

related problems. Their study offers summary of the most algorithms may exhibit biases towards the majority class,
recent research in area of Customer Churn prediction. Their leading to poor predictive performance for the minority class.
goal is to offer a clear roadmap that will make it easier to
design unique churn prediction systems in the future Limited Interpretability: While some traditional machine
learning models like logistic regression offer interpretable
According to [8], customer churn is defined as the act of results, others such as random forests or gradient boosting
a customer discontinuing use of an organization's goods or are considered black-box models. These models provide
services. The rising charges, poor assistance, low level of limited insight into the underlying factors driving churn,
knowledge of the service plan, high subscription rate, and making it challenging for stakeholders to interpret and trust
service quality, among other factors, are the causes of this the model predictions.
unhappiness. To keep existing customers and lower the
customer churn rate in advance of events, businesses should
be able to forecast customer behaviour precisely. The article
delves into a comprehensive review analysis of numerous
churn prediction models utilized across various industries.
These models employ a range of machine-learning
approaches, deep-learning algorithms, metaheuristic
optimization strategies, feature extraction-based methods,
and hybrid approaches. Additionally, the article surveys
commonly employed machine-learning techniques on a
cloud computing platform to discern patterns of customer
turnover.. The churn prediction model, with greater accuracy
findings, makes it easier to identify businesses that are about
to churn and to focus on reducing the total churn percentage,
forming retention strategies, and increasing the company's
revenue.
According to [9], in the telecommunications sector,
customer churn is a severe issue that happens more
frequently. The cost of retaining current consumers is
substantially lower than the cost of acquiring new ones[10],
and according to the literature, the cost of acquiring new
customers must be five times higher. One of the most
Fig. 1. Taxonomy of Churn Prediction
significant challenges in predicting customer attrition is
dealing with unbalanced data, and various solutions have
been explored. Their study employs machine learning The taxonomy diagram shown in Fig.1, provides a
methods such as Decision Tree, Support Vector Machine, structured framework for understanding the diverse range of
Multi-Layer Perceptron, Random Forest, and Gradient churn prediction methods, spanning from machine learning
Boosting. Techniques including Synthetic Minority Over- to deep learning techniques. By categorizing these methods
sampling Technique (SMOTE), random over-sampling, and into distinct branches, it offers researchers and practitioners
random under-sampling were utilized to balance the data. In insights into the different methodologies available for
terms of the Area Under Curve (AUC) index of the receiver predicting customer churn in various domains.
operating characteristic (ROC) curve, both over-sampling
and under-sampling approaches yielded comparable and Though there are several methods, there is a need for better
appropriate results. However, under-sampling exhibited churn prediction method. This paper proposes a new Hybrid
superior specificity and over-sampling demonstrated higher Churn Prediction method and discusses about the existing
sensitivity. Additionally, compared to other algorithms, the machine learning and deep learning approaches.
random forest and gradient boosting techniques performed
better. III. CHURN PREDICTION METHODOLOGY
A. EXISTING SYSTEM This paper uses various machine learning and deep learning
The existing system typically involves traditional machine techniques to predict customer churn in order to build
learning techniques and algorithms applied to customer classification models such as Random Forest, SVM [12],
churn prediction in the telecom sector. These techniques may Logistic Regression [13], and Gradient boosted tree [14],
include logistic regression, gradient boosting, random KNN, MLP. Each machine learning algorithm has its own
forests, decision trees, support vector machines (SVM) and drawbacks,so this paper proposes a new algorithm “Hybrid
others. Data preprocessing steps such as data cleaning, Churn Prediction(HCP)” which combines the strengths of
feature engineering, and encoding categorical variables are Logistic Regression, Random Forest, Gradient Boost and
performed before training the models. The models are then SVM. This paper also compares the performance of all these
trained on historical data containing customer attributes, models on two different datasets.
usage patterns, and churn labels. Dataset1 Details: The data set includes information about:
Drawbacks: Imbalanced Data: Customer churn datasets in customerID: Unique identifier for each customer.
the telecom sector are often imbalanced [11], with a
significantly higher number of non-churn instances gender: Gender of the customer.
compared to churn instances. Traditional machine learning SeniorCitizen: If a customer is a senior citizen or not.

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)

A. PROBLEM STATEMENT
Partner: If customer has a partner. The phenomena of customers quitting or terminating
their subscription or services with a telecom provider is
Dependents: If customer has dependents. known as "customer churn" in telecom industry. This
Tenure: Number of months a customer has remained with the phenomenon can lead to huge revenue loss and customer
company. unhappiness. The following can be used to build the issue
statement for the telecom industry's customer churn
PhoneService: If customer has phone service. prediction:
MultipleLines: If customer has multiple lines. The challenge is to create reliable and understandable
InternetService: Type of internet service subscribed by churn prediction algorithm that can use past data to estimate
customer. potential churners, offer insights into the primary causes of
churn, and direct proactive retention tactics. To enable
OnlineSecurity: Whether customer opt for online security or prompt responses, the models must be scalable, able to
not. handle large-scale datasets, and capable of early prediction.
OnlineBackup: Whether customer opt for online backup. With the ultimate goal of lowering customer turnover and
raising customer retention rates in the telecom industry, the
DeviceProtection:Whether customer has device protection. performance of the models should be assessed using relevant
TechSupport: Whether customer opt for tech support. measures to determine how well they anticipate attrition and
direct retention efforts.
StreamingTV: Whether customer opt for streaming TV.
B. HYBRID CHURN PREDICTION ARCHITECTURE
StreamingMovies: Whether customer opt for streaming .
movies.
Contract: Type of contract the customer has.
PaperlessBilling: Whether customer has paperless billing.
PaymentMethod: Payment method used by the customer.
MonthlyCharges: Monthly charges paid by customer.
TotalCharges: Total charges paid by customer.
Churn: Whether the customer churned or not.
Dataset2 Details: The data set includes information about:
customer_id: Unique identifier for each customer.
telecom_partner: Partner company of the telecom service
used by the customer.
gender: Gender of the customer (Categorical: 'M' for male,
'F' for female).
age: Age of the customer. Fig. 2. Architecture of Hybrid Churn Prediction

state: State where the customer resides. The detailed architecture of Hybrid churn prediction is shown
city: City where the customer resides. in Fig.2 with step by step process.

pincode: Postal code of the customer's location. Algorithm Used For HCP is given below:
date_of_registration: Date when the customer registered with 1. Data Preprocessing: It includes
the telecom service.
• Removing Duplicate Values
num_dependents: Number of dependents of the customer.
• Renaming Coulmn Names
estimated_salary: Estimated salary of the customer.
• Converting to numeric datatype
calls_made: Number of calls made by the customer.
• Checking Null Values.
sms_sent: Number of SMS messages sent by the customer.
2. Model Training with Algorithm: Random forests,
data_used: Amount of data used by customer. Logistic Regression, K nearest neighbor and Support
vector machine algorithms have been used to train
churn: Binary variable indicating whether customer churned
the model.
(1) or not (0).
3. Model Testing: Various metrics have been used to
test model such as:
• Precision = True Positives(TP) / (True Positives +
False Positives(FP))

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)

• Recall = True Positives(TP) / (True Positives + False def combine_predictions(predictions):


Negatives(FN))
• F1 Score = 2 * (Precision * Recall) / (Precision + • Implement a voting scheme to combine predictions
Recall) from multiple classifiers.
# Example: Use majority voting
final_predictions = []
C. PSEUDOCODE FOR ‘HYBRID CHURN PREDICTION(HCP)’
for i in range(len(predictions[0])):
Steps: votes = [pred[i] for pred in predictions]
• Define the HybridClassifier class: majority_vote = max(set(votes), key=votes.count)
class HybridClassifier: final_predictions.append(majority_vote)
return final_predictions
• Initialize the class with parameters for each base classifier.
def __init__(self, classifier_params): • Instantiate the HybridClassifier class with parameters
self.classifier_params = classifier_params for each base classifier and train it using training data.

• Define the fit method to train the hybrid classifier: IV. IMPLEMENTATION DETAILS
def fit(self, X_train, y_train): All the experiments were performed using an Intel(R)
Core(TM) i5-9300H CPU @ 2.40GHz and 8 GB of RAM
• Iterate over each base classifier and train them using running Windows. The work has been done in google colab
the provided training data. using python. The data given contains 7043 observations and
for classifier_name, classifier_param in 21 variables.
self.classifier_param.items(): The data is analyzed then it found out the top 4 variables
classifier = create_classifier(classifier_name, which has highest correlation with output column ‘churn’.
classifier_param)
classifier.fir(X_train,y_train)

• Define the predict method to make predictions using


the hybrid classifier:
def predict(self, X_test):

• Initialize an empty list to store predictions from


each base classifier.
Fig. 3. Correlated Variable
predictions = []
As shown in Fig. 3, most correlated variables with ‘churn’
• Iterate over each base classifier and make column are Monthly charges, Senior Citizen, Total Charges,
predictions using the test data. tenture.
for classifier_name, classifier_param in A. Logistic Regression(LR)
self.classifier_param.items():
classifier = create_classifier(classifier_name, For Dataset1: The test classification accuracy of the
classifier_param) Logistic Regression model was 78.7%. The model’s
precision was estimated to be 82% for 0 value and 63% for 1
classifier_predicitons= classifier.predict(X_test)
value. The recall to be 91% for 0 value and 44% for 1 value.
predictions.append(classifier_predictions)
The f1 score [15] was estimated to be 86% for 0 value and
52% for 1 value.
• Combine the predictions using a voting scheme (e.g.,
majority voting) to determine the final prediction. For Dataset2: The test classification accuracy of the
final_predictions = combine_predictions(predictions) Logistic Regression model was 85%. The model’s precision
return final_predictions was estimated to be 86% for 0 value and 0% for 1 value. The
recall to be 99% for 0 value and 0% for 1 value. The f1 score
was estimated to be 92% for 0 value and 0% for 1 value.
• Define utility functions to create individual classifiers
and combine their predictions as needed. B. Support Vector Machine(SVM)
def create_classifier(classifier_name, classifier_param):
SVM identifies the hyperplane that best separates the classes
In binary classification. The equation of the hyperplane in
• Create and return an instance of the specified classifier SVM is given by:
with the provided parameters.
if classifier_name == 'LogisticRegression':
return LogisticRegression(**classifier_param)
elif classifier_name == 'RandomForest':
For Dataset1: The test classification accuracy of SVM
return RandomForestClassifier(**classifier_param)
model was 78.8%. The model’s precision was estimated to
elif classifier_name == 'SVM':
be 81% for 0 value and 68% for 1 value. The recall to be
return SVC(**classifier_param)
# Add more classifiers as needed

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)

94% for 0 value and 36% for 1 value. The f1 score was G. Hybrid Churn Prediction(HCP)
estimated to be 87% for 0 value and 47% for 1 value.
For Dataset1: The test classification accuracy of the HCP
For Dataset2: The test classification accuracy of the SVM Classifier model was 79.6%. The model’s precision was
model was 86%. The model’s precision was estimated to be estimated to be 83% for 0 value and 68% for 1 value. The
86% for 0 value and 0% for 1 value. The recall to be 100% recall to be 91% for 0 value and 48% for 1 value. The f1
for 0 value and 0% for 1 value. The f1 score was estimated to score was estimated to be 87% for 0 value and 56% for 1
be 92% for 0 value and 0% for 1 value. value.

C. Random Forest(RF) For Dataset2: The test classification accuracy of the HCP
Classifier model was 86%. The model’s precision was
For Dataset1: The test classification accuracy of the estimated to be 86% for 0 value and 0% for 1 value. The
Random Forest model was 76.6%. The model’s precision recall to be 100% for 0 value and 0% for 1 value. The f1
was estimated to be 81% for 0 value and 61% for 1 value. score was estimated to be 92% for 0 value and 0% for 1
The recall to be 89% for 0 value and 45% for 1 value. The f1 value.
score was estimated to be 85% for 0 value and 52% for 1
value. V. RESULTS AND DISCUSSIONS
For Dataset2: The test classification accuracy of the In this paper customers are categorized on the basis of their
Random Forest model was 86%. The model’s precision was age groups and requirements which is shown in the figure
estimated to be 86% for 0 value and 0% for 1 value. The below:
recall to be 100% for 0 value and 0% for 1 value. The f1
score was estimated to be 92% for 0 value and 0% for 1
value.
D. Gradient Boost
For Dataset1: The test classification accuracy of the
Gradient Boost model was 78.2%. The model’s precision
was estimated to be 80% for 0 value and 69% for 1 value.
The recall to be 92% for 0 value and 43% for 1 value. The f1
score was estimated to be 86% for 0 value and 53% for 1
value. Fig 4. Different age groups and requirements

For Dataset2: The test classification accuracy of the According to Fig. 4, derivation of summary for each
Gradient Boost model was 79%. The model’s precision was customer category and their requirements can be made:
estimated to be 86% for 0 value and 11% for 1 value. The
recall to be 91% for 0 value and 7% for 1 value. The f1 score Young customers in the age group of 0-20 tend to have lower
was estimated to be 88% for 0 value and 9% for 1 value. data usage, but they make a relatively higher number of calls
compared to other age groups. They also send fewer SMS
E. K Nearest Neighbours(KNN) messages on average.
For Dataset1: The test classification accuracy of the KNN Middle-aged customers between 21 and 40 years old exhibit
model was 77.3%. The model’s precision was estimated to higher data usage compared to younger customers. They also
be 83% for 0 value and 58% for 1 value. The recall to be send more SMS messages on average, while the number of
88% for 0 value and 47% for 1 value. The f1 score was calls made is slightly lower compared to the younger age
estimated to be 85% for 0 value and 52% for 1 value. group.
For Dataset2: The test classification accuracy of the KNN Older customers aged between 41 and 60 show similar data
model was 85%. The model’s precision was estimated to be usage patterns to middle-aged customers but tend to make
87% for 0 value and 33% for 1 value. The recall to be 98% slightly fewer calls on average. They also send a comparable
for 0 value and 7% for 1 value. The f1 score was estimated to number of SMS messages.
be 92% for 0 value and 12% for 1 value.
Senior customers aged 61 and above exhibit relatively
F. Multi-Layer Perceptron(MLP) consistent data usage compared to middle-aged and older
customers. They also maintain a moderate level of SMS
For Dataset1: The test classification accuracy of the MLP
usage and make a moderate number of calls.
model was 79.4%. The model’s precision was estimated to
be 83% for 0 value and 65% for 1 value. The recall to be Understanding these patterns can help telecom companies
91% for 0 value and 46% for 1 value. The f1 score was tailor their services and offerings to better meet the needs
estimated to be 87% for 0 value and 54% for 1 value. and preferences of different customer segments based on
their age.
For Dataset2: The test classification accuracy of the MLP
model was 82%. The model’s precision was estimated to be Telecom companies need to understand what causes
88% for 0 value and 30% for 1 value. The recall to be 92% customer churn to keep their current client from defecting.
for 0 value and 21% for 1 value. The f1 score was estimated Telecom data can reveal this type of information. This work
to be 90% for 0 value and 25% for 1 value. includes the training of six distinct machine learning models
and building a new ‘Hybrid Churn Prediction(HCP)’ model.

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.
2024 3rd IEEE International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024)

The models in question are SVM, Logistic Regression, REFERENCES


Random Forest, KNN, MLP, Gradient boosted tree and [1] M. V. Naik and S. S. Reddy, "An innovative optimized model to
Hybrid Churn Prediction. The best of these models is anticipate clients about immigration in telecom industry," 2017 3rd
Hybrid Churn Prediction. The least effective is Random International Conference on Applied and Theoretical Computing and
Communication Technology (iCATccT), Tumkur, India, 2017, pp.
Forest on dataset 1 and Gradient Boosting on dataset 2 232-236, doi: 10.1109/ICATCCT.2017.8389139.
according to the accuracy values of these seven models as [2] Hashmi, Nabgha, Naveed Anwer Butt, and Muddesar Iqbal.
shown in the table. "Customer churn prediction in telecommunication a decade review
and classification." International Journal of Computer Science Issues
(IJCSI) 10.5 2013 : 271.
Table 1. Accuracy Values Comparison [3] M. Singh, S. Singh, N. Seen, S. Kaushal and H. Kumar, "Comparison
of learning techniques for prediction of customer churn in
telecommunication," 2018 28th International Telecommunication
Networks and Applications Conference (ITNAC), Sydney, NSW,
Australia, 2018, pp. 1-5, doi: 10.1109/ATNAC.2018.8615326.
[4] J. Kunnen, M. Duchateau, Z. Van Veldhoven and J. Vanthienen,
"Benchmarking Stacking Against Other Heterogeneous Ensembles in
Telecom Churn Prediction," 2020 IEEE Symposium Series on
Computational Intelligence (SSCI), Canberra, ACT, Australia, 2020,
pp. 1234-1240, doi: 10.1109/SSCI47803.2020.9308188.
[5] M. B. A. Joolfoo, R. A. Jugumauth and K. M. B. A. Joolfoo, "A
Systematic Review of Algorithms applied for Telecom Churn
Prediction," 2020 3rd International Conference on Emerging Trends
in Electrical, Electronic and Communications Engineering
(ELECOM), Balaclava, Mauritius, 2020, pp. 136-140, doi:
10.1109/ELECOM49001.2020.9296999.
[6] A. Gaur and R. Dubey, "Predicting Customer Churn Prediction In
Telecom Sector Using Various Machine Learning Techniques," 2018
International Conference on Advanced Computation and
According the table 1, the build model HCP Classifier is Telecommunication (ICACAT), Bhopal, India, 2018, pp. 1-5, doi:
10.1109/ICACAT.2018.8933783.
best as compared to other models because it has highest
[7] Almana, Amal M., Mehmet Sabih Aksoy, and Rasheed Alzahrani. "A
accuracy for both datasets. survey on data mining techniques in customer churn analysis for
telecom industry." International Journal of Engineering Research and
VI CONCLUSION Applications 4.5 2014 pp: 165-171.
This paper presented the datasets of existing churn [8] I. Ullah, B. Raza, A. K. Malik, M. Imran, S. U. Islam and S. W. Kim,
"A Churn Prediction Model Using Random Forest: Analysis of
prediction methods and proposed the Hybrid Churn Machine Learning Techniques for Churn Prediction and Factor
Prediction Method. The proposed method is implemented in Identification in Telecom Sector," in IEEE Access, vol. 7, pp. 60134-
python on google colab. The performance of proposed HCP 60149, 2019, doi: 10.1109/ACCESS.2019.2914999.
method is compared with existing machine and deep [9] S. Shumaly, P. Neysaryan and Y. Guo, "Handling Class Imbalance in
learning algorithms and from experimentation it is found Customer Churn Prediction in Telecom Sector Using Sampling
Techniques, Bagging and Boosting Trees," 2020 10th International
that the proposed method gives the best accuracy for both Conference on Computer and Knowledge Engineering (ICCKE),
datasets. By aggregating predictions from multiple Mashhad, Iran, 2020, pp. 082-087, doi:
classifiers, including those designed to handle imbalanced 10.1109/ICCKE50421.2020.9303698.
datasets effectively(SVM and Random Forest), the HCP [10] M. Ali, A. U. Rehman and S. Hafeez, "Prediction of Churning
Behavior of Customers in Telecom Sector Using Supervised Learning
classifier can provide more balanced predictions and Techniques," 2018 IEEE 3rd International Conference on Computing,
mitigate biases towards the majority class. The HCP Communication and Security (ICCCS), Kathmandu, Nepal, 2018, pp.
classifier enhances interpretability by incorporating 143-147, doi: 10.1109/CCCS.2018.8586836.
transparent models, such as logistic regression, alongside [11] A. Hanif and N. Azhar, "Resolving Class Imbalance and Feature
more complex models like random forests or gradient Selection in Customer Churn Dataset," 2017 International Conference
on Frontiers of Information Technology (FIT), Islamabad, Pakistan,
boosting. 2017, pp. 82-86, doi: 10.1109/FIT.2017.00022.
[12] Babatunde, Ronke, et al. "Classification of customer churn prediction
VII FUTURE WORK model for telecommunication industry using analysis of variance."
Machine Learning approaches are generally trying to IAES International Journal of Artificial Intelligence 12.3 2023: 1323-
0.
understand the customer behavior and give important
recommendations for keeping the customers with the [13] Madan, Mamta, Meenu Dave, and Vani Kapoor Nijhawan. "A Review
on: Data mining for telecom customer churn management."
telecom companies but what is really needed is there are International Journal of Advanced Research in Computer Science and
overlapping customer requirement. Young customers Software Engineering 5.9 2015.
(suppose a student) may have a startup and old customers [14] Kavitha, V., et al. "Churn prediction of customer in telecom industry
may have international connections with their relatives so using machine learning algorithms." International Journal of
Engineering Research & Technology (2278-0181) 9.05 2020: 181-
they need to have video calls which causes data usage. If 184.
this type of overlapping customers are there then obviously [15] Amin, Adnan, et al. "Customer churn prediction in telecommunication
the churn prediction model can be improved based upon industry using data certainty." Journal of Business Research 94 2019:
their requirements. pp - 290-301.

Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 22,2024 at 16:45:35 UTC from IEEE Xplore. Restrictions apply.

You might also like