Enhancing Customer Retention Strategies Predicting Churn Rate in Telecom Sectors Using Machine Learning Ensemble Techniques
Enhancing Customer Retention Strategies Predicting Churn Rate in Telecom Sectors Using Machine Learning Ensemble Techniques
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 23,2024 at 13:53:17 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-5306-8/24/$31.00 ©2024 IEEE 377
crucial, businesses invest strategically for efficient The challenge of customer churn requires advanced
growth. data analysis and predictive methodologies. Logistic
regression, outperforming other methods, provides
Churn events arise when the customer experience falls valuable insights into churn factors. Yet, there's a gap in
below a critical threshold. Existing methods like exploring hybrid models for a more nuanced
satisfaction surveys prove inadequate in capturing understanding, suggesting future research opportunities
impulsive customer decisions. Proactive measures are [15]
essential to identify nuanced dissatisfaction factors.
The research employs supervised learning algorithms III. METHODOLOGY
to predict customer churn, utilizing six algorithms: KNN,
Random Forest, XGBoost, AdaBoost, Decision Tree, and
Gradient Boosting.
• The KNN (K-Nearest Neighbors) algorithm is a
method of machine learning that makes
classifications or predictions by examining the
majority class of its nearest neighbours in the feature
space. This algorithm possesses characteristics of
simplicity and versatility [6].
• Random Forest, an ensemble algorithm, prevents
overfitting and efficiently handles highly
Fig. 1. Process Flow Diagram
dimensional data, demonstrating lower classification
errors [7]. A. Dataset
• XGBoost, recognized for speed and accuracy, The telecom customer dataset, sourced from Kaggle
outperforms other gradient boosting and originally uploaded by user Blastchar five years ago,
implementations, excelling in classification and provides valuable insights into a California-based telecom
regression tasks [8]. company in the USA. The dataset encompasses a
• AdaBoost improves the performance of other comprehensive data dictionary. With a total of 21
algorithms by focusing on misclassified data, columns, including 19 features and 1 target column
demonstrating effectiveness in applications like (Churn), there are 7043 rows, each representing a distinct
customer. The dataset is a balanced combination of
fingerprint classification [9].
categorical and numerical features, making it versatile for
• Decision Tree classifiers, popular for representing in-depth analysis. The downloader and documentation
classifiers, find applications in diverse fields due to highlight the dataset and its importance in predicting
their interpretability and ease of understanding [10]. customer behaviour and developing targeted customer
• Gradient Boosting, a technique for regression and retention programs. This dataset is a reliable resource for
understanding the dynamics of telecommunications
classification, constructs a predictive model through
customer relationships and provides valuable information
an ensemble of weak prediction models [11]. for both research analysis and predictive modelling.
• Extra-Trees, an extension of Random Forest,
enhances model robustness by introducing high B. Loading Dataset
randomization during tree-building. It considers The telecom customer prediction dataset was loaded into
multiple random splits for each feature, guarding Python using slash and the file path was obtained from the
against overfitting and improving generalization file stored on the laptop and desktop.
capabilities compared to traditional Random
Forests[12]. C. Importing Required Libraries
• LightGBM, an efficient gradient boosting The next step involved importing the necessary libraries
framework, excels in tasks like classification and to simplify tasks and simplify the code. Prewritten code in
regression, optimized for distributed and efficient Python libraries can help solve some problems and make
training [13]. data preprocessing steps more efficient. Imported libraries,
as shown in Figure 2, facilitate access to the code that solves
The literature review highlights the significance of the problem.
customer churn prediction in the telecom sector. Existing
studies explore various machine learning methods,
emphasizing the superiority of boosted versions and
ensemble techniques, particularly Random Forest.
Proposed models leverage Random Forest, achieving an
88.63% accuracy in classifying instances and identifying
specific churn factors for targeted strategies. However,
research gaps suggest further exploration of clustering
techniques for a nuanced understanding of customer
dynamics and enhanced predictive strategies [14].
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 23,2024 at 13:53:17 UTC from IEEE Xplore. Restrictions apply.
378
• F1-Score:
A complete metric for evaluating binary
classification that strikes a balance between recall
and precision is the F1-Score. It gives an overview
of the prediction performance of a model,
especially for datasets that are both balanced and
imbalanced. The F1-Score is computed using the
following formula:
• 2.Precision:
Precision measures the model and its ability to
correctly predict a given class of cases, focusing on
the percentage of true positives. It measures the Fig. 7. Bar Graph (Churn-Tenure)
accuracy of the model's positive predictions, and
the accuracy formula is: • Longer Contracts Linked to Higher Churn: Scatter
plot displays monthly charges distribution across
different customer tenures (0-12, 12-24, 24-48, and
over 48 months). Longer tenures exhibit a broader
Fig. 4. Precision Formula range and higher median monthly charges; for
instance, over a 48-month tenure median is $80,
• Recall: compared to $60 for 0-12 months. Potential reasons
Recall calculates the percentage of accurately include higher usage levels in longer-tenured
predicted positive cases out of all potential positive customers or being on older, costlier rate plans.
predictions. It is sometimes referred to as Outliers indicate customers with significantly higher
sensitivity or true positive rate. Out of all true or lower charges, possibly due to excessive data use
positive samples, it determines the proportion of or promotional rates. The plot suggests a correlation
accurately recognized positive samples, and the between tenure and monthly charges, highlighting
return formula is: factors influencing customer billing variations.
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 23,2024 at 13:53:17 UTC from IEEE Xplore. Restrictions apply.
379
Fig. 8. Scatter Plot (Charges-Tenure)
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 23,2024 at 13:53:17 UTC from IEEE Xplore. Restrictions apply.
380
experiences. Considerations include data quality, telecommunications applications and provides
cost-benefit analysis, and interpretability. valuable information for strategic decision-making.
AdaBoost leads with a precision of 0.74, recall of
0.70, F1-score of 0.71, and accuracy of 0.79,
excelling in identifying at-risk customers. Gradient
Boosting follows with balanced performance
(precision 0.73, recall 0.68, F1-score 0.69,
accuracy 0.77). Random Forest, LightGBM, and
XGBoost offer competitive results across metrics.
Decision Tree and Extra Trees prioritize
interpretability, while KNN exhibits lower
precision (0.66) and recall (0.63). AdaBoost is
recommended for precision-focused applications,
while other models offer balanced trade-offs
between precision, recall, and accuracy, aligned
with diverse business objectives. Regular
monitoring ensures continued effectiveness in
predicting customer churn. Fig. 14. ROC-AUC Curve
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 23,2024 at 13:53:17 UTC from IEEE Xplore. Restrictions apply.
381
[3] Alzubi, J., Nayyar, A., & Kumar, A. (2018). Machine Learning from
Theory to Algorithms: An Overview. Journal of Physics: Conference
Series, 1142, 012012. https://doi.org/10.1088/1742-6596/1142/1/012012.
[4] Bin, L., Peiji, S., & Juan, L. (2007). Customer Churn Prediction Based
on the Decision Tree in Personal Handyphone System Service. 2007
International Conference on Service Systems and Service Management.
https://doi.org/10.1109/icsssm.2007.4280145.
[5] Özdemir, O., Batar, M., & Işık, A. H. (2020). Churn Analysis with
Machine Learning Classification Algorithms in Python. Artificial
Intelligence and Applied Mathematics in Engineering Problems,
844–852. https://doi.org/10.1007/978-3-030-36178-5_73.
[6] Taunk, K., De, S., Verma, S., & Swetapadma, A. (2019). A Brief
Review of Nearest Neighbor Algorithm for Learning and Classification.
2019 International Conference on Intelligent Computing and Control
Systems (ICCS). https://doi.org/10.1109/iccs45141.2019.9065747.
[7] Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32.
https://doi.org/10.1023/a:1010933404324.
[8] Peng, Z., Huang, Q., & Han, Y. (2019). Model Research on Forecast of
Second-Hand House Price in Chengdu Based on XGboost Algorithm. 2019
IEEE 11th International Conference on Advanced Infocomm Technology
(ICAIT). https://doi.org/10.1109/icait.2019.8935894.
[9] Zhang, Y., Ni, M., Zhang, C., Liang, S., Fang, S., Li, R., & Tan, Z.
(2019). Research and Application of AdaBoost Algorithm Based on SVM.
2019 IEEE 8th Joint International Information Technology and Artificial
Intelligence Conference (ITAIC).
https://doi.org/10.1109/itaic.2019.8785556.
[10] B. Charbuty and A. Abdulazeez, “Classification Based on Decision
Tree Algorithm for MachineLearning”, JASTT, vol. 2, no. 01, pp. 20 - 28,
Mar. 2021. https://doi.org/10.38094/jastt20165.
[11] O. González-Recio, J.A. Jiménez-Montero, R. Alenda, The gradient
boosting algorithm and random boosting for genome-assisted evaluation in
large data sets, Journal of Dairy Science, Volume 96, Issue 1, 2013, Pages
614-624, ISSN 0022-0302, https://doi.org/10.3168/jds.2012-5630.
[12] Geurts, P., Ernst, D. & Wehenkel, L. Extremely randomized trees.
Mach Learn 63, 3–42 (2006). https://doi.org/10.1007/s10994-006-6226-1.
[13] Y. Deng, D. Li, L. Yang, J. Tang and J. Zhao, "Analysis and
prediction of bank user churn based on the ensemble learning algorithm,"
2021 IEEE International Conference on Power Electronics, Computer
Applications (ICPECA), Shenyang, China, 2021, pp. 288-291, doi:
10.1109/ICPECA51329.2021.9362520.
[14] I. Ullah, B. Raza, A. K. Malik, M. Imran, S. U. Islam and S. W. Kim,
"A Churn Prediction Model Using Random Forest: Analysis of Machine
Learning Techniques for Churn Prediction and Factor Identification in
Telecom Sector," in IEEE Access, vol. 7, pp. 60134-60149, 2019, doi:
10.1109/ACCESS.2019.2914999.
[15] ICVISP 2019: Proceedings of the 3rd International Conference on
Vision, Image and Signal Processing August 2019 Article No.: 34 Pages 1–
7 https://doi.org/10.1145/3387168.3387219.
Authorized licensed use limited to: K K Wagh Inst of Engg Education and Research. Downloaded on August 23,2024 at 13:53:17 UTC from IEEE Xplore. Restrictions apply.
382