Customer Churn Prediction On E-Commerce Using Machine Learning
Customer Churn Prediction On E-Commerce Using Machine Learning
https://doi.org/10.22214/ijraset.2023.50479
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
Abstract: For E-commerce businesses to produce successful marketing plans and customer retention tactics, client churn
vaticination is pivotal. In order to handle the longitudinal timeframes and multiple data variables of B2Ce-commerce
consumers' buying habits, the authors of this study present a loss vaticination model that integrates k- means client
segmentation with support vector machine (SVM) vaticination. guests are divided into three groups according to the approach,
which also defines the main customer groupings. In order to anticipate client development, the study analyses the efficacity of
logistic retrogression and SVM vaticination. The findings show that client segmentation greatly increases each indicator’s
capability to read values, emphasizing the significance of k- means clustering segmentation. also, it's demonstrated that SVM
vaticination is more accurate than logistic retrogression vaticination. The conclusions of this study have important ramifications
for client relationship operation.
Keywords: Churn prediction Machine learning techniques Boosting algorithm.
I. INTRODUCTION
Customers are a valuable asset for any business as they play a vital role in enhancing market competitiveness and performance. In
today's fiercely competitive market, customers have a plethora of products and service providers to choose from. Research shows
that the cost of acquiring a new customer is often higher than retaining an existing one. By maintaining a strong and long-lasting
relationship with customers, a business can derive more profits from its existing customers. A mere 5% increase in customer
retention can lead to a 25-95% increase in the net present value of the business. Similarly, reducing the customer churn rate by 5%
can result in a 25-85% increase in the average profit margin of the enterprise. Therefore, it has become crucial for businesses to
leverage their existing customer resources and prevent customer loss to maintain their market advantage. One effective approach to
achieving this is through customer churn prediction techniques that can help identify customers who are at risk of leaving, enabling
the business to take proactive measures to retain them. This is particularly important in the highly competitive telecommunications
market, where companies must analyse customer behaviour to identify churn risks and take appropriate steps to retain customers.
This involves examining customers' calling behaviour, their interactions with the operator, package subscriptions, account
information, calling details, and demographic characteristics. In e-commerce, the significance of churn prediction and analysis lies
in its ability to help companies anticipate and identify clients who may be at risk of leaving, allowing them to take necessary
measures to reduce or prevent customer churn and minimize potential losses.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1774
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1775
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
B. Software Requirements
1) Operating System: Windows 8/10/11.
2) Programming Language: Python, JSON.
3) Development IDE: Visual Studio Code Version: 1.75
4) Other Software’s: Google Collab, Jupiter Notebook.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1776
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
2) Step 2: Having imported the required libraries, we can now proceed to reading the data.
3) Step 3: This is the loading phase where your file is being uploaded.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1777
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
5) Step 5: After importing the necessary libraries, we have read the data and discovered that there are 5630 observations with
missing values in some of the features. We will remove the irrelevant CustomerID column before proceeding further. Moving
on to handling outliers, we will now explore if there are any outliers in our feature columns.
6) Step 6: We will now create visualizations for each variable in the dataset and their corresponding churn value. This will help us
understand the relationship between each variable and churn. After visualizing the data, we will pre-process it by handling
missing values, encoding categorical variables, and scaling the numerical features. Once the data is pre-processed, we will split
it into training and testing sets and then train our models. We will train four base learners - Decision Trees, Random Forests,
Support Vector Machines, and KNN classifiers. The outputs of these models will be fed into the Stacking Classifier's meta-
classifier using logistic regression. Finally, we will evaluate the performance of our models using various metrics such as
accuracy, precision, and recall.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1778
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue IV Apr 2023- Available at www.ijraset.com
7) Step 7: To gain further perceptivity, we can calculate the chance of churn contributed by each order for each variable. This
information can be presented in the form of pie maps, showing the average client churn for each order.
IX. CONCLUSION
The ability to predict customer churn is crucial for e-commerce companies to remain competitive. Employing machine learning
techniques in customer relationship management can aid companies in forecasting potential customer loss and devising effective
marketing and retention strategies. This study aimed to evaluate the predictive ability of SVM and LR models using customer
behaviour data from a B2C e-commerce enterprise. The k-means algorithm was employed for clustering subdivision to classify
customers into three categories, and predictions were made for each category. The performance of the models was evaluated using
accuracy, recall, precision, and AUC metrics.
The study had two primary objectives. Firstly, to assess the efficacy of customer segmentation and the predictive power of the
model before and after segmentation based on customer shopping behaviour. The results indicated a substantial improvement in
prediction accuracy after implementing k-means clustering segmentation. Secondly, to compare the performance of traditional
statistical LR model prediction with machine learning-based SVM model prediction. The SVM model outperformed the LR model
in terms of accuracy.
In conclusion, the research findings offer valuable insights for B2C e-commerce companies' customer relationship management
efforts
REFERENCES
[1] Bi, Q.Q. Cultivating loyal customers through online customer communities: A psychological contract perspective. J. Bus. Res. 2019, 103, 34–44.
[2] Maria, O.; Bravo, C.; Verbeke, W.; Sarraute, C.; Baesens, B.; Vanthienen, J. Social network analytics for churn prediction in telco: Model building, evaluation
and network architecture. Expert. Syst. Appl. 2017, 85, 204–220.
[3] Roberts, J.H. Developing new rules for new markets. J. Acad. Market. Sci. 2000, 8, 31–44.
[4] Reichheld, F.F.; Sasser, W.E. Zero defeofions: Quoliiy comes to services. Harvard. Bus. Rev. 1990, 68, 105–111.
[5] Jones, T.O.; Sasser, W.E., Jr. Why satisfied customer’s defect. IEEE Eng. Manag. Rev. 1998, 26, 16–26.
[6] Nie, G.; Rowe, W.; Zhang, L.; Tian, Y.; Shi, Y. Credit card chum forecasting by logistic regression and decision tree. Expert. Syst. Appl. 2011, 38, 15273–
15285.
[7] Gordini, N.; Veglio, V. Customers churn prediction and marketing retention strategies: An application of support vector machines based on the AUC
parameter-selection technique in B2B e-commerce industry. Ind. Market. Manag. 2017, 62, 100–107.
[8] Zorn, S.; Jarvis, W.; Bellman, S. Attitudinal perspectives for predicting churn. J. Res. Interact. Mark. 2010, 4, 157–169.
[9] Datta, P.; Masand, B Automated cellular modeling and prediction on a large scale. Artif. Intell. Rev. 2000, 14, 485–502.
[10] Jain, H.; Khunteta, A.; Srivastava, S. Churn prediction in telecommunication using logistic regression and logit boost. Procdia Compute. Sci. 2020, 167.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 1779