E-Commerce Customer Churn Prevention Using Machine Learning-Based
E-Commerce Customer Churn Prevention Using Machine Learning-Based
Measurement: Sensors
journal homepage: www.sciencedirect.com/journal/measurement-sensors
A R T I C L E I N F O A B S T R A C T
Keywords: Businesses in the E-Commerce sector, especially those in the business-to-consumer segment, are engaged in fierce
Index terms— E-commerce customer churn competition for survival, trying to gain access to their rivals’ client bases while keeping current customers from
Hybrid algorithm defecting. The cost of acquiring new customers is rising as more competitors join the market with significant
Personalized retention
upfront expenditures and cutting-edge penetration strategies, making client retention essential for these orga
Support vector machine
nizations. The best course of action in this circumstance is to detect prospective churning customers and prevent
churn with temporary retention measures. It’s also essential to understand why the customer decided to go away
to apply customized win-back strategies. Each customer’s information, including searches made, purchases
made, frequency of purchases, reviews left, feedback is given, and other data, is kept on file by the e-commerce
company. Machine learning and data mining may be aided by examining this enormous quantity of data, ana
lysing customer behaviour, and seeing potential attrition opportunities. The support vector machine is a popular
supervised learning method in machine learning applications. Predictive analysis uses the hybrid classification
approach to address the regression and classification issues. The process for forecasting E-Commerce customer
attrition based on support vector machines is presented in this paper, along with a hybrid recommendation
strategy for targeted retention initiatives. You may prevent future customer churn by suggesting reasonable
offers or services. The empirical findings demonstrate a considerable increase in the coverage ratio, hit ratio, lift
degree, precision rate, and other metrics using the integrated forecasting model. To effectively identify separate
groups of lost customers and create a customer churn retention strategy, categorize the various lost customer
types using the RFM principle.
* Corresponding author.
E-mail addresses: [email protected] (S. J), [email protected] (Ch. Gangadhar), [email protected] (R.K. Arora), [email protected]
(P.N. Renjith), [email protected] (J. Bamini), [email protected] (Y. Chincholkar).
https://doi.org/10.1016/j.measen.2023.100728
Received 16 December 2022; Received in revised form 14 February 2023; Accepted 2 March 2023
Available online 8 March 2023
2665-9174/© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
S. J et al. Measurement: Sensors 27 (2023) 100728
E-Commerce situation can be in one of four states: new, active, inactive, work on B2C E-Commerce customer churn detection, customer churn
or churned [4]. The industry’s success is entirely based on its capacity to prediction was obtained using a logistic regression approach [16]. The
keep customers engaged for an extended period. Because of the indus customer profile is one of the most obvious input criteria for recom
try’s fierce rivalry, obtaining a new consumer is relatively expensive. In mendation generation, and previous recommender systems relied
the event of a new customer, a firm’s financial breakeven (return on heavily on it, as well as the user’s behavioural features. Further studies
investment) is usually reached only after the customer completes a few began to look into things like trust factor historical transaction patterns,
transactions over time [5]. Active consumers are the backbone of every societal inputs from people in comparable demographic groups and/or
E-Commerce business, and businesses should pay special attention to with similar past transaction qualities, contextual information, and so on
those who may become inactive and churn later. Statistical techniques [17]. It is called a collaborative method when social inputs are taken
and machine learning strategies [6] are examples of methodologies that into account when making recommendations. This method is usually
can be used to anticipate prospective client turnover. Customer data used in conjunction with a demography-based method [18]. The simi
collected by B2C E-Commerce companies are a great source of infor larity of client profiles, which is computed via computational method
mation because it allows for analyzing different consumers’ purchasing ologies, is a significant parameter to consider in such models. Customer
patterns. The likelihood of a client churn is expected via churn predic retention and loyalty programme memberships were explored as a result
tion. It reduces the cost of acquiring new customers while also assisting of service experiences. Neslin and colleagues investigated the method
in customer retention. It takes more marketing time and money to ological components that determine the accuracy of customer churn
develop a new client than to keep an existing consumer [7]. Customers prediction models in a descriptive study. Jamal et al. created a model to
who are hesitant to make a purchase or are prepared to switch shopping examine the relationship between influencing elements such as failure
sites due to financial concerns can be persuaded and clutched. They can recovery, customer attrition, and customer service experience [19]. The
expect standards and variety in product offerings. Customers leaving for study of customer behaviour in non-contractual relationships focused
essential and unavoidable reasons are free to do so. Though we invest in mostly on buying behaviour. The SMC model for predicting consumer
involuntary churners, the result is solid. Target marketing assists in transactions was proposed by Schmittlien Morrison and Colombo in
reaching and connecting with customers [8]. In a contract scenario, a 1987 [20]. The model used mathematical calculations to determine the
customer relationship indicates that the firm and the client work customer’s activity and then used that information to predict the cus
together [9]. Both parties’ rights and obligations are clearly stated in the tomer’s behaviour in the future [21].
contract. The client must complete the necessary responsibility
following the contract to enjoy the appropriate freedom after signing an 2. Methodology
agreement with the firm [10]. Extending discounts, changing products
to consumers’ preferences, and sending out trigger emails are all ways to The use of labelled training datasets in supervised learning is a type
discourage voluntary churners. Focusing solely on voluntary churners of machine learning. These datasets are used to train or support vector
will reduce the cost of providing advantages to new and existing machine algorithms, allowing them to reliably identify data and predict
churned consumers [11]. Predicting client attrition with unstructured outcomes. Using labelled inputs and outputs, the model may be tested
data produces poor outcomes and for accuracy and learn over time. Artificial intelligence and machine
Creates a problem of class imbalance. To prevent these issues, there learning, which has pervaded the BI sector and is transforming the way
must be a difference significant in the percentage of churn in historical businesses think about their data, is at the forefront of this automation.
data and non-churn in historical data. Customer data should be pre- Machine learning, like any other technical frontier, may be a risky topic
processed, and feature selection should be used to forecast the essen for enterprises. In order to deal with client turnover in B2C E− Com
tial properties. This saves time and money when it comes to forecasting. merce, the technique proposed in this study adopts a two-pronged
Different classification and predictive models can be used to predict approach. The first component employs support vector machine tech
customer attrition. Efficient algorithms focused on soft voting on accu nology to anticipate churn occurrences. The second portion looks at how
racy are exposed, which chooses the ensemble model as the best to to use a hybrid strategy that blends content-based, collaborative, de
follow in future works [12]. mographic, and knowledge-based methodologies to produce customised
retention approaches. Business intelligence is a system that collects,
1.1. Related works analyses, and presents massive amounts of data and information. When
used correctly business intelligence solutions and products save you
Researchers have taken a variety of approaches to predict churn. A time and energy while providing you with crucial information about
conceptual model to investigate the effect of consumer loyalty and your Business and clients. Business intelligence software can parse the
relationship quality on the client maintenance process [13] was one of bulk of your raw data understand it and present you with the informa
the principal concentrates around here. The age of programmed main tion that matters. Organizations can use business knowledge to make
tenance methodologies, as depicted by Bolton et al. furthermore Hadden increasingly knowledgeable sales, better decisions in marketing and
et al., is a sort of suggestion administration. There are various exami other areas of the Business. E-commerce enterprises can employ business
nations in the field of recommender frameworks, including one that intelligence to make data-driven decisions. It is possible to settle on
analysed the abilities of three information mining models - neural or choices about future firm development by taking a long term at a drawn-
ganizations, relapse, and relapse trees in gauging client churn [14]. A out perspective available, market, contest, and different realities. Busi
combination of SMC models and naïve Bayesian forecasts the perfor ness knowledge can be utilised by web-based business associations to
mance of e-commerce website customers. The new model, the Naive distinguish explicit deals patterns dependent on their clients’ prefer
Bayesian method, outperforms the SMC model. The shopper movement ences, internet shopping encounters, habits, buying conduct, and re
level of an online business site utilizing the non-authoritative relation sponses to advancements and patterns, all of which impact deals and
ship and the SMC model. As indicated by the information, the higher a assist them with augmenting income. Any’s organization will likely
client’s degree of movement, the lesser the possibility of client churn Wu decrease the waste while expanding income. Business knowledge can
Hong utilizes the SMC model to figure forthcoming client esteem and help E-trade associations distinguish flaws for quality concerns
incorporate it into client property attributes [15]. Credit cards, Retail including client steady loss, lost deals usefulness because of call focus
banking, mobile internet gambling, social gaming, and the email mar discontent, poor technique because of ineffectively investigated statis
keting are some of the other areas of expertise. Scholars have developed tical surveying, and a scope of different hardships that request quick
unique churn prediction models and studies in a variety of industries, efficient answers. Therefore, information might be prepared through an
such as telecom, where a lot of work has been done, such as. In previous assortment of modules to recognize waste improvement openings and
2
S. J et al. Measurement: Sensors 27 (2023) 100728
set up plans to address these worries. Business Intelligence helps E-trade reduction in trade frequency. Customers who fall into this cate
organizations in refining a different stock and enhancing supply sums by gory may not be able to purchase the company’s product within
evaluating considerable recorded information, for example, client pro the specified time limit. The customer is not completely lost; he or
files and buy purchase habit. It also decreases the chance of out-of-stock she may still acquire the company’s service. Customers in this
situations by analysing safety stock data, sales, and data inventories to situation may be unable to buy the firm’s manufactured goods
generate accurate estimates. Business intelligence can predict overstock within the specific time frame. However, this does not imply that
situations before they become a severe problem by merging sales, the customer is absolutely helpless. After the time limit has
replenishment, and forecasting data. expired, the customer may purchase the company’s product or
service.
(2) Completely loss: The term “permanently lost” refers to a cus
2.1. The concept of customer churn tomer’s decision not to purchase services or the company’s
products in the future. E-commerce businesses will not delete a
Customer churn refers to a situation in which a customer’s contri client’s account; even if the customer hasn’t been interested. The
bution to the company’s earnings is decreasing. Client churn is the consumer can log in using the registered account for a long period
context of contractual relationships and non-contractual relationships the Business will not be able to tell whether the customer’s ac
are divided into two groups based on whether the company and the count is permanently gone. Permanent loss refers to a client’s
customer engaged into a contract. complete loss, which can occur for a variety of reasons, including
With regards to a legally agreement for understanding the customer a change in the customer’s purchasing habits, as well as a change
churn. In an agreement situation, a client relationship shows that the in the growth stage at which the product is no longer required.
firm and the customer cooperate to bring down the two parties’ mis
fortunes in the exchange cycle and to tie the agreement by restricting the
2.2. Detection of customer churn
agreement. The two parties’ privileges and commitments are obviously
expressed in the agreement and the relating just subsequent to marking
Statistical models and machine learning algorithms are examples of
an agreement with the contract the customer should finish the important
predictive analytics techniques can be used to populate risk scores for
duty as per the agreement. The exchange between the client and the firm
each client, allowing for proactive churn detection. Usage data,
is limited by the understanding in the contract relationship, the
customer reviews, customer demography, net promoter score, pur
customer should pay a more prominent expense of move, and the client’s
chasing patterns, and other input characteristics are all used in this
exchanging conduct is sensibly predictable. Telephone communication,
process. Since the date on which a customer registers with a B2C E-
protection organizations, etc are instances of agreement-based organi
Commerce firm, all of this information has been available to them. So,
zations. With regards to non-contractual relationship organization,
the next step is to finalise a framework for processing this data. One
client churn is an issue. The relationship between the firm and the client
possibility is to utilise statistical models based on regression techniques
begins with the customer’s initial transaction in the case of non-
such as logistic regression to anticipate prospective churners. Another
contractual situations. Customers and businesses do not need to sign a
method is to use machine learning technologies such as the support
contract; customers can easily enter and exit the Business, and the trade
vector machine. The Support Vector Machine is a supervised learning
and loss behaviour of customers is unpredictable. Customers may buy
technique that may be used to solve both regression and classification
things for a long period after their first purchase or never trade again.
issues in machine learning. SVM, which splits a set of entities into
The customer turnover rate is higher in the setting of non-contractual
distinct class memberships, is supported by the decision planes idea.
relationships because the firm is less bound to the customer, combined
SVM is prepared on a bunch of information in which every component is
with the minimal amount of client transfer. Into non-contractual part
characterized into one of two classes. SVM makes a model that can
nerships, measuring customer turnover is more challenging since the
relegate each new member to one of the two classes depending on the
reasons for lost consumers are more ambiguous, and there is no strong
data obtained. To do this, an algorithmically developed ideal hyper-
division between lost and non-lost clients. E-commerce businesses must
plane in the element space is made, filling in as a choice limit for the
establish a benchmark for customer turnover based on the characteris
arrangement issue. SVM can be considered a non-probabilistic paired
tics of their products in order to detect the trend of lost clients in a timely
direct classifier in this situation shown in Fig. 2.
manner and adopt effective retention strategies. The Fig. 1 provides the
For a given data collection (X1, X1), (X2, X2), (X3, X3), (Xm, Xm), Yi
overview customer churn management.
= − 1 for inputs in class Xi = 0 and Yi = 1 for inputs Xi in class 1, where
Customers who shop online are characteristic of customer churn in
Yi = 1 for inputs Xi in class 1.
non-contractual relationships. When the length of the transaction is
The decision boundary can be expressed mathematically as a vector
considered, the failure of an online shopping client is split into two
representing a two-dimensional line and two-dimensional vectors, and b
categories: permanent loss and interruptions.
(1) Loss that occurs on a regular basis: The term “intermittent loss”
refers to clients who did not purchase the company’s goods or
services over a set length of time, with the main feature being a
3
S. J et al. Measurement: Sensors 27 (2023) 100728
represents the bias as a constant. In this way, the negative support vector
n can be defined as the input vector from class 0, and the positive sup
port vector p may be defined as the input vector from class 1.
Support Vector Machines use a technique known as the kernel trick
to achieve nonlinear classification. In the high-dimensional feature
space input components are mapped and the kernel trick is used to
perform a classification task. The method will be similar to that used in
the linear case, with the exception that the nonlinear kernel function
will be used to replace every dot product. The decision boundary (also
known as the maximum-margin hyper plane) can now be fitted using
algorithm converts the feature space based on higher-dimensional
Fig. 4. Retention Actions based Hybrid Recommendation Strategy.
Format. The hyperplane of the revised feature space may become
nonlinear when remapped to the original input space in Fig. 2. Mercer’s
condition can be seen as a kernel function for any mathematical function Step 2. Determination of the various test data set.
that can verify as a general rule. Step 3. Determination of the calculated accuracy.
On the other hand, the most often used kernel functions are based on
Euclidean inner products or Euclidean distance. The type of class Step 4. Selection of the optimal value of cost and gamma.
boundaries and the data structure employed are essential elements that Step 5. Implementation of the SVM train step for every data point.
influence kernel selection. SVM is regarded as the most reliable classi
fication technique in various difficult real-time circumstances. Hand- Step 6. Implementation of the SVM to classification of the testing data
written character recognition image/face detection, hypertext catego points.
rization and fraud detection are a few examples. In these cases, the main Step 7. Returning of the accuracy value.
benefit of SVM is that it avoids the overfitting problem by using an
appropriate training phase to generate the classification model. The customer’s personal information and previous transactional
In the case of Business to consumer for E-Commerce churn detection, qualities serve as the input data for creating content-based recommen
a binary classification of the entire client base using a support vector dations. The customer’s demographic data and their accumulated
machine is used to identify prospective churners in Fig. 3. Each of the expertise over time serve as the inputs for knowledge- and demographic-
client characteristics n-dimensional vector space is represented with based recommendations. Collaborative suggestions are created using the
each of the consumer qualities is represented by a vector dimension in use habits of similar users. To make similarity-based recommendations,
the feature space. Historic churn data, customer characteristics, and similarity functions are utilised to calculate the perceived value of the
transaction traits are used to train the SVM Model showing in Fig. 3. A recommendations to the target consumer. Recommendations generated
periodic churn test is also run for each customer database member to in this way often have a high level of personalization since they are
detect probable attrition scenarios. unique for each customer and are influenced by their prior activity
patterns and individual profiles.
Customer While “j" is a client belongs to the set of J’s is loyal cus
2.3. Retention action tomers, whose set of customers be J with more than 90% loyalty, i.e., a
churn probability of less than 10%, I belong to the predicted set of po
In the B2C E-Commerce industry a hybrid method is the most tential attrition candidates. In contrast to conventional recommender
dependable way for generating a list of best tailored retention alterna systems, which always construct the neighbourhood with customers
tives. To achieve the best recommendation result, it is advised to who are exactly like the client in question, this approach to neigh
combine, collaborative, content-based, knowledge-based, and de bourhood creation uses consumers who are different from the client. The
mographic techniques shown in Fig. 4. following are the main factors or assumptions to take into account while
Hybrid classification algorithm: creating a customised retention strategy for a likely churn scenario: You
The overview of the proposed algorithm may prevent future client turnover by proposing appropriate offers and
services. The value of an offer or service may vary based on the con
Step 1. Determination of the various training data set.
sumer, and customers with similar tastes will behave similarly.
4
S. J et al. Measurement: Sensors 27 (2023) 100728
Fig. 6. Analyzing the data from the training sample. Fig. 7. The connection between pricing and consumer attrition.
5
S. J et al. Measurement: Sensors 27 (2023) 100728
accuracy rate of 82.64% and an error rate of 17.36% seen in Fig. 12.
For this Research, we use the PEN hypothesis, in which P stands for
recent purchase rate, E stands for frequency of exchanges, and M stands
for the purchase amount. In the absence of the customer’s commitment
to the association’s value, N stands for monetary. As a result of the ad
justments made to the three loads, the client’s value increased. Because
of the time hub’s influence on the client’s most recent purchase date, the
two components F and M are split into two states. N2 is more costly than
the average item, whereas N1 is less expensive than the average item
price. In contrast to E2, which happens more often than the average
exchange rate, E1 occurs less often. The PEN hypothesis is utilised to
estimate the worth of lost consumers in this investigation shown in
Table 2.
More than half of all lost clients are E1N1 customers, making them
Fig. 9. The connection between trading frequency and client attrition.
the most prevalent sort of client. Even if such clients might be dismissed
due to their lower value in a conventional administration strategy, on
line company firms should not disregard such consumers. Businesses
engaged in e-commerce need to address the problem of decreasing the
loss of low-value customers. clients who have the E1N2 code. Although
this kind of consumer trades less often than the average, they do so more
frequently than the average, with a volume that is around 25.5% higher.
The trading frequency is approximately the same for E1N1 customers.
9.8% of E2N1 customers in this category have experienced financial loss.
As the purchase value stays the same, increasing customer purchase
frequency significantly lowers turnover when compared to E1N1 con
sumers. Increasing the quantity of client purchases is harder after a
while, when the customer’s revenue is essentially steady. The best
strategy is thus to increase customer frequency. Consumers of E2N2 have
the lowest rate of lost customers, regardless of whether they purchase
more often or in larger quantities than the norm. Increasing the quantity
Fig. 10. The relationship between customer churn and product’s score. of things sold and the trading frequency of online shoppers would
6
S. J et al. Measurement: Sensors 27 (2023) 100728
Table 1 3. Conclusion
The customer churn categorization table.
Actual Prediction Non Lost Customer Lost customer Multiple marketing studies have shown that retaining existing con
sumers is the greatest option for sustaining a B2C E-Commerce business,
Non Lost Customer f11 f12
Lost customer f21 f22 as penetration of the competition’s client base is nearly five times more
expensive than retention. Furthermore, a new customer must be
considered valuable after a longer period and a greater number of
transactions. According to a study conducted by Bain & Co, a. 5%
increment in client maintenance can bring about a 25% expansion in
benefit. Another figure shows that dynamic clients accomplish more
Business than new shoppers in B2C E-Commerce. The likelihood of
Business from dynamic clients is 60%. These variables underline the
significance of holding clients and reducing churn. This study provides a
framework that uses machine learning methods, specifically support
vector machines, to detect probable client attrition. The huge measure of
information accessible to E-Commerce organizations can be mined to
find covered standards of conduct, and any deviation from expected
patterns or examples can be seen as a potential client churn. Personal
ized retention techniques are also developed using hybrid recommen
dation strategies. The following steps are planned, evaluation of
recommendation efficiency and automatic feedback to improve the SVM
model. In addition, ensemble methods are compared to the SVM model
to see if they may be used to increase the churn detection process.
Fig. 13. The BP neural network model’s lifting curve.
Funding statement
Data availability
Table 2
The value of lost consumers is classified. Data will be made available on request.
Customer Category Quantity Rate%
7
S. J et al. Measurement: Sensors 27 (2023) 100728
improved value model and XG-boost algorithm, Manag. Sci. Eng. 12 (3) (2018) [14] T.T. Leonid, R. Jayaparvathy, Classification of elephant sounds using parallel
51–56. convolutional neural network, INTELLIGENT AUTOMATION AND SOFT
[8] S. Renjith, An integrated framework to recommend personalized retention actions COMPUTING 32 (3) (2022) 1415–1426.
to control B2C E-commerce customer churn, 2015. Volume-27 Number-3, pp 152- [15] X. Yu, S. Guo, J. Guo, X. Huang, An extended support vector machine forecasting
159 arXiv preprint arXiv:1511.06975. framework for customer churn in e-commerce, Expert Syst. Appl. 38 (3) (2011)
[9] E.E. Grandón, P.E. Ramírez-Correa, J.S.L. Orrego, Modelo de Aplicaciones e- 1425–1430.
Business en Grandes Empresas: una Validación Empírica, Interciencia 44 (4) (2019) [16] N. Glady, B. Baesens, C. Croux, Modeling churn using customer lifetime value, Eur.
210–217. J. Oper. Res. 197 (1) (2009) 402–411.
[10] X.Q. Wu, L. Zhang, S.L. Tian, L. Wu, Scenario based e-commerce recommendation [17] K.A. Amuda, A.B. Adeyemo, Customers Churn Prediction in Financial Institution
algorithm based on customer interest in Internet of things environment, Electron. Using Artificial Neural Network, 2019 arXiv preprint arXiv:1912.11346.
Commer. Res. (2019) 1–17. [18] T. Ferreira, I. Pedrosa, J. Bernardino, Business intelligence for e-commerce: survey
[11] Ponnan Suresh, J Robert Theivadas, V.S. HemaKumar, Daniel Einarson, Driver and research directions, in: World Conference on Information Systems and
monitoring and passenger interaction system using wearable device in intelligent Technologies, Springer, Cham, 2017, April, pp. 215–225.
vehicle, Comput. Electr. Eng. 103 (2022), 108323, https://doi.org/10.1016/j. [19] J. Yadav, B. Mallick, Web mining: characteristics and application in ecommerce,
compeleceng.2022.108323. ISSN 0045-7906. Intl. J. of IJECSE 1 (4) (2011).
[12] Z. Pei, R. Yan, Cooperative behavior and information sharing in the e-commerce [20] A. Bologa, R. Bologa, Business intelligence using software agents, Database Syst. J.
age, Ind. Market. Manag. 76 (2019) 12–22. 2 (4) (2011) 31–42.
[13] V. Urbancokova, M. Kompan, Z. Trebulova, M. Bielikova, BEHAVIOR-BASED [21] Dilip Singh Sisodia, Somdutta Vishwakarma, Abinash Pujahari, Evaluation of
customer demography prediction in e-commerce, J. Electron. Commer. Res. 21 (2) machine learning models for employee churn prediction, in: International
(2020) 96–112. Conference on Inventive Computing and Informatics, 2017.