0% found this document useful (0 votes)

39 views10 pages

Research On Segmenting E-Commerce Customer Through An Improved K-Medoids Clustering Algorithm

Research on Segmenting E-Commerce Customer through an Improved K-Medoids Clustering Algorithm

Uploaded by

Yusuf Virmansyah, S.Pd.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views10 pages

Research On Segmenting E-Commerce Customer Through An Improved K-Medoids Clustering Algorithm

Research on Segmenting E-Commerce Customer through an Improved K-Medoids Clustering Algorithm

Uploaded by

Yusuf Virmansyah, S.Pd.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Hindawi

Computational Intelligence and Neuroscience

Volume 2022, Article ID 9930613, 10 pages
https://doi.org/10.1155/2022/9930613

Research Article
Research on Segmenting E-Commerce Customer through
an Improved K-Medoids Clustering Algorithm

Zengyuan Wu ,1 Lingmin Jin ,1 Jiali Zhao ,1 Lizheng Jing ,1 and Liang Chen 2

1
College of Economics and Management, China Jiliang University, No. 258, Xueyuan Street, Hangzhou, Zhejiang 310018, China
2
College of Optical and Electronic Technology, China Jiliang University, No. 258, Xueyuan Street, Hangzhou,
Zhejiang 310018, China

Correspondence should be addressed to Zengyuan Wu; [email protected]

Received 2 March 2022; Revised 11 April 2022; Accepted 11 May 2022; Published 18 June 2022

Academic Editor: Mario Versaci

Copyright © 2022 Zengyuan Wu et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
In view of the shortcomings of traditional clustering algorithms in feature selection and clustering effect, an improved Recency,
Frequency, and Money (RFM) model is introduced, and an improved K-medoids algorithm is proposed. Above model and
algorithm are employed to segment customers of e-commerce. First, traditional RFM model is improved by adding two features of
customer consumption behavior. Second, in order to overcome the defect of setting K value artificially in traditional K-medoids
algorithm, the Calinski–Harabasz (CH) index is introduced to determine the optimal number of clustering. Meanwhile,
K-medoids algorithm is optimized by changing the selection of centroids to avoid the influence of noise and isolated points.
Finally, empirical research is done using a dataset from an e-commerce platform. The results show that our improved K-medoids
algorithm can improve the efficiency and accuracy of e-commerce customer segmentation.

1. Introduction platforms to become more client centric [9]. Only with an

in-depth understanding of the preferences and needs of
In recent years, e-commerce has developed vigorously all different customer groups, precision marketing can be
over the world, with many e-commerce platforms emerging, implemented.
such as Amazon, Tmall, and JD.com. In 2020, facing the In the field of customer segmentation, RFM is most
challenges brought by the COVID-19 epidemic to pro- classical model, which is proposed by Hughes [10]. On the
duction, operation, and supply chains, e-commerce played base of RFM model, many scholars developed clustering
an important role in ensuring basic production, supply of analysis technique to segment customers [11]. However,
living materials, and stimulating economic growth. It is there are still some gaps in the existing literature. First, in
important for e-commerce platforms to gain more cus- terms of feature selection, the existing literature focused on
tomers [1, 2]. In order to gain more customers, they must try using the historical order data of customers, which cannot
to meet the needs of customers [3, 4]. Different customers fully reflect the behavioral preferences and consumption
need different services and products, leading to the diver- habits of different customer groups. Second, in terms of
sification of customer needs [5]. Customers segmentation is selecting cluster algorithm, the K-means clustering algo-
the basis of analyzing the diverse needs of different cus- rithm proposed by the existing literature did not consider
tomers. Customer segmentation is to subdivide customers the algorithm operation efficiency.
with different attributes and features into specific categories, Based on the above considerations, in this article, we
which is an important tool to effectively identify the value of study strategies for optimizing clustering algorithms to
customers, and it can help online merchants to develop improve the performance of e-commerce customer seg-
personalized marketing strategies for customers of different mentation. We made some improvements in feature se-
value categories [6–8]. Customer segmentation enables lection and clustering algorithms. First, when selecting
2 Computational Intelligence and Neuroscience

features, we introduce customer consumption behavior data purchasing power, and added these features to the RFM
into the traditional RFM model, including data added to model so that consumer categories could be accurately
shopping cart (C) and favorites (V). Second, in terms of identified and differentiated.
algorithm improvement, we address the problem of artifi- K-means algorithm and K-medoids algorithm are the
cially setting K values in the K-medoids algorithm and most commonly used clustering algorithms. K-means has
introduce the CH as clustering quality evaluation index to been widely applied in the fields of data mining and pattern
determine the best K values. Meanwhile, according to the recognition because of its advantages such as simple oper-
problem that the K-medoids algorithm is sensitive to the ation and fast speed. However, the traditional K-means
initial clustering center, we combine the K-means++ algo- algorithm is susceptible to noise and isolated points, which
rithm to improve the selection of clustering center. The leads to poor clustering results [26]. K-medoids algorithm is
experimental results show that the improved K-medoids another classical division-based clustering method [27].
algorithm can effectively alleviate the sensitivity of the al- Compared with K-means, this algorithm optimizes the se-
gorithm to noise and initial clustering center selection. The lection method of the center of mass, overcomes the defect of
algorithm also considers the operational performance of the being sensitive to isolated points, and has higher clustering
algorithm, so as to improve the efficiency and accuracy of accuracy. However, the K-medoids algorithm still has the
e-commerce customer segmentation. problem of being vulnerable to the initial clustering center.
The rest of this paper is organized as follows. In Section To address the above problem, many scholars have proposed
2, the existing literatures on customer segmentation are a series of improved algorithms for K-medoids.
reviewed and the research gaps are proposed. In Section 3, According to the problem of the selection of initial
the improved K-medoids algorithm is described in detail. In clustering centers, two improvement ideas are mainly
Section 4, empirical research is done using an e-commerce proposed in existing literature. First, based on the
dataset and the empirical results are analyzed. In Section 5, K-medoids algorithm, existing literatures optimize the se-
the contributions, shortcomings, and future research are lection of initial clustering centers using the distance or
discussed. Finally, the conclusions are drawn in Section 6. correlation between samples [28, 29]. This improved method
is based on the following principle. Since the cluster centers
2. Literature Review are usually the more important sample points in a cluster,
the denser the sample points are with strong correlation with
Existing literature on customer segmentation is divided into other sample points, the easier they are to become the best
two fields. The first is about selecting different segmentation cluster centers. Ho-Kieu et al. [28] proposed an improved
features. The second is about selecting and improving the initial center selection method by introducing probability
clustering algorithms. In terms of the selection of seg- density function. The experimental results showed that the
mentation features, the existing literature can be divided into improved algorithm had obvious advantages compared with
three types from different perspectives [12], including de- the original K-medoids algorithm. The above improved
mographic perspective, customer life cycle perspective, and methods optimize K-medoids for the selection of initial
customer behavior perspective. Firstly, scholars [13] who clustering centers, reduce the number of iterations, and
conducted research from the perspective of demography improve the clustering efficiency. However, these selection
mainly collected data using questionnaire surveys. They methods only consider the distance or correlation between
divide customers into different groups according to their samples, which is easy to make the clustering results fall into
age, gender, family income, marital status, education, etc. local optimum. They cannot achieve more accurate clus-
Secondly, literature studying this issue from the perspective tering results for datasets with large disparity in the number
of the customer life cycle [14] divides the customer life cycle of samples between clusters.
into several stages according to the number of new cus- Second, some scholars introduce the Swarm Intelligence
tomers, retained customers, and lost customers. In different [30, 31] and combine it with K-medoids to improve the
stages, companies should take different measures for them. global search capability and efficiency of the improved al-
The customer loyalty classification method [15, 16] is the gorithms for samples. Arthur and Vassilvitskii [32] algo-
most popular segmentation method in existing segmenta- rithmically fused the Swarm Algorithm with K-medoids. The
tion literature. Third, with the continuous development of experimental results showed that the improved algorithm
data mining technology, the indicator selection methods effectively reduced the influence of noise on the clustering
based on customer behavior are becoming a hot topic. In results and improved the clustering accuracy. This type of
these literatures, multidimensional features are used to re- improved algorithm effectively avoids the problem of local
flect the consumption behaviors and habits of different optimum of clustering results. However, it is worth noting
customer groups [17, 18]. As a classic customer value model, that the integration with the Swarm Intelligence will lead to
the RFM model has been successfully applied to customer the increase in algorithm complexity and the reduction in
segmentation [19, 20]. Due to features in different industries, operation efficiency. The huge transaction volume and mass
some scholars have improved and extended the RFM model data in e-commerce platforms require high clustering effi-
[21–24]. However, the consumer behavior preference among ciency. It is necessary for platform managers to segment
different customer groups cannot be well identified. Yoseph customer timely in order to manage e-commerce customers
et al. [25] studied consumer behavior (e.g., clicking on well. Therefore, we try to solve the problem of sensitivity to
product links, browsing products, and adding to cart) and the initial clustering center that exist in K-medoids
Computational Intelligence and Neuroscience 3

algorithm while ensuring the operational efficiency of the iterations is reached. Then, the cycle ends and the final
algorithm in this paper. clustering result is obtained.
In summary, in existing e-commerce customer seg-
mentation literature, there are still two gaps that have not 3.2. Implementation Procedure of the Improved K-Medoids
been solved well. First, from the perspective of selecting Algorithm.
segmentation features, the existing literatures focus on using
the historical order data of customers. But the consumption
behavior data of customers is ignored, which cannot more 3.3. Determine the Optimal Number of Clusters k. We in-
comprehensively reflect the behavioral preferences and troduce the CH clustering quality evaluation index [32] and
consumption habits of customers in different customer set the class corresponding to the highest CH value as the
groups. Second, from the perspective of clustering algo- number of clusters. The CH value is the ratio of intercluster
rithms, although the improved K-medoids algorithm in sample separation to intracluster sample tightness, and a
existing literature alleviates the sensitivity of the algorithm to larger CH represents a tighter class itself and a more dis-
the initial clustering center and improves the clustering persed class to class (i.e., a better clustering result). When the
performance, there are still limitations in the two aspects. intracluster is dense and the intercluster separation is good,
First, the clustering results may fall into the local optimum. the optimal number of clusters can be clearly derived from
Second, the algorithm may run less efficiently. the CH value line graph, and it has the advantage of fast
Therefore, we attempt to solve the above problems. First, calculation speed.
while selecting segmentation features, we construct a new The calculation formula of CH value is as follows.
model by incorporating customers’ online consumption
behavior, where Recency, Frequency, Money, Add to Cart, BGSS m−k
S(k) � 􏼒 􏼓 ×􏼠 􏼡. (1)
and Add Favorites are included. For clarity, this model is WGSS k−1
called a RFMCV model. Second, considering the defect of
artificially set K values in the K-medoids algorithm, we Within-Groups Sum of Squared Error (WGSS) is the
introduce the CH index to determine the best K values. sum of squared errors within clusters. It is used to measure
Third, drawing on the idea of K-means++ algorithm [33] for the tightness of samples within clusters. The smaller the
selecting initial clustering center, the K-medoids algorithm WGSS is, the tighter the clusters are and the better the
is improved. Finally, the algorithm proposed in this paper is clustering effect is. Its calculation formula is
validated on two standard test datasets.
1
WGSS � 􏼔 m1 − 1􏼁d21 + · · · + mk − 1􏼁d2k 􏼕, (2)
2
3. Improved K-Medoids Algorithm
where d21 is the average distance of samples within the k-th
In this paper, we improve K-medoids algorithm from two cluster; mk is the number of samples in the k-th cluster.
aspects. First, the CH evaluation index is introduced in order Between-Groups Sum of Squared Error (BGSS) is the
to determine the optimal number of clusters in the sum of squared errors between clusters, which is used to
K-medoids algorithm. Second, the idea of K-means++ al- measure the separation of samples between clusters. The
gorithm is introduced while selecting initial clustering larger the BGSS is, the more dispersed the clusters are and
centers. the better the clustering effect is. Its calculation formula is
k
1
BGSS � ⎢⎣⎡(k − 1)d2 + · · · + 􏽘􏼐mj − 1􏼑􏼒d2 − d2j 􏼓⎤⎥⎦, (3)
3.1. Description of the K-Medoids Algorithm. Both K-means 2 j�1
and K-medoids algorithms are classical division-based
clustering methods, which generally use Euclidean distance where d2 is the average distance between all samples, d2j is
as a measure of similarity between two data points. The the average distance of samples within the j-th cluster, mj is
smaller the distance, the greater the similarity. However, the the number of samples in the j-th cluster, and k is the
K-medoids algorithm is optimized for the selection of number of sample clusters.
centroids to avoid the influence of noise and isolated points
[34]. The algorithm is implemented in the following steps.
3.4. Comparison and Validation. In order to verify the ef-
First, input dataset and the number of clusters. Second,
fectiveness of the improved K-medoids proposed in this
initialize the clustering centers and assign samples. Ran-
paper, two comparison experiments are conducted. First, we
domly select the initial clustering centers, calculate the
compare the performance of clustering algorithms. Second,
Euclidean distance between the remaining data points and
we compare the clustering quality evaluation indicators.
the clustering center, find the shortest distance, and assign all
samples to the clusters corresponding to the clustering
center. Third, update the cluster centroids. Randomly select 3.4.1. Comparison of Algorithm Performance. In order to
a noncentroid, and replace the clustering centers according verify the effectiveness of the algorithm, two standard test
to the principle of squared error function value reduction. datasets were selected for the experiments, including breast
Finally, iterative calculation is performed until the clustering cancer [35] and iris plants [36] in UCI database. UCI da-
center no longer changes or the maximum number of tabase is the most popular dataset in the field of machine
4 Computational Intelligence and Neuroscience

Input: dataset Y � {y1, y2, . . ., yn}, X � 􏼈x1 , x2 , . . . , xn 􏼉, where n is the number of data points.
Step 1: Randomly select one sample from the dataset as the initial clustering center C1.
Step 2: First, calculate the shortest distance D(x) between each sample and the existing clustering center. Second, calculate the
probability P(x) that a sample is selected as the next clustering center. Calculate P(x), which yields to P(x) � D(x)2 /􏽐x∈X D(x)2 .
Third, a random number Ri is generated in the interval (0, 1), and calculate the difference between P(x) and Ri Finally, when the
difference is less than or equal to 0 for the first time, the corresponding object is the next clustering center.
Step 3: Repeat Step 2 until K clustering centers are selected.
Step 4: Assign samples. Calculate the Euclidean distance between the remaining data points and the cluster center Ci, then find the
shortest distance. Assign all samples to the clusters corresponding to the cluster center Ci.
Step 5: Update the cluster centers. Randomly select the non-central point Crandom and replace Ci with Crandom to update the cluster
centroids of each cluster according to the principle of squared difference function value reduction.
Step 6: Repeat Step 4 and Step 5 until the cluster centers no longer change or the maximum number of iterations is reached, the cycle
ends and the final clustering result is obtained.
Output: Clustering result C � {c1, c2, . . ., ck}.

ALGORITHM 1: Implementation procedure of the improved K-medoids algorithm.

learning, which is built by University of California Irvine. Table 1: The performance of 4 algorithms working on different
Furthermore, K-medoids, K-means++, and spectral clus- datasets.
tering (SC) method were selected to compare with the Datasets
improved K-medoids algorithm proposed in this paper.
Breast cancer Iris plants
Both the clustering accuracy and the running time of 4
Clustering algorithm ACC Time
algorithms on the two datasets were mainly compared. The Time (ms) ACC (%)
(%) (ms)
results are shown in Table 1.
As can be seen from Table 1, the improved K-medoids K-medoids 0.858 33.1 0.663 26.5
K-means++ 0.854 208.2 0.833 265.0
algorithm has an accuracy of 86.8% on the breast cancer
Spectral clustering 0.667 103.8 0.9 118.1
dataset, outperforming the K-medoids, K-means++, and Improved K-medoids 0.868 22.7 0.840 13.9
spectral clustering methods in terms of clustering accuracy.
Meanwhile, the running time of the improved K-medoids
algorithm is shorter than the other 3 algorithms, which is line graph, because continuing to increase the K value after
22.7 ms. On the iris plants dataset, the improved K-medoids the inflection point does not increase the classification ac-
algorithm has the highest accuracy of 84% and the shortest curacy much, but increases the number of clusters. In
running time of 13.9 ms. Therefore, among the four algo- Figure 2, the horizontal axis is the number of clusters, and
rithms, the improved K-medoids algorithm has the best the vertical axis is the sum of squares due to error (SSE). As
performance in terms of accuracy and clustering efficiency. can be seen in Figure 2, when the K value changes from 4 to
Based on the above analysis, the improved K-medoids al- 19, the change in the folding graph is smoother (i.e., there is
gorithm proposed in this paper outperforms the other three no obvious inflection point to accurately determine the
clustering methods on both datasets. optimal number of clusters).
The above analysis shows that the CH index is better than
the inflection point method in the segmentation of
3.4.2. Comparison of Clustering Quality Evaluation e-commerce customers.
Indicators. In order to determine the best K value, the CH
index is introduced to decide the K value in this paper. In 4. Empirical Analysis
order to verify the applicability of the CH index for customer
segmentation in the e-commerce industry, we use the 4.1. Selecting Features for Customer Segmentation. RFM
e-commerce dataset in practice. Furthermore, the result is model was first proposed by Hughes [10], which is generally
compared with the inflection point method. The experi- an analysis tool used to identify an organization’s best
mental result of CH value is shown in Figure 1. The ex- customers. RFM model is based on 3 factors, including
perimental result of the inflection point method is shown in Recency (R), Frequency (F), and Monetary value (M). Re-
Figure 2. cency (R) usually represents how recently a customer has
As can be seen from Figure 1, the line chart of CH value made a purchase. The more recently a customer has made,
shows a line rising and then falling trend, and the highest CH the more likely he will continue to keep the relationship.
value is obtained when the number of clusters is 4. Therefore, Frequency (F) usually represents how often a customer
using the CH index, it can be clearly concluded that the makes a purchase within the observation period. The larger
optimal number of clusters for this e-commerce platform the F-value represents the idea that the more frequent the
dataset is 4. customer consumption, the higher the customer value.
The principle of the inflection point method is to obtain Monetary (M) usually represents how much money a cus-
the optimal number of clusters at the inflection point of the tomer spends on purchases within the observation period.
Computational Intelligence and Neuroscience 5

12000

11000

10000

9000
CH

8000

7000

6000
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Number of clusters
Figure 1: Line chart of CH value.

60000

55000

50000
SSE

45000

40000

35000

30000
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Number of clusters
Figure 2: Line chart of inﬂection point method.

The larger the M-value, the higher the customer value. Since customers’ s activity and online consumption habits. Add to
its introduction, the RFM model has been widely used in cart (C) represents frequency that a consumer has added a
customer segmentation [29]. product to their shopping cart. Add favorites (V) represents
The traditional RFM model has been widely used for the frequency that a consumer has added a product to their
customer segmentation in various industries. However, product favorites. Both of these behaviors represent the
there are still several problems. The RFM model cannot consumer’s preference for a product. The higher the fre-
reflect the customer’s activity on the e-commerce platform quency is, the more likely consumers are to buy the product.
and the differences in consumption and behavior between The introduction of these two indicators into the RFM
different customer groups. With the development of big data model can effectively improve the effectiveness of the RFM
technology, the dimensions of customer data extracted from model for e-commerce customer segmentation [25].
e-commerce platforms are increasing, and these data reflect
customers’ value characteristics, consumption habits, and
behavioral preferences in a more detailed and compre- 4.2. Data Description. The customer consumption data in
hensive way. Therefore, based on the traditional RFM model, this paper is from Kaggle database [37]. There are 100,000
we integrated customers’ online behavioral indicators and orders from multiple marketplaces in Brazil from 2016 to
proposed the RFMCV model for e-commerce customer 2018. Many features are contained in this dataset, including
segmentation, in which C and V indicators could reflect order status, price, payment, and freight performance to
6 Computational Intelligence and Neuroscience

Table 2: The ﬁelds and descriptions in the dataset.

Field name Data type Field description
Customer_unique_id Int Customer’s unique identification
Order_id Int Order identification
Product_id Int Product identification
The type of user behavior towards the product, including browsing,
Behavior type String
favoriting, adding to cart, purchasing
Timestamp Int Time of behavior

customer location, product attributes, and reviews written Z-score normalization method is employed in this paper,
by customers. Then the order and online behavior data of which normalizes the data by giving the mean and standard
37,376 customers were extracted from this dataset. The deviation of the original data. The processed data yields the
consumption time is from November 18, 2017, to December standard normal distribution (i.e., the mean value is 0 and
18, 2017. In order to segment e-commerce customers, we the standard deviation is 1). The transformation function is
select 5 fields. The fields and descriptions in the dataset X−μ
involved in this dataset are shown in Table 2. X∗ � , (4)
σ
where μ is the mean of all samples and σ is the standard
4.3. Data Preprocessing
deviation of all samples.
4.3.1. Data Cleaning. The behavioral data of these e-com- After the normalization process, all data were converted
merce customers in a month is about 100,000 pieces, and to dimensionless data. Partial data is shown in Table 4.
data cleaning is needed. Firstly, data with missing and ab-
normal values are processed, such as data with zero expense,
data with purchase date as the idle value, and data with 4.4. Analysis of Empirical Results. According to the experi-
obviously wrong expense. Secondly, duplicate data are mental results in Section 3.2, the optimal number of clusters
processed. The user’s purchase behavior is accurate to the k is 4. Based on the RFMCV model, the improved
hour. There will be a small number of users who repeatedly K-medoids algorithm is run. The results show that all
purchase or add favorites within an hour, so this kind of data customers are divided into 4 groups, named Type A, Type B,
will be processed. Finally, the consistency of the data is dealt Type C, and Type D. The distribution of each indicator of the
with. The indicator R involves time features. The date and RFMCV model of four customer types is shown in Figure 3.
hour in the time data exist in one field, so it is split into two Comparing the customer indicators of each group
fields. In addition, we convert the field type in the Time- among the 4 groups in Figure 3, some findings can be drawn.
stamp field into the form of year, month, and day to facilitate The value of Type B customers is the highest, which
the calculation of time. includes 13,415 customers, accounting for 35.89% of total
e-commerce customers. R-value of the Type B customers is
smaller; their last purchase on this platform is more recent.
4.3.2. Indicator Extraction and Normalization. The indi- The F-value is the highest, suggesting that the frequency is
vidual indicators in the RFMCV model are explained in high and that they are active customers on this e-commerce
detail as follows: platform. M-value is the biggest; they spend the most in this
platform. C-value is the biggest; they add to cart most
R: recency: the time interval between the customer’s last
frequently. However, V-value is small, which shows that
purchase in the observation period and 31 December
these customers often add to cart rather than add favorites
2017.
when they find interesting products. This group has the
F: frequency of customer purchasing in the observation highest current value and value-added potential and should
period. be classified as a high-value customer group in this e-com-
M: monetary: the amount spent by the customer in the merce platform. For this group, platform owners should put
observation period. significant effort and resources into maintaining and devel-
C: frequency of the customer who added the product to oping good relationships with them. Effective measures
cart in the observation period. should be taken to tap their consumption potential.
The second valuable customer group is type A, which
V: frequency of the customer who added the product to
includes 7,463 customers, accounting for 19.97% of total
favorites in the observation period.
customers. R-value of the Type A customers is smaller than
According to the RFMCV model proposed in this paper, Type B and Type D, and they make a purchase most recently.
37,376 samples are collected, and some of them are shown in Both F-value and M-value of Type A are the second biggest
Table 3. among the 4 groups. They are more active customers and
In order to avoid the disparity caused by the different spend more on this e-commerce platform. Different from
units of each indicator, the dataset after indicator extraction Type B, C-value of these customers is low, but the V-value is
needs to be normalized prior to experimental analysis. The the highest among these four groups. It shows that these
Computational Intelligence and Neuroscience 7

Table 3: Partial data of RFMCV model.

Customer_unique_id R F M C V
5 1 1 99 13 7
18 6 2 210 16 0
22 4 8 84 0 0
... ... ... ... ... ...
906311 7 5 118 7 0
906338 3 1 28 5 0
906355 5 4 84 9 0

Table 4: The table of partial data of RFMCV model after normalized treatment.
Customer_unique_id R F M C V
5 −0.000902 2.068466 −0.097700 −0.745080 −0.397498
18 −0.623415 −0.018191 1.465430 1.340590 −0.390554
22 1.247733 3.807347 0.597041 −0.390554 −0.390554
... ... ... ... ... ...
906311 −0.625219 −0.365967 −1.139736 −0.397498 −0.390554
906338 0.935574 1.025137 0.324119 0.167400 −0.390554
906355 0.311257 −0.677361 −0.097700 0.428109 −0.390554

4.5

3.5
cluster center value

2.5

1.5

0.5

-0.5

-1.5
Type A customers Type B customers Type C customers Type D customers

R C
F V
M
Figure 3: Distribution chart of four groups.

customers are used to adding favorites when they find in- However, the number of this group is big, and their
teresting products. According to the above analysis, cus- consumption frequency is medium. It is necessary for
tomers of Type A can be classified as the second valuable platform owners to enhance the value of this group by
group. These customers have greater potential for value personalized push products.
mining. The platform owners should hold some promotional The fourth customer group is Type C, including 2,158
activities in order to stimulate their consumption potential. customers, accounting for 5.77% of total e-commerce cus-
The third customer group is Type D, which includes tomers. R-value of this customer group is low, and F-value is
14,340 customers, accounting for 38.37% of total e-com- smallest, indicating that this group has recently spent money
merce customers. These customers have the biggest R-value, on the platform, but the overall consumption frequency is
indicating that they have not purchased goods from this low. M-value, C-value, and V-value are smallest; they are
platform for a long time. F-value, M-value, C-value, and V- also inactive customers. Unlike those customers of Type D,
value are all small, indicating that this group of customers they complete their last purchase at a very close time, so they
is inactive in this e-commerce platform. They do not are likely to be new customers. Special attention needs to be
frequently add favorites or add to cart on the platform. paid to them. It is important to understand their needs and
They can be classified as a low-value customer group. develop good relationship with them.
8 Computational Intelligence and Neuroscience

5. Discussion improve the accuracy of e-commerce customer classiﬁ-

cation. In the future, more features of consumer behavior
The main contributions of this paper are the following. (e.g., clicks, comments, etc.) can be integrated into the
Firstly, this research enriches the theoretical research related model to classify customers. Second, we improve
to customer segmentation. The research object of this paper K-Medoids algorithm for clustering in this paper. We
is e-commerce customers, whose consumption behaviors are verify the effectiveness of our improved K-medoids al-
based on the Internet platform. It is necessary to add more gorithm using two standard test datasets, and then em-
new online characteristics and consumption patterns. ploy this algorithm to segment e-commerce customers. In
Therefore, we integrate two features of online consumption future research, we will use hierarchical clustering,
behavior into RFM model, including adding to cart (C) and density-based clustering and other methods to cluster
add favorites (V). Secondly, in order to solve the problems of e-commerce customers. Furthermore, we plan to com-
artificially setting K values and sensitivity to the initial pare the clustering performance of these methods with
clustering centers, we improve the existing K-medoids that of K-Medoids. Third, the available data in this paper
clustering algorithm by introducing CH cluster quality could be affected by uncertainties or inaccuracies. In view
evaluation index and idea of K-means++ algorithm. Fur- of this problem, some scholars put forward solutions.
thermore, data from both simulated dataset and the real Versaci et al. [38] proposed a new approach to assess the
dataset are used to test the performance of improved mechanical integrity of a steel plate, which translated this
K-medoids. In practice, the findings in this paper will enable problem into a classification problem by using fuzzy
e-commerce platforms to identify different kinds of cus- similarity computations. In order to handle the data
tomers. According to different kinds of customers, different uncertainty, Ontiveros-Robles and Melin [39] proposed a
preventive measures can be taken. It will help to maintain specific kind of computer-aided diagnosis system based
the important profit source for an e-commerce platform, on General Type-2 Fuzzy Logic. In the future, it would be
thus achieving a “win-win” situation for both platforms and necessary to use fuzzy classification systems.
consumers. In this paper, we improve the RFM model by intro-
ducing customer’s behavioral features, and employ an
6. Conclusion improved clustering algorithm to segment e-commerce
customers. Firms can improve the effectiveness of cus-
It is necessary for an e-commerce platform to segment tomer segmentation by using our proposed model. In
customers before implementing a marketing strategy. In addition, they can understand the needs of different
other words, customer segmentation is the base of accurate customers, which helps promote the innovation of en-
marketing. In the era of big data, machine learning is an terprises from the source.
important tool which can help platforms to analyze con-
sumption behavior. In view of some gaps in the existing Data Availability
literature, some improvements have been made in this
paper. First, we improve the traditional RFM model by The data used to support the findings of this study are
integrating the consumption behavior of customers. Second, available from the UCI repository “Breast Cancer Data Set”
the CH index is introduced to determine the best K value. and “Iris Data Set”, and the Kaggle repository “Brazilian E-
Third, combining with the K-means++ algorithm, the Commerce Public Dataset by Olist.”
K-medoids algorithm is improved by optimally selecting the
initial clustering center. Finally, an empirical analysis was
conducted using a sample of 37,376 customers from an Conflicts of Interest
e-commerce platform. The authors declare that they have no conflicts of interest.
Based on the comparison with other algorithms and
empirical analysis, three conclusions can be drawn. First,
the RFMCV model proposed in this paper is an effective Acknowledgments
index system to segment customers. The five features
This work was supported by the Natural Science Foundation
selected in this model integrated customer value features
of Zhejiang Province (Grant no. LY20G010008) and the Key
and customer consumption behavior features, which can
R&D Program of Zhejiang Province (Grant no.
be used to distinguish different consumption habits and
2021C01027).
preferences. Second, compared with the inflection point
method, the CH index introduced in this paper is more
suitable for e-commerce datasets. Third, compared with References
the K-medoids algorithm, K-means++ algorithm, and
[1] A. Gawer and M. A. Cusumano, “Industry platforms and
spectral clustering method, the improved K-medoids al-
ecosystem innovation,” Journal of Product Innovation Man-
gorithm proposed in this paper can gain better clustering agement, vol. 31, no. 3, pp. 417–433, 2014.
accuracy and efficiency. [2] Z. Soltani and N. J. Navimipour, “Customer relationship
However, there are still some potential limitations in management mechanisms: a systematic review of the state of
this paper, and some future research can be done. First, the art literature and recommendations for future research,”
we introduce two features C and V into the RFM model to Computers in Human Behavior, vol. 61, pp. 667–688, 2016.
Computational Intelligence and Neuroscience 9

[3] W.-Y. Chiang, “Establishing high value markets for data- Intelligence and Neuroscience, vol. 2022, Article ID 1499801,
driven customer relationship management systems,” Kyber- 11 pages, 2022.
netes, vol. 48, no. 3, pp. 650–662, 2019. [19] P. A. Sarvari, A. Ustundag, and H. Takci, “Performance
[4] E. Umuhoza, D. Ntirushwamaboko, J. Awuah, and B. Birir, evaluation of different customer segmentation approaches
“Using unsupervised machine learning techniques for be- based on RFM and demographics analysis,” Kybernetes,
havioral-based credit card users segmentation in africa,” vol. 45, no. 7, pp. 1129–1157, 2016.
SAIEE Africa Research Journal, vol. 111, no. 3, pp. 95–101, [20] M. Song, X. Zhao, H. E, and Z. Ou, “Statistics-based CRM
2020. approach via time series segmenting RFM on large scale data,”
[5] Y. Deng and Q. Gao, “A study on E-commerce customer Knowledge-Based Systems, vol. 132, pp. 21–29, 2017.
segmentation management based on improved K-means al- [21] W.-Y. Chiang, “To mine association rules of customer values
gorithm,” Information Systems and e-Business Management, via A data mining procedure with improved model: an em-
vol. 18, no. 4, pp. 497–510, 2018. pirical case study,” Expert Systems with Applications, vol. 38,
[6] H. Güçdemir and H. Selim, “Corrigendum to “Integrating no. 3, pp. 1716–1722, 2011.
simulation modelling and multi criteria decision making for [22] B. Zhao, W. Li, Q. Guo, and R. Song, “E-commerce picture
customer focused scheduling in job shops” [Simulation text recognition information system based on deep learning,”
Modelling Practice and Theory 88 (2018) 17-31],” Simulation Computational Intelligence and Neuroscience, vol. 2022, Ar-
Modelling Practice and Theory, vol. 100, Article ID 101990, ticle ID 9474245, 11 pages, 2022.
2020. [23] H. Li, X. Yang, Y. Xia, L. Zheng, G. Yang, and P. Lv, “K-
[7] G. Sun, X. F. Xie, J. Y. B. Zeng et al., “Using improved RFM LRFMD: method of customer value segmentation in shared
model to classify consumer in big data environment,” In- transportation filed based on improved K-means algorithm,”
ternational Journal of Embedded Systems, vol. 14, no. 1, Journal of Physics: Conference Series, vol. 1060, no. 1, Article
pp. 54–64, 2020. ID 012012, 2018.
[8] Q. S. Wang, X. Yang, P. J. Song, and C. L. Sia, “Consumer [24] Z. Wu, C. Zhou, F. Xu, and W. Lou, “A CS-AdaBoost-BP
segmentation analysis of multichannel and multistage con- model for product quality inspection,” Annals of Operations
sumption: a latent class mnl approach,” Journal of Electronic Research, vol. 308, no. 1-2, pp. 685–701, 2020.
Commerce Research, vol. 15, no. 4, pp. 339–358, 2014. [25] F. Yoseph, N. H. Ahamed Hassain Malim, M. Heikkilä,
[9] R. Punhani, V. P. S. Arora, A. Sai Sabitha, and V. K. Shukla, A. Brezulianu, O. Geman, and N. A. Paskhal Rostam, “The
“Segmenting E-commerce customer through data mining impact of big data market segmentation using data mining
techniques,” Journal of Physics: Conference Series, vol. 1714, and clustering techniques,” Journal of Intelligent & Fuzzy
no. 1, Article ID 012026, 2021. Systems, vol. 38, no. 5, pp. 6159–6173, 2020.
[10] A. M. Hughes, Strategic database marketing, Probus Pub- [26] J. Deng, J. Guo, and Y. Wang, “A novel K-medoids clustering
lishing Company, New York, NY, USA, 1994. recommendation algorithm based on probability distribution
[11] C. Hennig and T. F. Liao, “How to find an appropriate for collaborative filtering,” Knowledge-Based Systems, vol. 175,
clustering for mixed-type variables with application to socio- no. 1, pp. 96–106, 2019.
economic stratification,” Journal of the Royal Statistical So- [27] H.-S. Park and C.-H. Jun, “A simple and fast algorithm for
ciety: Series C (Applied Statistics), vol. 62, no. 3, pp. 309–369, K-medoids clustering,” Expert Systems with Applications,
2013. vol. 36, no. 2, pp. 3336–3341, 2009.
[12] L. B. Romdhane, N. Fadhel, and B. Ayeb, “An efficient ap- [28] D. Ho-Kieu, T. Vo-Van, and T. Nguyen-Trang, “Clustering
proach for building customer profiles from business data,” for Probability Density Functions by New k-Medoids
Expert Systems with Applications, vol. 37, no. 2, pp. 1573–1585, Method,” Scientific Programming, vol. 2018, Article ID
2010. 2764016, 7 pages, 2018.
[13] P. B. Chou, E. Grossman, D. Gunopulos, and P. Kamesam, [29] R. Liu, H. Wang, and X. Yu, “Shared-nearest-neighbor-based
“Identifying prospective customers,” in Proceedings of the 6th clustering by fast search and find of density peaks,” Infor-
ACM SIGKDD international conference on Knowledge dis- mation Sciences, vol. 450, no. 1, pp. 200–226, 2018.
covery and data mining-KDD ’00, pp. 447–456, Boston MA, [30] G. Surya Narayana and D. Vasumathi, “An attributes simi-
USA, August 2000. larity-based K-medoids clustering technique in data mining,”
[14] W. Lan, “The impact of perception difference on channel Arabian Journal for Science and Engineering, vol. 43, no. 8,
conflict: a customer relationship life cycle view,” Journal of pp. 3979–3992, 2018.
Service Science and Management, vol. 8, no. 5, pp. 655–661, [31] Z. Pooranian, M. Shojafar, J. H. Abawajy, and A. Abraham,
2015. “An efficient meta-heuristic algorithm for grid computing,”
[15] W. Buckinx, G. Verstraeten, and D. Van den Poel, “Predicting Journal of Combinatorial Optimization, vol. 30, no. 3,
customer loyalty using the internal transactional database,” pp. 413–434, 2015.
Expert Systems with Applications, vol. 32, no. 1, pp. 125–134, [32] D. Arthur and S. Vassilvitskii, “K-means++: The advantages of
2007. careful seeding,” in Proceedings of the 18th annual acm-siam
[16] C. Martin, P. Adrian, and B. David, Relationship marketing, symposium on discrete algorithms, New Orleans, Louisiana,
Butter Worth-Heinemann Ltd, London, UK, 1998. USA, January 2007.
[17] S. Peker, A. Kocyigit, and P. E. Eren, “LRFMP model for [33] M. J. Brusco, D. Steinley, and J. Stevens, “K-medoids inverse
customer segmentation in the grocery retail industry: a case regression,” Communications in Statistics - Theory and
study,” Marketing Intelligence & Planning, vol. 35, no. 4, Methods, vol. 48, no. 20, pp. 4999–5011, 2019.
pp. 544–559, 2017. [34] T. Y. Kim, S. Kim, J. A. Kim et al., “Automatic identification of
[18] Q. Zhang, A. R. Abdullah, C. W. Chong, and M. H. Ali, “E- java method naming patterns using cascade K-medoids,” KSII
commerce information system management based on data Transactions on Internet and Information Systems, vol. 12,
mining and neural network algorithms,” Computational no. 2, pp. 873–891, 2018.
10 Computational Intelligence and Neuroscience

[35] M. Zwitter and M. Soklic, “Breast Cancer Data Set,” 1988,

http://archive.ics.uci.edu/ml/datasets/Breast+Cancer.
[36] R. A. Fisher, “Iris Data Set,” 1936, http://archive.ics.uci.edu/
ml/datasets/Iris.
[37] K. Kaggle, “Brazilian E-Commerce Public Dataset by Olist,” 2016,
https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce.
[38] M. Versaci, G. Angiulli, P. di Barba, and F. C. Morabito, “Joint
use of eddy current imaging and fuzzy similarities to assess the
integrity of steel plates,” Open Physics, vol. 18, no. 1,
pp. 230–240, 2020.
[39] E. Ontiveros-Robles and P. Melin, “Toward a development of
general type-2 fuzzy classiﬁers applied in diagnosis problems
through embedded type-1 fuzzy classiﬁers,” Soft Computing,
vol. 24, no. 1, pp. 83–99, 2020.

Segmentation Analysis
No ratings yet
Segmentation Analysis
17 pages
Page - 1
No ratings yet
Page - 1
5 pages
Sustainability 14 07243 v2
No ratings yet
Sustainability 14 07243 v2
15 pages
Adm Final
No ratings yet
Adm Final
7 pages
E-Commerce Customer Segmentation Via Unsupervised Machine Learning
No ratings yet
E-Commerce Customer Segmentation Via Unsupervised Machine Learning
7 pages
Customer Segmentation
No ratings yet
Customer Segmentation
7 pages
WQD7005 Case Study - 17219402
No ratings yet
WQD7005 Case Study - 17219402
21 pages
Major 74 Team
No ratings yet
Major 74 Team
20 pages
JPSP202244
No ratings yet
JPSP202244
7 pages
Customer Segmentation
No ratings yet
Customer Segmentation
15 pages
DWDM PPT
No ratings yet
DWDM PPT
13 pages
Customer Segmentation Using Data Science
No ratings yet
Customer Segmentation Using Data Science
7 pages
IEEE Conference Template 5
No ratings yet
IEEE Conference Template 5
5 pages
A Review On Customer Segmentation Methods For Personalized Customer Targeting in e Commerce Use Cases
No ratings yet
A Review On Customer Segmentation Methods For Personalized Customer Targeting in e Commerce Use Cases
44 pages
IJCRT2212570
No ratings yet
IJCRT2212570
4 pages
Mall Customer Segmentation Kalash Daf
No ratings yet
Mall Customer Segmentation Kalash Daf
12 pages
Customer Segmentation Using Machine Learning
No ratings yet
Customer Segmentation Using Machine Learning
8 pages
IJCRT2407525
No ratings yet
IJCRT2407525
9 pages
Chapter 1,2 Report
No ratings yet
Chapter 1,2 Report
5 pages
RFM Analysis For Customer Segmentation Using Machine Learning: A Survey of A Decade of Research
No ratings yet
RFM Analysis For Customer Segmentation Using Machine Learning: A Survey of A Decade of Research
8 pages
CUSTOMER - MALL - SEGMENTATION.1 (1) (1) (Autosaved)
No ratings yet
CUSTOMER - MALL - SEGMENTATION.1 (1) (1) (Autosaved)
9 pages
Research Paper Mini Project
No ratings yet
Research Paper Mini Project
13 pages
Customer Segmentation IEEE Report
No ratings yet
Customer Segmentation IEEE Report
2 pages
Customer Segmentation Using K
No ratings yet
Customer Segmentation Using K
16 pages
Irjet V11i5300
No ratings yet
Irjet V11i5300
5 pages
1 s2.0 S1319157819309802 Main
No ratings yet
1 s2.0 S1319157819309802 Main
8 pages
A Review On Customer Segmentation Methods For Pers
No ratings yet
A Review On Customer Segmentation Methods For Pers
45 pages
Janardhanan 2020 J. Phys. Conf. Ser. 1706 012160
No ratings yet
Janardhanan 2020 J. Phys. Conf. Ser. 1706 012160
9 pages
DWDM Report
No ratings yet
DWDM Report
6 pages
Application of Clustering Algorithm For Effective Customer Segmentation in E-Commerce
No ratings yet
Application of Clustering Algorithm For Effective Customer Segmentation in E-Commerce
6 pages
25 Vol 102 No 22
No ratings yet
25 Vol 102 No 22
13 pages
Customer Segmentation Using K Means Clustering IJERTV11IS030152
No ratings yet
Customer Segmentation Using K Means Clustering IJERTV11IS030152
6 pages
An Efficiency Analysis On The TPA Clustering
No ratings yet
An Efficiency Analysis On The TPA Clustering
5 pages
DS MP
No ratings yet
DS MP
18 pages
Customer Segmentation Report
No ratings yet
Customer Segmentation Report
31 pages
RFM Model For Customer Purchase Behaviour Using K-Means Algorithm
No ratings yet
RFM Model For Customer Purchase Behaviour Using K-Means Algorithm
55 pages
Amazon User Segmentation: Ankit Chaudhary, Abhishek Pal, Ankit Saraswat, Harshit Jindal, Jagbeer Singh
No ratings yet
Amazon User Segmentation: Ankit Chaudhary, Abhishek Pal, Ankit Saraswat, Harshit Jindal, Jagbeer Singh
7 pages
K-Means Clustering Interpretation Using Recency, Frequency, and Monetary Factor For Retail Customers Segmentation
No ratings yet
K-Means Clustering Interpretation Using Recency, Frequency, and Monetary Factor For Retail Customers Segmentation
12 pages
5
No ratings yet
5
14 pages
Full Customer Segmentation
No ratings yet
Full Customer Segmentation
11 pages
Janardhanan 2020 J. Phys. Conf. Ser. 1706 012160
No ratings yet
Janardhanan 2020 J. Phys. Conf. Ser. 1706 012160
9 pages
Customer Segmentation Using Flying Fox Optimization Algorithm
No ratings yet
Customer Segmentation Using Flying Fox Optimization Algorithm
20 pages
Da cs-1
No ratings yet
Da cs-1
11 pages
24770-Article Text-109440-2-10-20231203
No ratings yet
24770-Article Text-109440-2-10-20231203
28 pages
1 s2.0 S1319157818304178 Main
No ratings yet
1 s2.0 S1319157818304178 Main
7 pages
IEEE Conference Template
No ratings yet
IEEE Conference Template
5 pages
Kmeans Improved
No ratings yet
Kmeans Improved
11 pages
Fin Irjmets1732261591
No ratings yet
Fin Irjmets1732261591
5 pages
Wu 2011
No ratings yet
Wu 2011
11 pages
Energy Consumption Prediction System
No ratings yet
Energy Consumption Prediction System
21 pages
ML Review PPT 2
No ratings yet
ML Review PPT 2
29 pages
IEEE Conference Template
No ratings yet
IEEE Conference Template
5 pages
Segmenting Bank Customers Via RFM Model and Unsupervised Machine Learning
No ratings yet
Segmenting Bank Customers Via RFM Model and Unsupervised Machine Learning
6 pages
IEEE Conference Template
No ratings yet
IEEE Conference Template
7 pages
Customer Segmentation Literature Review 1
No ratings yet
Customer Segmentation Literature Review 1
8 pages
MGT Report 1
No ratings yet
MGT Report 1
20 pages
Multi Clustering Recommendation System For Fashion Retail
No ratings yet
Multi Clustering Recommendation System For Fashion Retail
28 pages
Customer Segmentation Using RFM Analysis
No ratings yet
Customer Segmentation Using RFM Analysis
13 pages
Artificial Intelligence of Things AIoT Technologie
No ratings yet
Artificial Intelligence of Things AIoT Technologie
2 pages
IntroClassificationDA 2024
No ratings yet
IntroClassificationDA 2024
129 pages
Guia Fdi
No ratings yet
Guia Fdi
31 pages
U20cs604 Machine Learning Unit III
No ratings yet
U20cs604 Machine Learning Unit III
23 pages
Mini Project
No ratings yet
Mini Project
43 pages
UNIT 4 - Perceptron and DL
No ratings yet
UNIT 4 - Perceptron and DL
39 pages
Ai Project Logbook.
No ratings yet
Ai Project Logbook.
27 pages
Project Report
No ratings yet
Project Report
29 pages
Akshay Lohar Project Report
No ratings yet
Akshay Lohar Project Report
75 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
Smartphone-Based Colorimetric Analysis of Urine Test Strips For At-Home Prenatal Care
No ratings yet
Smartphone-Based Colorimetric Analysis of Urine Test Strips For At-Home Prenatal Care
9 pages
Chapter 01 Notes
No ratings yet
Chapter 01 Notes
11 pages
Final Phase 2 - Review 2
No ratings yet
Final Phase 2 - Review 2
45 pages
Mla Aiml
No ratings yet
Mla Aiml
52 pages
An Artificial Neural Network Mechanism For
No ratings yet
An Artificial Neural Network Mechanism For
9 pages
Diabetes Project Using Machine Learning
No ratings yet
Diabetes Project Using Machine Learning
49 pages
Ayush Machine Learning Lab
No ratings yet
Ayush Machine Learning Lab
38 pages
Advancing Material Property Prediction Using Physics-Informed Machine Learning Models For Viscosity
No ratings yet
Advancing Material Property Prediction Using Physics-Informed Machine Learning Models For Viscosity
14 pages
(REPORT) LAB - 2 - Decision - Tree
No ratings yet
(REPORT) LAB - 2 - Decision - Tree
17 pages
Deep Learning Models For Waterfowl Detection and C
No ratings yet
Deep Learning Models For Waterfowl Detection and C
13 pages
Quantum PAper
No ratings yet
Quantum PAper
12 pages
DP-100 Designing and Implementing A
No ratings yet
DP-100 Designing and Implementing A
12 pages
Machine Learning Based Detection of Cardiovascular Disease Using ECG Signals-1
No ratings yet
Machine Learning Based Detection of Cardiovascular Disease Using ECG Signals-1
22 pages
Information Technology Fundamentals: CCIT4085
No ratings yet
Information Technology Fundamentals: CCIT4085
43 pages
Bengali Speech Sentiment Analysis Using Machine Learning Models A Comparative Study
No ratings yet
Bengali Speech Sentiment Analysis Using Machine Learning Models A Comparative Study
6 pages
Machine Learning Presentaion
No ratings yet
Machine Learning Presentaion
15 pages
ML Assignment 2
No ratings yet
ML Assignment 2
7 pages
Predicting Student Dropout With Minimal Information
No ratings yet
Predicting Student Dropout With Minimal Information
15 pages
Nec Ass-4
No ratings yet
Nec Ass-4
2 pages
Discovering Knowledge Through Learning Videos
No ratings yet
Discovering Knowledge Through Learning Videos
6 pages

Research On Segmenting E-Commerce Customer Through An Improved K-Medoids Clustering Algorithm

Uploaded by

Research On Segmenting E-Commerce Customer Through An Improved K-Medoids Clustering Algorithm

Uploaded by

Hindawi

Computational Intelligence and Neuroscience

Correspondence should be addressed to Zengyuan Wu; [email protected]

Academic Editor: Mario Versaci

1. Introduction platforms to become more client centric [9]. Only with an

ALGORITHM 1: Implementation procedure of the improved K-medoids algorithm.

Table 2: The ﬁelds and descriptions in the dataset.

Table 3: Partial data of RFMCV model.

5. Discussion improve the accuracy of e-commerce customer classiﬁ-

[35] M. Zwitter and M. Soklic, “Breast Cancer Data Set,” 1988,

You might also like