Yeh 2009
Yeh 2009
a r t i c l e i n f o a b s t r a c t
Keywords: The objective of this paper is to introduce a comprehensive methodology to discover the knowledge for
Knowledge discovery selecting targets for direct marketing from a database. This study expanded RFM model by including two
RFM model parameters, time since first purchase and churn probability. Using Bernoulli sequence in probability the-
Marketing ory, we derive out the formula that can estimate the probability that one customer will buy at the next
Bernoulli sequence
time, and the expected value of the total number of times that the customer will buy in the future. This
study also proposed the methodology to estimate the unknown parameters in the formula. This method-
ology leads to more efficient and accurate selection procedures than the existing ones. In the empirical
part we examine a case study, blood transfusion service, to show that our methodology has greater pre-
dictive accuracy than traditional RFM approaches.
Ó 2008 Elsevier Ltd. All rights reserved.
1. Introduction and the most highly responsive to promotions. The opposite is true
for customers with low RFM scores. Moreover, some researchers
Since the early 1980s, the concept of relationship management pointed out that a high recency/frequency/monetary value cus-
(CRM) in marketing area has gained its importance. Acquiring and tomer who stops visiting is a customer who is finding alternatives
retaining the most profitable customers are serious concerns of a to old site.
company to perform more targeted marketing campaigns (Bult & The advantage of RFM model is that its concept is extraordi-
Wansbeek, 1995; Hughes, 1994; Hwang, Jung, & Suh, 2004; Kahan, narily intuition, its implementation is very simple, and its compu-
1998; Malthouse & Blattberg, 2004; Schmittlein, Morrison, & tation is rather easy. Marketing personnel can analyze customers
Colombo, 1987; Stone, 1995). For effective customer relationship without assistance of professional computer information systems,
management, it is important to gather information on customer so it has already used for a long time in the business circles. But
value. The most powerful and simplest model to implement CRM RFM model has some shortcomings (Colombo & Jiang, 1999;
may be the RFM model – Recency, Frequency, and Monetary value. Hughes, 1996; Miglautsch, 2000):
RFM is a behavior-based model, meaning it is used to analyze the
behavior a customer is engaging in, and make predictions based
(1) RFM model is not a precise quantitative predict model; using
on this behavior (Colombo & Jiang, 1999; Hughes, 1996).
customers’ historical trading data, it appraise customers’
RFM has a corollary: Customers who have purchased or visited
value based on naı¨ve method and subjective judgment, such
more recently, more frequently, or created higher monetary values
as five partition score systems.
are much more likely to respond to your marketing efforts, com-
(2) Each RFM parameter has different importance to different
pared with other customers who are less recent, less frequent,
industries. For example, parameter R may have very good
and create less monetary value.
evaluation ability in some industries, while parameter F
Classic RFM implementation ranks each customer on the re-
and M may have better evaluation ability in other industries.
cency, frequency, and monetary value parameters against all the
RFM model is unable to integrate these three parameters
other customers, and creates an RFM ‘‘score” for each customer.
into a single precise quantitative evaluation index.
Customers with high scores are usually the most profitable, the
(3) While RFM analysis is popular among marketing personnel,
most likely to repeat a behavior (visit or purchase, for example),
ad-hoc rules are often employed to judge whether custom-
ers are still active or not. Because customers do not declare
* Corresponding author. Tel.: +886 3 5186511.
explicitly when they have found alternative sites to the old
E-mail addresses: [email protected] (I.-C. Yeh), [email protected] (K.-J.
Yang), [email protected] (T.-M. Ting).
one, a company infers a customer is out of their business if
1
Tel.: +886 3 5186388. he/she did not make any purchase, for example, for over
2
Tel.: +886 3 5186388/3412500x1551. three months.
0957-4174/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2008.07.018
I.-C. Yeh et al. / Expert Systems with Applications 36 (2009) 5866–5871 5867
Nomenclature
b difference between the period when a customer re- P response probability, the probability that a specific cus-
mains active and the period from the first response tomer respond to a marketing campaign
and the most recent response PB,i actual value of response probability for the customer in
Bk binary variable for the kth sorted customer, Bk = 1 the next marketing campaign
means this customer took response, Bk = 0 means this Q churn probability, the probability that a customer dis-
customer did not take response continue his/her use of a service for ever after a market-
E(XL) expected value of times that a customer will respond in ing campaign
the next L marketing campaigns R recency, time since the most recent response for a cus-
Ei(XLjL = 1) expected value of times that the customer will re- tomer
spond at the next marketing campaign S number of marketing campaigns in the period when the
F number of responses in the period when the customer is customer is active
active s period per marketing campaign
G period when a customer remains active T time since the first response for a customer
g ratio of the response probability of the current cam- XL number of times that a customer will respond in the
paign and the average of response probabilities of previ- next L marketing campaigns
ous campaigns Y status variable; Y = 0 denotes a customer was still active
M monetary value, total monetary amount that a customer under the condition that this customer had no response
purchases up to now for n consecutive marketing campaigns. Y = 1 denotes a
m number of data for moving average customer had been inactive under the condition that
n times of recent consecutive marketing campaigns that a this customer had no response for n consecutive mar-
customer had no response keting campaigns
To improve the above-mentioned shortcomings, and enable the Theorem 1. Under the condition that a customer had no response for
model to be adjusted based on the data of the marketing databases n consecutive marketing campaigns, the expectation of times XL that
of different industries, in this study, we will derive an augmented this customer will respond in the next L marketing campaigns is
RFM model, called RFMTC model (Recency, Frequency, and Mone- " #
tary value, Time since first purchase, and Churn probability), using n
XL1
k L
EðX L Þ ¼ ð1 QÞ P kð1 Q Þ Q þ Lð1 Q Þ ð1Þ
Bernoulli sequence in probability theory. The model can automat-
k¼0
ically build the formula that can predict the probability that one
customer will buy at the next time, and the expected value of Proof
the total number of times that the customer will buy in the future.
The main contribution of this study is to develop predictive formu- Let
las based on probability theory instead of appraise customers with
naive methods, such as RFM score method. Y = 0 denotes a customer was still active under the condition
In this paper, section two will derive the model to estimate the that this customer had no response for n consecutive marketing
probability that one customer will buy at the next time, and the ex- campaigns.
pected value of the total number of times that the customer will Y = 1 denotes a customer had been inactive under the condition
buy in the future. Section three proposes the method to estimate that this customer had no response for n consecutive marketing
the parameters in the model. Section four will prove that the campaigns.
above-mentioned theory is reliable using a real case study. Section Then
five gives the conclusion.
EðX L Þ ¼ EðX L jY ¼ 0Þ PrðY ¼ 0Þ þ EðX L jY ¼ 1Þ PrðY ¼ 1Þ
2. RFMTC marketing model ¼ EðX L jY ¼ 0Þ PrðY ¼ 0Þ þ EðX L jY ¼ 1Þ ð1 PrðY ¼ 0ÞÞ
¼ EðX L jY ¼ 0Þ ð1 Q Þn þ 0 ½1 ð1 QÞn
In order to obtain the mathematically theoretical solution of the
¼ EðX L jY ¼ 0Þ ð1 Q Þn
RFMTC marketing model, we set the following hypotheses:
ð2Þ
Hypothesis 1. The probability that a specific customer respond to
a marketing campaign is the constant P, called response In the next L marketing campaigns, a customer may become inac-
probability. tive at the ith marketing campaign, i = 1, 2, 3, . . ., L, or may be still
active.
Hypothesis 2. A customer having response to a marketing cam- Let
paign means that he/she is still active, i.e., the probability that this
customer is still active is one. Ci denotes the event that a customer became inactive at the ith
marketing campaign, i = 1, 2, 3, . . ., L, and
Hypothesis 3. A customer having no response to a marketing cam- CL+1 denotes the event that a customer was still active after next
paign means that the probability that this customer is still active is L consecutive marketing campaigns.
(1 Q)n, where Q is the churn probability, and n is the times of
recent consecutive marketing campaigns that a customer had no When a customer became inactive at the kth marketing cam-
response. Churn probability is defined as the probability that a cus- paign, he/she would face first k 1 marketing campaigns. Since
tomer discontinue his/her use of a service for ever after a market- the response probability of each marketing campaign for a
ing campaign. customer is P, assuming the process can be described as a Bernoulli
5868 I.-C. Yeh et al. / Expert Systems with Applications 36 (2009) 5866–5871
PrðC Lþ1 Þ ¼ ð1 Q Þ L
ð5Þ X
1
1Q 1Q
ð1 Q Þk ¼ ¼ ð12Þ
1 ð1 Q Þ Q
Table 1 shows the probabilities of event Ck and the expecta- k¼1
tionsE(XLjY = 0,Ck), where k = 1,2,3, . . . ,L,L + 1. Substituting (12) into (10), we obtain
Therefore
d 1Q
X
Lþ1 EðX L jY ¼ 0Þ ¼ PQð1 Q Þ
EðX L jY ¼ 0Þ ¼ PrðC k Þ EðX L jY ¼ 0; C k Þ dQ Q
k¼1 d 1
¼ PQð1 Q Þ 1
¼ 0 Q þ Pð1 Q ÞQ þ 2Pð1 Q Þ2 Q þ dQ Q
þ ðL 1ÞPð1 Q ÞL1 Q þ LPð1 QÞL 1 Pð1 QÞ
" # ¼ PQð1 Q Þ 2 ¼ ð13Þ
XL1 Q Q
¼ kPð1 QÞk Q þ LPð1 QÞL ð6Þ
k¼0 Substituting (13) into (1), we get
Proof. The result comes from (1) with L = 1. h Hypothesis 1. The probability P for a customer responding a
marketing campaign could be estimated by
Consequence 1.2. Under the same condition as Theorem 1, the
F
expectation of times XL that this customer will respond in all the P¼g ð15Þ
following marketing campaigns is
S
Pð1 Q Þ P where
EðX L jL ¼ 1Þ ¼ ð1 Q Þn ¼ ð1 Q Þnþ1 ð9Þ
Q Q
S is the number of marketing campaigns in the period when the
Proof. By (6) and letting L ? 1, we have h
customer is active, and
F is the number of responses in the period when the customer is
active, and
Table 1 g is the ratio of the response probability of the current cam-
Probabilities of Ck and expectations E(XL—Y = 0, Ck)
paign and the average of response probabilities of previous
Event Pr(Ck) E(XL—Y = 0, Ck) campaigns.
C1 Q 0P
C2 (1 Q)1Q 1P
Hypothesis 2. The number of marketing campaigns S in the period
C3 (1 Q)2Q 2P
C4 (1 Q)3Q 3P when the customer is active could be estimated by
: : : G
CL1 (1 Q)L2Q (L 2)P S¼ ð16Þ
CL (1 Q)L1Q (L 1)P s
CL+1 (1 Q)L LP
where
I.-C. Yeh et al. / Expert Systems with Applications 36 (2009) 5866–5871 5869
s is the period per marketing campaign, and 2. Sort customer data by the value Ei (XLjL = 1) from large to small.
G is the period when a customer remains active. The exact value 3. Use the moving average to estimate the actual value of response
of G is unknown; however, it must be greater than T R, where probability by
T is the time since first response, and R is the time since the P
k Bk
most recent response. Hence assume G can be estimated by PB;i ¼ ; k ¼ i m; i m þ 1; . . . ; i 1; i; i þ 1; . . . ; i þ m 1;
2m þ 1
G ¼ ðT RÞ þ b ð17Þ iþm
Substituting (17) into (16), we have ð24Þ
S ¼ ðT R þ bÞ=s ð18Þ where
Assuming the period per marketing campaign is one day (or one
Bk is the binary variable for the kth sorted customer, Bk = 1
week, one month) and all the unit of time in this model for all time
means this customer took response, Bk = 0 means this customer
variables is one day (or one week, one month), then we may let s = 1
did not take response, and
in order to simplify formulas. Therefore, by (18), we have
m is the number of data for moving average. In (23c), we center
S ¼ ðT R þ bÞ=1 ¼ T R þ b ð19Þ at the ith sorted customer and use 2m samples around this cus-
tomer to estimate PB,i.
Hypothesis 3. Assume that n is the times of recent consecutive
After the above steps, if the assuming values of Q, g, b make
marketing campaigns that this customer had no response, and
(23b) reach its minimum, then they are the optimal estimation
could be estimated by
of the parameter values.
R
n¼ ð20Þ
s
4. Case studies: blood transfusion service
where R is the time since the most recent response for a customer. If
we assume that s = 1, then To demonstrate the RFMTC marketing model, this study
n¼R ð21Þ adopted the donor database of Blood Transfusion Service Center
in Hsin-Chu City in Taiwan. The center passes their blood transfu-
Theorem 2. Under the same condition as Theorem 1, the expectation sion service bus to one university in Hsin-Chu City to gather blood
of times XL that this customer will respond at the next marketing donated about every three months. To build a FRMTC model, we
campaign is selected 748 donors at random from the donor database. These
748 donor data, each one included R (Recency – months since
F last donation), F (Frequency – total number of donation), M (Mon-
EðX L jL ¼ 1Þ ¼ ð1 Q ÞRþ1 g ð22Þ
T Rþb etary – total blood donated in c.c.), T (Time – months since first
donation), and a binary variable representing whether he/she do-
Proof. Substituting (15), (19), and (21) into (8), the result is nated blood in March 2007 (1 stand for donating blood; 0 stands
obtained. for not donating blood). Table 2 shows the descriptive statistics
In (22), R, F and T are known, but Q, g, b are unknown and must be of the data. We selected 500 data at random as the training set,
estimated. and the rest 248 as the testing set.
Because formula (23b) is a three variables optimization prob-
3.2. Method of parameter estimation lem, it is easy to solve it by numerical optimization methods. To
improve the efficiency of the solving process, it is necessary to esti-
In order to estimate parameters Q, g, b, we present the following mate the rational range of these three parameters, Q, g, b, first
model:
A greater Q stands for that the customer churn probability is
Find Q ; g; b ð23aÞ high. 0 6 Q 6 1. According to general marketing experience,
X 2 when the period per marketing campaign is one month, it may
Min i Ei ðX L jL ¼ 1Þ P B;i ð23bÞ
be between 0.01 and 0.2.
where A greater g stand for the prevalence of the current marketing
campaign compared to a common one. If it greater than 1, this
PB,i is the actual value of response probability for the customer campaign is superior to an average one. Its value domain is
in the next marketing campaign, and assumed in the range between 0.25 and 4.0.
Ei(XLjL = 1) is the expected value of times XL that the customer The b stands for that the difference between the period when a
will respond at the next marketing campaign estimated by Eq. customer remains active and the period from the first response
(22). Because the maximum value of it is one, it can be regarded and the most recent response. A reasonable guess is that b is
as the expected value of response probability. about the average period per response; hence, its value domain
is assumed in the range as follows
Supposing that the actual value of response probability can be
provided by marketing database, the optimal estimated parameter T R T R
0:25 <b<4 ð25Þ
values can be found by Eq. (23b). However, in the marketing data- F 1 Avg F 1 Avg
base a customer will either respond or not respond in the market-
ing campaign, thus PB,i is unknown. We proposed the novel Note when we estimate the value domain of parameter b by (23d),
‘‘Sorting Moving Average Method” (SMA) to estimate it: only the data whose frequency F P 2 are adopted. In this study
Table 2 hence
Descriptive statistics of the data
1:96 < b < 31:4 ð27Þ
Variable Minimum Maximum Mean Standard deviation
Using numerical optimization method, based on the above-men-
Recency (months) 0.03 74.4 9.74 8.07
Frequency (times) 1 50 5.51 5.84
tioned value ranges, the most fitting parameters are shown in Table
Monetary (c.c. blood) 250 12500 1378.68 1459.83 3. No parameters reached their limits, which appears that the limits
Time (months) 2.27 98.3 34.42 24.32 of these parameters may be reasonable.
To compare the performance of RFMTC and RFM model, lift
chart was adopted. In the lift chart, the horizontal axis represents
the ranking of the data according to predictive model, and the
Table 3
vertical axis shows the number of cumulative response. There are
Estimated parameter values two curves, model curve and baseline curve, in the lift chart.
The RFM model is implemented based on the following scoring
Parameter Estimated value domain Estimated value
systems:
Q 0.01–0.2 0.111
g 0.25–4 3.72
R (months) score: 0–2 months = 5, 3 months = 4, 4–10
b 1.96–31.4 9.52
months = 3, 11–15 months = 2, 16 months and up = 1;
F (times) score: 1 time = 1, 2 times = 2, 3–4 times = 3, 5–7
times = 4, 8 times and up = 5;
M (c.c. blood) score: 250 c.c. = 1, 500 c.c. = 2, 750–1000 c.c. = 3,
120 1250–1750 c.c. = 4, 2000 c.c. and up = 5;
RFM score = R score + F score + M score.
baseline
100
RFMTC
number of cumulative response
The lift chart obtained with RFMTC model and with RFM model
RFM is shown in Fig. 1 for the training set and Fig. 2 for the testing set.
The RFMTC model curve is obviously above the RFM model curve
80
in training set and in testing set, which shows the RFMTC model
is really better than the RFM model in ranking targets from the
60 population of the direct marketing campaign.
5. Conclusions
40
The most important issue for direct marketers is how to sample
targets from a population for a direct marketing campaign.
20 Although some selection methods are described in the literature,
there seems to be few literatures discussing the analytical and sta-
tistical aspects. RFM model is a growing area of marketing practice,
0
0 100 200 300 400 500 yet the academic journals contain very little research on this topic
based on sound mathematic theory.
ranking of the data according to model
The objective of this paper is to introduce a comprehensive
Fig. 1. Lift chart of RFMTC model and RFM model for training set. methodology for selecting targets for direct marketing from a data-
base. This study expanded RFM model to RFMTC model by includ-
ing two parameters, time since first purchase and churn
probability. Using Bernoulli sequence in probability theory, we de-
70 rive formulas that can predict the probability that one customer
will buy at the next time, and the expected value of the total num-
baseline ber of times that the customer will buy in the future.
60 This study also proposed the methodology to estimate the un-
RFMTC
number of cumulative response
(3) RFM model must adjust the weights of R, F, M parameter Colombo, R., & Jiang, W. (1999). A stochastic RFM model. Journal of Interactive
Marketing, 13(3), 2–12.
according to different industries, but this kind of adjustment
Hughes, A. M. (1994). Strategic database marketing. IL: Probus Publishing
is short of the systematic method. RFMTC model can build Company.
the optimum predictive model automatically based on the Hughes, A. M. (1996). Boosting response with RFM. American Demographics, 5,
data of the marketing databases of different industries, and 4–9.
Hwang, H., Jung, T., & Suh, E. (2004). An LTV model and customer segmentation
avoid the problem of adjusting the weights of parameters based on customer value: A case study on the wireless telecommunication
in a trial-and-error approach. industry. Expert Systems with Applications, 26(2), 181–188.
(4) RFM model needs to segment the customers into different Kahan, R. (1998). Using database marketing techniques to enhance your one-to-one
marketing initiatives. Journal of Consumer Marketing, 15(5), 491–493.
groups in order to confirm the response rate of each group. Malthouse, E. C., & Blattberg, R. C. (2004). Can we predict customer lifetime value?
RFMTC model don’t need to segment the customers into dif- Journal of Interactive Marketing, 19(1), 2–16.
ferent groups, and uses single customer group to confirm the Miglautsch, J. (2000). Thoughts on RFM scoring. Journal of Database Marketing, 8(1),
67–72.
response rate of each customer, so the necessary quantity of Schmittlein, D. C., Morrison, D. G., & Colombo, R. (1987). Counting your
trials on customers can be greatly reduced. customers: Who are they and what will they do next? Management Science,
33(1), 1–24.
Stone, B. (1995). Successful direct marketing methods (Vol. 3). IL: NTC Business Books.
References
Bult, J. R., & Wansbeek, T. (1995). Optimal selection for direct mail. Marketing
Science, 14(4), 378–381.