0% found this document useful (0 votes)
15 views9 pages

Direct Marketing Decision Support Through Predictive Customer Response Modeling

Uploaded by

1900 Louise
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views9 pages

Direct Marketing Decision Support Through Predictive Customer Response Modeling

Uploaded by

1900 Louise
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Decision Support Systems 54 (2012) 443–451

Contents lists available at SciVerse ScienceDirect

Decision Support Systems


journal homepage: www.elsevier.com/locate/dss

Direct marketing decision support through predictive customer response modeling


David L. Olson a, Bongsug(Kevin) Chae b,⁎
a
Department of Management, University of Nebraska, Lincoln, NE 68588-0491, United States
b
Department of Management, Kansas State University, Manhattan, KS 66506, United States

a r t i c l e i n f o a b s t r a c t

Article history: Decision support techniques and models for marketing decisions are critical to retail success. Among different
Received 8 June 2011 marketing domains, customer segmentation or profiling is recognized as an important area in research and
Received in revised form 12 May 2012 industry practice. Various data mining techniques can be useful for efficient customer segmentation and
Accepted 19 June 2012
targeted marketing. One such technique is the RFM method. Recency, frequency, and monetary methods pro-
Available online 3 July 2012
vide a simple means to categorize retail customers. We identify two sets of data involving catalog sales and
Keywords:
donor contributions. Variants of RFM-based predictive models are constructed and compared to classical data
Customer response predictive model mining techniques of logistic regression, decision trees, and neural networks. The spectrum of tradeoffs is an-
Knowledge-based marketing alyzed. RFM methods are simpler, but less accurate. The effect of balancing cells, of the value function, and
RFM classical data mining algorithms (decision tree, logistic regression, neural networks) are also applied to the
Neural networks data. Both balancing expected cell densities and compressing RFM variables into a value function were
Decision tree models found to provide models similar in accuracy to the basic RFM model, with slight improvement obtained by
Logistic regression increasing the cutoff rate for classification. Classical data mining algorithms were found to yield better pre-
diction, as expected, in terms of both prediction accuracy and cumulative gains. Relative tradeoffs among
these data mining algorithms in the context of customer segmentation are presented. Finally we discuss prac-
tical implications based on the empirical results.
© 2012 Elsevier B.V. All rights reserved.

1. Introduction for marketing decisions for a long time and is recognized as a useful
data mining technique for customer segmentation and response models
The role of decision support techniques and models for marketing [3,30]. A survey [43] also shows that RFM is among the most popular
decisions has been important since the inception of decision support segmentation and predictive modeling techniques used by marketers.
systems (DSSs) [25]. Diverse techniques and models (e.g., optimization, RFM relies on three customer behavioral variables (how long since the
knowledge-based systems, simulation) have emerged over the last five last purchase by customer, how often the customer purchases, how much
decades. Many marketing domains, including pricing, new product de- the customer has bought) to find valuable customers or donors and devel-
velopment, and advertising, have benefited from these techniques and op future direct marketing campaigns. Having a reliable and accurate cus-
models [16]. Among these marketing domains, customer segmentation tomer response model is critical for marketing success since an increase
or profiling is recognized as an important area [18,19,26,43]. There are or decrease in accuracy of 1% could have a significant impact on their
at least two reasons for this. First, the marketing paradigm is becoming profits [1]. While there could be many other customer-related factors
customer-centric [41] and targeted marketing and service are suitable. [e.g.,42], previous studies have shown that RFM alone can offer a pow-
Second, unsolicited marketing is costly and ineffective (e.g., low re- erful way of predicting the future customer purchase [1,3,17].
sponse rate) [15,30]. Along with these reasons, there are increasing ef- Our research builds customer response models using RFM variables
forts on collecting and analyzing customer data for better marketing and compares them in terms of customer gains and prediction accuracy.
decisions [9,26,30]. The advancement of online shopping technologies The paper aims to increase understanding of how to find knowledge
and database systems has accelerated this trend. hidden in customer and transactional databases using data mining tech-
Data mining has been a valuable tool in this regard. Various data niques. This area is called knowledge-based marketing [26]. The next
mining techniques, including statistical analysis and machine learn- section briefly reviews various data mining techniques for building cus-
ing algorithms, can be useful for efficient customer segmentation tomer response or predictive models. Section 3 describes methodology.
and targeted marketing [4,26,38]. One such technique is RFM, stand- All the response models will be built upon the three RFM variables,
ing for recency, frequency, and monetary. RFM analysis has been used while different data mining techniques are used. Then, we present a re-
search design, including two direct marketing data sets with over
⁎ Corresponding author.
100,000 observations, a process of predictive modeling building, and
E-mail addresses: [email protected] (D.L. Olson), [email protected] methods to measure the performance of models. Section 4 includes
(B.(K.) Chae). analysis and results. There could be different methods to increase the

0167-9236/$ – see front matter © 2012 Elsevier B.V. All rights reserved.
doi:10.1016/j.dss.2012.06.005
444 D.L. Olson, B.(K.) Chae / Decision Support Systems 54 (2012) 443–451

prediction performance of an RFM-based predictive model and sophis- the original RFM model have been recognized in the literature [31,45].
ticated data mining techniques (decision tree, logistic regression, and Some previous studies have extended the original RFM model either
neural networks) appear to outperform more traditional RFM. These by considering additional variables (e.g., socio-demographics) [1] or
findings are further discussed in Section 5, comparing results with pre- by combining with other response techniques [6,7]. Because of the
vious studies of customer response models and in the broad contexts of high correlation between F and M, Yang [45] offered a version of RFM
knowledge-based marketing. We also discuss practical implications from model collapsing the data to a single variable “Value”= M / R. To over-
the findings and offer conclusions. come the problem of data skewed in RFM cells, Olson et al. [31] pro-
The contribution of this study is to demonstrate how RFM model posed an approach to balance observations in each of the 125 RFM cells.
variants can work, and supports general conclusions consistently Other variables that may be important include customer income,
reported by others that RFM models are inferior to traditional data customer lifestyle, customer age, product variation, and so on [14].
mining models. This study shows that RFM variables are very useful That would make traditional data mining tools such as logistic regres-
inputs for designing various customer response models with different sion more attractive. However, RFM is the basis for a continuing
strengths and weaknesses and the ones relying on classical data min- stream of techniques to improve customer segmentation marketing
ing (or predictive modeling) techniques can significantly improve the [12]. RFM has been found to work relatively well if expected response
prediction capability in direct marketing decisions. These predictive rate is high [24]. Other approaches to improve RFM results have in-
models using RFM variables are simple and easy to use in practice cluded Bayesian networks [1,8] and association rules [46].
than those with a complex set of variables. Besides descriptive model-
ing techniques popular in practice [43], thus, marketers should adopt
2.2.2. Classical data mining tools
those advanced predictive models in their direct marketing decisions.
Common data mining practice in classification is to gather a great
number of variables and apply different standard algorithms. Given
2. Customer response models using data mining techniques
the set of predefined classes and a number of attributes, these classi-
fication methods can provide a model to predict the class of other un-
2.1. Marketing DSS and customer response models
classified data. Mathematical techniques that are often used to
construct classification methods are binary decision trees, neural net-
The use of DSS in marketing goes back to the 1960s and 1970s
works, and logistic regression. By using binary decision trees, a tree
[22,44] and has been applied in various areas, including marketing
induction model with “Yes–No” format can be built to split data into
strategy, pricing, new product development, and product analysis
different classes according to its attributes. Such a model is very
and management [16]. There has been an increase of DSS use in
easy to apply to new cases, although the algorithms often produce
customer-side marketing activities, such as customer segmentation
an excessive number of rules. Neural networks often fit nonlinear re-
(or profiling), direct marketing, database marketing, and targeted ad-
lationships very well, but are difficult to apply to new data. Logistic
vertising. This reflects advances in database management and com-
regression models are easy to apply to new data, although the prob-
plex model building [11,16,35]. More convenient methods are
lem of a cutoff between classes can be an issue [32].
available for the acquisition and storage of large amounts of customer
Relative performance of data mining algorithms has long been un-
and transactional data. In addition, knowledge-based systems or intel-
derstood to depend upon the specific data. Since data mining soft-
ligent systems using data mining techniques (e.g., neural networks)
ware is widespread, common practice in classification is to try the
[37] have emerged in the marketing domain.
three basic algorithms (decision trees, neural networks, logistic re-
This trend is broadly termed knowledge-based marketing.
gression), and use the one that works best for the given data set.
Knowledge-based marketing is both data-driven and model-driven:
Studies have compared these algorithms with RFM. Levin and Zahavi
that is the use of sophisticated data mining tools and methods to
[20] compared RFM with decision trees (specifically CHAID), pointing
find knowledge discovery from customer and transactional databases
out that decision trees are more automatic (RFM requires extensive
[26]. Overall, this leads to more efficient and effective communication
data manipulation), but involve modeling issues such as controlling
with potential buyers and an increase in profits. An important ap-
tree size and determining the best split for branches and leaves.
proach to knowledge-based marketing is to understand customers
Kim and Street [19] proposed a neural network model and applied
and their behavioral patterns. This requires such transactional charac-
feature selection mechanisms to reduce input variables, enabling
teristics as recency of purchases, frequency of purchases, size of pur-
focus upon the most important variables. Baesens et al. [1] also ap-
chases, identifying customer groups, and predicting purchases [35].
plied neural networks to customer response models (adding custom-
The RFM model and other data mining-based customer response
er profile indicators to RFM), obtaining better prediction accuracy.
models have proven useful to marketers.
That is a consistent finding — data mining algorithms will be expected
to better predict customer response than RFM. However, RFM re-
2.2. Data mining techniques for customer response models
mains interesting because it relies upon the three fundamentally
basic inputs that are readily available.
2.2.1. RFM
R represents the period since the last purchase. F is the number of
purchases made by a customer during a certain period. M is the total 3. Methodology
purchase amount by a customer over that period. It is common practice
for each R, F, and M to have five groups or levels and thus there are 125 3.1. Problem description and data set
(=5 * 5 * 5) customer segmentation groups. Each customer is segment-
ed into one cell or group. This model allows markets to differentiate This research design includes two studies (Study 1 and Study 2
their customers in terms of three factors and to target the customer hereafter) using two datasets obtained from the Direct Marketing
groups that are likely to purchase products or services. This technique Educational Foundation. Study 1 uses a dataset including 101,532 in-
is known as the benchmark model in the area of database marketing [3]. dividual purchases from 1982 to 1992 in catalog sales. Study 2 is
Since its introduction in a major marketing journal [5], RFM has based on the data of 1,099,009 individual donors' contributions to a
received a great deal of interest from both academic and industry non-profit organization collected between 1991 and 2006. The pur-
communities [3,17]. Many studies [1,13,17] have recognized these chase orders (or donations) included ordering (or donation) date
three variables as important to predict the future responses by cus- and ordering amount. The last four months (Aug–Dec) of the data
tomers to potential direct marketing efforts. Certain limitations in were used as the target period: Aug–Dec 1992 for Study 1 and
D.L. Olson, B.(K.) Chae / Decision Support Systems 54 (2012) 443–451 445

Aug–Dec 2006 for Study 2. The average response rates in Studies 1 into customer behavior, as well as lead to ways to profitably act on
and 2 are 0.096 and 0.062 respectively. results. One of a number of algorithms automatically determines
Data preparation and manipulation are an important stage of which variables are most important, based on their ability to sort
knowledge discovery and learning in knowledge-base marketing the data into the correct output category. The method has relative
[35]. Fig. 1 describes our approach. The raw data contained customer advantage over neural network and genetic algorithms in that a
behavior represented by account, order (or donation) date, order (do- reusable set of rules are provided, thus explaining model
nation) dollars, and many other variables. We followed the general conclusions.
coding scheme to compute R, F, and M [17]. Various data preparation
techniques (e.g., filtering, transforming) were used during this pro-
3.2.6. Neural networks (NN)
cess. The order date of last purchase (or the date of last donation)
Neural networks are the third classical data mining tool found in
was used to compute R (R1, R2, R3, R4, R5). The data set contained
most commercial data mining software products, and have been ap-
order (or donation) history and order dollars (or donation amounts)
plied to direct marketing applications [4,8,19,36]. NN are known for
per each customer (or donor), which were used for F (F1, F2, F3, F4,
their ability to train quickly on sparse data sets. NN separates data
F5) and M (M1, M2, M3, M4, M5). We also included one response var-
into a specified number of output categories. NN are three layer net-
iable (Yes or No) to the direct marketing promotion or campaign.
works wherein the training patterns are presented to the input layer
and the output layer has one neuron for each possible category.
3.2. Predictive models

3.2.1. RFM 3.3. Performance evaluation measures


RFM analysis typically divides the data into 125 cells, designated
by the 5 groups. The most attractive group would be 555, or Group There are different methods to assess customer response model
5 for each of the 3 variables [17]. performances. We use prediction accuracy and cumulative gains to
discuss the performance of different predictive customer response
models. Gains show the percentage of responders in each decile. Mar-
3.2.2. RFM with balanced cells
keters can figure out how many responders (or what proportion of
Dividing customers or donors into 125 cells tends to result in the
responders) can be expected in a specific decile. For example, we
skewness that the data is not evenly distributed among those cells. This
can say that given a same mailing size (e.g., 40% of the total cus-
skewness has been recognized as one of the problems with RFM
tomers) a model capturing 70% of the responders is better than a
[13,27,31]. Our approach to this issue was through more equal density
model capturing only 60% of the responders [47]. Through cumulative
(size-coding) to obtain data entries for all RFM cells. We accomplished
gain values we can evaluate the performances of different data min-
this by adjusting cell limits to obtain more equal counts for cells in the
ing techniques [21]. Another way is using prediction accuracy rate
training set.
of each technique. The data set employed in this research has the
information about who responded to the direct marketing or cam-
3.2.3. RFM with Yang's value function
paign. Using R, F, and M as three predictive variables, each data
Previous studies [19] have pointed out a strong correlation between
mining technique will develop a binary customer response model
F and M as a limitation of RFM. The value function [45] compresses the
based on the training data set and apply the model to the test
RFM data into one variable — V = M/ R.
data set. This will generate prediction accuracy rate — the percentage
of customers classified correctly [21]. The model building process is
3.2.4. Logistic regression (LR) shown in Fig. 1.
The purpose of logistic regression is to classify cases into the most
likely category. Logistic regression provides a set of β parameters for
the intercept (or intercepts in the case of ordinal data with more than 4. Analysis and results
two categories) and independent variables, which can be applied to a
logistic function to estimate the probability of belonging to a specified The analysis process consisted of model building using each data
output class [32]. Logistic regression is among the most popular data mining technique and model assessment. For Study 1, customer re-
mining techniques in marketing DSS and response modeling [24]. sponse models were developed using RFM, RFM with balanced cells,
RFM with Yang's value function, logistic regression (LR), decision tree
(DT), and neural networks (NN). Model assessment is presented with
3.2.5. Decision tree (DT)
gains and predictive accuracy.
Decision trees in the context of data mining refer to the tree struc-
ture of rules. They have been applied by many in the analysis of direct
marketing data [39,40]. The data mining decision tree process in- 4.1. Study 1
volves collecting those variables that the analyst thinks might bear
on the decision at issue, and analyzing these variables for their ability An initial correlation analysis was conducted, showing that there
to predict outcome. Decision trees are useful to gain further insight was some correlation among these variables, as shown in Table 1.

Fig. 1. Research design building predictive models using RFM variables.


446 D.L. Olson, B.(K.) Chae / Decision Support Systems 54 (2012) 443–451

All three variables were significant at the 0.01 level. The relation- Table 2
ship between R and customer response is negative, as expected. In RFM boundaries.

contrast, F and M are positively associated with customer response. Factor Min Max Group 1 Group 2 Group 3 Group 4 Group 5
R and F are stronger predictors for customer response.
R 12 3810 1944+ 1291–1943 688–1290 306–687 12–305
RFM was initially applied, dividing the scales for each of the three Count 16,297 16,323 16,290 16,351 16,271
components into five groups based upon the scales for R, F, and M. F 1 39 1 2 3 4–5 6+
This was accomplished by entering bin limits in SPSS. Table 2 shows Count 43,715 18,274 8206 6693 4644
M 0 4640 0–20 21–38 39–65 66–122 123+
boundaries. Group 5 was assigned the most attractive group, which
Count 16,623 16,984 15,361 16,497 16,067
for R was the minimum, and for F and M the maximum.
Note the skewness of the data for F, which is often encountered.
Here the smaller values dominate that metric. Table 3 displays the cannot obtain the desired counts for each of the 125 combined cells
counts obtained for these 125 cells. because we are dealing with three scales. But we can come closer,
The proportion of responses (future order placed) for the data is as in Table 6. Difficulties arose primarily due to F having integer
given in Table 4. values. Table 6 limits were generated sequentially, starting by divid-
In the training set, 10 of 125 possible cells were empty, even with ing R into 5 roughly equal groups. Within each group, F was then
over 100,000 data points. The cutoff for profitability would depend sorted into groups based on integer values, and then within those
upon cost of promotion compared to average revenue and rate of 25 groups, M divided into roughly equally sized groups.
profit. For example, if cost of promotion were $50, average revenue The unevenness of cell densities is due to uneven numbers in the
per order $2000, and average profit rate $0.25 per dollar of revenue, few integers available for the F category. The proportion of positive
the profitability cutoff would be 0.1. In Table 4, those cells with return responses in the training set is given in Table 7.
ratios greater than 0.1 are shown in bold. Those cells with ratios at 0.1 If M = 5, this model predicts above average response. There is a
or higher with support (number of observations) below 50 are indi- dominance relationship imposed, so that cells 542 and better, 532
cated in italics. They are of interest because their high ratio may be spu- and better, 522 and better, 512 and better, 452 and better, 442 and
rious. The implication is fairly self-evident — seek to apply promotion to better, and 433 and better are predicting above average response.
those cases in bold without italics. The idea of dominance can also be Cells 422, 414, and 353 have above average training response, but
applied. The combinations of predicted success for different training cells with superior R or F ratings have below average response, so
cell proportions are given in Table 5. these three cells were dropped from the above average response
The RFM model from the Excel spreadsheet model yields predictive model. The prediction accuracy ((13,897 + 734) / 20,000) for this
model performance shown in the Appendix A for the line Basic on 0.1 model was 0.732 (see the balance on 0.1 row in the Appendix A). In
(because the cutoff used was a proportion of 0.1) along with results this case, balancing cells did not provide added accuracy over the
from the other models. This model was correct (13,961+ 1337 = basic RFM model with unbalanced cells. Using the cutoff rate of 0.5,
15,298) times out of 20,000, for a correct classification rate of 0.765. the model is equivalent to predict the combination of R = 5, F =4 or 5,
The error was highly skewed, dominated by the model predicting and M= 4 or 5 as responding and all others not. This model had a cor-
4113 observations to be 0 that turned out to respond. An alternative rect classification rate of 0.894, which was inferior to the degenerate
model would be degenerate — simply predict all observations to be 0. case. For this set of data, balancing cells accomplished better statistical
This would have yielded better performance, with 18,074 correct re- properties per cell, but was not a better predictor.
sponses out of 20,000, for a correct classification rate of 0.904. This Since F is highly correlated with M (0.631 in Table 1), the analysis
value could be considered a par predictive performance. This data is in- is simplified to one dimension. Dividing the training set into groups of
cluded in the Appendix A, where we will report results of all further 5%, sorted on V, generates Table 8.
models in terms of correct classification.
Increasing the test cutoff rate leads to improved models. We used
Table 3
increasing cutoffs of 0.2, 0.3, 0.4, and 0.5, yielding the results indicat-
Count by RFM cell – training set.
ed in the Appendix A. Only the model with a cutoff rate of 0.5 resulted
in a better classification rate than the degenerate model. In practice, RF R F M1 M2 M3 M4 M5
the best cutoff rate would be determined by financial impact analysis, 55 R 12–305 F 6+ 0 0 16 151 1761
reflecting the costs of both types of errors. Here we simply use classi- 54 F 4–5 2 18 118 577 1157
fication accuracy overall, as we have no dollar values to use. 53 F 3 9 94 363 756 671
52 F 2 142 616 1012 1135 559
The correlation across F and M (0.631 in Table 1) can be seen in 51 F 1 2425 1978 1386 938 387
Table 3, looking at the R = 5 categories. In the M = 1 column of 45 R306–687 F 6+ 0 1 11 101 1018
Table 3, F entries are 0 for every F5 category, usually increasing 44 F 4–5 0 16 87 510 927
through M = 2 through M = 5 columns. When F = 5, the heaviest 43 F 3 6 88 316 699 636
42 F 2 150 707 1046 1140 616
density tends to be in the column where M = 5. This skewness is
41 F 1 2755 2339 1699 1067 416
often recognized as one of the problems with RFM [13,27,31]. Our ap- 35 R688–1290 F 6+ 0 1 5 70 799
proach to this issue was through more equal density (size-coding) to 34 F 4–5 1 16 122 420 832
obtain data entries for all RFM cells. We accomplished this by setting 33 F 3 9 88 319 706 589
cell limits by count within the training set for each variable. We 32 F 2 163 697 1002 1128 645
31 F 1 2951 2567 1645 1078 437
25 R1291–1943 F 6+ 0 0 9 56 459
24 F 4–5 0 22 72 372 688
Table 1 23 F 3 9 95 290 678 501
Variable correlations. 22 F 2 211 749 1096 1128 561
21 F 1 3377 2704 1660 1108 478
R F M Ordered
15 R 1944+ F 6+ 0 0 3 22 170
R 1 14 F 4–5 1 11 74 243 409
F −0.192⁎⁎ 1 13 F 3 9 122 261 511 380
M −0.136⁎⁎ 0.631⁎⁎ 1 12 F 2 268 878 1108 995 522
Ordered −0.235⁎⁎ 0.241⁎⁎ 0.150⁎⁎ 1 11 F 1 4145 3177 1641 908 449
⁎⁎ Totals 16,623 16,984 15,361 16,497 16,067
Correlation is significant at the 0.01 level (2-tailed).
D.L. Olson, B.(K.) Chae / Decision Support Systems 54 (2012) 443–451 447

Table 4 Table 6
Response ratios by cell. Balanced group cell densities—training set.

RF R F M1 M2 M3 M4 M5 RF M1 M2 M3 M4 M5

55 R 12–306 F 6+ – – 0.687 0.563 0.558 55 186 185 149 223 187


54 F 4–5 0 0.500 0.415 0.426 0.384 54 185 186 185 185 186
53 F3 0.111 0.426 0.342 0.381 0.368 53 187 185 188 186 187
52 F2 0.296 0.289 0.281 0.283 0.256 52 184 184 185 184 185
51 F1 0.173 0.196 0.201 0.158 0.152 51 186 187 186 187 186
45 R307–687 F 6+ – 0 0.273 0.238 0.193 45 268 265 270 289 246
44 F 4–5 – 0.125 0.092 0.112 0.123 44 269 269 268 274 264
43 F3 0 0.091 0.082 0.089 0.101 43 272 267 280 251 296
42 F2 0.060 0.075 0.069 0.081 0.078 42 263 263 265 245 283
41 F1 0.047 0.049 0.052 0.053 0.041 41 268 261 261 259 277
35 R688–1286 F 6+ – 1.000 0 0.100 0.125 35 331 330 349 316 330
34 F 4–5 0 0.063 0.107 0.107 0.103 34 324 325 322 325 324
33 F3 0.111 0.023 0.066 0.059 0.075 33 332 331 329 332 335
32 F2 0.049 0.047 0.061 0.063 0.060 32 330 330 330 331 330
31 F1 0.030 0.031 0.029 0.026 0.021 31 323 324 323 326 324
25 R1287–1943 F 6+ – – 0.111 0.054 0.078 25 733 730 735 737 733
24 F 4–5 – 0.091 0.028 0.065 0.060 24 735 736 735 737 734
23 F3 0 0.053 0.048 0.049 0.064 23 747 746 751 749 748
22 F2 0.043 0.020 0.039 0.041 0.039 22 705 704 707 704 707
21 F1 0.018 0.021 0.018 0.020 0.019 21 731 733 730 735 732
15 R 1944+ F 6+ – – 0.000 0.045 0.041 15 1742 1746 1739 1740 1744
14 F 4–5 0 0.091 0.024 0.025 0.039 14 1718 1715 1713 1713 1716
13 F3 0.111 0.041 0.050 0.033 0.053 13 1561 1809 1689 1675 1684
12 F2 0.019 0.046 0.036 0.031 0.044 12 1768 1775 1771 1779 1762
11 F1 0.021 0.015 0.016 0.020 0.016 11 1830 1831 1832 1824 1839

Lift is the marginal difference in a segment's proportion of re- A logistic regression model was run on RFM variables. The model
sponse to a promotion and the average rate of response. Target cus- results were as shown in Table 9. The beta values of R and F are
tomers are identified as the small subset of people with marginally found to be significant.
higher probability of purchasing. Lift itself does not consider profit- Note that F was not included at all. This is explainable by the high
ability. In practice, this needs to be considered. For our purposes, we correlation between M and F, and the dominance of R in obtaining a
demonstrate without dollar values (which are not available), noting better fit. This model did very well on the test data, with a correct
that the relative cost of marketing and expected profitability per seg- classification rate of 0.984.
ment will determine the optimal number of segments to market. The neural network model used a popular architecture called mul-
Fig. 2 shows lift by value ratio. tilayer perceptron (MLP) [10]. This model built a hidden layer. The
In Fig. 2, the most responsive segment has an expected return of rate of neural network was 0.911, as shown in the Appendix A.
slightly over 20%. The lift line is the cumulative average response as Another performance measure used in this study is gains, which is a
segments are added (in order of response rate). useful tool for evaluating the value of predictive models in direct mar-
Using the value ratio as a predictive classifier, the training data keting [21]. We use gains to compare the performance of RFM score
was used to identify cells with better responses. Model fit is shown model and classical data mining-based predictive models. Table 11
in the Appendix A in the row value function. This model has a correct
classification rate of 0.721. This is inferior to a degenerate model that Table 7
would simply classify all cases as no response, indicating that the Training set proportion of responses by cell.
value function was non-productive in this case. While it is easier to
RF M1 M2 M3 M4 M5
manipulate than the RFM model, in this case the fit was inferior to
the basic RFM model. 55 0.129 0.178 0.101 0.673 0.818
54 0.059 0.118 0.189 0.541 0.629
Data mining is rich in classification models [2,23]. Three classical
53 0.064 0.130 0.287 0.392 0.647
data mining classification models were applied to the data: logistic 52 0.076 0.103 0.200 0.424 0.605
regression, decision trees, and neural networks. We next applied 51 0.054 0.102 0.274 0.406 0.527
these three basic data mining algorithms using SPSS. 45 0.037 0.109 0.141 0.211 0.378
44 0.041 0.108 0.116 0.281 0.417
43 0.033 0.052 0.125 0.072 0.483
42 0.049 0.118 0.098 0.073 0.544
41 0.045 0.038 0.092 0.116 0.531
35 0.045 0.067 0.138 0.060 0.458
Table 5 34 0.052 0.043 0.059 0.080 0.448
Basic RFM models by cutoff. 33 0.042 0.048 0.058 0.093 0.433
32 0.027 0.045 0.058 0.097 0.379
Cutoff R F M
31 0.050 0.040 0.062 0.080 0.414
0.1 R=5 Any Any 25 0.037 0.051 0.056 0.084 0.254
R=4 F=5 M = 3, 4, or 5 24 0.024 0.046 0.052 0.076 0.309
R=3 F=4 M = 4 or 5 23 0.051 0.047 0.055 0.080 0.273
F=3 M=5 22 0.027 0.040 0.055 0.068 0.246
F = 4 or 5 M = 3, 4, or 5 21 0.027 0.038 0.048 0.076 0.242
0.2 R=5 F = 2, 3, 4, or 5 Any 15 0.017 0.021 0.025 0.051 0.146
R=4 F=5 M = 3, 4, or 5 14 0.016 0.017 0.033 0.054 0.167
0.3 R=5 F = 3, 4, or 5 M = 2, 3, 4, or 5 13 0.010 0.019 0.034 0.052 0.156
0.4 R=5 F = 4 or 5 M = 2, 3, 4 or 5 12 0.018 0.021 0.036 0.043 0.137
0.5 R=5 F=5 M = 3, 4, or 5 11 0.016 0.022 0.014 0.044 0.154
448 D.L. Olson, B.(K.) Chae / Decision Support Systems 54 (2012) 443–451

Table 8 Table 9
V values by cell. Regression betas for logistic regression.

Cell Min V UL Hits N Success Variable Beta Significance

1 0.0000 4077 91 4076 0.0223 Constant −1.5462 0.05


2 0.0063 8154 69 4077 0.0169 R −0.0015 b0.05
3 0.0097 12,231 116 4077 0.0285 F 0.2077 b0.05
4 0.0133 16,308 109 4077 0.0267 M −0.0002
5 0.0171 20,385 120 4077 0.0294
6 0.0214 24,462 119 4077 0.0292
7 0.0263 28,539 151 4077 0.0370 Difficulties arose in balancing cells due to F being only a few inte-
8 0.0320 32,616 174 4077 0.0427 ger values (1, 2, 3, 4, 5 +) and highly skewed, letting a majority of the
9 0.0388 36,693 168 4077 0.0412
data assigned into F group1.
10 0.0472 40,770 205 4077 0.0503
11 0.0568 44,847 258 4077 0.0633 Fig. 3 displays the lift chart for the V models. The lift chart shows that
12 0.0684 48,924 256 4077 0.0628 the 5% of cases with the most likely response is much more likely to re-
13 0.0829 53,001 325 4077 0.0797 spond than the least responsive 50%. The proportion of responses in the
14 0.1022 57,078 360 4077 0.0883 test set for the 5% highest training set V scores had a response ratio of
15 0.1269 61,155 408 4077 0.1001
0.311, compared to less than 0.010 for the worst 50%. We applied different
16 0.1621 65,232 542 4077 0.1329
17 0.2145 69,309 663 4077 0.1626 V levels (0.05 and up; 0.10 and up; 0.15 and up; 0.20 and up; 0.25 and up;
18 0.2955 73,386 827 4077 0.2028 and 0.30 and up). These six models had very consistent results as shown
19 0.4434 77,463 1134 4077 0.2781 in the Appendix A, just slightly inferior to the degenerate model. When
20 0.7885 81,540 1686 4070 0.4143
datasets are highly skewed as this is, with roughly only 5% responding,
Total/avg 7781 81,532 0.0954
the degenerate model becomes very hard to beat.
All three predictive data mining models (DT, LR, NN) were built as
displays cumulative gains for different deciles for RFM score model, de- in Study 1. The result is that those three models are performed equal-
cision tree, logistic regression, and neural networks. This gain-value in- ly in terms of accuracy (0.938), as shown in Appendix A. We also per-
formation is well aligned with the prediction accuracy. The predictive formed the gain analysis reported in Table 14. The predictive models
response models based on decision, logistic regression, and neural net- using decision tree, logistic regression, and neural networks out-
work significantly outperformed RFM score model. For example, if only performed the RFM Score model. The performance gap is more significant
10% of the total customers are selected for direct marketing promotion, when a small sample size (e.g., 20%) is chosen for donor solicitation.
the RFM-based predictive model can include only 32.1% of actual re-
spondents in that sampling customer group, those of logistic and neural
5. Discussion and conclusion
network are 38.7% and 42.9% respectively. With selecting a group of
only 20% of customers, the decision tree-based predictive model can in-
Marketing professionals have found RFM to be quite useful
clude almost 95% of actual buyers.
[17,18,35,43], primarily because the data is usually at hand and the
For Study 1, we used J48, one of the most popular decision tree algo-
technique is relatively easy to use. However, previous research
rithms. The J48 decision tree algorithm using 10 fold cross-validation
[10] was applied to the dataset. The resultant decision tree was as Table 10
shown in Table 10. J48 Decision Tree.

R M Yes Total P (yes) P (no) Conclusion Error


4.2. Study 2 0–36 1 1 1.000 Yes
37–152 41 619 0.066 0.934 No 41
An initial correlation analysis was conducted, showing that there 153 605 606 0.998 0.002 Yes 1
was significant correlation among these variables, as shown in Table 12. 154–257 53 1072 0.049 0.951 No 53
258–260 449 500 0.898 0.102 Yes 51
F and M appear to have a strong correlation [45]. R and F appear to 261–516 0 2227 0.000 1.000 No
be strong predictors for customer response [1]. Table 13 shows RFM 517–519 119 144 0.826 0.174 Yes 25
limits for this dataset and cell counts. 520–624 0 1219 0.000 1.000 No
We built an RFM model by following the same procedures de- 625 206 227 0.907 0.093 Yes 21
626–883 0 2047 0.000 1.000 No
scribed in Study 1. An RFM model using a cutoff rate of 0.1 was
884 51 68 0.750 0.250 Yes 17
built on half of the dataset, and tested on the other half. This yielded 885–989 0 1116 0.000 1.000 No
a model with a correct classification rate of 0.662, as reported in the 990 135 160 0.844 0.156 Yes 25
Appendix A. This was far worse than any of the other models tested. 991–1248 0 1773 0.000 1.000 No
1249 31 37 0.838 0.162 Yes 6
1250–1354 0 985 0.000 1.000 No
1355 85 108 0.787 0.213 Yes 23
1356–1612 0 1290 0.000 1.000 No
1613–1614 17 28 0.607 0.393 Yes 11
1615–1720 0 786 0.000 1.000 No
1721 36 36 1.000 0.000 Yes
1722–2084 14 1679 0.008 0.992 No 14
2085–2086 18 18 1.000 0.000 Yes
2087–2343 0 831 0.000 1.000 No
2344–2345 7 7 1.000 0.000 Yes
2346–2448 0 404 0.000 1.000 No
2449–2451 M > 44 21 24 0.875 0.125 Yes 3
M b =44 8 12 0.667 0.333 No 8
2452–2707 0 665 0.000 1.000 No
2708–2710 3 5 0.600 0.400 Yes 2
2711+ 26 1306 0.020 0.980 No 26
Total 1926 20,000 0.096 0.904 327
Fig. 2. Lift by value ratio cell.
D.L. Olson, B.(K.) Chae / Decision Support Systems 54 (2012) 443–451 449

Table 11
Gains.

10% 20% 30% 40% 50%

RFM score 32.12 49.83 62.24 72.26 81.05


LR 38.79 61.67 70.67 79.01 84.85
DT 89.62 95.67 96.94 98.21 99.48
NN 42.95 60.28 70.21 79.12 84.80

suggests that it is easy to obtain a stronger predictive customer re-


sponse model with other data mining algorithms [e.g., 1, 19, 20, 24].
RFM has consistently been reported to be less accurate than other
forms of data mining models, but that is to be expected, as the origi-
nal RFM model segmenting customers/donors into 125 cells and is
prescriptive rather than predictive. Fig. 3. Lift chart for study 2.
That expected result was confirmed in this research. RFM helped nice-
ly structure millions of records in each dataset into 125 groups of cus- consider external variables in addition to R, F, and M. Here, we ap-
tomers using only three variables. The model offers a well-organized plied them to these three variables without adding other explanatory
description of people based on their past behaviors, which helps mar- variables. All three model types did better than the degenerate case,
keters effectively identify valuable customers or donors and develop a or any of the other variants we applied.
marketing strategy. However, this descriptive approach is less accurate The best overall predictive fit was obtained using the decision tree
in predicting future behavior than more complex data mining models. model. This model also outperformed other predictive models in cu-
There have been proposed improvements to RFM. In the models mulative gains in both studies. Decision tree tends have advantages
seeking to improve RFM, our study showed that increasing the cutoff over low dimensionality datasets [34] like those used in this research.
limit will lead to improvement in prediction accuracy. However, RFM This characteristic of decision tree may explain this result. Thus, we
models at any cutoff limit have trouble competing with degenerate do not contend that decision tree will always be best. However,
models. Degenerate models have high predictive accuracy for highly there is a major relative value for decision trees that they provide
skewed datasets, but provide no benefit as they simply conclude it an easily understandable model. For example, Table 10 presents the
is not worth promoting to any customer profile. decision tree rule sets obtained in Study 1, which amounts to enu-
Balancing cell sizes by adjusting the limits for the three RFM vari- merating ranges of R that had high densities of response. There was
ables is sound statistically, but did not lead to improved accuracy in only one range where M was used (R = 2449 to R = 2451). And
our tests. In both Study 1 and Study 2, the basic RFM model signifi- looking at Table 10, the fit would have been improved if the decision
cantly underperformed other predictive models, except the V func- tree had not differentiated and called all of these cases Yes. (There
tion model in Study 1. These results indicate that balancing cells would have been 4 fewer errors out of 20,000, yielding essentially
might help improve fit, but involves significant data manipulation the same fit with the same correct response of 0.984.) There is the
for very little predictive improvement in the data set we examined. downside for decision trees that they often overfit the data (as they
Using the V ratio is an improvement to RFM that is useful in theo- did in Table 10), and can yield an excessive number of rules for
ry, but in our tests the results are mixed. In Study 1, the technique did users to apply. Table 15 presents a comparison of methods based on
not provide better predictive accuracy. In Study 2, it did yield an im- inferences from our two studies.
proved classification rate but underperformed the degeneracy model. While our study uses predication accuracy along with cumulative
Thus, this technique deserves a further inquiry. Overall, the results gains for model comparison, in practice the type of error can be consid-
above indicate that some suggested alternatives to the traditional ered in terms of relative costs, thus enabling influence on profit. For ex-
RFM have limitations in prediction. ample, our study shows that increasing the cutoff level between
The primary conclusion of our study, as was expected, is that clas- predicting response or not can improve correct classification. However,
sical data mining algorithms outperformed RFM models in terms of a more precise means to assess this would be to apply the traditional
both prediction accuracy and cumulative gains. This is primarily be- cost function reflecting the cost of the two types of error. This is to be
cause decision tree, logistic regression, and neural networks are a consideration in evaluating other predictive models as well. Thus, spe-
often considered the benchmark “predictive” modeling techniques cific models should be used in light of these relative costs.
[4,10,29,42]. The demand of predictive modeling or analytics is in The good performance of those data mining methods (particularly
high demand in many industries [9], including direct marketing [4]. decision tree), in terms of prediction accuracy and cumulative gains,
This implies that marketers can make more effective marketing deci- indicates that three variables (R, F, and M) alone can be useful for
sions by embracing advanced predictive modeling techniques, be- building a reliable customer response model. This echoes the impor-
sides popular descriptive models. It often is the case that decision tance of RFM variables in understanding customer purchase behavior
tree, logistic regression, and neural networks vary in their ability to and developing response models for marketing decisions [17,18,33,35].
fit specific sets of data [34]. Furthermore, there are many parameters Previous research [e.g., 1] also shows that inclusion of non-RFM attributes
that can be used with neural network models and decision trees. All (e.g., income) is likely to slightly improve the model performance.
three of these model types have the advantage of being able to

Table 13
Table 12 RFM boundaries.
Variable correlations.
Factor Min Max Group 1 Group 2 Group 3 Group 4 Group 5
R F M Response R 1 4950 2811+ 1932–2811 935–1932 257–935 1–257
R 1 Count 220,229 219,411 220,212 219,503 219,654
F −0.237⁎⁎ 1 F 1 1027 1 2 3 4 5+
M −0.125⁎⁎ 0.340⁎⁎ 1 Count 599,637 190,995 95,721 57,499 155,157
Response −0.266⁎⁎ 0.236⁎⁎ 0.090⁎⁎ 1 M 0 100,000 0–9 10–24 25–39 40–89 90+
⁎⁎
Count 248,639 343,811 77,465 209,837 219,257
Correlation is significant at the 0.01 level (2-tailed).
450 D.L. Olson, B.(K.) Chae / Decision Support Systems 54 (2012) 443–451

Table 14 Appendix A. Comparative model results — Study 1


Gains.

10% 20% 30% 40% 50%

RFM score 40.38 62.39 84.66 95.63 97.90


LR 43.24 66.22 86.10 95.75 99.75 Model Actual Actual Correct Overall
DT 44.68 70.75 87.41 96.63 97.96 no response, response, response correct
NN 43.64 67.58 86.12 95.75 99.77 model response model no classification
response

Degenerate 0 1926 18,074 0.904


However, a sophisticated model with too many variables is not very ef-
Basic RFM on 0.1 4113 589 15,298 0.765
fective for marketing practitioners [19] and reducing variables is impor- Basic RFM on 0.2 1673 999 17,328 0.866
tant for practical use of predictive models [28]. Marketers should be Basic RFM on 0.3 739 1321 17,940 0.897
aware of this tradeoff between a simple model (with fewer variables) Basic RFM on 0.4 482 1460 18,058 0.903
and a sophisticated model (with a large number of variables) and devel- Basic RFM on 0.5 211 1643 18,146 0.907
Balance using 0.5 1749 379 17,872 0.894
op a well-balanced model using their market and product knowledge.
Value function 623 4951 14,426 0.721
To repeat the contributions of this paper given in the Introduction, Logistic regression 1772 91 18,137 0.907
we have demonstrated how RFM models and variants can be Neural network 119 1661 18,220 0.911
implemented. RFM models have the relative advantage that they are Decision tree 185 142 19,673 0.984
simple in concept, and thus understandable to users. However, they
can easily be improved in terms of predictive accuracy (or profitabil-
ity, given situational data) by using classical data mining models. Of
these traditional data mining models, decision trees are especially at- Comparative model results — Study 2
tractive in that they have easily understood output. These advanced
predictive models are much beneficial in the practice of direct marketing
since they can use only three behavioral input variables and generate
the results significantly better than the traditional RFM model and Model Actual Actual Correct Overall correct
other variants. no response, response, response classification
model response model no response

Degenerate 0 34,598 515,123 0.9371


Basic RFM 4174 181,357 364,190 0.6625
Table 15
Value 6212 30,418 513,091 0.9334
Comparison of methods.
function >5
Model Relative Relative Inferences Value 3344 31,830 514,547 0.9360
advantages disadvantages function > 10
Value 2296 32,475 514,950 0.9367
Degenerate Tends to have Mindless If cost of missing good function > 15
high Simply says no responses is low, don't do Value 1712 32,867 515,142 0.9371
accuracy when Provides no anything function > 20
outcome highly marginal value Value 1400 33,136 515,185 0.9372
skewed function > 25
Basic RFM Widely used Predictive Can do better using Value 1153 33,330 515,238 0.9373
Data readily accuracy conventional data mining function > 30
available consistently weak (RFM implicitly a special Logistic 821 32,985 515,915 0.9385
Software case) regression
obtainable Neural network 876 32,888 515,957 0.9386
RFM with Better statistical May not actually Not worth the trouble Decision tree 393 33,373 515,955 0.9386
balanced practice improve accuracy
data
Value Easy to apply Not necessarily Value function is superior
function (uses 2 more accurate to RFM
of the 3 RFM
variables, so
data readily References
available)
Focuses on [1] B. Baesens, S. Viaene, D. den Poel, J. Vanthienen, Bayesian neural network learning
uncorrelated for repeat purchase modelling in direct marketing, European Journal of Opera-
variables tional Research 138 (2002) 191–211.
Logistic Can get better fit Logistic output Decision trees easier to [2] N. Belacel, H. Raval, A. Punnen, Learning multicriteria fuzzy classification method
regression Can include harder to interpret interpret PROAFTN from data, Computers and Operations Research 34 (7) (2007) 1885–1898.
many variables than OLS for [3] R. Blattberg, B. Kim, S. Neslin, Database Marketing: Analyzing and Managing Cus-
Model managers tomers, Chapter 2 RFM Analysis, Springer, New York, 2008.
[4] I. Bose, X. Chen, Quantitative models for direct marketing: a review from systems
statistically
perspective, European Journal of Operational Research 195 (2009) 1–16.
interpretable
[5] J. Bult, T. Wansbeek, Optimal selection for direct mail, Marketing Science 14 (4)
Neural Can get better fit Output not Decision trees easier to
(1995) 378–394.
network Can include conducive to interpret [6] C. Cheng, Y. Chen, Classifying the segmentation of customer value vis RFM model
many variables interpretation and RS theory, Expert Systems with Applications 36 (3) (2009) 4176–4184.
Can't apply model [7] W. Chiang, To mine association rules of customer values via a data mining procedure
outside of software with improved model: an empirical case study, Exper Systems with Applications 38
used to build model (2011) 1716–1722.
Decision Can get better fit Model may involve Best option, if can control [8] G. Cui, M. Wong, H. Lui, Machine learning for direct marketing response models:
trees Can include an excessive the number of rules Bayesian networks with evolutionary programming, Management Science 52 (4)
many variables number of rules obtained (through (2006) 597–612.
Output easily minimum required [9] T. Davenport, J.G. Harris, R. Morison, Analytics at Work: Smarter Decisions, Better
understandable response parameter) Results, Harvard Business Press, Boston, MA, 2010.
[10] D. Delen, A comparative analysis of machine learning techniques for student re-
by managers
tention management, Decision Support Systems 49 (2010) 498–506.
D.L. Olson, B.(K.) Chae / Decision Support Systems 54 (2012) 443–451 451

[11] E. Eisenstein, L. Lodish, Marketing decision support and intelligence systems: pre- [36] K.A. Smith, J. Gupta, Neural networks in business: techniques and applications for
cisely worthwhile or vaguely worthless? In: in: B. Weitz, R. Wensley (Eds.), Hand- the operations researcher, Computers and Operations Research 27 (11/12)
book of Marketing, Sage Publication, London, 2002. (2000) 1023–1044.
[12] R. Elsner, M. Krafft, A. Huchzemeier, Optimizing Rhenania's main-order business [37] E. Turban, J. Aronson, T. Liang, Decision Support Systems and Intelligent Systems,
through dynamic multilevel modeling, Interfaces 33 (1) (2003) 50–66. 7th ed. Prentice Hall, Upper Saddle River, NJ, 2004.
[13] P. Fader, B. Hardie, K. Lee, RFM and CLV: using iso-value curves for customer base [38] E. Turban, R. Sharda, D. Delen, D. King, Business Intelligence: A Managerial Approach,
analysis, Journal of Marketing Research 42 (4) (2005) 415–430. 2nd ed. Prentice Hall, New York, 2011.
[14] M. Fitzpatrick, Statistical analysis for direct marketers — in plain English, Direct [39] B. Van den Berg, T. Breur, Merits of interactive decision tree building: part 1,
Market 64 (4) (2001) 54–56. Journal of Targeting, Measurement and Analysis for Marketing 15 (3)
[15] R. Gopal, Ad mediation: new horizons in effective email advertising, Communica- (2007) 137–145.
tions of the ACM 19 (1) (2001) 17–30. [40] B. Van den Berg, T. Breur, Merits of interactive decision tree building: part 2: how
[16] M. Hart, Systems for supporting marketing decisions, In: in: F. Burstein, C. to do it journal of targeting, Measurement & Analysis for Marketing 15 (4) (2007)
Holsapple (Eds.), Handbook on Decision Support Systems, 2, Springer, 2008, 201–209.
pp. 395–418. [41] S. Vargo, R. Lusch, Evolving to a new dominant logic for marketing, Journal of
[17] A. Hughes, Strategic Database Marketing, Third ed. McGraw-Hill, New York, 2006. Marketing 68 (2004) 1–17.
[18] J. Jonker, N. Piersma, R. Potharst, A decision support system for direct mailing deci- [42] G. Verhaert, D. Poel, Empathy as added value in predicting donation behavior, Jour-
sions, Decision Support Systems 42 (2006) 915–925. nal of Business Research 64 (2011) 1288–1295.
[19] Y. Kim, W. Street, An intelligent system for customer targeting: a data mining ap- [43] P. Verhoef, P. Spring, J. Hoekstra, P. Leeflang, The commercial use of segmentation
proach, Decision Support Systems 37 (2004) 215–228. and predictive modeling techniques for database marketing in the Netherlands,
[20] N. Levin, J. Zahavi, Predictive modeling using segmentation, Journal of Interactive Decision Support Systems 34 (2002) 471–481.
Marketing 15 (2) (2001) 2–22. [44] B. Wierenga, The past, the present and the future of marketing decision models,
[21] G. Linoff, M. Berry, Data Mining Techniques, Wiley, Indianapolis, 2011. In: in: B. Wierenga (Ed.), Handbook of Marketing Decision Models, Springer,
[22] J. Little, Decision support systems for marketing managers, Journal of Marketing 2008, pp. 3–20.
43 (3) (1979) 9–26. [45] A. Yang, How to develop new approaches to RFM segmentation, Journal of
[23] N. Mastrogiannis, B. Boutsinas, I. Giannikos, A method for improving the accuracy Targeting, Measurement and Analysis for Marketing 13 (1) (2004) 50–60.
of data mining classification algorithms, Computers and Operations Research 36 [46] H. Yun, D. Ha, B. Hwang, K. Ryu, Mining association rules on significant rare data
(10) (2009) 2829–2839. using relative support, Journal of Systems and Software 67 (181–191) (2003).
[24] J. McCarthy, M. Hastak, Segmentation approaches in data-mining: a comparison [47] J. Zahavi, N. Levin, Applying neural computing to target marketing, Journal of Di-
of RFM, CHAID, and logistic regression, Journal of Business Research 60 (6) rect Marketing 11 (1) (1997) 5–22.
(2007) 656–662.
[25] R. McDaniel, Management strategies for complex adaptive systems, Performance David L. Olson is the James & H.K. Stuart Professor in MIS and Chancellor's Professor at
Improvement Quarterly 20 (2) (2007) 21–42. the University of Nebraska. He has published research in over 100 refereed journal ar-
[26] B. McKelvey, Avoiding complexity catastrophe in coevolutionary pockets: strategies ticles, primarily on the topic of multiple objective decision-making and information
for rugged landscape, Organization Science 10 (3) (1999) 294–321. technology. He has authored 17 books, is associate editor of Service Business and
[27] J. Miglautsch, Application of RFM principles: what to do with 1-1-1 customers? co-editor in chief of International Journal of Services Sciences. He has made over 100 pre-
Journal of Database Marketing 9 (4) (2002) 319–324. sentations at international and national conferences on research topics. He is a mem-
[28] P. Naik, M. Hagerty, C. Tsai, A new dimension reduction approach for data-rich ber of the Decision Sciences Institute, the Institute for Operations Research and
marketing environments, Journal of Marketing Research 37 (1) (2000) 88–101. Management Sciences, and the Multiple Criteria Decision Making Society. He was a
[29] E.W.T. Ngai, L. Xiu, D.C.K. Chau, Application of data mining techniques in custom- Lowry Mays endowed Professor at Texas A&M University from 1999 to 2001. He was
er relationship management: a literature review and classification, Expert Sys- named the Raymond E. Miles Distinguished Scholar award for 2002, and was a James
tems with Applications 36 (2) (2009) 2592–2602. C. and Rhonda Seacrest Fellow from 2005 to 2006. He was named Best Enterprise Infor-
[30] C. O'Reilly III, J. Harreld, M. Tushman, Organizational ambidexterity: IBM and emerg- mation Systems Educator by IFIP in 2006. He is a Fellow of the Decision Sciences Insti-
ing business opportunities, California Management Review 51 (4) (2009) 75–99. tute.
[31] D. Olson, Q. Cao, C. Gu, D. Lee, Comparison of customer response models, Service
Business 3 (2) (2009) 117–130.
[32] D.L. Olson, D. Delen, Advanced Data Mining Techniques, Springer, Heidelberg, 2008. Bongsug (Kevin) Chae is Associate Professor in Information & Operations Manage-
[33] P. Rossi, R. McCulloch, G. Allenby, The value of purchase history data in target ment at Kansas State University. He has published papers in such areas as business an-
marketing, Marketing Science 15 (4) (1996) 321–340. alytics, supply chain management, service innovation, and knowledge discovery. He
[34] G. Seni, J. Elder, Ensemble Methods in Data Mining: Improving Accuracy Through has made presentations in universities and global companies in several countries, pri-
Combining Predictions, Morgan & Claypool, 2010. marily on the topic of business analytics and intelligence, supply chain management,
[35] M. Shaw, C. Subramaniam, G. Tan, M. Welge, Knowledge management and data and service innovation. He is a recipient of the Ralph Reitz Teaching Award and a nom-
mining for marketing, Decision Support Systems 31 (2001) 127–137. inee of several other teaching awards at Kansas State University.

You might also like