Advanced Customer Analytics
Advanced Customer Analytics
Journal of Management Information Systems / 2018, Vol. 35, No. 2, pp. 540–574.
Copyright © Taylor & Francis Group, LLC
ISSN 0742–1222 (print) / ISSN 1557–928X (online)
DOI: https://doi.org/10.1080/07421222.2018.1451957
ADVANCED CUSTOMER ANALYTICS 541
Knowledge and Data Engineering, and others. His projects on cyber security, health
analytics, and social media have been funded by the National Science Foundation.
He received the IBM Faculty Award, AWS Research Grant, and Microsoft Research
Azure Award for his work on big data. He serves as senior editor or associate editor
for several journals. His work has been featured in several media outlets.
ABSTRACT: As more firms adopt big data analytics to better understand their customers
and differentiate their offerings from competitors, it becomes increasingly difficult to
generate strategic value from isolated and unfocused ad hoc initiatives. To attain
sustainable competitive advantage from big data, firms must achieve agility in com-
bining rich data across the organization to deploy analytics that sense and respond to
customers in a dynamic environment. A key challenge in achieving this agility lies in
the identification, collection, and integration of data across functional silos both within
and outside the organization. Because it is infeasible to systematically integrate all
available data, managers need guidance in finding which data can provide valuable
and actionable insights about customers. Leveraging relationship marketing theory, we
develop a framework for identifying and evaluating various sources of big data in
order to create a value-justified data infrastructure that enables focused and agile
deployment of advanced customer analytics. Such analytics move beyond siloed
transactional customer analytics approaches of the past and incorporate a variety of
rich, relationship-oriented constructs to provide actionable and valuable insights. We
develop a customized kernel-based learning method to take advantage of these rich
constructs and instantiate the framework in a novel prototype system that accurately
predicts a variety of customer behaviors in a challenging environment, demonstrating
the framework’s ability to drive significant value.
KEY WORDS AND PHRASES: big data, customer acquisition, customer analytics, custo-
mer expansion, data integration, data management, design science, IT strategic
value, relationship marketing, customer retention.
system at the intersection of big data analytics, marketing, and IT strategy. Our
framework enables the development of advanced customer analytics systems that
harness the volume, variety, and velocity of available customer data while assessing
the value of various data sources as well as providing much needed guidance for
data management and integration decisions and investments.
Based on design principles from relationship marketing theory (RMT), our frame-
work consists of (1) a rich and generalizable set of relationship-oriented constructs
that provide insight into customer behaviors; (2) a principled, flexible, versatile
predictive model to extract value from a wide variety of structured and unstructured
relationship-oriented data; and (3) an approach for estimating the contribution of
various constructs for prioritizing data management and integration efforts.
Collectively, our framework is focused on building a data infrastructure for agile
deployment of customer analytics that leverage and improve customer relationships
to drive competitive advantage across a broad range of customer analytics use cases
where reliance on siloed data has impeded insight and business value.
We develop a novel kernel-based machine learning method to serve as the
predictive model in our framework. This method combines a radial basis function
kernel with novel hybrid tree and weighted cross entropy string kernels in a
composite kernel support vector machine (SVM). This innovative method allows
for the principled embedding of theory and domain knowledge into the constituent
kernels; is flexible in incorporating a wide variety of data in tabular, graphical, and
text format; and provides versatile ensemble-like performance across a wide portfo-
lio of customer analytics tasks.
We instantiate the framework in a novel advanced customer analytics prototype
system developed for a major U.S. e-commerce and catalog-based retailer of educational
materials. We evaluate the system and underlying framework using 664,737 actual
customers sampled from the firm. The prototype system implements a portfolio of
customer analytics applications, accurately predicting customer churn, conversion on
specific promotional offers, and lifetime value, all within 30 days of first purchase.
Deployment of this portfolio results in significant value for the firm. We also assess the
value of various potential data constructs relative to data management and integration
costs, offering guidance and justification for investment in data infrastructure that
provides agility for deploying further analytics. Because of our work, our corporate
partner made significant investments in expansion and integration of data sources found
to be most valuable by our prototype, in order to support analytics initiatives.
Our research makes several academic and managerial contributions. Our primary
contribution is the synergistic ecosystem of closely related design science artifacts
that enable the creation of a value-justified infrastructure for the rapid and agile
deployment of a portfolio of advanced customer analytics. Our proposed framework
supports development of advanced customer analytics capabilities, directly addres-
sing the challenges of determining which data to invest in and integrate. The novel
composite kernel SVM we develop supplies a method tailor-made for extracting
insight and value from a rich variety of structured and unstructured relationship-
oriented data. The instantiation of our prototype system demonstrates that our
544 KITCHENS, DOBOLYI, LI, AND ABBASI
Literature Review
IT Strategy and Support of Big Data Customer Analytics
In order to create sustainable strategic value in a competitive landscape characterized
by rapid change, firms must build dynamic capabilities for adapting, integrating, and
reconfiguring resources to match their environment [31]. These dynamic capabilities
provide strategic agility, which allows firms to quickly recognize and capitalize on
opportunities, thereby creating competitive advantage. Sambamurthy et al. [54]
demonstrate that IT competencies, when effectively integrated with overall business
strategy and capabilities, serve as a platform for agility. However, IT investments to
improve agility must be carefully planned and managed, as haphazard or misguided
investment can actually impede agility [38].
A burgeoning opportunity for IT to support competitive action exists in the field of
big data analytics [12]. In the current environment and for the foreseeable future,
analytics represents a primary arena for innovation and competition. The prolifera-
tion of abilities to harness big data to improve decisions, processes, and products has
made such capabilities a basic requirement for survival, with those best equipped to
ADVANCED CUSTOMER ANALYTICS 545
extract value from data achieving significant competitive advantage [18]. To derive
strategic value from analytics, it is important that firms innovate by: (1) moving
from general-purpose to specialized analytics uniquely optimized to address specific
business issues; and (2) eliminating organizational silos to coordinate data sharing
and analytics across functional boundaries [50].
Achieving these two objectives is prohibitively expensive and time-consuming
through ad hoc efforts. Instead, IT must strategically partner with other business
functions and become a proactive advocate and architect for analytics [57].
Specifically, IT departments should provide data governance and infrastructure to
support agile integration of data from multiple sources, organizing “around data as
if it were a valuable organizational asset” [50, p. 15] to foster innovation and sustained
competitive advantage from big data analytics [66]. By creating an infrastructure that
incorporates data across functional silos, “resulting client services are superior, less
susceptible to commoditization, and generate higher revenue” [39, p. 217 (emphasis
added)]. If achieved, this structure can provide a foundation for establishing a portfolio
of specialized yet coordinated analytics initiatives that deliver strategic value.
A key challenge relates to the collection and integration of valuable data across
silos within and outside the organization. Data management has long been consid-
ered a cornerstone of the IT function [23], and “big data’s rise has further amplified
the importance of IT in this role” through challenges and opportunities of exponen-
tially increasing data volume, variety, and velocity [2, p. 2]. Together these aspects
bring into focus the need for data infrastructure investment as well as the potential
value of resulting analytics [5, 13, 22]. However, unfocused data management and
integration can be extremely costly, and benefits are not always sufficient to offset
these costs [24]. Research suggests that “no single integration strategy is optimal in
all cases” [11, p. 89], and an intermediate level of integration is often more
beneficial than complete integration [11, 24, 40]. In order to support big data
analytics through an integrated data infrastructure, organizations must find effective
strategies to assess the value of available data.
Given the wide variety of available sources of relevant data, customer analytics
initiatives stand to gain significantly from such value-driven investments in IT infra-
structure for integration. By building on one of the firm’s most important resources (its
customers) and one of its least imitable (its data), customer analytics represent an
important strategic initiative with the potential to create significant and sustainable
competitive advantage. Customer analytics are enabled by a firm’s customer agility—
the ability to sense opportunities for innovation and respond to those opportunities
with competitive action—and operational agility—the ability to rapidly redesign
processes to exploit marketplace conditions [54]. These dynamic capabilities should
be supported by the synergistic combination of interfunctional business coordination
and IT infrastructure for integrating the right data across the organization [53].
As firms move toward analytics specialization, there are opportunities to create a
diverse portfolio of customer analytics initiatives spanning acquisition, retention,
and expansion in order to optimize customer lifetime value and equity [26]. In order
to be most effective, this portfolio of specialized initiatives should draw from various
546 KITCHENS, DOBOLYI, LI, AND ABBASI
aspects of the organization to incorporate the data most valuable for accomplishing
each individual objective. For this to become feasible, a common framework is
needed for designing customer analytics applications that incorporates business and
IT strategy. With this, firms could build a value-justified infrastructure for supporting
a portfolio of advanced customer analytics capabilities that combine to create
significant strategic value and sustainable competitive advantage.
40], there is a strong need for relationship-oriented analytics that can produce results
in weeks rather than months or years.
Research Gaps
From our analysis of relevant prior literature, we identify three important
research gaps. First, there is a lack of research providing guidance in evaluating
the impact of data management capabilities at a granular level that could inform
prioritization decisions for focused data management and partial integration,
which has been called for in the literature [11, 24]. While some studies have
focused on the strategic value of data management capabilities at the organiza-
tional level or benefits versus related costs of collecting and integrating certain
data for specific purposes [43], none provide a general framework for valuing
data for analytics initiatives in a broader sense. While there have been calls for
this type of research, little has been undertaken. This ability to value various data
sources is imperative for developing an integrated big data infrastructure for the
agile development of analytics.
Second, there is a lack of studies that have taken a comprehensive and holistic
relationship view, as opposed to a siloed transactional view, in predicting custo-
mer behaviors. As discussed, prior studies have largely focused on RFM [20, 55]
or practically whatever data are available [36, 46]. There has even been criticism
of this approach, heretofore unaddressed, pointing out that a vast majority of
customers at many firms have made only a single purchase and are consequently
largely indistinguishable from an RFM perspective, rendering the approach use-
less [41]. As we will discuss later, many studies have evaluated relationship-
oriented constructs associated with customer behaviors using explanatory models
[7, 17], yet few have been utilized for prediction, and few employ a multifaceted
perspective, whether in a predictive or explanatory context. A more holistic and
integrated view of customer relationships would provide more accurate and
broadly applicable analytics.
Third, there is a paucity of literature on how to formally operationalize the
relationship-oriented constructs and data valuation appraisal alluded to in the first
two gaps for the purposes of performing predictive customer analytics. From a
design science perspective, instantiations offer essential prescriptive guidelines and
proofs-of-concept regarding how IT artifacts might be developed to solve an impor-
tant class of problems [28]. Such instantiations move beyond conceptual claims and
offer practical, research-grounded operationalizations that not only advance the
literature but also inform practice.
In this study, we address these gaps. We present a framework specifically focused
on providing a structure for determining the value of individual data sources, both
alone and in combination, in order to help IT managers decide where to invest
resources and how to justify data management and integration efforts in support of
advanced customer analytics. The framework includes the identification of a rich set
548 KITCHENS, DOBOLYI, LI, AND ABBASI
surveys, which makes implementing the resultant models for individual-level predictions
difficult. However, the concepts introduced in this literature are informative in building
predictive analytics for nuanced problems requiring a holistic and integrated view of
customer relationships from a variety of sources. Figure 1 depicts the proposed framework,
including:
1. A rich set of relationship-oriented constructs to guide the identification and
acquisition of valuable data for advanced customer analytics
2. A principled, flexible, versatile predictive model for extracting value from
data constructs
3. Value-based evaluation metrics for action- and outcome-oriented cost/benefit
analysis
4. Construct evaluation leading to value-justified integration and data manage-
ment investments in IT infrastructure to create a foundation of integrated data
for analytics
5. A portfolio of advanced customer analytics capabilities supported by IT-
enabled agility for providing strategic value through enhanced customer
lifetime value and customer equity
550 KITCHENS, DOBOLYI, LI, AND ABBASI
challenge with engagement data is integrating it with other customer data for
analysis. Our framework provides a structure for identifying and valuing such
information, supporting integration efforts.
Satisfaction (and Related Perceptions): As discussed previously, customer satis-
faction is a key, foundational construct in RMT [7]. Many studies have shown that
increasing satisfaction leads to better customer retention and higher customer life-
time value [7, 44]. Beyond satisfaction, other intrinsically related perceptual con-
structs studied include trust, commitment, loyalty, and other perceptions of firm and
environmental attributes [61]. The key challenge with satisfaction and related per-
ceptions is that they can be difficult to measure for individual customers. Most
studies examine these perceptions on a sample basis through surveys, with the intent
of explaining customer relationships or measuring entire market share rather than
making individual-level predictions [17, 44].
Choice: A key observation available from a customer’s first as well as subsequent
purchases is the choice of product(s) purchased. Particularly when a firm sells goods
or services that are horizontally differentiated over differing customer preferences,
understanding the types of products a customer prefers can be useful in several ways.
From an RMT perspective, some have argued that the categories and types of products
purchased by a customer can be an indicator of their level of trust and interest in a
particular company [42]. The amount of variety in products purchased may also
provide a signal, with cross-buying increasing switching costs as consumers become
more aware of the firms offerings and quality [52]. Product choice can also be
informative regarding what products the consumer would be interested in purchasing
in the future, and product assortment is an important factor in retaining customers.
Channel: The channel(s) through which customers are acquired and continue to
interact with a firm provides information about customers and sets the stage for ongoing
relationships. RMT suggests that various channel interactions may impact the loyalty or
connection customers feel to the firm [8]. For instance, a customer calling and speaking
to a representative may develop a stronger relationship with the firm as a result of this
communication. Studies have also suggested that customers acquired through digital
channels tend to be more loyal and active due to self-selection effects and greater
opportunities to form connections with the company [29]. In addition, each channel is
likely to attract a different type of customer [32]: for example, online channels may
attract those who are more technology-savvy and hedonic.
Messaging: From an RMT perspective, the communications the customer receives
from the firm can also have a profound impact on future behavior [17]. Messaging can
often be focused specifically on selling: for instance, by using promotional offers to
entice a purchase. However, relationship-building messaging specifically focused on
enhancing the customer’s view of the firm is effective at retention, yet promotional
messaging, while effective in the short term, can have various effects over time [21].
The mode and quantity of communications received may also impact customer
behavior, with both too little and too much communication being detrimental [59].
Firm and Environment Characteristics: Characteristics of the firm and the market
environment in which it operates set the stage for the customer’s relationship with the
552 KITCHENS, DOBOLYI, LI, AND ABBASI
firm. Brand equity, payment equity/perceived fairness of the firm’s pricing policy, and
firm ethics and citizenship have all been shown to play a significant role in customer
perceptions and relationships with the firm [61]. Outside the firm, customer relation-
ships may be influenced by characteristics of the environment, such as dynamism,
munificence, complexity, competition and characteristics of competing firms, and
market share [8, 62]. These represent important features for incorporation into analytics
applications, but also importantly should inform the entire analytics process.
Particularly, changes in characteristics or perceptions of the firm and environment
should be monitored in order to update analytics solutions for continued relevance.
customer histories (often five or more years) and are less accurate and effective outside
high-purchase-volume environments [6]. Many firms provide goods or services that are
purchased infrequently by nature or attract a large number of single-purchase customers,
and no existing method addresses this challenging case. We propose that advanced
customer analytics based on rich, relationship-oriented customer data will provide an
effective solution in this setting where other methods fall short.
To create our prototype system, we partnered with a company that we will refer to as
Course Shop International (CSI), a large e-commerce and catalog-based seller of educa-
tional materials for lifelong learners. CSI is highly representative of the high single-
purchase, low-frequency environment of interest. A large majority of CSI’s customers
make only a single initial purchase, and of those who return, many months may pass
between purchases. The goal of our prototype system is to make predictions about
customers’ future behaviors just 30 days after their initial purchase, when these predictions
are most valuable. During this period, we observe aspects related to each construct in our
framework. An illustration of the problem setting for our prototype system is provided in
Figure 2. To demonstrate how firms may use our framework to develop a data infra-
structure providing agility in supporting a portfolio of analytics initiatives, our prototype
system consists of three distinct customer analytics applications:
● A churn prediction application focused on evaluating which customers the
firm should invest in through continued marketing efforts (retention)
● A conversion prediction application for identifying customers likely to
respond to individual email promotions to reduce messaging fatigue and
prevent attrition (retention/expansion)
These three applications allow us to evaluate each aspect of our framework and the
strategic value it creates. Here we focus on the churn prediction example. Details of
Results of the combined portfolio are discussed in the evaluation section, while
details of the CLV and conversion prediction tasks are provided in online Appendix
B.
Leveraging the proposed advanced customer analytics framework described pre-
viously, we implemented a novel churn prediction system for CSI. Figure 3 shows
the system diagram, encompassing five stages: Data Lake, Feature Generation,
Data Preparation, Offline Modeling/Evaluation, and Online Implementation. These
five stages are closely aligned with facets of the proposed framework. For instance,
the Data Lake, Feature Generation, and Data Preparation components of the
system relate to the Investigate, Identify, and Acquire Potential Data section of the
framework. Offline Modeling/Evaluation is associated with Predictive Modeling and
Value-Based Evaluation Metrics. Lastly, the Online Implementation component
relates to the framework’s deployment-oriented Online Data portion.
Data
Data Lake: For use in constructing the prototype system, we obtained a variety of data
for a sample of customers making initial purchases between January 1, 2012, and March
1, 2014. As previously alluded to, incorporating rich relationship-oriented constructs
requires consideration of an array of structured and unstructured data sources: a non-
trivial task [13]. The various data sources incorporated in the system include structured
data from databases that support online transaction processing (OLTP) and CRM, as
well as unstructured data in the form of text log files, call center transcripts, and so on,
which are collectively referred to as a “data lake.” The Data Lake was generated by
obtaining these various raw data from CSI for a sample of 664,737 customers. As
depicted in Figure 4, the data lake included over 188 million raw data points.
Feature Generation: Once the data lake was constructed, the Feature Generation
component of the system was used to operationalize relationship-oriented constructs
identified in the framework. In order to simulate a realistic environment for imple-
mentation of the prototype system and avoid data leakage that could inflate accuracy,
we utilize a chronological rolling-window approach for creating training and test
sets. Only the first 30 days of customer history after initial purchase was used to
construct input features, and a period of 365 days to observe outcomes. In total, we
generated 1,003 features pertaining to the various construct categories. Transaction
and Demographics represent readily available baseline constructs. Transaction fea-
tures included order timestamps, amounts, prices, discounts, payment and shipping
methods, as well as purchased product information such as course names and topics.
Feature Data Offline Modeling/ Online
Data Lake
Generation Preparation Evaluation Implementation
Transaction Support (OLTP)
Data Aggregation/
Customers Demographics
De-duplication
Modeling
Iterative Process
Relational Intermediate Stage (dashed line means not
IT Constructs Files/Texts Process
Database Data included in our study)
557
Demographic features included age, gender, income, net worth, education, and
household information. With respect to relationship-oriented constructs, channel
variables were also relatively straightforward to operationalize. These features
describe the channels through which a customer was acquired and/or made an initial
purchase, including acquisition e-mails, call center upsells, paid search, partners,
prospect mailings, radio, social media, and television. However, other relationship-
oriented constructs such as choice, messaging, engagement, and satisfaction neces-
sitated the use of more involved logic and algorithms applied to multiple data
sources (as indicated in Figure 3). We discuss these construct categories in the
remainder of the section.
In a novel operationalization, our choice features focus on the interplay between
what products (and categories of products) a customer has purchased, relative to
what the company is offering and promoting. Figure 5 illustrates how the choice
variables were operationalized using a novel product taxonomy construction to
develop a tree of product categories, subcategories, products, and promotions related
to products purchased by the customer. Whereas prior studies have included only the
specific category of purchase as a variable, our taxonomic representation facilitates
richer contextualization of customer choice in the relationship—enabling enhanced
discriminatory potential. In the example, the purchased course “The Addictive
Brain” is used to generate features such as the number of choices in the same
category, subcategory, and promotions bundles received by the customer.
Engagement variables that are related to how a customer interacts with a company
provide essential intermediate cues regarding the status of the relationship [19]. However,
inclusion in prior studies has typically been limited to service usage in industries such as
telecommunications. In the context of CSI, e-mail and clickstreams were prominent
avenues for customer engagement. Using the message interaction logs in our data lake,
we developed variables related to various engagement actions, including opening, for-
warding, and clicking e-mails, as well as viewing and engaging with the landing pages to
ADVANCED CUSTOMER ANALYTICS 559
which they lead. Utilizing contact preference logs, we also developed engagement vari-
ables related to customer’s current contact preferences, as well as changes in preferences.
Satisfaction features focus primarily on online product reviews provided by
customers. CSI currently has no method for linking review authors to their customer
database, so in order to incorporate these data, a satisfaction approximation method
was employed. Review satisfactions (e.g., ratings, votes) and percentage change in
these measures were aggregated at the product level for products purchased by each
customer during the first 30 days of initial purchase. These contemporary review
characteristics for products purchased by the customer were used as proxy indicators
of satisfaction. As later demonstrated, this approach for overcoming the satisfaction
integration issue provided significant benefit in our system. We also included
measures of satisfaction and other perceptions from surveys and call center logs.
A firm’s messaging to customers can have a profound impact on how customers
perceive the relationship [17, 21] and on future customer behavior. Customers receive a
diverse set of mass and customized physical and digital messaging. Given that the mode,
quantity, and combination of messaging can impact customer behavior [59], we propose
a novel approach for extracting messaging patterns most likely to result in future
customer purchases. Adapting highly efficient new methods [65], we employed asso-
ciation rule mining to generate messaging features from hundreds of millions of messa-
ging records. We generated frequent item sets of messages, retaining only those sets that
culminated in a purchase in order to find messages strongly associated with purchases.
Details regarding this method are provided in online Appendix C. In addition to features
related to these association rule-mined messages, we included various other messaging
related features, including frequency of mail/e-mail, discounts, promoted products,
overlap with categories from a customer’s initial purchase, and so forth.
Data Preparation: Once the feature generation stage was completed, an initial data
matrix encompassing 1,003 variables for each of the 664,737 customers was con-
structed, comprising approximately 667 million values. Various necessary data
preparation steps were undertaken to handle veracity issues. Features or customer
560 KITCHENS, DOBOLYI, LI, AND ABBASI
records with a high volume of missing values were removed. Outlier removal was
applied to eliminate records with abnormal entries (e.g., negative age or no transac-
tion history). Feature weighting by information gain and chi-squared statistics was
used to identify features with zero variance or negligible information. Ultimately,
435 features were retained. Moreover, random undersampling of the majority class
(i.e., churn) was used on the training data to achieve class balance.
Model
Offline Modeling/Evaluation: Kernel-based machine learning methods have garnered
attention from the information systems (IS) community in recent years for their ability
ADVANCED CUSTOMER ANALYTICS 561
to derive patterns from large quantities of heterogenous, noisy, high-velocity data [3, 4].
These methods have outperformed state-of-the-art rule-based, tree-based, Bayesian, and
deep learning models in recent benchmarking studies pertaining to voice-of-the-customer
tasks [4], while simultaneously providing the added benefit of greater transparency than
other big data machine learning methods through greater explanatory potential and
provisions for theory-driven design [2]. In sum, kernel-based methods afford the follow-
ing potential opportunities and benefits for our advanced customer analytics context:
● Principled, theory-driven kernel design by leveraging key customer nuances
elucidated by RMT
● Capability to effectively incorporate different types of customer–firm interaction
patterns manifested in diverse structured and unstructured enterprise data through
use of custom kernels designed for tabular, graphical, and string-based inputs
● Potential to efficiently fuse these diverse custom kernels through a meta-level
composite convolution kernel, providing robust and flexible ensemble-like
performance capabilities across a wide portfolio of customer analytics tasks
The tabular, hierarchical, and textual representations feed into the composite convolu-
tion kernel, composing three underlying kernels: radial basis function (RBF), hybrid tree
(HT), and weighted cross entropy string (WCES). The main intuition guiding our
composite convolution setup is that customer–firm relationships embody dynamic,
multifaceted patterns that may encompass a plethora of manifestations, including
point-value quantifications of consumer decisions and actions, structural representations
of latent preferences, and semantic/stylistic indicators of customer proclivities. A critical
562 KITCHENS, DOBOLYI, LI, AND ABBASI
aspect of kernel-based methods is the kernel matrix comprising the similarity scores
between any two training instances. Next we describe each of the underlying kernels at a
high level (online Appendix D includes additional details).
Using the tabular data representation, the RBF kernel uses a Gaussian classifier to
capture nonlinear “localized learning” patterns from point-value features [10]. The
HT kernel combines two tree methods [15]: a novel weighted shortest path tree
approach and a subtree method, both utilizing the hierarchical product line taxon-
omy. The shortest path approach utilizes two large global probabilistic trees encom-
passing all product purchases for positive and negative class cases observed in the
training set (e.g., churn and return). Link strengths between tree nodes are propor-
tional to product co-occurrence likelihood across all class instances in the training
set. Any two customers in the training set are compared on both trees by computing
the shortest paths between all nodes in their respective trees. To account for
potentially nuanced complementary/substitutive relations between product purchases
across customers, a purchase co-occurrence matrix is used to weight similarities. In
Figure 6, the Weighted Shortest Path illustration depicts an example involving two
customers (X and Y nodes), each with a two purchases (leaf nodes are products,
nonleaf nodes categories).
The second half of the HT kernel utilizes a labeled subtree method. Whereas the
shortest path approach is well-suited to capture probabilistic similarity between custo-
mer preferences, subtrees are effective in incorporating structural taxonomic similarity
[15] such as commonalities between customer preferences for certain specific cate-
gories, or category breadth versus depth similarities. Instead of relying on global trees,
the subtree method only compares the customer purchase trees. For instance, in the
Labeled SubTree example shown in Figure 6, X and Y gray customers each purchased
three items, including one common item (#3). The subtree method compares all unique
subtrees from each customers’ purchase tree. Similarity between any two customers is
computed as the proportion of matching subtrees from their respective purchase trees. In
both the shortest path and subtree kernels, we also incorporate various constructs from
the tabular representation for each customer order item (i.e., a single block).
WCES is a novel string kernel [37] geared toward uncovering semantic and stylistic
customer tendencies hidden in text descriptions or logs. All training texts are tokenized
separately within each of the four categories of text, and unigram tokens are weighted
using the information gain heuristic. Due to class imbalance, undersampling, and use of a
single training set, we also incorporate a feature stability heuristic to as part of our token
weights to alleviate overfitting [33]. These weights are used as part of a cross entropy-
based customer similarity scoring mechanism. Cross entropy has been shown to be
effective in identifying commonalities in unstructured data distributions [30]. In our
context, it can help identify hidden commonalities in customer proclivities based on
purchase of introductory versus advanced products or certain styles or genres of offerings,
specific sticky promotional language, specialized purchase use cases, subtle communica-
tion and interaction indicators, and so forth. As depicted in Figure 6, for each customer, we
randomly extract a predefined number of text windows of certain length (e.g., 50
ADVANCED CUSTOMER ANALYTICS 563
characters) for each of the four textual representations. All such window pair combinations
between any two customers are compared using a weighted similarity comparison.
At the composite kernel stage, the RBF, HT, and WCES customer similarity kernel
matrices are fused using multiple kernel learning to allow robust predictive power
across an array of customer acquisition, retention, and expansion-related customer
analytics tasks. We employ this composite convolution kernel SVM model within
our prototype system to predict customer behaviors across a portfolio of applica-
tions. For comparison we also examined more traditional machine learning algo-
rithms, including stochastic gradient boosting, CART, boosted generalized linear
model, C5.0, random forest, and naive Bayes. Because of its ability to incorporate a
richer set of structured and unstructured data, the custom kernel SVM solution
described here significantly outperformed these methods. However, we found that
the more important factor for performance was the variety of data constructs
provided to the models. For further details, see online Appendix E.
Evaluation Metrics: To evaluate our system as well as determine the value of
individual data constructs, we created evaluation metrics based on the value of
actions taken as a result of model predictions. For the task of churn prediction, the
proposed action to be taken based on the model is to discontinue marketing physical
catalog mailings to the top 10 percent of customers most likely to churn. According
to CSI, the average cost of catalog mailings over the life of a customer is $66, and
the average lifetime revenue from a customer who does not churn after the first
purchase is $260. The costs and benefits of system-driven actions for the CLV and
conversion tasks are detailed in online Appendix B. Applied across an expected
population of 250,000 new customers each year, these figures are used to arrive at
the overall value for our system as well as each individual construct category.
Online Implementation: After building the prototype system, we delivered the system
and our results to CSI. They are currently testing to validate expected outcomes for
customers and creating a plan to implement the system in a production capacity. In
addition, based on the results of construct evaluation discussed next, CSI has already
made investments in infrastructure for data management and integration (discussed at
the end of the following section), which provides support for the core benefit of our
framework
Evaluation
Through evaluation of our prototype system implementation, we demonstrate the
utility as well as economic value of our contributed artifacts [49]. In the setting of
CSI, where customer outcomes are highly uncertain and costs of serving customers
are high, early predictions of customer behaviors are critical. Therefore, we use only
customer characteristics observable within the first 30 days of an initial purchase as
inputs to system predictions.2 To ensure realistic results, we utilized chronological
evaluation with nine-month rolling windows for training, and one-month windows
for capturing test observations.
564 KITCHENS, DOBOLYI, LI, AND ABBASI
improves the model accuracy. Engagement, choice, and messaging appear to be the
most valuable constructs from the add-in perspective, improving accuracy signifi-
cantly over the add-in satisfaction model, which, in turn, provides a significant
improvement over the add-in channel. The add-in comparisons are useful for
providing initial direction to determine which single construct could add the most
value if focused on first. From the leave-out perspective, engagement, satisfaction,
and messaging are the most important constructs. Depending on costs of data
management, integration, and real-time feature engineering of various constructs,
the model could be further tested with various construct subsets to determine the set
providing optimal value. It is important to note that all of the leave-out models
significantly outperform all of the add-in models, pointing to significant synergy
among the relationship-oriented constructs in making accurate predictions.
included in their models based on data available for CSI. For instance, satisfaction
constructs were operationalized through online reviews contemporary to a customer’s
initial purchase, and engagement was operationalized through e-mail open and click
rates, as well as opt-in or opt-out of communications from CSI, as described previously.
Each model was evaluated using the same windowing strategy as the prototype system.
As shown in Table 2, the prototype system outperforms each of the benchmark
models by a wide margin. Based on z-tests for differences in model accuracies for
the top 10 percent most likely churners, the prototype system has significantly higher
accuracies than all other models at a significance of p < .001. The improvement of
the prototype over existing models is also highly economically significant, with net
savings of 7.0 percent of total marginal marketing spending, as compared to 2.9
percent to 4.8 percent saved by other models. This is despite the fact that the
benchmark models comprise leading approaches from marketing, IS, and machine
learning disciplines.
The CLV and conversion applications are detailed in online Appendix B. Table 3
demonstrates the total value achieved across the analytics portfolio as a whole, given
the inclusion or exclusion of various construct categories. Along with satisfaction and
engagement, which provided value to the churn prediction model, choice and messa-
ging are shown to have significant value across the portfolio of analytics applications.
This analysis illustrates how our proposed framework may be used to develop a
platform for agility in deploying customer analytics to achieve strategic value through
big data.
If, for example, investments in infrastructure were made to integrate all available
constructs into a feature generation pipeline for live deployment of the full model,
the estimated annualized value provided by the analytics portfolio would be
$5,702,675. This value (and that of future analytics) may be compared with the
costs of the infrastructure investment as a whole, as well as for each data construct/
source. For instance, if it is determined that the cost of obtaining and integrating data
for the engagement construct exceeds $708,100 (per annum), the firm may choose to
implement a system without it, resulting in a reduced $4,994,575 annualized value.
In addition to the various options shown in Table 3, other construct subsets may be
tested to choose the best complement of data. This evaluation provides value-based
justification for IT investment in data management and integration efforts to support
analytics initiatives.
Results Discussion
The instantiation of a portfolio of advanced customer analytics applications demon-
strates how our framework may be leveraged to develop capabilities for agility
through big data analytics and create strategic value and sustainable competitive
advantage in a dynamic market environment. By creating a value-justified infra-
structure for data integration and management to support advanced customer analy-
tics, firms can create a portfolio of analytics applications to serve a variety of
strategic purposes, providing significant value. The applications in our portfolio
combine to generate nearly $6 million annualized value and represent only a small
fraction of the opportunity for deploying analytics from this infrastructure to
strengthen and leverage customer relationships.
Evaluation of the system demonstrates that advanced customer analytics
systems built on relationship-oriented data from a variety of sources can
accurately predict customer behavior and add value. The prototype system
for churn prediction significantly outperformed each of the leading benchmarks
for comparison, including the well-adopted BG/NBD approach, evidencing the
impetus for relationship-oriented advanced customer analytics supported by IT-
enabled data infrastructure. Further, in addition to our system based on the
ADVANCED CUSTOMER ANALYTICS 569
Conclusion
In this study, we present a framework for designing advanced customer analytics
solutions based on relationship-oriented constructs. This work encompasses three
key contributions. First and foremost, we contribute to the design science literature
in the creation of a synergistic ecosystem of novel IT artifacts for performing
advanced customer analytics in the era of big data [28]. Our framework provides
guidance for the agile development and deployment of advanced customer analytics
solutions that predict customer behavior and inform strategic business decisions.
Following guiding principles from our framework, we develop a novel kernel-based
machine learning method that is custom-designed to extract insight and value from a
rich variety of relationship-oriented data constructs. We also provide a prototype
system instantiation of a portfolio of advanced customer analytics applications for a
firm with high proportions of single or infrequent purchase customers, a problem
that cannot be addressed by the siloed approaches of existing customer analytics
methods [6, 41]. The system represents a rigorous proof-of-concept, while its results
offer practical relevance [28]. We show that this system enables significant strategic
value, contributing nearly $6 million in estimated annualized benefits. This value
will increase with additional analytics supported by the value-justified data infra-
structure informed by our framework.
Second, we contribute to managerial practice for firms attempting to employ big
data analytics to drive strategic value. Organizations are overwhelmed by available
data [40], and IT managers who are asked to support big data analytics through data
management and integration are in need of a blueprint for valuing various data
sources and justifying their efforts through return on investment in infrastructure [11,
35]. The framework provides a structure through which the various relationship-
oriented constructs can be evaluated based on added business value relative to costs
of data acquisition, management, integration, and real-time feature construction. Our
contributions in this area are validated by the direct impact of our results for CSI in
motivating strategic investment in data management and integration infrastructure.
Third, our research contributes to the nascent literature regarding predictive
analytics through the use of big data [2, 5, 13, 22]. The kernel theory that we
employ in our design, RMT, has investigated many of the constructs that are central
to our framework [7, 17, 44]. However, all of this work has been from an explana-
tory, rather than predictive, viewpoint. We provide a firm foundation that allows us
to answer recent calls by many in the IS field for predictive analytics [56], particu-
larly utilizing volume and variety of available data [13, 22] to predict micro-level
outcomes at the individual level [5, 22]. The advanced customer analytics our
framework allows are not feasible in the absence of big data from a variety of
sources providing a relationship-oriented view of customer behavior.
In this era of profound digital transformation, customer agility lies at the intersection
of customer analytics, big data, and IT strategy. Firms capable of taking advantage of
such agility are best positioned to achieve sustainable competitive advantage. Our study
makes important contributions to the nascent literature on this critical topic.
ADVANCED CUSTOMER ANALYTICS 571
Supplemental File
Supplemental data for this article can be accessed on the publisher’s website at
https://doi.org/10.1080/07421222.2018.1451957
NOTES
1. All studies cited in the references for the main body of our manuscript were reviewed, as
well as additional papers listed in online Appendix F (over 200 relevant studies in total).
While it is not feasible to review all relevant studies, this creates a representative set from
which we may draw conclusions about the completeness of our construct set.
2. We evaluated other observation period lengths (60 and 90 days), but the small improve-
ments in predictive power provided were outweighed by the reduction in value caused by
longer lead times to predictions.
3. No data for the firm and environment characteristics construct is included in our
prototype system. As noted in the description of this construct within the broader framework,
although potentially useful for prediction, this construct is likely more informative of shifts
suggesting models be revisited as firms monitor analytics solutions after deployment.
4. We would like to thank Peter Fader for his support as we implemented this model, as
well as for providing the data set used in his paper so that we could verify our implementation
to be identical.
REFERENCES
1. Abbasi, A.; Albrecht, C.; Vance, A.; and Hansen, J. Metafraud: A meta-learning frame-
work for detecting financial fraud. MIS Quarterly, 36, 4 (2012), 1293–1327.
2. Abbasi, A.; Sarker, S.; and Chiang, R.H.L. Big data research in information systems:
Toward an inclusive research agenda. Journal of the Association of Information Systems, 17, 2
(2016), 1–32.
3. Abbasi, A.; Zahedi, F.; Zeng, D.; Chen, Y.; Chen, H.; and Nunamaker, J.F. Enhancing
predictive analytics for anti-phishing by exploiting website genre information. Journal of
Management Information Systems, 31, 4 (2015), 109–157.
4. Abbasi, A.; Zhou, Y.; Deng, S.; and Zhang, P. Text analytics for sense-making in social
media: A language-action perspective. MIS Quarterly, forthcoming.
5. Agarwal, R., and Dhar, V. Big data, data science, and analytics: The opportunity and
challenge for IS research. Information Systems Research, 25, 3 (2014), 443–448.
6. Ballings, M., and Van Den Poel, D. Customer event history for churn prediction: How
long is long enough? Expert Systems with Applications, 39, 18 (2012), 13517–13522.
7. Bolton, R.N. A dynamic model of the duration of the customer’s relationship with a
continuous service provider: The role of satisfaction. Marketing Science, 17, 1 (1998), 45–65.
8. Bolton, R.N.; Lemon, K.N.; and Verhoef, P.C. The theoretical underpinnings of custo-
mer asset management: A framework and propositions for future research. Journal of the
Academy of Marketing Science, 32, 3 (2004), 271–292.
9. Buckinx, W., and Van den Poel, D. Customer base analysis: Partial defection of
behaviourally loyal clients in a non-contractual FMCG retail setting. European Journal of
Operational Research, 164, 1 (2005), 252–268.
10. Burges, C.J. A tutorial on support vector machines for pattern recognition. Data Mining
and Knowledge Discovery, 2, 2 (1998), 121–167.
11. Cappiello, C.; Francalanci, C.; and Pernici, B. Time-related factors of data quality in informa-
tion multichannel systems. Journal of Management Information Systems, 20, 3 (2003), 71–91.
12. Chen, D.Q.; Preston, D.S.; and Swink, M. How the use of big data analytics affects
value creation in supply chain management. Journal of Management Information Systems, 32,
4 (2015), 4–39.
572 KITCHENS, DOBOLYI, LI, AND ABBASI
13. Chen, H., and Storey, V.C. Business intelligence and analytics: From big data to big
impact. MIS Quarterly, 36, 4 (2012), 1165–1188.
14. Chen, P.S., and Hitt, L.M. Measuring switching costs and the determinants of customer
retention in internet-enabled businesses: A study of the online brokerage industry. Information
Systems Research, 13, 3 (2002), 255–74.
15. Collins, M., and Duffy, N. Convolution kernels for natural language. In Advances in
Neural Information Processing Systems, (2002), Vancouver, British Columbia, Canada. pp.
625–632.
16. Coussement, K., and De Bock, K.W. Customer churn prediction in the online gambling
industry: The beneficial effect of ensemble learning. Journal of Business Research, 66, 9
(2013), 1629–1636.
17. Crosby, L.A., and Stephens, N. Effects of relationship marketing on satisfaction, reten-
tion, and prices in the life insurance industry. Journal of Marketing Research, 24, 4 (1987),
404–411.
18. Davenport, T.H. Analytics 3.0. Harvard Business Review, December 2013, 64–72.
19. van Doorn, J.; Lemon, K.N.N.; Mittal, V.; et al. Customer engagement behavior:
Theoretical foundations and research directions. Journal of Service Research, 13, 3 (2010),
253–266.
20. Fader, P.S.; Hardie, B.G.S.; and Lee, K.L. Counting your customers the easy way: An
alternative to the Pareto/NBD model. Marketing Science, 24, 2 (2005), 275–284.
21. Gázquez-Abad, J.C.; Canniére, M.H. De; and Martínez-López, F.J. Dynamics of custo-
mer response to promotional and relational direct mailings from an apparel retailer: The
moderating role of relationship strength. Journal of Retailing, 87, 2 (2011), 166–181.
22. Goes, P.B. Big data and IS research. MIS Quarterly, 38, 3 (2014), iii–viii.
23. Goodhue, D.L.; Kirsch, L.J.; Quillard, J.A; and Wybo, M.D. Strategic data planning:
Lessons from the field. MIS Quarterly, 16, 1 (1992), 11–34.
24. Goodhue, D.L., Wybo, M.D., and Kirsch, L.J. The impact of data integration on the
costs and benefits of information systems. MIS Quarterly, 16, 3 (1992), 293–311.
25. Gunarathne, P.; Rui, H.; and Seidmann, A. Whose and what social media complaints
have happier resolutions? Evidence from Twitter. Journal of Management Information
Systems, 34, 2 (2017), 314–340.
26. Gupta, S.; Hanssens, D.; Hardie, B.; et al. Modeling customer lifetime value. Journal of
Service Research, 9, 2 (2006), 139–155.
27. Heudecker, N., and White, A. The data lake fallacy: All water and little substance.
Gartner, July 2014, 6.
28. Hevner, A.R.; March, S.T.; Park, J.; and Ram, S. Design science in information systems
research. MIS Quarterly, 28, 1 (2004), 75–105.
29. Hitt, L.M., and Frei, F.X. Do Better customers utilize electronic distribution channels?
The case of PC banking. Management Science, 48, 6 (2002), 732–748.
30. Juola, P., and Baayen, H. A controlled-corpus experiment in authorship identification by
cross-entropy. Literary and Linguistic Computing, 20 (2005), 59–67.
31. Karimi, J., and Walter, Z. The role of dynamic capabilities in responding to digital
disruption: A factor-based study of the newspaper industry. Journal of Management
Information Systems, 32, 1 (2015), 39–81.
32. Keane, T.J., and Wang, P. Applications for the lifetime value model in modern news-
paper publishing. Journal of Direct Marketing, 9, 2 (1995), 59–66.
33. Koppel, M.; Akiva, N.; and Dagan, I. Feature instability as a criterion for selecting
potential style markers. JASIST, 57, 11 (2006), 1519–1525.
34. Kunz, W.; Aksoy, L.; Bart, Y.; et al. Customer engagement in a big data world. Journal
of Services Marketing, 31, 2 (2017), 161–171.
35. Laney, D. Why and how to measure the value of your information assets. Gartner,
August 2015. 1–22
36. Lemmens, A., and Croux, C. Bagging and boosting classification trees to predict churn.
Journal of Marketing Research, 43, 2 (2006), 276–286.
37. Lodhi, H.; Saunders, C.; Shawe-Taylor, J.; Cristianini, N.; and Watkins, C. Text
classification using string kernels. Journal of Machine Learning Research, 2 (2002), 419–444.
ADVANCED CUSTOMER ANALYTICS 573
38. Lu, Y., and Ramamurthy, K. Understanding the link between information technology
capability and organizational agility: An empirical examination. MIS Quarterly, 35, 4 (2011),
931–954.
39. Lyytinen, K., and Grover, V. Management misinformation systems: A time to revisit?
Journal of the Association for Information Systems, 18, 3 (2017), 206–230.
40. McAfee, A., and Brynjofsson, E. Big data: The management revolution. Harvard
Business Review, October 2012, 60–68.
41. Miglautsch, J. Application of RFM principles: What to do with 1–1–1 customers?
Journal of Database Marketing, 9, 4 (2002), 319–324.
42. Miguéis, V.L.; Van den Poel, D.; Camanho, A.S.; and Falcão e Cunha, J. Modeling
partial customer churn: On the value of first product-category purchase sequences. Expert
Systems with Applications, 39, 12 (2012), 11250–11256.
43. Mithas, S.; Ramasubbu, N.; and Sambamurthy, V. How information management
capability influences firm performance. MIS Quarterly, 35, 1 (2011), 237–256.
44. Mittal, V., and Kamakura, W.A. Satisfaction, repurchase intent, and repurchase beha-
vior: Investigating the moderating effect of customer characteristics. Journal of Marketing
Research, 38 (February 2001)., 131–142.
45. Neslin, S.A.; Taylor, G.A.; Grantham, K.D.; and McNeil, K.R. Overcoming the
“recency trap” in customer relationship management. Journal of the Academy of Marketing
Science, 41, 3 (2013), 320–337.
46. Neslin, S.A; Gupta, S.; Kamakura, W.; Lu, J.; and Mason, C.H. Defection detection:
Measuring and understanding the predictive accuracy of customer churn models. Journal of
Marketing Research, 43 (May 2006), 204–211.
47. Nunamaker, J.F.; Briggs, R.O.; Derrick, D.C.; and Schwabe, G. The last research mile:
Achieving both rigor and relevance in information systems research. Journal of Management
Information Systems, 32, 3 (2015), 10–47.
48. Palmer, A. The evolution of an idea: An environmental explanation of relationship
marketing. Journal of Relationship Marketing, 1, 1 (2002), 79–94.
49. Prat, N.; Comyn-Wattiau, I.; and Akoka, J. A taxonomy of evaluation methods for informa-
tion systems artifacts. Journal of Management Information Systems, 32, 3 (2015), 229–267.
50. Ransbotham, B.S., and Kiron, D. Analytics as a source of business innovation. MIT
Sloan Management Review, February 2017, 1–16.
51. Ransbotham, S.; Kiron, D.; and Prentice, P.K. Minding the analytics gap. MIT Sloan
Management Review, 56, (Spring 2015), 63–68.
52. Reinartz, W.J., and Kumar, V. The impact of customer relationship characteristics on
profitable lifetime duration. Journal of Marketing, 67 (January 2003), 77–99.
53. Roberts, N., and Grover, V. Leveraging information technology infrastructure to facil-
itate a firm’s customer agility and competitive activity: An empirical investigation. Journal of
Management Information Systems, 28, 4 (2012), 231–270.
54. Sambamurthy, V.; Bharadwaj, A.; and Grover, V. Shaping agility through digital
options: Reconceptualizing the role of information technology in contemporary firms. MIS
Quarterly, 27, 2 (2003), 237–263.
55. Schmittlein, D.C.; Morrison, D.G.; and Colombo, R. Counting your customers: Who are
they and what will they do next? Management Science, 33, 1 (1987), 1–24.
56. Shmueli, G., and Koppius, O.R. Predictive analytics in information systems research.
MIS Quarterly, 35, 3 (2011), 553–572.
57. Teo, T.S.H., and King, W.R. Integration between business planning and information
systems planning: An evolutionary-contingency perspective. Journal of Management
Information Systems, 14, 1 (1997), 185–214.
58. Vanderveld, A., and Han, A. An engagement-based customer lifetime value system for
e-commerce. In 22nd ACM SIGKDD Conference on Knowledge Discovery and Data
Mining. 2016.
59. Venkatesan, R., and Kumar, V. Framework for customer selection. Journal of
Marketing, 68 (October 2004), 106–125.
574 KITCHENS, DOBOLYI, LI, AND ABBASI
60. Verbraken, T.; Verbeke, W.; and Baesens, B. A novel profit maximizing metric for
measuring classification performance of customer churn prediction models. IEEE
Transactions on Knowledge and Data Engineering, 25, 5 (2013), 961–973.
61. Verhoef, P.C. Understanding the effect of customer relationship management efforts on
customer retention and customer share development. Journal of Marketing, 67 (October 2003).
62. Voss, G.B.; Godfrey, A.; and Seiders, K. How complementarity and substitution alter the
customer satisfaction–repurchase link. Journal of Marketing, 74 (November 2010), 111–127.
63. Wagner, C., and Majchrzak, A. Enabling customer-centricity using wikis and the wiki
way. Journal of Management Information Systems, 23, 3 (2007), 17–43.
64. Walls, J.G.; Widmeyer, G.R.; and Sawy, O.A. El. Building an information system
design theory for vigilant EIS. Information Systems Research, 3, 1 (1992), 36–59.
65. Wang, K., and Skadron, K. Association rule mining with the Micron Automata
Processor. IEEEInternational Parallel and Distributed Processing SymposiumHyderabad,
India. pp. 689–699, 2015.
66. Wixom, B., and Ross, J. How to monetize your data. MIT Sloan Management Review,
January 2017, 10–13.
Copyright of Journal of Management Information Systems is the property of Taylor & Francis
Ltd and its content may not be copied or emailed to multiple sites or posted to a listserv
without the copyright holder's express written permission. However, users may print,
download, or email articles for individual use.