Uplift Modelling
Uplift Modelling
Uplift modelling, also known as incremental modelling, true lift modelling, or net modelling is a predictive
modelling technique that directly models the incremental impact of a treatment (such as a direct marketing action)
on an individual's behaviour.
Uplift modelling has applications in customer relationship management for up-sell, cross-sell and retention
modelling. It has also been applied to political election and personalised medicine. Unlike the related Differential
Prediction concept in psychology, Uplift Modelling assumes an active agent.
Introduction
Uplift modelling uses a randomised scientific control not only to measure the effectiveness of an action but also to
build a predictive model that predicts the incremental response to the action. The response could be a binary
variable (for example, a website visit)[1] or a continuous variable (for example, customer revenue).[2] Uplift
modelling is a data mining technique that has been applied predominantly in the financial services,
telecommunications and retail direct marketing industries to up-sell, cross-sell, churn and retention activities.
Measuring uplift
The uplift of a marketing campaign is usually defined as the difference in response rate between a treated group and
a randomized control group. This allows a marketing team to isolate the effect of a marketing action and measure
the effectiveness or otherwise of that individual marketing action. Honest marketing teams will only take credit for
the incremental effect of their campaign.
However, many marketers define lift (rather than uplift) as the difference in response rate between treatment and
control, so uplift modeling can be defined as improving (upping) lift through predictive modeling.
The table below shows the details of a campaign showing the number of responses and calculated response rate for
a hypothetical marketing campaign. This campaign would be defined as having a response rate uplift of 5%. It has
created 50,000 incremental responses (100,000 - 50,000).
This model would only use the treated customers to build the model.
In contrast uplift modeling uses both the treated and control customers to build a predictive model that focuses on
the incremental response. To understand this type of model it is proposed that there is a fundamental segmentation
that separates customers into the following groups (their names were suggested by N. Radcliffe and explained in
[3])
The Persuadables : customers who only respond to the marketing action because they were targeted
The Sure Things : customers who would have responded whether they were targeted or not
The Lost Causes : customers who will not respond irrespective of whether or not they are targeted
The Do Not Disturbs or Sleeping Dogs : customers who are less likely to respond because they were
targeted
The only segment that provides true incremental responses is the Persuadables.
Uplift modelling provides a scoring technique that can separate customers into the groups described above.
Traditional response modelling often targets the Sure Things being unable to distinguish them from the
Persuadables.
Return on investment
Because uplift modelling focuses on incremental responses only, it provides very strong return on investment cases
when applied to traditional demand generation and retention activities. For example, by only targeting the
persuadable customers in an outbound marketing campaign, the contact costs and hence the return per unit spend
can be dramatically improved.
Victor Lo also published on this topic in The True Lift Model (2002),[5] and later Radcliffe again with Using
Control Groups to Target on Predicted Lift: Building and Assessing Uplift Models (2007).[6]
Radcliffe also provides a very useful frequently asked questions (FAQ) section on his web site, Scientific
Marketer.[7] Lo (2008) provides a more general framework, from program design to predictive modeling to
optimization, along with future research areas.[8]
Independently uplift modelling has been studied by Piotr Rzepakowski. Together with Szymon Jaroszewicz he
adapted information theory to build multi-class uplift decision trees and published the paper in 2010.[9] And later in
2011 they extended the algorithm to multiple treatment case.[10]
Similar approaches have been explored in personalised medicine.[11][12] Szymon Jaroszewicz and Piotr
Rzepakowski (2014) designed uplift methodology for survival analysis and applied it to randomized controlled trial
analysis.[13] Yong (2015) combined a mathematical optimization algorithm via dynamic programming with
machine learning methods to optimally stratify patients.[14]
Uplift modelling is a special case of the older psychology concept of Differential Prediction.[15] In contrast to
differential prediction, uplift modelling assumes an active agent, and uses the uplift measure as an optimization
metric.
Uplift modeling has been recently extended and incorporated into diverse machine learning algorithms, like
Inductive Logic Programming,[15] Bayesian Network,[16] Statistical relational learning,[12] Support Vector
Machines,[17][18] Survival Analysis[13] and Ensemble learning.[19]
Even though uplift modeling is widely applied in marketing practice (along with political elections), it has rarely
appeared in marketing literature. Kane, Lo and Zheng (2014) published a thorough analysis of three data sets using
multiple methods in a marketing journal and provided evidence that a newer approach (known as the Four Quadrant
Method) worked quite well in practice.[20] Lo and Pachamanova (2015) extended uplift modeling to prescriptive
analytics for multiple treatment situations and proposed algorithms to solve large deterministic optimization
problems and complex stochastic optimization problems where estimates are not exact.[21]
Recent research analyses the performance of various state-of-the-art uplift models in benchmark studies using large
data amounts.[22][1]
A detailed description of uplift modeling, its history, the way uplift models are built, differences to classical model
building as well as uplift-specific evaluation techniques, a comparison of various software solutions and an
explanation of different economical scenarios can be found here.[23]
Implementations
In Python
CausalML (https://github.com/uber/causalml)
DoubleML (https://docs.doubleml.org/stable/index.html)
EconML (https://github.com/microsoft/EconML)
UpliftML (https://github.com/bookingcom/upliftml)
PyLift (https://github.com/wayfair/pylift)
scikit-uplift (https://github.com/maks-sh/scikit-uplift)
In R
DoubleML (https://docs.doubleml.org/stable/index.html)
uplift package (https://cran.r-project.org/web/packages/uplift/index.html)
Other languages
JMP by SAS
Portrait Uplift by Pitney Bowes
Uplift node for KNIME by Dymatrix
Uplift Modelling in Miró (http://www.stochasticsolutions.com/miro/) by Stochastic Solutions (http://ww
w.stochasticsolutions.com/)
Datasets
Hillstrom Email Marketing dataset (https://blog.minethatdata.com/2008/05/best-answer-e-mail-analyti
cs-challenge.html)
Criteo Uplift Prediction dataset (http://ailab.criteo.com/criteo-uplift-prediction-dataset/)
Lenta Uplift Modeling Dataset (https://www.uplift-modeling.com/en/latest/api/datasets/fetch_lenta.html
#lenta-uplift-modeling-dataset)
X5 RetailHero Uplift Modeling Dataset (https://www.uplift-modeling.com/en/latest/api/datasets/fetch_x
5.html#x5-retailhero-uplift-modeling-dataset)
MegaFon Uplift Competition Dataset (https://www.uplift-modeling.com/en/latest/api/datasets/fetch_m
egafon.html#megafon-uplift-competition-dataset)
See also
Lift (data mining)
External links
Abby Johnson explains how it works in this video broadcast (http://videos.smallbusinessnewz.com/20
11/01/05/how-uplift-modeling-boosts-marketing-efforts/)
Introductory white paper with full references (http://www.predictiveanalyticsworld.com/signup-uplift-wh
itepaper.php)
Eric Siegel: Uplift Modeling (http://www.predictiveanalyticsworld.com/pdf/YTW03080USEN/Uplift-Mod
eling-Optimizes-Marketing-Decisions-White-Paper.pdf)
User guide for uplift modelling on uplift-modeling.com (https://www.uplift-modeling.com/en/latest/user
_guide/index.html)