Beyond Prediction Using Big Data For Policy Problems
Beyond Prediction Using Big Data For Policy Problems
for policy problems ment would die within a year from other causes,
and identify patients who are at particularly
high risk and should not receive joint replace-
Susan Athey ment surgery. They argue that “benefits accrue
over time, so surgery only makes sense if some-
Machine-learning prediction methods have been extremely productive in applications one lives long enough to enjoy them; joint re-
ranging from medicine to allocating fire and health inspectors in cities. However, there are placement for someone who dies soon afterward
a number of gaps between making a prediction and making a decision, and underlying is futile—a waste of money and an unnecessary
assumptions need to be understood in order to optimize data-driven decision-making. painful imposition on the last few months of
A
life” (p. 493). In this class of problems, the ra-
recent explosion of analysis in science, few assumptions are required for off-the-shelf tionale for focusing on prediction is clear; the
industry, and government seeks to use prediction techniques to work: The environment average effect of an intervention is known to be
“big data” for a variety of problems. In- must be stable, and the units whose behavior is negative in certain states of the world (if the
creasingly, big-data applications make use being studied should not interact or “interfere” with patient will die soon), so that predicting the state
of the toolbox from supervised machine one another. In many applications, SML tech- of the world is sufficient for the decision to forgo
learning (SML), in which software programs niques can be successfully applied by data scien- the surgery. However, the authors highlight the
(such as outreach by salespeople) to those cus- verify. A large literature in causal inference that found that the true return on investment was
tomers at highest risk of churn. Ascarza (12) spans multiple disciplines (social science, com- –63%. Part of the gap between the naïve anal-
documented firms following this type of prac- puter science, medicine, statistics, epidemiology, ysis and the results from the experiment arose
tice, and then used methods from the causal and engineering) has emerged to analyze this type because many people who clicked on eBay search
inference literature to provide empirical evidence of problem [see Imbens and Rubin (13) for a advertisements would have purchased items
that allocating resources according to a simplis- review]. One approach to estimating causal effects from eBay, anyway. Although a click on an eBay
tic predictive model is not optimal. The overlap using data that were not generated from a ad was a strong predictor of a sale—consumers
between the group with highest risk of churn- randomized experiment is to adjust for factors typically purchased right after clicking—the
ing and the group who would respond most to that led to differential inspection probabilities experiment revealed that a click did not have
interventions was only 50%. Thus, treating the in the past, and then to estimate the effect of nearly as large a causal effect, because the con-
problem of retaining customers as if it were a inspection on restaurant-specific health outcomes sumers who clicked were likely to purchase,
prediction problem yielded lower payoffs to (perhaps using audits). Recent methodological anyway.
the firm. advances focus on adjusting for observed con- Beyond resource allocation problems, the dis-
A public-sector resource allocation problem is founders in big-data applications [e.g., (14–16)]. tinction between pure prediction and causal in-
the question of how a city should allocate build- A theme in this literature is that off-the-shelf ference has been the subject of decades of
ing inspectors optimally to minimize safety or prediction methods from SML lead to biased odological and empirical research in many dis-
health violations. New York City’s ciplines. Economics has placed
Firecast algorithm allocates fire particular focus on this distinc-
inspectors according to the pre- tion, perhaps because some of
dicted probability of a violation the most fundamental economic
being detected upon inspection, questions, such as how con-
and Glaeser et al. (6) developed
provement in overall quality of units (e.g., food advertising, but it did not attempt to separate to answer this question, perhaps exploiting “natural
poisoning rates) in the city under a new inspector correlation from causality. Rather, eBay mea- experiments” in the data or an approach known
allocation regime? sured advertising effectiveness with a simple as “instrumental variables” [see (13) for a review
Thus, prediction and causal inference are predictive model in which clicks were used to of these techniques]. Recently, several authors
distinct (though closely related) problems. Out- predict sales, finding that the return on in- have combined advances from SML with this
side of randomized experiments, causal inference vestment for advertising clicks (that is, the traditionally “small data” set of methods, both for
is only possible when the analyst makes assumpt- ratio of eBay sales attributed to clicks to the estimating average causal effects (18) and for
ions beyond those required for prediction methods, cost of the advertising clicks) was about 1400%. personalized estimates of causal effects (19).
assumptions that typically are not directly Using the experimental data to measure the Beyond the distinction between prediction
testable and thus require domain expertise to causal effect of the advertisements, the authors and causal inference, methods optimized solely
for prediction also do not account for other select among job applicants for interviews; but 7. E. L. Glaeser, S. D. Kominers, M. Luca, N. Naik, Big data and big
factors that may be important in data-driven they might wish to incorporate diversity objec- cities: The promises and limitations of improved measures of
urban life (Technical Report, National Bureau of Economic
policy analysis or resource allocation. For ex- tives in the algorithm, or at least prevent in- Research, 2015).
ample, incentives and manipulability can be equities by gender or race. These issues have 8. J. Grimmer, B. M. Stewart, Polit. Anal. 21, 267–297 (2013).
important. If a building or restaurant owner received recent attention in the literature on 9. J. S. Kang, P. Kuznetsova, M. Luca, Y. Choi, in Proceedings
anticipates a low probability of being inspected SML [e.g., (21)]. of the 2013 Conference on Empirical Methods in Natural
Language Processing (Association for Computational
based on these characteristics, he or she may Overall, for big data to achieve its full po- Linguistics, 2013), pp. 1443–1448.
reduce efforts for safety. tential in business, science, and policy, multi- 10. M. Bayati et al., PLOS ONE 9, e109264 (2014).
In an example of data-driven policy where disciplinary approaches are needed that build 11. J. Kleinberg, J. Ludwig, S. Mullainathan, Z. Obermeyer, Am.
manipulability played a role, the market pricing on new computational algorithms from the Econ. Rev. 105, 491–495 (2015).
12. E. Ascarza, Retention futility: Targeting high risk customers
system (MPS) of British Columbia is used to SML literature, but also that bring in the meth- might be ineffective (2016); available at SSRN.
set prices for harvest of timber from government- ods and practical learning from decades of multi- 13. G. W. Imbens, D. B. Rubin, Causal Inference in Statistics, Social,
owned land that has been allocated to timber disciplinary research using empirical evidence and Biomedical Sciences (Cambridge Univ. Press, 2015).
companies under long-term leases. The MPS to inform policy. A nascent but rapidly growing 14. M. Dudık, J. Langford, L. Li, in Proceedings of the 28th
International Conference on Machine Learning (ICML, 2011),
builds a predictive model using data from tim- body of research takes this approach: For ex- pp. 1097–1104.
ber sold at auctions to predict the prices that ample, the International Conference on Machine 15. A. Belloni, V. Chernozhukov, C. Hansen, Rev. Econ. Stud. 81,
would have been obtained if a tract harvested Learning (ICML) in 2016 held separate work- 608–650 (2014).
under a long-term lease had instead been sold shops on causal inference, interpretability, and 16. S. Athey, G. Imbens, S. Wager, Approximate residual
balancing: De-biased inference of average treatment effects
via auction. However, a lease-holder could po- reliability of SML methods, while multidiscipli- in high dimensions; https://arxiv.org/abs/1604.07125
tentially have an incentive to bid artificially nary research teams at Google (22), Facebook (2016).
low in auctions in order to influence the pre- (23), and Microsoft (24) have made available 17. T. Blake, C. Nosko, S. Tadelis, Econometrica 83, 155–174
(2015).
RELATED http://science.sciencemag.org/content/sci/355/6324/468.full
CONTENT
http://science.sciencemag.org/content/sci/355/6324/470.full
http://science.sciencemag.org/content/sci/355/6324/474.full
http://science.sciencemag.org/content/sci/355/6324/477.full
http://science.sciencemag.org/content/sci/355/6324/481.full
http://science.sciencemag.org/content/sci/355/6324/486.full
http://science.sciencemag.org/content/sci/355/6324/489.full
http://science.sciencemag.org/content/sci/355/6324/515.full
REFERENCES This article cites 11 articles, 1 of which you can access for free
http://science.sciencemag.org/content/355/6324/483#BIBL
PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions
Science (print ISSN 0036-8075; online ISSN 1095-9203) is published by the American Association for the Advancement of
Science, 1200 New York Avenue NW, Washington, DC 20005. The title Science is a registered trademark of AAAS.
Copyright © 2017, American Association for the Advancement of Science