Communications201611-Dl - Sex As An Algorithm
Communications201611-Dl - Sex As An Algorithm
ACM
CACM.ACM.ORG OF THE 11/2016 VOL.59 NO.11
SEX
AS AN ALGORITHM
THE THEORY OF EVOLUTION
UNDER THE LENS OF COMPUTATION
Association for
Computing Machinery
31st
IEEE • 2 0 1 7 •
INTERNATIONAL
Parallel and May 29-June 2, 2017
Buena Vista Palace Hotel
Distributed Orlando, Florida USA
Processing
SYMPOSIUM
www.ipdps.org
Orlando is home to a rich offering of indoor and outdoor attractions. Located a mile from Walt Disney World® and 4 miles from Epcot, the
Buena Vista Palace Hotel is a 5-minute walk from Downtown Disney with a compl imentary shuttle to all Disney Theme Parks and Water Parks.
The sprawl ing Lake Buena Vista resort offers a full menu of amenities and family friendly activities as well as ideal meeting space for IPDPS 2017.
IPDPS WORKSHOPS MONDAY 29 MAY 2017 (Check final schedule) WORKSHOPS VICE-CHAIR
Erik Saule (University of North Carolina Charlotte, USA)
HCW Heterogeneity in Computing Workshop
RAW Reconfigurable Architectures Workshop STUDENT PARTICIPATION CHAIR
HiCom b High Performance Computational Biology Trilce Estrada (University of New Mexico, USA)
EduPar NSF/TCPP W. on Parallel and Distributed Computing Education
ParLearn i ng Parallel and Distributed Computing for
Machine Learning and Big Data Analytics ROUNDTABLE WORKSHOPS
PDCO Parallel / Distributed Computing and Optimization These condensed workshops, organized and animated by a
GABB Graph Algorithms Building Blocks few people, will be held on Tuesday and Thursday in a
AsHES Accelerators and Hybrid Exascale Systems “roundtable” setting designed to promote one-on-one
interaction. They will focus on an emerging area of interest to
H I PS High Level Programming Models and Supporting Environments
IPDPS attendees, especially topics that complement and
APDCM Advances in Parallel and Distributed Computational Models “round out” the areas covered by the regular workshops.
HPPAC High-Performance, Power-Aware Computing
HPBDC High-Performance Big Data Computing PhD FORUM & STUDENT MENTORING
This event will include traditional poster presentations by PhD
IPDPS WORKSHOPS FRIDAY 2 JUNE 2017 (Check final schedule) students enhanced by a program of mentoring and coaching
in scientific writing and presentation skills and a special
CH I UW Chapel Implementers and Users Workshop
opportunity for students to hear from and interact with senior
LSPP Large-Scale Parallel Processing: Practices and Experiences researchers attending the conference.
PDSEC Parallel and Distributed Scientific and Engineering Computing
JSSPP Job Scheduling Strategies for Parallel Processors INDUSTRY PARTICIPATION
DPDNS Dependable Parallel, Distributed and Network-centric Systems IPDPS extends a special invitation for companies to become
an IPDPS 2017 Industry Partner and join us in Orlando to
I PDRM Emerging Parallel and Distributed Runtime Systems and Middleware
share in the benefits of associating with an international
iWAPT International Workshop on Automatic Performance Tunings community of top researchers and practitioners in fields
ParSoc ial Parallel and Distributed Processing for Computational Social System related to parallel processing and distributed computing. Visit
BigDataEco Big Data Regional Innovation Hubs and Spokes: the IPDPS website to see ways to participate.
Accelerating the Big Data Innovation Ecosystem
GraML Graph Algorithms and Machine Learning
EMBRACE Evolvable Methods for Benchmarking Realism IMPORTANT DATES
and Community Engagement Conference Author Notification January 8, 2017
RE PPAR Reproducibility in Parallel Computing Workshop Call for Papers Deadlines Most Fall After January 8, 2017
Technical Committee on Parallel Processing IEEE Computer Society Technical Committee on Distributed Processing
Previous
A.M. Turing Award
Recipients
37 Viewpoint
Technology and Academic Lives
Considering the need to create
new modes of interaction and
approaches to assessment given
a rapidly evolving academic realm.
By Jonathan Grudin
Association for Computing Machinery
Advancing Computing as a Science & Profession
Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fields.
Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional.
Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology,
and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications,
public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM
enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts,
sciences, and applications of information technology.
ACM, the world’s largest educational STA F F EDITORIAL BOARD ACM Copyright Notice
and scientific computing society, delivers DIRECTOR OF GROUP PU BLIS HING E DITOR- IN- C HIE F Copyright © 2016 by Association for
resources that advance computing as a Scott E. Delman Moshe Y. Vardi Computing Machinery, Inc. (ACM).
science and profession. ACM provides the [email protected] [email protected] Permission to make digital or hard copies
computing field’s premier Digital Library of part or all of this work for personal
and serves its members and the computing Executive Editor NE W S or classroom use is granted without
profession with leading-edge publications, Diane Crawford Co-Chairs fee provided that copies are not made
conferences, and career resources. Managing Editor William Pulleyblank and Marc Snir or distributed for profit or commercial
Thomas E. Lambert Board Members advantage and that copies bear this
Executive Director and CEO Senior Editor Mei Kobayashi; Michael Mitzenmacher; notice and full citation on the first
Bobby Schnabel Andrew Rosenbloom Rajeev Rastogi page. Copyright for components of this
Deputy Executive Director and COO Senior Editor/News work owned by others than ACM must
VIE W P OINTS
Patricia Ryan Larry Fisher be honored. Abstracting with credit is
Director, Office of Information Systems Co-Chairs permitted. To copy otherwise, to republish,
Web Editor Tim Finin; Susanne E. Hambrusch;
Wayne Graves David Roman to post on servers, or to redistribute to
Director, Office of Financial Services John Leslie King lists, requires prior specific permission
Rights and Permissions Board Members
Darren Ramdin Deborah Cotton and/or fee. Request permission to publish
Director, Office of SIG Services William Aspray; Stefan Bechtold; from [email protected] or fax
Donna Cappo Michael L. Best; Judith Bishop; (212) 869-0481.
Art Director Stuart I. Feldman; Peter Freeman;
Director, Office of Publications
Andrij Borys Mark Guzdial; Rachelle Hollander;
Bernard Rous For other copying of articles that carry a
Associate Art Director Richard Ladner; Carl Landwehr;
Director, Office of Group Publishing code at the bottom of the first or last page
Margaret Gray Carlos Jose Pereira de Lucena;
Scott E. Delman or screen display, copying is permitted
Assistant Art Director Beng Chin Ooi; Loren Terveen; provided that the per-copy fee indicated
ACM CO U N C I L Mia Angelica Balaquiot Marshall Van Alstyne; Jeannette Wing in the code is paid through the Copyright
President Designer
Clearance Center; www.copyright.com.
Vicki L. Hanson Iwona Usakiewicz
Vice-President Production Manager P R AC TIC E
Subscriptions
Cherri M. Pancake Lynn D’Addesio Co-Chair An annual subscription cost is included
Secretary/Treasurer Advertising Sales Stephen Bourne in ACM member dues of $99 ($40 of
Elizabeth Churchill Juliet Chance Board Members which is allocated to a subscription to
Past President Eric Allman; Peter Bailis; Terry Coatta; Communications); for students, cost
Alexander L. Wolf Columnists Stuart Feldman; Benjamin Fried; is included in $42 dues ($20 of which
Chair, SGB Board David Anderson; Phillip G. Armour; Pat Hanrahan; Tom Killalea; Tom Limoncelli; is allocated to a Communications
Patrick Madden Michael Cusumano; Peter J. Denning; Kate Matsudaira; Marshall Kirk McKusick; subscription). A nonmember annual
Co-Chairs, Publications Board Mark Guzdial; Thomas Haigh; George Neville-Neil; Theo Schlossnagle; subscription is $269.
Jack Davidson and Joseph Konstan Leah Hoffmann; Mari Sako; Jim Waldo
Members-at-Large Pamela Samuelson; Marshall Van Alstyne ACM Media Advertising Policy
The Practice section of the CACM
Gabriele Anderst-Kotis; Susan Dumais; Editorial Board also serves as Communications of the ACM and other
Elizabeth D. Mynatt; Pamela Samuelson; CO N TAC T P O IN TS ACM Media publications accept advertising
the Editorial Board of .
Eugene H. Spafford Copyright permission in both print and electronic formats. All
SGB Council Representatives [email protected] C ONTR IB U TE D A RTIC LES advertising in ACM Media publications is
Paul Beame; Jenna Neefe Matthews; Calendar items Co-Chairs at the discretion of ACM and is intended
Barbara Boucher Owens [email protected] Andrew Chien and James Larus to provide financial support for the various
Change of address Board Members activities and services for ACM members.
BOARD C HA I R S [email protected] William Aiello; Robert Austin; Elisa Bertino; Current advertising rates can be found
Education Board Letters to the Editor Gilles Brassard; Kim Bruce; Alan Bundy; by visiting http://www.acm-media.org or
Mehran Sahami and Jane Chu Prey [email protected] Peter Buneman; Peter Druschel; Carlo Ghezzi; by contacting ACM Media Sales at
Practitioners Board Carl Gutwin; Yannis Ioannidis; (212) 626-0686.
W E B S IT E
George Neville-Neil Gal A. Kaminka; James Larus; Igor Markov;
http://cacm.acm.org
Gail C. Murphy; Bernhard Nebel; Single Copies
REGIONA L C O U N C I L C HA I R S AU T H O R G U ID E L IN ES Lionel M. Ni; Kenton O’Hara; Sriram Rajamani; Single copies of Communications of the
ACM Europe Council http://cacm.acm.org/ Marie-Christine Rousset; Avi Rubin; ACM are available for purchase. Please
Dame Professor Wendy Hall Krishan Sabnani; Ron Shamir; Yoav contact [email protected].
ACM India Council ACM ADVERTISIN G DEPARTM E NT Shoham; Larry Snyder; Michael Vitale;
Srinivas Padmanabhuni 2 Penn Plaza, Suite 701, New York, NY Wolfgang Wahlster; Hannes Werthner; COMMUN ICATION S OF THE ACM
ACM China Council 10121-0701 Reinhard Wilhelm (ISSN 0001-0782) is published monthly
Jiaguang Sun T (212) 626-0686 by ACM Media, 2 Penn Plaza, Suite 701,
F (212) 869-0481 RES E A R C H HIGHLIGHTS New York, NY 10121-0701. Periodicals
PUB LICATI O N S BOA R D postage paid at New York, NY 10001,
Co-Chairs Co-Chairs
Azer Bestovros and Gregory Morrisett and other mailing offices.
Jack Davidson; Joseph Konstan Advertising Sales
Board Members Juliet Chance Board Members
Martin Abadi; Amr El Abbadi; Sanjeev Arora; POSTMASTER
Ronald F. Boisvert; Karin K. Breitman; [email protected] Please send address changes to
Terry J. Coatta; Anne Condon; Nikil Dutt; Nina Balcan; Dan Boneh; Andrei Broder;
For display, corporate/brand advertising: Doug Burger; Stuart K. Card; Jeff Chase; Communications of the ACM
Roch Guerrin; Carol Hutchins; 2 Penn Plaza, Suite 701
Yannis Ioannidis; Catherine McGeoch; Craig Pitcher Jon Crowcroft; Sandhya Dwaekadas;
[email protected] T (408) 778-0300 Alexei Efros; Alon Halevy; Norm Jouppi; New York, NY 10121-0701 USA
M. Tamer Ozsu; Mary Lou Soffa; Alex Wade;
Keith Webster William Sleight Andrew B. Kahng; Sven Koenig; Xavier Leroy;
[email protected] T (408) 513-3408 Steve Marschner; Kobbi Nissim; Guy Steele, Printed in the U.S.A.
ACM U.S. Public Policy Office Jr.; David Wagner; Margaret H. Wright;
Media Kit [email protected] Nicholai Zeldovich; Andreas Zeller
Renee Dopplick, Director
1828 L Street, N.W., Suite 800
Washington, DC 20036 USA WEB
T (202) 659-9711; F (202) 667-1066 Association for Computing Machinery Chair
(ACM) James Landay
Computer Science Teachers Association 2 Penn Plaza, Suite 701 Board Members A
SE
REC
Y
Mark R. Nelson, Executive Director New York, NY 10121-0701 USA Marti Hearst; Jason I. Hong;
E
CL
PL
S
I
Z
I
M AGA
Globalization, Computing,
and Their Political Impact
P
E RCH ED A S W E are on the at the broader impact of globalization for inflation, is at the level it was in in
crest of the current tech on the economy, we might have reached 1968! Ironically, automation is now
and computing-enrollment somewhat less sanguine conclusions. reaching developing-world economies.
boom, it is hard to remem- Globalization exerted tremendous com- Manufacturing employment in Chi-
ber the dark days of the early petitive pressure on manufacturing in na peaked around 1995, where rising
2000s. The NASDAQ Index peaked on developed countries. It is instructive to wages are driving automation, and a
March 10, 2000, declining almost 80% examine the response to this competi- recent report from the International
over the next two years. The stock-mar- tive pressure, taking U.S. manufactur- Labor Organization found that more
ket crash in the U.S. caused the loss of ing as an example. To survive in the than two-thirds of Southeast Asia’s 9.2
$5 trillion in the market valuations from intensely competitive global economy, million textile and footwear jobs are
2000 to 2002. Computing enrollments U.S. manufacturing had to increase its threatened by automation.
in North America went into a steep dive. productivity dramatically, substituting Until very recently, the global pro-
At the same time, the Internet enabled technology for labor. U.S. manufac- fessional class, which includes most
the globalization of software produc- turing productivity roughly doubled computing professionals, was some-
tion, giving rise to the phenomenon of between 1995 and 2015. As a result, what oblivious to the plight of work-
offshore outsourcing. There were daily while U.S. manufacturing output to- ing- and middle-class people in de-
stories in the media describing major day is essentially at an all-time high, veloped countries. In fact, some have
shifts in employment that were occur- employment peaked around 1980, and argued that this class lives in “an eco-
ring largely as a result of software off- has been declining precipitously since nomic and cultural bubble.” Political
shoring. Combined with the dot-com 1995. Neoclassical economists argue developments of the past few months
bust, these reports raised concerns that when technology destroys jobs have made it clear that the issue of
about the future of information tech- “people find other jobs, albeit possibly shared prosperity cannot be ignored.
nology (IT) as a viable field of study and after a long period of painful adjust- It is now evident that the Brexit vote
work in developed countries. ment.” They are definitely right about in the U.K., as well as the temporary
In response to these concerns, ACM the painful adjustment! The impact of rise of Bernie Sanders and the rise of
Council commissioned a Task Force globalization and automation over the Donald Trump in the U.S., were driven
in 2004 to “look at the facts behind past 20 years on working- and middle- to a large extent by economic griev-
the rapid globalization of IT and the class Americans has been quite harsh. ances. However the outcome of this
migration of jobs resulting from out- This impact has been succinctly cap- month’s U.S. presidential election,
sourcing and offshoring.” The Task tured recently by economist Branko Mi- globalization and automation will re-
Force, co-chaired by Frank Mayadas lanovic’s “Elephant Curve” (see http:// main policy issues of the utmost priority
and myself, with the assistance of Wil- prospect.org/article/worlds-inequality), and will resist simplistic solutions.
liam Aspray as Editor, issued its report which shows how people around the Globalization and automation pro-
Globalization and Offshoring of Software world, ranked by their income in vide huge benefits to society, but their
(http://www.acm.org/globalizationreport) 1998, saw their incomes increase by adverse effects cannot and should not
in 2006. The report concluded that 2008. While the incomes of the very be ignored. Technology is not destiny
there is no real reason to believe that IT poor was stagnant, rising incomes in and public policy has a key role to play.
jobs are migrating away from developed emerging economies lifted hundreds As actors in and beneficiaries of this
countries. The passing decade has vin- of millions of people out of poverty. societal transformation, we have, I be-
dicated that conclusion. People at the top of the income scale lieve, a social responsibility that goes
But while the report conceded that also benefited from globalization and beyond our technical roles.
“trade gains may be distributed dif- automation. But the income of work- Follow me on Facebook, Google+,
ferentially,” meaning some individu- ing- and middle-class people in the de- and Twitter.
als gain and some lose, some localities veloped world stagnated over that pe-
gain and some lose; it was focused nar- riod. In the U.S., for example, income Moshe Y. Vardi, EDITOR-IN-CHIEF
rowly on the IT industry. Had we looked of production workers today, adjusted Copyright held by author.
q Join ACM-W: ACM-W supports, celebrates, and advocates internationally for the full engagement of women in
all aspects of the computing field. Available at no additional cost.
Priority Code: CAPP
Payment Information
Payment must accompany application. If paying by check
or money order, make payable to ACM, Inc., in U.S. dollars
Name or equivalent in foreign currency.
Credit Card #
City/State/Province
Exp. Date
ZIP/Postal Code/Country
Signature
Email
1-800-342-6626 (US & Canada) Hours: 8:30AM - 4:30PM (US EST) [email protected]
1-212-626-0500 (Global) Fax: 212-944-1318 acm.org/join/CAPP
cerf’s up
Heidelberg Anew
I have just returned from the fourth annual
Heidelberg Laureate Forum and I want to
emphasize how very important it has been
for ACM Turing laureates to participate in
the program. Each year 200 math and gramming. The value of abstraction that other ACM awardees might be in-
computer science undergraduates par- to aid in reasoning about expected vited to attend the annual event.
ticipate in the program, approximately program function resonated very Looking back on the Lindau event
100 each. Speeches by laureates are strongly with me and, I think, with that I attended in late June at which
mixed with undergraduate workshops others in attendance. Nobel Prize winners mingle with stu-
and plenary open sessions. There is As always, the mathematics and dents, I was struck by the increasingly
ample opportunity for interaction computer science students were full of important role of computing in dis-
among students and laureates and be- energy, ideas, and eagerness to interact covery science. Simulations of physi-
tween students. with each other and with the laureates cal phenomena are revealing new in-
This year, Brian Schmidt gave the present. The organizers worked hard sights into the nature of our universe.
Lindau lecture (from the annual No- to maximize student opportunities to One of the dramatic examples I have
bel Prize winners meeting). Schmidt meet with laureates including a num- seen shows an evolving universe from
discovered that the universe is not only ber of workshops where some in-depth the big bang that takes into account
expanding, the expansion is accelerat- discussion could be supported. Some dark matter and dark energy and pro-
ing. It would be difficult to imagine a of the laureates voiced a strong recom- duces a simulated universe with many
more profound discovery. In the very mendation that every effort should be of the large scale structures we actu-
long term, it appears the universe will made to allow rich interaction between ally see in the observable universe. We
expand to the point that only a certain students of the two disciplines. see huge reticular structures emerging
amount of local gravity will hold a gal- Looking at the available laureate at- that are largely the product of masses
axy or small group of galaxies together. tendee lists, I can’t help but imagine that of dark matter that organize ordinary
The rest will accelerate away and the future Heidelberg events would benefit matter into a lacework of stars and gas.
universe will end in a cold whimper. from a cohort of additional younger lau- That these predictions can be tested
Fortunately, there was nothing but reates so I look forward to the possibility through observation reinforces the im-
stimulating conversation at this year’s portance of computing in our explora-
Heidelberg Forum. More than ever tion of the natural world.
we are seeing how mathematics and We are reaching We are reaching an exciting period
computer science are interacting, es- in scientific discovery in which com-
pecially with the arrival of neural net- an exciting period putation is as important as labora-
works and quantum computers that in scientific discovery tory experiment and observation. We
have capabilities quite different from can invent our own universes and test
the conventional von Neumann de- in which computation them for compatibility with the real
signs that have dominated computing is as important as one we can measure. Indeed, we may
for over seven decades. Fundamental find that our predictions could draw
questions about what is computable, laboratory experiment our attention to phenomena we might
illuminated by Gödel, are getting at- and observation. never have looked for, were it not for
tention in the light of these new com- the revelation of computation.
puting engines.
Leslie Lamport delivered another Vinton G. Cerf is vice president and Chief Internet Evangelist
extraordinary lecture reinforcing the at Google. He served as ACM president from 2012–2014.
DOI:10.1145/3002205
N
O O N E LI KESbeing reduced to diversity of attributes that characterize establishing staff units to provide the
a number. For example, there my department, as well as peer depart- diverse data requested by the ranking
is much more to my financial ments at other universities. I know how agencies and boosting their communi-
picture than my credit score unreasonable it is to reduce it all to a cations and public relations activities.
alone. There is even schol- single number. But I also know there There is also evidence that these league
arly work on weaknesses in the system are prospective students, as well as their tables are beginning to (adversely) in-
to compute this score. Everyone may parents and others, who find a number fluence resource-allocation and hiring
agree the number is far from perfect, useful. I encourage them to consider decisions despite their glaring inad-
yet it is used to make decisions that an array of factors when I am trying to equacies and limitations.
matter to me, as Moshe Y. Vardi dis- recruit them to choose Michigan. But I I have been asked to serve on the pan-
cussed in his Editor’s Letter “Academic cannot reasonably ask them not to look els of two of the ranking systems but have
Rankings Considered Harmful!” (Sept. at the number. So it behooves me to do had to abandon my attempts to complete
2016). So I care what my credit score is. what I can to make it as good as it can the questionnaires because I just did not
Many of us may even have made finan- be, and to work toward a system that have sufficient information to provide
cial decisions taking into account their produces numbers that are as fair as honest responses to the kinds of difficult,
potential impact on credit score. they can be. I agree it is not possible to comparative questions about such a large
As an academic, I also produce such come anywhere close to perfection, but number of universities. The agencies sel-
numbers. I assign grades to my stu- the less bad we can make the numbers, dom report how many “experts” they ac-
dents. I strive to have the assigned grade the better off we all will be. tually surveyed or their survey-response
accurately reflect a student’s grasp of the H.V. Jagadish, Ann Arbor, MI rates. As regards the relatively “objec-
material in my course. But I know this is tive” ARWU ranking, it uses measures
imperfect. At best, the grade reflects the like number of alumni and staff winning
student’s knowledge today. When a pro- Author Responds: Nobel Prizes and Fields Medals, number
spective employer looks at it two years My Editor’s Letter did not question the need of highly cited researchers selected by
later, it is possible an A student had for quantitative evaluation of academic Thomson Reuters, number of articles
crammed for the exam and has since programs. I presume, however, that published in journals of Nature and Sci-
completely forgotten the material, while Dr. Jagadish assigns grades to his students ence, number of articles indexed in Sci-
a B student deepened his or her under- rather than merely ranking them. These ence and Social Science Citation Index,
standing substantially through a subse- students then graduate with a transcript, and “per capita performance” of a uni-
quent internship. The employer must which reports all their grades, rather than versity. It is not at all clear to what extent
learn to get past the grade to develop a just their class rank. He argues that we should the six narrowly focused indicators can
richer understanding of the student’s learn to live with numbers (I agree) but capture the overall performance of mod-
strengths and weaknesses. does not address any of the weaknesses ern universities, which tend to be large,
As an academic, I am also a consumer of academic rankings. complex, loosely coupled organizations.
of these numbers. Most universities, in- Moshe Y. Vardi, Editor-in-Chief As well, the use of measures like num-
cluding mine, look at standardized test ber of highly cited researchers named
scores. No one suggests they predict suc- by Thomson Reuters/ISI can exacerbate
cess perfectly. But there is at least some More Negative Consequences some of the known citation malprac-
correlation—enough that they are used, of Academic Rankings tices (such as excessive self-citations, ci-
often as an initial filter. Surely there are I could not agree more with Moshe Y. tation rings, and journal-citation stack-
students who could have done very well Vardi’s Editor’s Letter (Sept. 2016). The ing). As Vardi noted, the critical role of
if admitted but were not considered ranking systems—whether U.S.-focused commercial entities in the rankings—
seriously because they did not make (such as U.S. News and World Report) or notably Times, QS, USNWR, and Thom-
the initial cutoff in test scores. A small global (such as Times Higher Education, son Reuters—is also a concern.
handful of U.S. colleges and universities World University Reputation Ranking, QS Joseph G. Davis, Sydney, Australia
have recently stopped considering stan- University Ranking, and Academic Rank-
dardized test scores for undergraduate ing of World Universities, compiled by
admission. I admire their courage. Most Shanghai Jiaotong University in Shang- Acknowledge Crowdworkers
others have not followed suit because it hai, China)—have all acquired lives of in Crowdwork Research
takes a tremendous amount of work to their own in recent years. These rank- Crowdwork promises to help integrate
get behind the numbers. Even if better ings have attracted the attention of human and computational processes
decisions might result, the process sim- governments and funding bodies and while also providing a source of paid
ply requires too much effort. are widely reported in the media. Many work for those who might otherwise
As an academic, I appreciate the rich universities worldwide have reacted by be excluded from the global economy.
Daniel W. Barowy et al.’s Research We hope future coverage of crowd- the human genome at the start of the
Highlight “AutoMan: A Platform for work in Communications will include third millennium. Berger et al. pointed
Integrating Human-Based and Digital research incorporating the perspective out that the biologist’s growth exponent
Computation” (June 2016) explored a of workers in the design of such sys- is greater even than Moore’s Law. This
programming language called Auto- tems. This will help counteract the risk growth has led to an increasing fraction
Man designed to integrate human of creating programming languages of medical research funding being di-
workers recruited through crowdwork that could actively, even if unintention- rected to data-rich ‘omics. But the differ-
markets like Amazon Mechanical Turk ally, accentuate inequality and poverty. ence between what Moore’s Law makes
alongside conventional computing At a time when technology increasingly affordable and a greater medical expo-
resources. The language breaks new influences political debate, social re- nent is itself in the long term also expo-
ground in how to automate the compli- sponsibility is more than ever an in- nential. Berger et al. proposed smarter
cated work of scheduling, pricing, and valuable aspect of computer science. algorithms to plug the gap.
managing crowdwork. The article described bioinformat-
While the attempt to automate this References ics’ ready adoption of cloud computing,
1. Irani, L.C. and Silberman, M.S. Turkopticon: Interrupting
managerial responsibility is clearly of worker invisibility in Amazon Mechanical Turk. In but the true parallel nature of much of
value, we were dismayed by the authors’ Proceedings of the SIGCHI Conference on Human Factors in e-biology went unstated; for example,
Computing Systems (Paris, France, Apr. 27–May 2). ACM
lack of concern for those who carry out Press, New York, 2013, 611–620. Illumina next-generation sequencers
the actual work of crowdwork. Humans 2. Salehi, N., Irani, L.C., Bernstein, M.S. et al. We are can generate more than one billion
Dynamo: Overcoming stalling and friction in collective
and computers are not interchangeable. action for crowd workers. In Proceedings of the SIGCHI short DNA strings, each of which can
Minimizing wages is quite different Conference on Human Factors in Computing Systems (Seoul, be processed independently in parallel.
Republic of Korea, Apr. 18–23). ACM Press, New York,
from minimizing execution time. For 2015, 1621–1630. Top-end graphics hardware—GPUs—
example, the AutoMan language is de- already contain several thousands of
signed to minimize crowdwork request- Barry Brown and Airi Lampinen, processing cores and deliver consider-
ers’ costs by iteratively running rounds Stockholm, Sweden ably more raw processing power than
of recruitment, with tasks offered at even multi-core CPUs. So it is no wonder
increasing wages. However, such opti- that bioinformatics has turned to GPUs.
mization is quite different from the per- Authors Respond: Berger et al. did mention BWA’s im-
spective of the workers compared to the We share the concerns Brown and plementation of the Burrows-Wheeler
requesters. The process is clearly not Lampinen raise about crowdworker compression transform. BarraCUDA is
optimized for economic fairness. Sys- rights. In fact, AutoMan, by design, an established port of BWA to Nvidia’s
tems that minimize payments could ex- automatically addresses four of the five hardware that was optimized for mod-
ert negative economic force on crowd- issues raised by workers, as described ern GPUs; results were presented at the
worker wages, failing to account for the by Irani and Silberman in the letter’s 2015 ACM Genetic and Evolutionary
complexities of, say, Mechanical Turk Reference 1: AutoMan never arbitrarily Computation Conference.1 Also, Nvidia
as a global labor market. rejects work; pays workers as soon as keeps a list of the many bioinformatics
Recent research published in the the work is completed; pays workers applications and tools that run on its
proceedings of Computer-Human In- the U.S. minimum wage by default; and parallel hardware.
teraction and Computer-Supported automatically raises pay for tasks until After CPU clocks maxed out at 3GHz
Cooperative Work conferences by Lil- enough workers agree to take them. Our more than 10 years ago, Moore’s Law
ly Irani, David Martin, Jacki O’Neill, experience reflects how much workers pushed 21st century computing to be
Mary L. Gray, Aniket Kittur, and others appreciate AutoMan, consistently rating parallel. GPUs and GPU-style many-
shows how crowdworkers are not inter- AutoMan-generated tasks highly on core hardware are today at the center
changeable cogs in a machine but real Turkopticon, the requester-reputation site. of the leading general-purpose paral-
humans, many dependent on crowd- Daniel W. Barowy, Charles Curtsinger, lel computing architectures. Much of
work to make ends meet. Designing for Emery D. Berger, and microbiology data processing is inher-
workers as active, intelligent partners Andrew McGregor, Amherst, MA ently parallel. Computational biology
in the functioning of crowdwork sys- and GPUs are a good match and set to
tems has great potential. Two examples continue to grow together.
where researchers have collaborated Computational Biology Is Parallel
with crowdworkers are the Turkopti- Bonnie Berger et al.’s article “Computa- Reference
3. Langdon, W.B. et al. Improving CUDA DNA analysis
con system, as introduced by Irani and tional Biology in the 21st Century: Scal- software with genetic programming. In Proceedings of
Silberman,1 which allows crowdwork- ing with Compressive Algorithms” (Aug. the 2015 ACM Genetic and Evolutionary Computation
Conference (Madrid, Spain, July 11–15). ACM Press,
ers to review crowdwork requesters, 2016) described how modern biology New York, 2015, 1063–1070.
and Dynamo, as presented by Salehi and medical research benefit from in-
et al.,2 which supports discussion and tensive use of computing. Microbiology W.B. Langdon, London, U.K.
collective action among crowdworkers. has become data rich; for example, the
Both projects demonstrate how crowd- volume of sequence data (such as strings Communications welcomes your opinion. To submit a Letter
workers can be treated as active part- of DNA and RNA bases and protein se- to the Editor, please limit yourself to 500 words or less,
and send to [email protected].
ners in improving the various crowd- quences) has grown exponentially, par-
work marketplaces. ticularly since the initial sequencing of © 2016 ACM 0001-0782/16/11 $15.00
DOI:10.1145/2994590 http://cacm.acm.org/blogs/blog-cacm
Mark Guzdial In JES, you can only edit one file blue box around the lines in the same
14 Years of a at a time, with an interaction pane (a block as the current line (containing
Learner-Centered REPL, http://bit.ly/2cEcZrl) for test- the cursor).
Python IDE ing and running code. Most Python Now, when students come to me
http://bit.ly/2aNXAnC programs by professionals use mul- with indentation errors, I ask them
August 10, 2016 tiple files. We opted to design for the “Where’s the blue box?” And they in-
I recently discovered that the latest beginner who could get lost in a sea of variably ask, “What blue box?” I point
version of our Python IDE for learn- files. Having more than one file open it out to them. “Ohhhh—that blue
ers, JES (Jython Environment for Stu- requires some interface for switching box! Yeah, that’s useful.” Until stu-
dents), passed 10,000 downloads (see files. One file means no additional in- dents realize they have to attend to in-
the count at http://bit.ly/2cyt1l2); terface. You can build bigger things dentation, they do not see the support
10,562 when I started writing this in JES, but we optimized for the most we are providing. Telling them that in-
post. We created JES in 2002 in sup- common learner case. dentation is important and pointing
port of our Media Computation course Imagineering Media Computation out the blue box is not nearly as effec-
at Georgia Tech, which is a required to be Normal Python: We built JES to tive as encountering the error once.
course for students in our colleges of facilitate students programming in Over the years, remove features:
Liberal Arts, Business, and Design. Media Computation. Students can The original 2002 JES had more fea-
Schools that adopted our Python Me- program anything in JES; it is a full tures in it than the most recent ver-
dia Computation textbook have also Python implementation. We added sion. Two examples:
mostly adopted JES, or used options additional features to JES to facilitate ˲˲ We used to have a menu item to
that support the same libraries, such Media Computation with a minimum automatically turn in homework. It
as Pythy or Pyjama. The current down- of cognitive load. worked for a while at Georgia Tech,
load count is probably an underes- We imagineered a community but then we changed how we handled
timate of users, since some schools of practice around Media Compu- homework. Other schools have always
download JES only once and then dis- tation (a story we told in http://bit. had their own mechanisms, and most
tribute it across campus. ly/2bRCSmd). In JES, it is normal for schools try to make it easy to turn in
Most student Python programmers Python to know how to make pictures homework. We dropped that.
use IDLE, the integrated development with makePicture, access individual ˲˲ We worried about students load-
environment (IDE) provided with Py- pixels with getPixels, change colors ing a function from their homework
thon. While Python is more learner- with setRed, and access sound sam- file, then deleting the source code (by
centered than many languages used ples with getSamples. Invisibly, we accident). The program might work
by professionals (see “Five Principles load libraries for the student so that correctly for them, but the file would
for Programming Languages for JES is a Python that supports program- be incomplete when grading. So, we
Learners,” http://bit.ly/2chERzG), the ming with multimedia from the first wrote complicated code that com-
IDE should also be tailored for learn- moment of class. In normal Python, pared the internal namespace to the
ers. Learners have different goals you can print anything to get its val- source code, but that complicated
and needs than professional pro- ue. In JES, you explore any picture or code often caused problems of one
grammers. They need programming sound to open an explorer to see color sort. We decided it was not worth a big
environments that take a learner- values in pictures or to visualize the maintenance task for a rare error case,
centered design approach (http://bit. samples in a sound. so we simply removed all of that com-
ly/2cEcQnz), like JES or DrPython. Later on, we tell students there are plicated code.
JES is likely one of the most-used libraries and how JES is automatically The lesson here is Yogi Berra’s fa-
pedagogical programming environ- loading them. But on Day One, JES is a mous quote, “It’s tough to make pre-
ments for Python, and the fact that it is friendly Python that knows about pix- dictions, especially about the future.”
still frequently used at the ripe old age els and pictures, sounds and samples, Some of the issues that seemed criti-
of 14 suggests that it has been pretty and frames and videos. No imported cal in the beginning simply were not
successful. It is probably worth con- libraries, no dots, no extra details. all that important in actual practice
sidering why it has worked. As lead on Sometimes, you explain errors after later. The features with more staying
the team that built it and maintained they occur: When we first started teach- power were the ones that we added
it for the last 14 years, I am not a good ing Python to non-technical majors at in response to learner behavior. Our
judge for why it has worked. Georgia Tech, students struggled with most successful features were the
Instead, I offer four brief stories indentation errors. Even today in our ones we added to meet real learners’
about JES’s development and mainte- ebooks (http://bit.ly/2bW119J), inden- needs, not what we initially imagined
nance over the last 14 years. tation problems in Python are among the needs to be.
Keep It Simple: From DrScheme the most common and the most diffi-
and DrJava, we took the principle to cult for students to fix. Valerie Barr is a professor in the Computer Science
Department at Union College, Schenectady, NY.
keep it simple. We did not want an We decided to add a small feature Mark Guzdial is a professor in the College of Computing
interface with many options for many to JES to help with indentation er- at the Georgia Institute of Technology, Atlanta, GA.
kinds of uses. We wanted an interface rors. We wanted indentation to be sa-
that worked very well for learners. lient and obvious. JES draws a small © 2016 ACM 0001-0782/16/11 $15.00
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 11
N
news
Learning Securely
Because it is easy to fool, machine learning
must be taught how to handle adversarial inputs.
O
VE R T H E PAST five years,
machine learning has blos-
somed from a promising
but immature technology
into one that can achieve
close to human-level performance on a
wide array of tasks. In the near future,
it is likely to be incorporated into an in-
creasing number of technologies that
directly impact society, from self-driv-
ing cars to virtual assistants to facial- School bus + tiny adversarial perturbation = “ostrich”
recognition software.
Yet machine learning also offers
brand-new opportunities for hack-
ers. Malicious inputs specially crafted
by an adversary can “poison” a ma-
chine learning algorithm during its
training period, or dupe it after it has
been trained. While the creators of a
machine learning algorithm usually
benchmark its average performance
carefully, it is unusual for them to con- Dog + tiny adversarial perturbation = “ostrich”
sider how it performs against adversar-
ial inputs, security researchers say. Adversarial input can fool a machine-learning algorithm into misperceiving images.
The emerging field of adversarial
machine learning is exploring these of the machine learning algorithm un- the University of California, Berkeley,
vulnerabilities. In the past few years, der attack. who has crafted audio files that sound
researchers have figured out, for exam- Machine learning can be easy to like white noise to humans, but like
IMAGES BY CHRISTIA N SZEGEDY ET A L.
ple, how to make tiny, imperceptible fool, computer scientists warn. “We commands to speech recognition al-
changes to an image to fool vision pro- don’t want to wait until machine gorithms. “We need to think of the at-
cessing systems into interpreting an learning algorithms are being used tacks as early as possible.”
image humans see as a school bus as on billions of devices, and then wait
an ostrich instead. Such deceptions of- for people to mount attacks,” said Attacking the Black Box
ten can be carried out with virtually no Nicholas Carlini, a graduate student Adversarial machine learning has been
knowledge about the inner workings in adversarial machine learning at studied for more than a decade in a few
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 13
news
they have confirmed the phenomenon makes it easy to miss vulnerabilities. ability vectors instead of single labels,
in increasingly broad settings. In a Now, he warned, the machine learn- and then trains a new neural network
paper posted online in May, Goodfel- ing community is poised to make the using the probability vector labels; this
low, together with Nicolas Papernot same mistake: machine learning is a more nuanced training makes the sec-
and Patrick McDaniel of Pennsylvania huge field, but only a tiny slice of the ond neural network less prone to over-
State University, showed that adver- community is focused on security. fitting. Fooling such a neural network,
sarial examples transfer across five “We should be developing security in the researchers showed, required
of the most commonly used types of from the start,” he said. “It shouldn’t eight times as much distortion to the
machine learning algorithms: neural be an afterthought.” image as before the distillation.
networks, logistic regression, support The good news is that adversarial ex- The work done so far is just a start,
vector machines, decision trees, and amples do not just offer a way to bench- Tygar said. “We haven’t even begun to
nearest neighbors. mark how vulnerable a machine learn- understand all the potential different
The team carried out “black box” at- ing model is; they also can be used to environments in which you might have
tacks—with no knowledge of the mod- make the model more robust against an attack” on machine learning. The
el—on classifiers hosted by Amazon an adversary and, in some cases, even field of adversarial machine learning is
and Google. They found after only 800 improve its overall accuracy. full of opportunities, he said. “The area
queries to each classifier, they could Some researchers are making ma- is ready to move.”
create adversarial examples that fooled chine learning algorithms more robust Although Tygar thinks it is a near
the two models 96% and 89% of the by essentially “vaccinating” them: add- certainty we will start seeing more at-
time, respectively. ing adversarial examples, correctly la- tacks on machine learning as its use
Not every adversarial example beled, into the training data. In a 2014 becomes prevalent, it would be a mis-
crafted on one machine learning paper, Goodfellow and two colleagues take, he said, to conclude machine
model will transfer to a different tar- demonstrated in a classification task learning is too risky to be used in ad-
get, Papernot said, but “for an adver- involving handwritten numbers that versarial settings. Rather, he said,
sary, sometimes just a small success this kind of training not only made it the question should be, “How can we
rate is enough.” harder to fool a neural network, but strengthen machine learning so it is
even brought down its error rate on ready for prime time?”
Getting Ready for Prime Time non-adversarial inputs. Kantchelian,
Human vision systems have adversar- Tygar, and Joseph have shown a simi-
Further Reading
ial examples of our own, more com- lar vaccination process can greatly im-
monly known as optical illusions. For prove the robustness of boosted trees. Goodfellow, I., Shlens, J., and Szegedy, C.
Explaining and Harnessing
the most part, though, people process Most of the research on adversarial
Adversarial Examples
visual data remarkably effectively. As machine learning has focused on “su- http://arxiv.org/pdf/1412.6572v3.pdf
computer vision systems approach pervised” learning, in which the algo-
Kantchelian, A., Tygar, J. D., and Joseph, A.
human-level performance on particu- rithm learns from labeled data. Adver- Evasion and Hardening
lar tasks, adversarial examples offer a sarial training offers a potential way for of Tree Ensemble Classifiers
new way to benchmark performance, machine learning algorithms to learn http://arxiv.org/pdf/1509.07892.pdf
besides measuring how well an algo- from unlabeled data—an exciting pros- Miyato, T., Dai, A., and Goodfellow, I
rithm performs on typical inputs. Ad- pect, Goodfellow said, since labeled Virtual Adversarial Training for
versarial inputs “help find the flaws in data is expensive and time-consuming Semi-Supervised Text Classification
neural networks that do really well on to create. In a paper posted online in http://arxiv.org/pdf/1605.07725v1.pdf
the more traditional kinds of bench- May, Takeru Miyato of Kyoto Univer- Papernot, N., McDaniel, P.,
marks,” Goodfellow said. sity—along with Goodfellow and An- Wu, X., Jha, X., and Swami, A.
Distillation as a Defense to
“In a perfect world, what I would drew Dai of Google Brain—was able to
Adversarial Perturbations against
like to see in papers is, ‘Here is a new improve the performance of a movie Deep Neural Networks
machine learning algorithm, and here review classifier by taking unlabeled Proceedings of the 37th IEEE Symposium on
is the standard benchmark for how reviews, creating adversarial versions Security and Privacy, May 2016.
it does on accuracy, and here is the of them, and then teaching the classi- Papernot, N., McDaniel, P., and Goodfellow, I.
standard benchmark on how it per- fier to group those reviews in the same Transferability in Machine Learning:
forms against an adversary,” Carlini category. With the plethora of texts from Phenomena to Black-Box Attacks
using Adversarial Samples
said. Adversarial machine learning ex- available on the Internet, “you can get
http://arxiv.org/pdf/1605.07277v1.pdf
perts have some leads on what such a as many examples for unlabeled learn-
benchmark should look like, Papernot ing as you want,” Goodfellow said. Szegedy, C., Zaremba, W., Sutskever, I.,
Bruna, J., Erhan, D., Goodfellow, I.,
said, but no such benchmark has been Meanwhile, in another paper pub- and Fergus, R.
established yet. lished in May, Papernot, McDaniel, Intriguing Properties of Neural Networks
Software developers have a history and other researchers detail the cre- https://arxiv.org/pdf/1312.6199v4.pdf
of adding security to their products ation of another defense against adver-
after the fact rather than integrating sarial examples, called “defensive dis- Erica Klarreich is a mathematics and science journalist
based in Berkeley, CA.
it into the development phase, Car- tillation.” This approach uses a neural
lini said, even though that approach network to label images with prob- © 2016 ACM 0001-0782/16/11 $15.00
Blockchain
Beyond Bitcoin
Blockchain technology has the potential to revolutionize
applications and redefine the digital economy.
B
LOCKCHAIN TECHNOLOGY
HAS attracted attention as
the basis of cryptocurren-
cies such as Bitcoin, but its
capabilities extend far be-
yond that, enabling existing technol-
ogy applications to be vastly improved
and new applications never previously
practical to be deployed.
Also known as distributed ledger
technology, blockchain is expected to
revolutionize industry and commerce
and drive economic change on a global
scale because it is immutable, trans-
parent, and redefines trust, enabling
secure, fast, trustworthy, and trans-
parent solutions that can be public or
private. It could empower people in
developing countries with recognized
identity, asset ownership, and financial
inclusion; and it could avert a repeat of
the 2008 financial crisis, support ef-
fective healthcare programs, improve
supply chains and, perhaps, clean up
unethical behavior in high-value busi-
nesses such as diamond trading.
Blockchain, like the Internet, is an
open, global infrastructure that allows block of data is added to the chain and the second Internet, personal comput-
companies and individuals making shared by all on the network. Transac- ers, and local area networks. The third
transactions to cut out the middle- tions are secure, trusted, auditable, platform delivers computing anywhere,
man, reducing the cost of transactions and immutable. They also avoid the immediately, and allows organizations
and the time lapse of working through need for copious, often duplicate, doc- to deploy and consume computing re-
third parties. The technology is based umentation, third-party intervention, sources in shared communities.
on a distributed ledger structure and and remediation. Says Versace, “The core capabilities
consensus process. The structure al- Blockchains can be either public of the third platform of technology are
lows a digital ledger of transactions to and unpermissioned, allowing any- beyond any we have seen before. Innova-
be created and shared between distrib- body to use them (bitcoin is a case in tion accelerators like blockchain mean
uted computers on a network. The led- point) or private and permissioned, we can achieve technology value out-
ger is not owned or controlled by one creating a closed group of known par- comes that we couldn’t achieve before.”
central authority or company, and can ticipants working, perhaps, in a partic- This is promising, but there are
be viewed by all users on the network. ular industry or supply chain. caveats. Sandeep Kumar, managing
When a user wants to add a trans- Michael Versace, global research di- director of capital markets and a block-
action to the ledger, the transaction rector for digital strategies at research chain specialist at digital business con-
data is encrypted and verified by other firm IDC, describes blockchain as an sulting and technology services firm
IMAGE BY IMAGENTLE
computers on the network using cryp- industry and innovation accelerator Synechron, names data privacy, scal-
tographic algorithms. If there is con- based on the capability of the third plat- ability, and interoperability as three
sensus among the majority of comput- form of technology—the first platform key challenges to blockchain technol-
ers that the transaction is valid, a new being mainframes and their networks, ogy that are pervasive across applica-
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 15
news
technology and building the founda- Data from Border Devices’ project.
tion of a standardized, production- Everledger’s focus is on the identity
Benefits of grade digital ledger.
Deloitte is working with clients and
and legitimacy of objects. Blockchain
works well here because its history can-
Blockchain startups to develop solutions includ- not be changed and it enables trust by
in Financial
ing Smart Identity, which can support consensus. The company’s initial work
banks’ regulatory client onboarding and provides a distributed ledger of dia-
surance industry by using smart con- ment or livestock, and provide access
tracts to pay out against insurance to working capital and, by extension, a
policies without policyholders having The potential wider market.
to make a claim. He adds: “The Inter- of blockchain is Digital identity enabled by block-
net of Things could be an enormous chain has the potential to change lives.
application area where people want diverse in Says Dahan, “If blockchain technol-
to communicate with devices, but not developing countries, ogy can be used to secure robust, self-
through intermediaries. There is no sovereign digital identities around
killer app yet, but it is likely to feature where the initial personal data, there’s a real possibility
the transparency of blockchain.” focus is on the that people in places with poor docu-
While start-ups can skip some of the ments, registries and rules of law can
challenges presented by blockchain trust element. establish trusted measures of their
technology, established firms must good reputation. This would allow
set up a network of blockchain partici- them to assert who they are and access
pants, perhaps suppliers and custom- proof of their digital identity anywhere
ers, and agree on technology protocols. using a private key.”
Commercial firms, like others, will also With the benefit of digital identity,
hit the interoperability barrier identi- ten in short supply.” many of the world’s two billion un-
fied by Synechron and by Microsoft in Dahan suggests the trust element of banked individuals could store their
feedback from early blockchain adopt- blockchain will play well into the 2030 identities on a blockchain, permis-
ers. Kumar explains: “Blockchain is Sustainable Development Goals ad- sion banks to fulfill regulatory require-
evolving in many ecosystems, such as opted by U.N. members in 2015 and de- ments such as Know Your Customer,
Hyperledger and Ethereum, but there signed to end poverty, protect the plan- and gain access to bank accounts,
needs to be a native way to integrate et, and ensure prosperity for all. More loans, and other financial services pre-
blockchains that would allow, for ex- specifically, she notes high-potential viously inaccessible to them.
ample, a transaction on Hyperledger to applications of blockchain in land regis- The potential of blockchain to revo-
invoke information from Ethereum.” tration, digital identity, and finance for lutionize applications and drive global
Sirer warns of less-advantageous ap- small and medium-sized enterprises. economic change is certainly there, but
plications such as gambling and on- Land registration and awareness problems persist in wide-scale execu-
going security problems. He cites the of its relevance to issues such as food tion. As Kumar concludes: “Blockchain
spectacular rise and fall of The DAO, security, climate change, urbaniza- is not yet ready for prime time.”
a distributed autonomous organiza- tion, and indigenous people’s rights
tion based on Ethereum technology has increased over recent years, yet the
Further Reading
that acted as an investment vehicle, Independent Evaluation Group of the
raising $220 million, then swiftly World Bank says 70% of the world’s The future of financial infrastructure:
An ambitious look at how blockchain
losing $53 million to a hacker. “We population lacks access to proper land
can reshape financial services,
looked at the DAO code and found it titling or demarcation. World Economic Forum,
was written so badly it was open to Beginning to solve this problem https://www.weforum.org/reports/the-
attack from nine different angles. In- are projects like one in the Republic of future-of-financial-infrastructure-an-
cidents like this uncover the need for Georgia, where the National Agency of ambitious-look-at-how-blockchain-can-
reshape-financial-services/
more multi-disciplinary research on Public Registry is working with BitFury
blockchain technology.” on a pilot project that will use a trans- Cuomo, J.
How Businesses and Governments
parent, secure ledger to manage land
Can Capitalize on Blockchain,
Developing Countries titles and, if successful, cut property http://www.ibm.com/blogs/
The potential of blockchain is also registration fees by up to 95%, increase think/2016/03/16/how-businesses-and-
diverse in developing countries, but transparency of land ownership, and governments-can-capitalize-on-blockchain/
where the commercial world is con- reduce fraud. A similar project partially Sirer. E.G.
centrating on outstanding technology funded by the World Bank is being de- Introducing Virtual Notary
challenges, developing countries are veloped in Honduras, where Factom is Hacking, Distributed
http://hackingdistributed.com/2013/06/20/
initially focusing on the trust element working with the government to proto-
virtual-notary-intro/
of blockchain. type a blockchain-based land registry.
Mariana Dahan, senior operations Beyond land registration, Dahan Casey, M., and Dahan, M.
Blockchain technology: Redefining trust
officer at the World Bank in charge explains how the ability to store and for a global, digital economy
of the 2030 development agenda and update property titles on a blockchain http://blogs.worldbank.org/ic4d/blockchain-
United Nations (U.N.) relations, says, could, for the first time, allow poor technology-redefining-trust-global-digital-
“We believe blockchain is a major people to assert reliable title claims economy?cid=EXT_WBBlogSocialShare_D_EXT
breakthrough and has great potential. to their homes and use them as col-
It will make an impact on, and bring lateral for borrowing. Small and medi- Sarah Underwood is a technology writer based in
Teddington, U.K.
value to, any transaction that requires um-sized enterprises also could prove
trust, a social resource that is all too of- ownership of assets, perhaps equip- © 2016 ACM 0001-0782/16/11 $15.00
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 17
news
Farm Automation
Gets Smarter
As fewer people work the land, robots pick up the slack.
F
“the world’s
I E L D FA R M I N G I S
oldest profession,” and not
just because food plants have
been cultivated for over 10,000
years. Its individual practitio-
ners are old as well, the median age ris-
ing rapidly as young people abandon the
farming lifestyle (the U.S. Department of
Agriculture reports a median age of 58
in 2012, up from 55 in 2002, with other
countries showing similar data). Those
who remain face the same repetitive
work of seeding, weeding, feeding, and
harvesting, the tedium of each task in- The BoniRob is a multipurpose robotic platform for agricultural applications featuring
creasing as farms grow ever larger. independently steerable drive wheels and adjustable track width.
However, today’s agricultural robots
excel at repetitive tasks, letting farmers next door has cheaper labor” said Salah field, the wind blows leaves around,
tend to more strategic matters. Sukkarieh, director of Research and In- lighting conditions change, you get
“When we develop robots, people novation at the Australian Centre for a bit of dust; these all affect the algo-
always say they should increase yields Field Robotics (ACFR) at the University rithm. We try things out in the lab, but
or lower costs, but that’s not always the of Sydney. then have to play with them because
case,” says Eldert van Henten, a profes- Food security is one of the driving they don’t work as well outdoors, in an
sor in Wageningen University’s Farm forces behind “The Vegetable Factory,” unstructured environment.”
Technology Group. “Robots that offer an indoor farm that the Japan-based Traditional farm equipment also
the farmer time to devote attention to company SPREAD Co. Inc. plans to adds unpredictability to the mix, as
business and management also have open in Kyoto next year. In addition Autonomous Tractor Corporation
economic value.” to computer-controlled lighting, tem- (ATC) president and CEO Kraig Schulz
Allen Lash, CEO of the farm manage- perature, and humidity, virtually every learned when his company started
ment company Family Farms Group, aspect of a plant’s development from planning to build an autonomous trac-
sees agricultural robots filling in for the seedling to harvest will be automated. tor. “Traditional [drive train] vendors
shrinking agricultural workforce. “In a “We expect automation to reduce hu- couldn’t deliver what we needed, so
lot of our small, rural communities, we man error, therefore stabilizing the we turned to electric drive trains, like
just don’t have adequate labor to run quality and quantity of production,” Google and Tesla have for their cars. A
big, sophisticated equipment anymore,” says SPREAD global marketing man- traditional drive train is hard to control;
he says. “With autonomous equipment, ager J.J. Price. “Such ‘plant factories’ there’s a lot of mechanical nonsense
you’ll need less labor, although the peo- will increase in the future to cover ag- between the driver and the ground. But
ple you have will need more computer ricultural issues, food shortages, and I can control an electric motor to milli-
technology skills. They’ll control equip- environmental problems.” seconds and fractions of an inch.” The
ment remotely to a great extent, from a company later released its “eDrive” sys-
control room, scanning at least three Command and Control tem as an electric engine replacement
IMAGE COURTESY OF BOSCH DEEPF IELD ROBOT ICS
cameras all the time. So we’ll get more Farm automation is particularly diffi- for tractors already in the field.
acres covered in a more efficient way; cult because the environment varies so As another step toward autono-
maybe several thousand acres per em- much—especially when compared to mous farming, ATC next supplement-
ployee, instead of today’s 1,000.” clean labs. “Computer scientists need ed traditional GPS navigation with a
Proponents believe robotic agricul- to get outdoors more,” said ACFR’s local laser-and-radio system named
ture also will help ensure food is avail- Sukkarieh. “A lot of the [computer vi- AutoDrive. “If you talk to farmers,
able for everyone, everywhere, from sion] algorithms get developed on very they’ll say that GPS is good, but far
local sources. “Every country is now wor- structured data, like a camera focus- from perfect. They’re in very rural ar-
ried about food security, and the country ing on objects in a kitchen. But in the eas, where reception isn’t great, and
GPS tends to suffer from problems cations to analyze the images for things that perceive its own situation; intel-
like sunspots, reflections, and tree like disease.” ligence and control mechanism for de-
coverage,” says Schulz. “A lot of driv- A different approach to weeding can termining and selecting the appropriate
erless cars use optics and radar and be found in “Lettucebot,” a product action by itself; and driving mechanisms
such, because there’s stuff around from U.S.-based Blue River Technology for achieving what it has determined.”
them to see. When you’re in a field, that puts its intelligence in the imple- Blue River Technology’s Chostner
there’s nothing to see! So you need a ment, combining vision technology with says, “A robot sees, decides, and acts.
second or third source of data to get microspraying to apply herbicides to Then it closes that loop by seeing how it
relative positioning in the field.” The weeds and overplanted lettuce sprouts. acted and adjusting its future actions.”
company is now working on a system According to the company, Lettucebot On the other hand, Chostner says, “We’d
that can be retrofit to any tractor with can identify 1.5 million plants per hour, much rather focus on the simplest tool
its eDrive, adding AutoDrive and its weeding and thinning a 40-acre field that will create value for our customers,
own “FieldSmart” artificial intelli- within a day. For vice president of busi- rather than the ultimate ‘robot’ that is
gence software. ness development Ben Chostner, speed the company vision. Robots are always
is only one benefit to this precision ap- something out in the future; automation
An “App” for Planting proach. “Most inputs or chemicals on is something that’s here today, creating
Another challenge for autonomous the farm are applied in a broadcast way, value. We found that farmers don’t like
farming is the variety of tasks that agri- with hardly any sensors. It’s like if some- to buy robots; they like to buy machines
culture demands. The hardest, accord- one in San Francisco had an infection, that work.”
ing to Wageningen University’s van and the only solution was to give every- Except for limited tasks by “machines
Henten, is selective harvesting. “Ro- one an antibiotic; that would solve the that work,” fully automated growing may
bots can do the trick now at relatively problem, but it would be very expensive be years away. Kubota’s Iida believes the
slow speeds, and without full success,” and inefficient, and would lead to nega- technology is now in the second of four
he says. “We still have a long way to go. tive consequences like resistance. That’s stages: we have achieved automation of
But humans excel in terms of eye-hand the same thing that’s happening with individual functions and auto-steering
coordination, intelligence, flexibility, weed control today, because those are under operator control, but have yet to
and robustness against variation in the the tools we have.” master unguided operation and fully au-
environment. So harvesting is done by ATC’s Schulz looks forward to the tonomous farming.
humans very effectively.” day that weeding, planting, and harvest- Sukkarieh believes economic chang-
One problem, he says, is that “Plants ing are all treated as “apps” that a single es have cleared the way to reach that fi-
that are members of the same type can machine (with specialized attachments) nal phase. “Farming is probably the last
differ a lot due to a lack of water, light, could learn to perform through artificial major primary industry to focus on auto-
or nutrition.” Switching from one vari- intelligence (AI) techniques. “The com- mation, and that’s because of the cost,”
ety to another is also difficult. “Consider puter has to become a farmer just like a he says, “but technology costs have
the tomato; usually a tomato is red, but person becomes a farmer, over time, by dropped dramatically. Even places with
there are some that are yellow. There learning. That’s where AI comes in. The cheaper labor are considering agricul-
are also different shapes, and beefsteak driver should think of the tractor riding tural automation. I think over the next
and cherry tomatoes have very different along like a son or daughter: it observes year or two, you’ll see a large amount of
sizes. But the robot still has to recognize what the farmer is doing and thinks, activity happening in this space.”
them as tomatoes.” ‘this is what I have to replicate.’ The trac-
A somewhat simpler task is weed- tor will start with small tasks and grow
Further Reading
ing, which is the bailiwick of the French competence over time. Its tillage ‘app’
company Naïo Technologies. “We start- is very different from a planting ‘app’ Robotics & Automation Society, Technical
Committee on Agricultural Robotics and
ed with weeding because it’s very ardu- because the task has fundamentally dif-
Automation, IEEE: http://www.fieldrobot.
ous for farmers,” says Julien Laffont, ferent controls, but the learning process com/ieeeras/
the company’s international business could be similar.”
Past IEEE webinars on agricultural robotics:
developer, “and it’s a repetitive task, so http://www.fieldrobot.com/ieeeras/Events.html
it’s perfect for robots.” The company’s From Automation to Automatons
Video of Lettucebot: https://www.youtube.
currently available “Oz” system uses two Kubota Corp. director and senior man- com/watch?v=yCPe9Sy0TFY
cameras and a laser to guide itself be- aging executive officer Satoshi Iida dif-
Robohub focus on agricultural robotics:
tween rows of crops as it drags a weeding ferentiates modern, computer-driven
http://robohub.org/tag/robohub-focus-on-
tool behind it. agents from centuries-old farming tech- agricultural-robotics/
Oz controls only the robot, not the nology like the seed drill. “An agricultur-
Australian Centre for Field Robotics
tool it drags behind. Laffont said the al machine can be defined as a ‘robot’ (Agriculture Projects): http://sydney.edu.au/
company’s future “Dino” robot also when it has the three technology com- acfr/agriculture
will control the tools themselves, let- ponents,” says Iida, who also is general
ting them do more than simply avoid manager of Kubota’s Research and De- Tom Geller is an Oberlin, Ohio-based writer and
documentary producer.
harming crops. “We’ll use cameras that velopment (R&D) headquarters, and of
analyze the image to detect, circle, and the company’s Water and Environment
count crops. We might also have appli- R&D function. “It must have sensors © 2016 ACM 0001-0782/16/11 $15.00
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 19
V
viewpoints
I
N THE EARLY days of computers, confident that from the standpoint of
security was easily provided technology there is a good chance for
by physical isolation of ma- The security problem secure shared systems in the next few
chines dedicated to security do- will remain as long years. However, from a practical stand-
mains. Today’s systems need point the security problem will remain
high-assurance controlled sharing as manufacturers as long as manufacturers remain com-
of resources, code, and data across remain committed mitted to current system architectures,
domains in order to build practical produced without a firm requirement
systems. Current approaches to cyber to current system for security. As long as there is support
security are more focused on saving architectures, for ad hoc fixes and security packages
money or developing elegant techni- for these inadequate designs, and as
cal solutions than on working and produced without long as the illusory results of penetra-
protecting lives and property. They a firm requirement tion teams are accepted as a demon-
largely lack the scientific or engi- stration of computer system security,
neering rigor needed for a trustwor- for security. proper security will not be a reality.”8
thy system to defend the security of
networked computers in three dimen- Current Approaches
sions at the same time: mandatory ac- Aren’t Working
cess control (MAC) policy, protection Our confidence in “security kernel”
against subversion, and verifiability— technology was well founded, but I
what I call a defense triad. development processes were neither never expected decades later to find
Fifty years ago the U.S. military rec- scalable nor sustainable for advanc- the same denial of proper security so
ognized subversiona as the most seri- ing computer technology and growing widespread. Although Forbes reports
ous threat to security. Solutions such threats. In a 1972 workshop, I pro- spending on information security
as cleared developers and technical posed “a compact security ‘kernel’ of reached $75 billion for 2015, our ad-
the operating system and supporting versaries are still greatly outpacing us.
hardware—such that an antagonist With that large financial incentive for
a As characterized by Anderson, et al.,2 “System
subversion involves the hiding of a software or
could provide the remainder of the sys- vested interests, resources are mostly
hardware artifice in the system that creates a tem without compromising the protec- devoted to doing more of what we knew
‘backdoor’ known only to the attacker.” tion provided.” I concluded: “We are didn’t work then, and still doesn’t.
Why does cyber security seem so remaining hole. Even worse, a witted of “malware,” a preferred attack for
difficult? Today’s emphasis on sur- adversary has numerous opportunities many of the most serious breaches.
veillance and monitoring tries to to subvert or sabotage a computer’s An IBM executive a few years ago de-
discover that an adversary has found protection software itself to introduce scribed the penetrate-and-patch cycle
and exploited a vulnerability to pen- insidious new flaws. This is an example as “an arms race we cannot win.”5
etrate security and cause damage—or
worse, subverted the security mecha- Figure 1. Cyber security defense triad.
nism itself. Then that hole is patched.
But science tells us trying to make a
system secure in this way is effectively
non-computable. Even after fixing
known flaws, uncountable flaws re- Secure Systems
• Subversion is tool for choice for witted adversary
main. Recently, Steven Lipner, for-
•
Only label-based MAC policy can enforce secure
merly of Microsoft, wrote a Commu-
information flow
Mand
Verifia
Limit Subversion
•
Security kernel (reference monitor) is only known
advocating technical “secure develop-
atory
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 21
viewpoints
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 23
V
viewpoints
Legally Speaking
Fair Use Prevails in
Oracle v. Google
Two software giants continue with legal sparring
after an initial judicial decision.
O
RACL E A N D G OOG LE have
been battling in the courts
for more than six years
about whether Google in-
fringed Oracle copyrights
by using 37 packages of the Java ap-
plication program interface (API) in
developing the Android platform for
smartphones. In May 2016, a jury re-
jected Oracle’s copyright claim and
decided that Google’s use of these 37
packages was fair and non-infringing.
Oracle’s lawyers have announced that
the company plans to appeal. This col-
umn will explain why Oracle’s appeal is
unlikely to succeed and why that’s good
news for Java programmers, for the
software industry, and for the public.
But before getting to that, this col-
umn will relate some facts about the
litigation, about fair use as a defense
to copyright infringement, and about
Oracle and Google’s arguments about within each package, without a license mulation was that any expressiveness
the fair-use defense. from Sun. It also developed more than the Java API elements Google used in
100 new packages in Java and C++ for Android had “merged” with the API
Background About Oracle v. Google smartphone functions. Soon after ac- functionality and so should not be a
Oracle acquired Sun Microsystems in quiring Sun, Oracle sued Google claim- basis for copyright liability. Because
IMAGE COLL AGE BY ANDRIJ BORYS ASSOC IAT ES/SH UTT ERSTOC K
2010. Sun’s assets included intellec- ing that Android infringed Oracle’s Judge Alsup found Google’s copyright-
tual property rights in Java technolo- patents and copyrights. In an earlier ability defense persuasive, there was
gies. Before that acquisition, Google trial, a jury decided against the patent no need to reach Google’s backup fair-
negotiated with Sun about a possible claims. use defense.
license to use Java technologies in An- Initially, Google’s main defense to Oracle appealed that ruling and
droid. Although those negotiations Oracle’s copyright claim was not fair convinced the Court of Appeals for the
broke down, Google went ahead with use. Instead, Google asserted that the Federal Circuit (CAFC) that the Java API
using certain packages of the Java API, Java API packages, classes, and decla- elements incorporated into Android—
and in particular, the declarations that rations it used in Android were not pro- principally, the 7,000 declarations that
invoke implementing code for specific tectable by copyright law because they Google had literally copied in Android
functions and the structure, sequence were too functional as components of source code—were protectable expres-
and organization (SSO) of classes the Java API system. An alternative for- sion under U.S. copyright law. (My No-
vember 2012 Communications Legally plaintiff’s work on the market for or that Google had made transformative
Speaking column incorrectly predicted value of the plaintiff’s work is also im- uses of the Java API packages by build-
that the CAFC would affirm; my March portant. When the defendant’s use is ing them into the Android platform.
2015 column criticized the CAFC deci- transformative, the focus is on whether He compared the Java API to a file cabi-
sion and incorrectly predicted that the the defendant’s work would serve as a net with folders, a functional device
Supreme Court would take Google’s substitute for the plaintiff’s work, not that was far from the core of copyright.
appeal. Oh well.) The CAFC sent the whether the plaintiff wants the defen- He argued that Google took no more
case back for trial on the fair-use issue. dant to pay a license fee. from the Java API than was necessary
to achieve its transformative purpose.
What Is Fair Use? Fair-Use Arguments The transformative nature of Google’s
Fair use is a statutorily recognized de- in Oracle v. Google use mitigated the harm factor because
fense to a claim of copyright infringe- Oracle and Google had starkly differ- Android did not supplant demand for
ment in the U.S. When this defense is ent views about whether the reuse of 37 the Java API for its original purpose.
successful, the defendant will be vindi- Java API packages could be fair use. Or- Fair use is intended to promote ongo-
cated and no copyright liability will be acle emphasized that Google acted in ing innovation, which Google’s lawyer
found. (Most nations do not have fair- bad faith because some internal email argued Android had done.
use provisions in their copyright laws.) correspondence showed that some
The copyright statute says that four fac- Google employees thought Google Oracle’s Effort to Overturn
tors should be considered: needed to a license to use the Java API. the Jury Verdict
Purpose of Use. The purpose and Oracle also contended that Google had Oracle v. Google was tried to a jury be-
character of the defendant’s use of made non-transformative use of the cause the two software giants did not
the plaintiff’s work is the first factor Java API packages because it copied agree about key facts pertinent to the
to consider. This includes subfactors the declarations verbatim. Its purpose, resolution of their dispute about An-
such as whether the use was commer- moreover, was commercial. Oracle ar- droid. The role of a jury is to decide
cial or noncommercial. Also signifi- gued all three considerations weighed which litigant’s view of the facts is most
cant is whether the use was “transfor- against fair use. persuasive, and then to apply the law
mative.” That is, did the defendant’s As for the other fair-use factors, to those facts in accordance with the
use enable the creation of a new work Oracle insisted that the Java API was instructions the judge reads to jurors
that builds upon the plaintiff’s work, highly creative. Google’s appropriation after the trial testimony has ended and
giving it a different purpose, mean- was substantial because the Android the lawyers have made their closing ar-
ing, or message, as a parody might do? source code included 11,500 lines of guments. Juries generally come back
Non-transformative uses consume the declaring code that Google copied from with a verdict for one party or the other.
work for its original purpose, as a pho- the Java API. Oracle’s economic expert They do not have to explain their find-
tocopy might do. Transformative uses witness testified that Google’s reuse of ings or the reasons for their verdict.
are more likely to be fair uses than the Java API packages had caused sub- After the jury found in favor of
non-transformative ones. stantial harm to the market for the Java Google’s fair-use defense, Oracle’s law-
Nature of Plaintiff’s Work. The na- API because Oracle had been unable to yers filed a motion asking Judge Alsup
ture of the copyrighted work is a sec- collect licensing revenues from Google to set aside the jury verdict and rule in
ond fair-use factor. If the plaintiff’s and from others as well. Oracle also its favor as a matter of law. The judge
work is highly creative, entertaining, contended that the network effects denied that motion and wrote an opin-
fanciful, or artistic, fair use is likely arising from the success of the Android ion to explain why a reasonable jury
to be narrow. If the plaintiff’s work is platform had made it impossible for might have found in Google’s favor on
functional or factual, fair use will tend Oracle to make a successful entrance several key fact issues:
to be broader. into the smartphone market. For one thing, Oracle made much
Amount Taken. The substantiality Google’s lawyer urged the jury to find during the trial about Google acting
of the defendant’s taking of expression in bad faith in using the Java API dec-
from the plaintiff’s work is often mea- larations when it should have gotten a
sured quantitatively, that is, in terms of As for the other license. Counteracting that evidence
what proportion of expression from the was testimony by some witnesses that
plaintiff’s work the defendant appro- fair-use factors, Google’s reimplementation of inter-
priated. But qualitative assessments Oracle insisted faces was customary in the software
of substantiality are sometimes made, industry. A reasonable jury, Judge Al-
especially if the defendant copied the that the Java API sup concluded, could have concluded
“heart” of the work. When the defen- was highly creative. that Google’s good faith defense was
dant’s use is transformative, however, more persuasive than Oracle’s bad
the question becomes whether what faith accusation.
the defendant took was reasonable in Second, Oracle insisted that the
light of its transformative purpose. Java language, which all were free to
Harm to Market. The effect of the use without permission, was distinct
defendant’s use of expression from the from the Java API declarations. Google
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 25
viewpoints
Economic and
Business Dimensions
Visualization to
Understand Ecosystems
Mapping relationships between stakeholders in an ecosystem
to increase understanding and make better-informed strategic decisions.
C
O M PA N I E S C A N N O longer
rely on just their internal
competencies. They must
complement their own
competencies with those of
other firms through alliances, part-
nerships, and digital relationships.
Google, by opening its mobile An-
droid platform to manufacturers and
developers, has become the leading
apps provider in the mobile space.
These multifaceted interdependen-
cies create complex connections, or
ecosystems, that affect and are affect-
ed by multiple stakeholders such as
suppliers, distributors, outsourcing
firms, makers of related products or
services, and technology providers.3
Competition today is between eco-
systems. Understanding ecosystems
can then help managers improve
strategic decisions and reshape the
boundaries of their industries. In-
terpreting this network of interac-
tions, however, can be difficult. For
example, a nascent ecosystem like
the Internet of Things (IoT) has no
dominant player and is increasing in
scale, scope, and complexity.4 Under effectively, many entities, activities, to interpret the visuals and make recom-
these circumstances, companies can and tools are required. Capturing, gen- mendations. Finally, companies must
have difficulty determining the most erating, and interpreting ecosystems manage the data, visuals tools, and les-
effective business opportunities. requires a sponsor who wants to learn sons learned in order to be effective.
IMAGERY BY SUMK INN
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 27
viewpoints
Figure 1. An Internet of Things (IoT) stack. platform companies for the IoT ecosys-
tem. Lists can be augmented by semi-
open sources such as Crunchbase or
Application Layer for Sense Making for example, Analytics Angel.co or paid services such as Dun
System Integrators for example, Accenture, Cognizant & Bradstreet, DataFox, and CB In-
Infrastructure for example, Azure, Amazon Web Services, Telstra sights. Once finalized, the analyst can
Platforms for Device Connectivity for example, Thingworkx, Xively, Iotivity
visit each company’s website to iden-
tify listed partners that provide servic-
Sensory Layer for example, ARM, Intel, Sigfox
es or components for platforms and
document dependencies. Most com-
pany websites list strategic alliances
approach and several visual repre- industry. For the IoT industry, the stack or partners, and describe the types of
sentation methods can be used.1 The shown in Figure 1 is an illustration relationships (for example, technical,
methodology consists of steps dis- based on current industry consensus. marketing, licensing, and so forth)
cussed here. This approach is not lin- Refinements to the stack can be made and content (for example, date formed,
ear; there are many feedback loops be- as new companies emerge. An inde- nature, and other characteristics).
tween the stages. Progress is guided by pendent list of available platforms can Finalize semantics for nodes and
human decision making along the way. be collected from trade publications dependencies: Based on the data col-
We use The Internet of Things (IoT) as and mapped onto the stack. lected so far, the analyst can prepare
an illustration of this methodology. Identify companies and their attributes: it for visualization. Some software
Determine industry structure: Think Platforms are important to technology- requires explicit specification of the
of an industry as a company, its com- centric industries.5 Identifying platform visual encodings of nodes and edges
plementors and competitors. Identify companies for the ecosystem can be (including the attributes that drive
the value chain or stacks of activities done by searching articles in industry color, shape, size, and dependen-
that deliver something of benefit to publications. Use Internet search en- cies), and particular consideration
customers. These are inferred from gines, news portals, socially curated to data type (that is, quantitative,
industry publications and company news websites, or social media sources ordinal, categorical). More recent
websites. Industry structure helps the to locate these articles. Industry pub- software packages (for example, Tab-
analyst identify the companies that are lications complemented with discus- leau, Gephi, ecoxight) allow dynamic
the major platform companies in an sions with practitioners identified 34 selection of attributes and assign-
ments. Node sizes are usually based
Figure 2. The IoT ecosystem. on a company’s revenue or number of
employees. Dependencies are often
color coded, based on relationship.
Visualize, analyze, and interpret:
The visualizations in the figures
in this column use Gephi, an open
source graph visualization tool.2 Many
visualization packages offer different
network layout algorithms. Most com-
monly used are derivations of force-
directed layouts, in which prominent
nodes are drawn in the center and less
prominent ones are pushed to the pe-
riphery. Nodes close to each other have
stronger associations. After visualiza-
tion, the sponsoring organization can
be asked for feedback to determine if
they have any insights about compa-
nies in key network positions or clus-
ters of interest, find any surprises, and
identify companies that did not make
the list. Corrections are incorporated
in subsequent analysis. This iterative
process creates confidence in the visu-
alizations.
player. In Figure 2, platform compa- Figure 3. Vertical differentiation in the IoT ecosystem.
nies are the red-colored circles and
component providers are the gray-col-
ored circles. Dependencies between
platforms and components are shown
as links. This visualization can be used
to identify key players and determine
business opportunities.4
Alliance Strength. Density of con-
nections, number of partners, and
network centrality all provide a sense
of the resources available to any given
firm that serves as an axis. In Figure
1, it’s easy to see, for example, that
the Zigbee Alliance and Industrial In-
ternet Consortium each represent an
important nexus of activity. If one is a
small player looking to join a winning
alliance then choosing from among
the bigger players could be wise.
While, if one is a competing nexus,
choosing compatibility with other
midsize players could reshape the
network landscape.
Vertical Integration. Companies
that provide end-to-end solutions
become vertically integrated and
tightly controlled. For example, in
its early days, the software indus-
try was dominated by IBM, Digital
Equipment, and International Com-
puters. These end-to-end solutions Figure 4. Preference of Android to IFTTT.
providers fragmented into multiple
companies and divided technical
leadership. The IoT ecosystem (see
Figure 3) reveals many clusters, sug-
gesting they are either providing
end-to-end solutions or leaving in-
tegration to the user. IBM among
those three early pioneers survives
today, but Bluemix (integrating
IBM products and services), Kaa
(providing middleware services
that work with any stack), and 365
Farmnet (specialized for agricultural
services) might be relevant for indi-
vidual specializations and doomed
from a network perspective.
Preferential Attachment. With pro-
liferation of platforms in the IoT eco-
system, solutions providers must se-
lect one platform over another. If they
simultaneously commit to too many
platforms they make significant re-
source commitments. In the software
industry, solutions providers attach
themselves to one or two platforms
based on incentives, momentum or
technological superiority. Here solu-
tions providers are unsure, they can
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 29
viewpoints
Figure 5. Core of the IoT ecosystem. track the ways users integrate services
and devices, providing insights into
how users derive value through a prod-
uct’s use. Alphabet’s Nest has used
this very effectively to implement the
“Works With Nest” program. Because
of this program, customers buying
Nest will have the option to integrate
it with the most common devices like
LG washing machines or Amazon’s
Echo. Nest’s position can be further so-
lidified, if it became interoperable with
highly connected platforms (as shown
in Figure 3). This position gives it ac-
cess to critical knowledge flows across
the network.
Challenges
Visualizations can help with under-
standing industry structure and emer-
gence of key players. The dynamic na-
ture of competition makes every new
entrant and every move by an existing
player relevant to the topology the
ecosystem changes and the competi-
tive position of companies. Compa-
nies must constantly track nodes and
dependencies in networks. Compu-
tational tools for data gathering and
AI techniques for text processing and
understanding can be automated so
hedge their bets by selecting multiple decision to open its API, it instantly executives can focus on understanding
platform. Oracle built huge market brings several hundreds of thousands and exploring interactive visuals. Eco-
share by supporting multiple plat- of developers into the equation. system visualization continues to grow
forms like Windows and Linux. An- Focus on the Core. An emergent eco- and mature as data scientists, graph
droid (see Figure 4) is preferential to system builds infrastructure first (see theorists, and visualization research-
IFTTT, ensuring that Android can inte- the core of the ecosystem in Figure 5). ers create new techniques. In this way,
grate with most other products. Personal computers needed an operat- ecosystem visualization is likely to be-
New Entrant and Network Effects. ing system before the explosion of appli- come ubiquitous in high-velocity busi-
Major players in non-IoT industries (for cations could occur. The IoT ecosystem ness environments.
example, Samsung, Apple, and Alpha- needs constant monitoring. Communi-
bet) still fight for dominance in the IoT cating data across networks is critical. References
1. Basole, R.C. et al. Understanding business ecosystem
ecosystem. To gain developers, these Low-power solutions and connectivity dynamics: A data-driven approach. ACM Trans.
companies open up their IoT platforms are needed in the infrastructure (see the Management Information Systems (TMIS) 6, 2 (Feb.
2015), 6.
and programming interfaces. Brand figure showing the core of the current 2. Bastian, M., Heymann, S., and Jacomy, M. Gephi: An
open source software for exploring and manipulating
recognition, existing devices, plat- IoT ecosystem) to help connect devices networks. ICWSM 8, (2009), 361–362.
forms, and relationships with devel- and ensure efficiency. 3. Iansiti, M. and Levien, R. The Keystone Advantage:
What the New Dynamics of Business Ecosystems
opers might give them an advantage. The Power of Interoperability Emer- Mean for Strategy, Innovation, and Sustainability.
As Alphabet expands into home auto- gent ecosystems require integration of Harvard Business Press, 2004.
4. Iyer, B. To predict the trajectory of the Internet
mation with Nest, many Google Play devices and services. They also require of Things, look to the software industry. Harvard
developers might move with it. De- interoperability between competing Business Review (Feb. 25, 2016).
5. Parker, G.G., Van Alstyne, M.W., and Choudary, S.P.
veloper familiarity with Google’s API platforms. These significant challeng- Platform Revolution. Norton Publishing, New York, 2016.
in one setting could also be used to es benefit from learning about the early
develop products in other settings. A days in the software industry. Applica- Bala R. Iyer ([email protected]) is Professor and Chair
similar trend in the software industry tion Program Interfaces (APIs) allowed TOIM Division, Babson College, MA. Twitter: @BalaIyer
enabled Microsoft’s strong position firms to interact and share information Rahul C. Basole ([email protected]) is Associate
Professor, School of Interactive Computing, and Director,
in operating systems to help it domi- with other firms, and made it easy to Tennenbaum Institute, Georgia Institute of Technology,
nate the web browser market. In Fig- achieve integration and interoperabil- GA. Twitter: @basole
ure 3, Apple Homekit occupies a key ity across platforms and devices. They
position in the network. With Apple’s could be used for more: APIs could Copyright held by authors.
Education
Growing Computer Science
Education Into a STEM
Education Discipline
Seeking to make computing education
as available as mathematics or science education.
C
OMPUTING E D U C AT I O N IS
changing. At this year’s
CRA Snowbird Conference,
there was a plenary talk and
three breakout sessions
dedicated to CS education and enroll-
ments. In one of the breakout ses-
sions, Tracy Camp showed that much
of the growth in CS classes is coming
from non-CS majors, who have dif-
ferent goals and needs for comput-
ing education than CS majors.a U.S.
President Obama in January 2016 an-
nounced the CS for All initiative with a
goal of making computing education
available to all students.b
Last year, the U.S. Congress passed
the STEM Education Act of 2015, which
officially made computer science part
of STEM (science, technology, engi-
neering, and mathematics). The federal
High school students and teachers engaging in collaborative meetings about computer
government offers incentives to grow science represents an important step toward making computing education as available as
participation in STEM, such as scholar- science or mathematics education.
ships to STEM students and to prepare
STEM teachers. Declaring CS part of ing computing education so it is just hind them. Informally, many students
STEM is an important step toward mak- as common requires recognition that talk to their parents about science issues
PHOTO BY CA LVIN LIN, UNIVERSIT Y OF T EXAS, AUSTIN
ing computing education as available as education in computer science is differ- (“Why is the sky blue?”), think about the
mathematics or science education. ent in important ways from education numbers in their lives (“How much is
The declaration is just a first step. in STEM. We have to learn to manage the tax?”), visit science museums, and
Mathematics and science classes those differences. see media about mathematics and sci-
are common in schools today. Grow- ence. Students live in a world where
Computer Science Is Invisible, living, chemical, and physical behavior
Formally and Informally is explained by biology, chemistry, and
a http://cra.org/wp-content/uploads/2016/07/
BoomCamp.pdf
Students enter mathematics or science physics. They develop ideas about how
b https://www.whitehouse.gov/blog/2016/01/30/ classes at the post-secondary level with the world works, some of which are
computer-science-all years of knowledge and experience be- wrong (like simple Lamarckian evolu-
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 31
viewpoints
tion). Students start formal learning ics education, we do not have learning
and want more computing education cation classes, we need to increase the
for their children. But the parents and number of computer science classes
While computer principals mostly do not understand and teachers by a magnitude. That is
science is now part what it is. an enormous change with dramatic
Students want computer science, implications. What we have today may
of STEM in the U.S. whatever they think it is. Many of them not tell us much about tomorrow. The
by fiat, students want it because of the economic value preparation, abilities, and preferences
of knowledge of computer science. They of existing computer science teachers
cannot access don’t know what it is, but they know it may not be predictive when we have a
computer science can get them a good job and make them 10-times-larger population of teachers.
more effective at the job they want. We have to invent whole new teacher
classes as easily as The CS situation is different from education programs.
mathematics and science or mathematics. Contrast
the number of coding boot camps Steps Toward Pervasive
science education. available in your area to the num- Computing Education
ber of biology boot camps or algebra While computer science is now part of
boot camps. While having a demand STEM in the U.S. by fiat, students can-
for CS is mostly positive, it creates not access computer science classes
a strange dynamic in the computer as easily as mathematics and science
want students to be able to build pro- science class. Students demand the education. Many countries are ramp-
grams they find motivating and engag- “real thing” (which we might inter- ing up computing education, so the
ing, without mastering all the skills of pret as “what will help me in a job”), situation is going to change. As it
textual programming first. David Wein- even if they don’t know what that is. does, we will have to develop more ac-
trop and Uri Wilensky have shown that For example, students might com- curate expectations of how students
students using blocks-based languages plain about learning a blocks-based learn CS, improve our ability to mea-
(like Scratch, Snap, and Blockly) make language or using a pedagogical IDE sure learning in computing, develop
far fewer errors than in textual pro- because it’s not “real”—even if they learning progressions, and create an
gramming languages. Again, part of are not quite clear what “real” is. infrastructure to develop teachers
their achievement is in measurement. and track progress as we reach the
They developed a commutative assess- Building the Infrastructure pervasiveness of mathematics and
ment6 that allowed them to compare for CS Classes science education.
the same concepts in both textual and In many countries and U.S. states,
blocks-based programming languages. you can learn the number of students References
1. Hewner, M. Undergraduate conceptions of the field of
We will need many kinds of measures taking primary or secondary school computer science. In Proceedings of the Ninth Annual
to develop learning progressions for reading, mathematics, or science International ACM Conference on International
Computing Education Research (ICER ‘13). ACM, New
computer science. classes. In the U.S., hardly any state York, 2013, 107–114.
2. McCracken, M. et al. A multi-national, multi-institutional
can tell you the number of students study of assessment of programming skills of first-
Computer Science Is in their computer science classes at year CS students. ACM SIGCSE Bulletin 33, 4 (2001),
125–140.
Valued but Misunderstood any level, or what is being taught in 3. Morrison, B.B., Margulieux, L.E., and Guzdial, M.
Students enter undergraduate math- those courses. (In many states, “com- Subgoals, context, and worked examples in learning
computing problem solving. In Proceedings of
ematics and science classes after years puter science” and “computing ap- the Eleventh Annual International Conference on
of formal education, so they enter with plications” courses are considered International Computing Education Research (Omaha,
Neb., 2015), 21–29.
a good idea about what those fields the same.) Because computer science 4. Morrison, B.B., Margulieux, L.E., Ericson, B., and
mean. That’s not true for computer sci- has only recently been declared part Guzdial, M. Subgoals help students solve Parsons
problems. Paper presented at the Proceedings of the
ence. Even undergraduate CS majors of STEM, it has not been tracked like 47th ACM Technical Symposium on Computing Science
do not know what computer science is. other STEM subjects. We don’t know Education (Memphis, Tenn., 2016).
5. Soloway, E. Learning to program = learning to
Mike Hewner showed that even under- with certainty how much computing construct mechanisms and explanations. Commun.
graduate students who declare a major education is offered in the U.S. today ACM 29, 9 (Sept. 1986), 850–858.
6. Weintrop, D. and Wilensky, U. 2015. Using
in computer science only have an un- nor where it is offered, which makes commutative assessments to compare conceptual
clear idea of what the field is about.1 it difficult to plan and grow access to understanding in blocks-based and text-based
programs. In Proceedings of the Eleventh Annual
Large-scale surveys in collabora- computing education. International Conference on International
Computing Education Research (ICER ‘15). ACM,
tion between Google and Gallup have We believe (from looking at data New York, NY, 2015, 101–110; DOI: http://dx.doi.
shown that parents and principals about Advanced Placement CS and in org/10.1145/2787622.2787721
think courses in computer science those states that do track) that far less
are about how to use personal com- than 30% of secondary school students Mark Guzdial ([email protected]) is a professor at
the Georgia Institute of Technology.
puters.c The surveys show parents and even have the opportunity take a com-
Briana Morrison ([email protected]) is an
principals highly value computing, puter science course in the U.S. today, assistant professor of computer science at the University
and less than 10% of primary school of Nebraska Omaha.
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 33
V
viewpoints
DOI:10.1145/2908733 Jack Copeland, Eli Dresner, Diane Proudfoot, and Oron Shagrir
Viewpoint
Time to Reinspect
the Foundations?
Questioning if computer science is outgrowing its traditional foundations.
T
HE THEORY OF computability
was launched in the 1930s, by
a group of logicians who pro-
posed new characterizations
of the ancient idea of an algo-
rithmic process. The most prominent
of these iconoclasts were Kurt Gödel,
Alonzo Church, and Alan Turing. The
theoretical and philosophical work that
they carried out in the 1930s laid the
foundations for the computer revolu-
tion, and this revolution in turn fueled
the fantastic expansion of scientific
knowledge in the late 20th and early 21st
centuries. Thanks in large part to these
groundbreaking logico-mathematical
investigations, unimagined number-
crunching power was soon boosting
all fields of scientific enquiry. (For an
account of other early contributors to
the emergence of the computer age see
Copeland and Sommaruga.9)
The motivation of these three revolu-
tionary thinkers was not to pioneer the
disciplines now known as theoretical
and applied computer science, although
with hindsight this is indeed what they
did. Nor was their objective to design
electronic digital computers, although
Turing did go on to do so. The found-
ing fathers of computability would have
thought of themselves as working in a
most abstract field, far from practical nerstones of current science and tech- ing, nano-computing, DNA computing,
computing. They sought to clarify and nology. Since then many diverse com- neuron-like computing, computing
define the limits of human computabil- putational paradigms have blossomed, over the reals, and computing involv-
ity in order to resolve open questions at and still others are the object of current ing quantum random-number genera-
IMAGE BY GANZALESS
the foundations of mathematics. theoretical enquiry: massively parallel tors. The list goes on … (for example,
The 1930s revolution was a critical and distributed computing, quantum see Cooper et al.,3 and recent issues of
moment in the history of science: ideas computing, real-time interactive asyn- Natural Computing and International
devised at that time have become cor- chronous computing, hypercomput- Journal of Unconventional Computing).
Few of these forms of computation tion? Some will argue that the Turing-
were even envisaged at the time of the machine model gives an adequate an-
1930s analysis of computability—and The founding swer to this question. But, given the
yet the ideas forged then are still typi- fathers of enormous diversity in the many spe-
cally regarded as constituting the very cies of computation actually in use or
basis of computing. computability would under theoretical consideration, does
So here’s the elephant in the room. have thought of the Turing-machine model still cap-
Do the concepts introduced by the ture the nature of computation? For
1930s pioneers provide a logico-math- themselves example, what about parallel asyn-
ematical foundation for what we call as working in chronous computations? Or interac-
computing today, or do we need to over- tive, process-based computations?15,24
haul the foundations in order to fit the a most abstract Now that such diversity is on the table,
21st century? Much work has been de- field, far it may be that the Turing-machine
voted in recent years to analysis of the model no longer nails the essence of
foundations and theoretical bounds from practical computation. If so, is there any suf-
of computing. However, the results of computing. ficiently flexible and general model
this diverse work—carried out by com- to replace it? Some of the authors of
puter scientists, mathematicians, and this Viewpoint would bet on Turing’s
philosophers—do not so far form a model, while the others would not. Ei-
unified and coherent picture. It is time ther way this question is central.
for the reexamination of the logico- Turning to the mysterious com-
mathematical foundations of comput- puter in our skulls … just as the limits
ing to move center stage. Not that we rote with paper and pencil.5–7,21–23 The of an idealized human clerk working
should necessarily expect an entirely limits of a human computer don’t nec- by rote do not necessarily dictate the
unified picture to emerge from these essarily set the limits of every conceiv- limits of computing hardware, nor
investigations, since it is possible that able physical process (for example, see do they necessarily dictate the limits
there is no common thread uniting the Pour El and Richards18), nor the limits of the human brain. For example, do
very different styles of computation of every conceivable machine. So it is an human beings qua creative mathema-
countenanced today. Perhaps investi- open question whether the Turing-ma- ticians carry out brain-based com-
gation will conclude that nothing more chine model of computation captures putations that transcend the limits
than ‘family resemblance’ links them. the physical, or even the logical, limits of human beings qua mathematical
A question pressed by foundational of machine computability. clerks? The question is not merely
revisionists is: How are the bounds of The Turing machine has also tra- philosophical: it seems highly likely
computability, as delineated by the ditionally held a central place in that the brain will prove to be a rich
1930s pioneers—bounds that theoreti- complexity theory, with the so-called source of models for new computing
cal computer science has by and large ‘complexity-theoretic’ Church-Tur- technologies. This gives us yet more
simply inherited and enshrined in ing2 or ‘extended’ Church-Turing reason to return to the writings of the
the textbooks—related to the bounds thesis1 asserting that there is always 1930s pioneers, since both Gödel and
of physical computing? The famous at most a polynomial difference be- Turing appear to have held that math-
Church-Turing thesis delineates the tween the time complexity of any rea- ematical thinking possesses features
bounds of computability in terms of the sonable model of computation and going beyond the classical Turing-
action of a Turing machine. But could that of a probabilistic Turing ma- machine model of computation (see
there be mathematical functions that chine. (A deterministic version of the Copeland and Shagrir8).
are physically computable and yet not thesis is also known as the Cobham- In 2000, the first International Hy-
Turing-machine computable? Various Edmonds thesis.13) This traditional percomputation Workshop was held at
physical processes have been proposed picture is today under threat, when University College, London. Now there
that allegedly allow for computation some propose counterexamples to is a growing community of hundreds
beyond Turing-machine computabil- the extended Church-Turing thesis. of researchers in this and allied fields,
ity, appealing to physical scenarios that Bernstein and Vazirani2 gave what communicating results at conferences
involve, for example, special or general they called ‘the first formal evidence’ (for example, ‘Computability in Eu-
relativity11,14,16,17,19 (see Davis10 for a cri- that quantum Turing machines vio- rope,’ ‘Theory and Applications of Com-
tique of the whole idea of super-Turing late the extended Church-Turing puting,’ and ‘Unconventional Com-
computation). The possibility of super- thesis. Shor’s quantum algorithm puting’) and in journals (for example,
Turing computation or hypercomputa- for prime factorization is arguably Applied Mathematics and Computation,
tion is not ruled out by the Church-Tur- another counterexample.20 All this is Theoretical Computer Science, Comput-
ing thesis, once the latter is understood highly controversial, but once again ability, and Minds and Machines). We
as originally intended, namely as an the signs are that we should take a sys- hope this is only the beginning. If it was
analysis of the bounds of what is com- tematic look at the foundations. important to clarify the logico-mathe-
putable by an idealized human comput- The most fundamental question matical foundations of computing in
er—a mathematical clerk who works by of all is, of course: what is computa- the 1930s, when computer science was
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 35
viewpoints
no more than a gleam in Turing’s eye, 8. Copeland, B.J. and Shagrir, O. Turing versus Gödel Reflections on the Foundations of Mathematics,
on computability and the mind. In B.J. Copeland, C. Association for Symbolic Logic, Natick, MA, 2002,
how much more important it seems Posy, and O. Shagrir, Eds. Computability: Gödel, Turing, 390–409.
today, when computing technology is Church and Beyond, MIT Press, Cambridge, MA, 2013. 22. Sieg, W. On computability. In A. Irvine, Ed., Handbook
9. Copeland, B.J. and Sommaruga, G. The stored-program of the Philosophy of Mathematics, Elsevier, 2009.
diversifying and mutating at an unprec- universal computer: Did Zuse anticipate Turing and 23. Stannett, M. X-machines and the halting problem:
edented rate. Via the great pioneers of von Neumann? In G. Sommaruga and T. Strahm, Eds., Building a super-Turing machine. Formal Aspects of
Turing’s Revolution, Birkhauser, Basel, 2006. Computing 2, (1990), 331–341.
electronic computing (such as Freddie 10. Davis, M. The myth of hypercomputation. In C. 24. Wegner, P. and Goldin, D. Computation beyond Turing
Williams, Tom Kilburn, Harry Huskey, Teuscher, Ed., Alan Turing: Life and Legacy of a Great machines. Commun. ACM 46, 4 (Apr. 2003), 100–102.
Thinker. Springer Verlag, Berlin, 2004, 195–212.
Jay Forrester, John von Neumann, Ju- 11. Etesi, G. and Németi, I. Non-Turing computations
lian Bigelow, and of course Turing him- via Malament-Hogarth space-times. International
Jack Copeland ([email protected]) is a
Journal of Theoretical Physics 41, (2002), 341–370.
professor of Philosophy at the University of Canterbury,
self) the 1930s analysis of computation 12. Gandy, R. Church’s thesis and principles for
New Zealand, where he is also the director of the Turing
mechanisms. In J. Barwise, D. Kaplan, H.J. Keisler,
led to the modern computing era. Who P. Suppes, and A.S. Troelstra, Eds., The Kleene
Archive for the History of Computing.
knows where a 21st-century overhaul of Symposium, North-Holland, Amsterdam, 1980. Eli Dresner ([email protected]) is a member of the
13. Goldreich, O. Computational Complexity: A Conceptual Gershon H. Gordon Faculty of Social Sciences at Tel Aviv
that classical analysis might lead. Perspective. Cambridge University Press, 2008. University.
14. Hogarth, M.L. Does general relativity allow an
observer to view an eternity in a finite time? Diane Proudfoot ([email protected])
References Foundations of Physics Letters 5, (1992), 173–181. is an associate professor (reader) of Philosophy at the
1. Aharonov, D. and Vazirani, U.V. Is quantum mechanics 15. Milner, R. Elements of interaction: Turing Award University of Canterbury, where she is also the co-director
falsifiable? A computational perspective on the lecture. Commun. ACM 36, 1 (Jan. 1993), 78–89. of the Turing Archive for the History of Computing.
foundations of quantum mechanics. In B.J. Copeland, C. 16. Németi, I. and Andréka, H. Can general relativistic
Oron Shagrir ([email protected]) is a professor
Posy, and O. Shagrir, Eds., Computability: Gödel, Turing, computers break the Turing barrier? In A. Beckmann,
of Philosophy and Cognitive Science at the Hebrew
Church and Beyond, MIT Press, Cambridge, MA, 2013. U. Berger, B. Löwe, and J.V. Tucker, Eds., Logical
University of Jerusalem.
2. Bernstein, E. and Vazirani, U.V. Quantum complexity Approaches to Computational Barriers, Springer-
theory. STOC ‘93 Proceedings of the Twenty-Fifth Verlag, Berlin, 2006, 398–412.
Annual ACM Symposium on Theory of Computing 17. Pitowsky, I. The physical Church thesis and physical
(1993), 11–20. computational complexity. Iyyun 39 (1990), 81–99.
3. Cooper, B., Lowe, B. and Sorbi, A., Eds. New 18. Pour-El, M. and Richards, I. The wave equation with
Computational Paradigms: Changing Conceptions of computable initial data such that its unique solution is not
What Is Computable. Springer, 2008. computable. Advances in Mathematics 39, (1981), 215–239.
4. Copeland, B. J. The Church-Turing Thesis. The 19. Shagrir, O. and Pitowsky, I. Physical hypercomputation
Stanford Encyclopedia of Philosophy. E.N. Zalta, Ed., and the Church-Turing Thesis. Special issue on
2002; http://plato.stanford.edu/entries/church-turing/. hypercomputation. Minds and Machines 13, 1 (2003),
5. Copeland, B.J. Narrow versus wide mechanism. 87–101.
The Journal of Philosophy 97, (2000), 5–32. 20. Shor, P.W. Polynomial-time algorithms for prime
6. Copeland, B.J. Hypercomputation: Philosophical issues. factorization and discrete logarithms on a quantum
Theoretical Computer Science 317 (2004), 251–267. computer. SIAM Journal on Computing 26, 5 (1997),
7. Copeland, B.J. and Proudfoot, D. Alan Turing’s 1484–1509.
forgotten ideas in computer science. Scientific 21. Sieg, W. Calculations by man and machine: Conceptual
American 280, (1999), 76–81. analysis. In W. Sieg, R. Sommer, and C. Talcott, Eds., Copyright held by authors.
Viewpoint
Technology and
Academic Lives
Considering the need to create new modes of interaction and
approaches to assessment given a rapidly evolving academic realm.
I
“ the mo-
’M OVE RCO M MI T T E D AT
ment. I find the university at the
center of a huge work speedup.”
—Vice provost and dean
sessed in three worlds: technology- a My three snapshots of academic life are from ideas, and possible collaborations. De-
intensive university departments in graduate work at Stanford, MIT, and UCSD; ris- partmental colloquia were well attend-
1975, when few had Internet access; ing from assistant to full professor at UC Irvine ed; faculty often commented cogently
in the 1990s; and now an affiliate professor at
in 1995, with the Internet but no Web the University of Washington.
on talks outside their area. At UCSD,
services; and today. I was a graduate b Facsimile machines existed but were not wide- each graduate student presented a
student in 1975, climbing the aca- ly used. I first sent a fax in the late 1980s. project to the department at the end
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 37
viewpoints
of the first year, providing all faculty ence letters were more balanced. The
INTER ACTIONS with a view into their colleagues’ work, academic norm was lavish praise; read-
methods, and student selection and ers looked for subtle negatives. My own
training. Faculty helped one another, tenure and full professor promotions
building the department’s reputation encountered reference-letter turbu-
in an outside world that was known lence but survived.
mainly through journal articles. Being able to interact at a distance
It wasn’t idyllic—there were fac- has phenomenal benefits, but it does
tional disputes, tenure anxiety, and reduce local interaction. This dimin-
‘publish or perish.’ Nevertheless, high ishing of community was reflected in
familiarity enabled faculty promotion the increased outsourcing of faculty as-
cases to be handled effectively within a sessment. It didn’t seem ideal to weight
department and school. the subjective opinions of outsiders so
heavily, but it was manageable.
ACM’s Interactions magazine The Internet Arrives
explores critical relationships In late 1979, halfway through my Ph.D. 2015: The Information Age
between people and journey, our lab connected to the AR- As one of few full professors in HCI in
technology, showcasing PANET. Faculty and staff could access the early 2000s, I was asked to write
emerging innovations and the computer and the ARPANET via many letters for appointments and
industry leaders from around modem from home. For students, the promotions. Over time, urgent per-
the world across important Internet precursor was largely a curi- sonal requests gave way to form let-
applications of design thinking osity. For faculty, it turned out to be ters that often omit key information.
and the broadening field of more significant. External letters appeared to be losing
interaction design. Taking a postdoc in the U.K. in their dominant role. This hypothesis
Our readers represent a growing 1982 was like parachuting in with was supported in my recent inqui-
community of practice that is the clothes on my back. Communica- ries. A wealth of digital data is now
of increasing and vital global tion with U.S. family, friends, and col- available and often relied upon. One
importance. leagues was epistolary and about one department chair wrote, “There is a
in 10 airmail letters went by sea and ar- pervasive atmosphere of pressure and
rived a month late. On different occa- obsessive quantification, and yes, it
sions, the two senior professors from affects senior people, too. In part this
my UCSD lab visited and presented is because as the senior ranks fill with
research co-authored with people out- people who came up through the ob-
side our lab. I was stunned. This work sessive quantification, the attitudes
had begun when I was a student but become entrenched ... People who
was never discussed in the lab. Prior don’t buy into obsessive quantifica-
to the ARPANET, everything was dis- tion get filtered out.”
cussed. Suddenly, faculty could work Budget-strapped legislatures de-
with distant colleagues. They no lon- mand that state universities show evi-
ger had to educate departmental col- dence of impact. Great teaching may
leagues to get strong feedback. With not compensate for weak research,
faculty more focused on external dis- but bad teaching is tolerated less.
cussions and collaborations, depart- Public universities with reduced state
mental meetings became less central support and private universities fac-
to academic life. ing higher costs count on rainmaking.
“Funding is the first consideration for
1995: Assessment from Outside promotion to full, even though that is
To learn more about us,
visit our award-winning website After a stint in industry, I joined a re-
http://interactions.acm.org markably democratic department: All
Follow us on
faculty participated in all evaluations. A highly beneficial
Within months, as an assistant profes-
Facebook and Twitter
sor, I was voting on associate and full technology can
To subscribe: professor cases. Faculty were no lon- have an undesirable
http://www.acm.org/subscribe ger well acquainted with one another’s
work; we relied extraordinarily heavily side effect.
on external letters. This was not good
Association for news for those of us in sub-fields in
Computing Machinery which most professors had recently
arrived from industry (such as IBM
Research and Bell Labs), where refer-
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 39
practice
DOI:10.1145/ 2980932
Expect to be constantly
and pleasantly befuddled.
BY PAT HELLAND
The
Power of
Babble its permutations with usages that usu-
ally do not converge. My personal idio-
lect shifts depending on whether I am
speaking to a computer science audi-
shape, the form, and how to
METAD ATA D EF INE S T H E ence, my team at work with its contex-
tual usages, my wife, my grandkids,
understand our data. It is following the trend taken by or the waiter at a local restaurant. Dif-
natural languages in our increasingly interconnected ferent communities of people extend
English in different ways.
world. While many concepts can be communicated Computer systems have an emerging
using shared metadata, no one can keep up with the and increasing common metadata for
number of disparate new concepts needed to have a interoperability. XML and now JSON fill
similar roles by making the parsing of
common understanding. messages easy and common. It’s great
English is the lingua franca of the world, yet there we are no longer arguing over ASCII ver-
sus EBCDIC, but that is hardly the most
are many facets of humanity and the concepts held challenging problem of understanding.
by different people that simply cannot be captured in As we move up the stack of under-
English no matter how pervasive the language. In fact, standing, new subtleties constantly
emerge. Just when we think we under-
English itself has nooks, crannies, dialects, meetups, stand, the other guy has some crazy
and teenager slang that innovate and extend new ideas!
healthcare and manufacturing stan- dard typically contains the union extensions onto the side of the app, this
dards). Many companies have internal of all ideas discussed by the com- impacts the shape, form, and meaning
communication standards as well. mittee. Natural selection relegates of its internal and shared data.
Dave Clark of MIT observed that these standards and their clutter to When there’s a common application
successful standards happen only if history books. lineage, there’s a common understand-
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 41
practice
Figure 1. The Apocalypse of Two Elephants. ing of its data. Popular ERP (enterprise
resource planning), CRM (customer
billion-dollar relationship management), and HRM
investment
research (human resources management) appli-
cations have their ways of solving busi-
ness problems, and different compa-
Activity
Internecine Interop
Still, challenges of understanding may
exist even across departments or divi-
Time sions of the same company. A large
conglomerate may sell many prod-
ucts, including light bulbs, dishwash-
Figure 2. Least-lossy conversion. ers, locomotives, and nuclear power
plants. I would hazard a guess that it
doesn’t have a single canonical cus-
service
service service tomer record type.
Of course, mergers and divestitures
impact a company’s metadata. I know
service service from personal experience how difficult
it is to change my mailing address with
a bank or insurance company. They
can’t seem to track down all the sys-
service service
tems that record my address even over
the course of a year. It’s not a big sur-
prise that they have a hard time manag-
service service ing their metadata.
form of translation. Unfortunately, is deemed more important than hold- Similarly, the base metadata continues
this results in a boatload of translators. ing an illness private. Communication to move and adjust as it assimilates
Creating a specific conversion for each cannot take place without understand- those new messages and fields that
source and destination pair results in ing the assumptions and interpreting made no sense at all a short time ago.
great conversion fidelity but also re- through that lens.
sults in N2 converters (see Figure 2). The artificial language Esperanto Relishing Diversity
What to do? Many times, we simply was created in 1887 with the hope of While not understanding another
capture a canonical representation achieving a common shared natural party is a pain, it probably means that
and do two data translations: first, a language for all people. Some folks innovation and growth have occurred.
lossy translation into the canonical grabbed hold and used it to write Economic forces will drive when and
representation; then, a lossy transla- and share. Some say a few million where it’s worth the bother to invest in
tion from the canonical representation speak it today. deeper understanding.
into the target representation. This is The use of Esperanto has been wan- Playing loose with understanding
double-lossy and just doesn’t supply as ing, however. Each of the approximately allows for better cohesion, as exempli-
good a result. 6,000 languages spoken by different fied by Amazon’s product catalog and
Why do the translation to a canoni- communities in the world has its own the search results from Google or Bing.
cal form? Because only 2*N translators flavor and nuance. You can say certain Remember that in many cases, cul-
are needed for N sources, and that is things in one language that you just tural and contextual issues will drive
a heck of a lot fewer than N2, as N gets can’t say in another one. how something is interpreted. Exten-
large. Using canonical metadata as a sible data does not have a prearranged
common translation reduces the num- Diverse and Homogeneous understanding. Translating between
ber of converters but results in a dou- The words and phrases people use and representations is lossy and frequently
ble-lossy conversion (see Figure 3). the metadata that applications use involves a painful trade-off between
In most cases, people use canonical follows a similar pattern. With a com- expensive handcrafted translators and
metadata to bound complexity but add mon codebase DNA and history, some even lossier multiple translations.
specific source-to-target translators meanings are the same. As time, evolu- Personally, as the years have gone
when the lossiness is too large. tion, and commingling occur, it’s more by, I’ve gotten much more relaxed
difficult to understand one another. about the things I don’t know and
What Color Are New software applications either in don’t understand. A lot of stuff confus-
Your Rose-Colored Glasses? the cloud or on premises sometimes es me! As we interoperate across dis-
We all see stuff couched in terms of a set offer enough business benefit that parate boundaries, it would do us well
of assumptions. This is a worldview that enterprises adapt their ways of doing to remember that the less stressed we
allows us to interpret incoming informa- business to fit the application. The new are about perfect understanding and
tion. This interpretation may be right or user adopts the canonical representa- agreement, the better we will all get
wrong, but, more importantly, it is right tion of data and business processes by along. Moving forward, I expect to be
or wrong for our subjective usage. sheer hard work. When the business constantly and pleasantly befuddled by
Computer systems are invariably de- value of the software is high enough, the power of babble.
signed for a certain company, depart- mapping to it is cost effective. Now
ment, or group. The data is typically cast the enterprise is much more closely
Related articles
into a meaning and use that are appro- aligned to the new approach and to on queue.acm.org
priate for one side but lose their deeper interoperating with other enterprises
meaning through the translation. sharing the new data and process. Immutability Changes Everything
Sometimes, the meaning and un- Next, the enterprise will begin to ex- Pat Helland
http://queue.acm.org/detail.cfm?id_2884038
derstanding of some data are deeply tend the system using extensibility fea-
couched in cultural issues. Any trans- tures. These extensions can then become Standardized Storage Clusters
lation to a new environment and cul- a source of misunderstanding, but they Garth Goodson et al.
http://queue.acm.org/detail.cfm?id+1317402
ture simply loses all meaning. Read- bring business value to the enterprise.
ing about daily life in Medieval Europe The U.S., Canada, and many other Search Considered Integral
Ryan Barrows and Jim Traverso
doesn’t help much unless you study the Western countries have tremendous http://queue.acm.org/detail.cfm?id=1142068
relationships between serfs and lord diversity in their populations. New ar-
as well as between men and women. rivals bring new customs. They work Reference
Only then can you understand the ac- to understand the existing customs in 1. Clark, D. 2009. The Apocalypse of Two Elephants, or
‘what I really said.’ Advanced Network Architecture.
tions described in the book. Similarly, their new home. While there are many MIT CSAIL; http://groups.csail.mit.edu/ana/People/
in any discussion of privacy, cultural differences at first, in a few short years DDC/Apocalypse.html.
expectations must be addressed. In the immigrants fit in. Their children are
North America and Europe, protecting deeply ingrained in the new country, Pat Helland has been implementing transaction systems,
databases, application platforms, distributed systems,
against the damage that may result by even though they still like some of that fault-tolerant systems, and messaging systems since
disclosing a medical challenge is para- 1978. He currently works at Salesforce.
food their mom cooked at home. That
mount. In India, the essential need to food becomes as American (or English Copyright held by author.
vet a prospective spouse for your child or German) as pizza, tacos, and falafel. Publication rights licensed to ACM. $15.00
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 43
practice
DOI:10.1145/ 2980987
To help readers make informed de-
Article development led by
queue.acm.org
sign decisions, this article describes
advanced (and practical) synchroniza-
tion methods that can push the per-
Advanced synchronization methods can boost formance of designs using shared mu-
the performance of multicore software. table data to levels that are acceptable
to many applications.
BY ADAM MORRISON To get a taste of the dilemmas in-
volved in designing multicore software,
let us consider a concrete problem:
Scaling
implementing a work queue, which al-
lows threads to enqueue and dequeue
work items—events to handle, packets
Synchronization
to process, and so on. Issues similar to
those discussed here apply in general
to multicore software design.14
Centralized shared queue. One
in Multicore
natural work queue design (depicted
in Figure 1a) is to implement a central-
ized shared (thread-safe) version of the
familiar FIFO (first in, first out) queue
Programs
data structure—say, based on a linked
list. This data structure supports en-
queuing and dequeuing with a constant
number of memory operations. It also
easily facilitates dynamic load balanc-
ing: because all pending work is stored
in the data structure, idle threads can
easily acquire work to perform. To
make the data structure thread-safe,
however, updates to the head and tail of
the queue must be synchronized, and
DESIGNING SOF TWARE FO R modern multicore this inevitably limits scalability.
Using locks to protect the queue se-
processors poses a dilemma. Traditional software rializes its operations: only one core at
designs, in which threads manipulate shared data, a time can update the queue, and the
have limited scalability because synchronization of others must wait for their turns. This
ends up creating a sequential bottle-
updates to shared data serializes threads and limits neck and destroying performance very
parallelism. Alternative distributed software designs, quickly. One possibility to increase scal-
ability is by replacing locks with lock-
in which threads do not share mutable data, eliminate free synchronization, which directly
synchronization and offer better scalability. But manipulates the queue using atomic
distributed designs make it challenging to implement instructions,1,11 thereby reducing the
amount of serialization. (Serialization
features that shared data structures naturally is still a problem because the hardware
provide, such as dynamic load balancing and strong cache coherence mechanism1 serial-
izes atomic instructions updating the
consistency guarantees, and are simply not a good fit same memory location.) In practice,
for every program. however, lock-free synchronization
IMAGE BY ST EVE BA LL
Often, however, the performance of shared mutable often does not outperform lock-based
synchronization, for reasons to be dis-
data structures is limited by the synchronization cussed later.
methods in use today, whether lock-based or lock-free. Partially distributed queue. Al-
ternative work-queue designs seek selected at random) until finding one is the data structure’s consistency guar-
scalability by distributing the data containing work. antee. In particular, unlike the central-
structure, which allows for more paral- This design should scale much ized shared queue, the distributed de-
lelism but gives up some of the prop- better than the centralized shared sign does not maintain the cause and
erties of the centralized shared queue. queue: enqueues by different cores effect relation in the program. Even
For example, Figure 1b shows a design run in parallel, as they update differ- if core P1 enqueues x1 to its queue af-
that uses one SPMC (single-producer/ ent queues, and (assuming all queues ter core P0 enqueues x0 to its queue, x1
multiple-consumer) queue per core. contain work) dequeues by different may be dequeued before x0. The design
Each core enqueues work into its cores are expected to pick different weakens the consistency guarantees
queue. Dequeues can be implemented queues to dequeue from, so they will provided by the data structure.
in various ways—say, by iterating over also run in parallel. The fundamental reason for this
all the queues (with the starting point What this design trades off, though, weakening is that in a distributed de-
sign, it is difficult (and slow) to combine
Figure 1. Possible designs for a work queue. the per-core data into a consistent view
of the data structure—one that would
have been produced by the simple cen-
(a) all cores access a centralized shared queue tralized implementation. Instead, as in
this case, distributed designs usually
centralized FIFO queue
weaken the data structure’s consisten-
P0 P1 P2 cy guarantees.5,8,14 Whether the weaker
guarantees are acceptable or not de-
(b) each core has an SPMC queue pends on the application, but figuring
this out—reasoning about the accept-
SPMC
SPMC
SPMC
SPSC
SPSC
SPSC
SPSC
SPSC
SPSC
SPSC
SPSC
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 47
practice
30
struct Node {
struct Node
* next;
25 void
* value;
}
M ops/second
Semantics-based optimizations. The Deferring delegation. For operations heavy workloads can be rare—all of its
server thread has a global view of con- that only update the data structure operations execute asynchronously.
currently pending operations, which but do not return a value, such as en- The original implementation of
it can leverage to optimize their execu- queue(), delegation facilitates an opti- this optimization still required cores
tion in two ways: mization that can sometimes eliminate to synchronize when executing these
˲˲ Combining. The server can com- serialization altogether. Since these deferred operations, since it logged
bine multiple operations into one and operations do not return a response the operations in a centralized (lock-
thereby save repeated accesses to the to the invoking core, the core does not free) queue.7 Boyd-Wickizer et al.,3
data structure. For example, multiple have to wait for the server to execute however, implemented deferred del-
counter-increment operations can be them; it can just log the requested op- egation without any synchronization
converted into one addition. eration in the publication list and keep on updates by leveraging systemwide
˲˲ Elimination. Mutually canceling running. If the core later invokes an op- synchronized clocks. Their OpLog
operations, such as a counter incre- eration whose return value depends on library logs invocations of response-
ment and decrement, or an insertion the state of the data structure, such as less update operations in a per-core
and removal of the same item from a a dequeue(), it must then wait for the log, along with their invocation times.
set, can be executed without modifying server to apply all its prior operations. Operations that read the data struc-
the data structure at all. But until such a time—which in update- ture become servers: they acquire the
locks of all per-core logs, apply the op-
Figure 5. Critical path of lock-free updating head of linked list. erations in timestamp order, and then
read the updated data-structure state.
(a) ideal case OpLog thus creates scalable imple-
read mentations of data structures that are
head updated heavily but read rarely, such as
P0 [ CAS ] LRU (least recently used) caches.
Performance. To demonstrate the
P1 [ CAS ] benefits of delegation, let’s compare
a lock-based work queue to a queue
implemented using delegation. The
P2 [ CAS ]
lock-based algorithm is Michael and
time Scott’s two-lock queue.11 This algorithm
protects updates to the queue’s head
(b) effect of CAS failures
and tail with different locks, serializ-
read
head
ing operations of the same type but al-
P0 [ CAS ]
lowing enqueues and dequeues to run
in parallel. Queue-based CLH (Craig,
failure
Landin, and Hagerstein) locks are used
P1 [ CAS ] in the evaluated implementation of the
lock-based algorithm. The delegation-
P2 [ CAS ] based queue is Fatourou and Kallima-
nis’s CC-Queue,4 that adds delegation
time
to each of the two locks in the lock-
based algorithm. (It thus has two serv-
ers running: one for dequeues and one It atomically updates the value stored (or only light) contention. Under high
for enqueues.) in addr from old to new; if the value contention, however, lock-free syn-
Figure 3 shows enqueue/dequeue stored in addr is not old, the CAS fails chronization has the potential to be
throughput comparison (higher is without updating memory. much more efficient than lock-based
better) of the lock-based queue and CAS-based lock-free algorithms syn- synchronization, as it eliminates lock
its delegation-based version. The chronize with a CAS loop pattern: A core acquire and release operations from
benchmark models a generic applica- reads the shared state, computes a new the critical path, leaving only the data
tion. Each core repeatedly accesses value, and uses CAS to update a shared structure operations on it (Figure 5a).
the data structure, performing pairs variable to the new value. If the CAS In addition, lock-free algorithms
of enqueue and dequeue operations, succeeds, this read-compute-update guarantee that some operation can al-
reporting the throughput of queue op- sequence appears to be atomic; other- ways complete and thus behave grace-
erations (that is, the total number of wise, the core must retry. Figure 4 shows fully under high load, whereas a lock-
queue operations completed per sec- an example of such a CAS loop for link- based algorithm can grind to a halt
ond). To model the work done in a real ing a node to the head of a linked list, if the operating system preempts a
application, a period of “think time” is taken from Treiber’s classic LIFO (last thread that holds a lock.
inserted after each queue operation. in, first out) stack algorithm.15 Similar In practice, however, lock-free al-
Think times are chosen uniformly at ideas underlie the lock-free implemen- gorithms may not live up to these per-
random from 1 to 100 nanoseconds to tations of many basic data structures formance expectations. Consider, for
model challenging workloads in which such as queues, stacks, and priority example, Michael and Scott’s lock-free
queues are heavily exercised. The C im- queues, all of which essentially perform queue algorithm.11 This algorithm im-
plementations of the algorithms from an entire data-structure update with a plements a queue using a linked list,
Fatourou and Kallimanis’s benchmark single atomic instruction. with items enqueued to the tail and re-
framework (https://github.com/nkal- The use of (sometimes multiple) moved from the head using CAS loops.
lima/sim-universal-construction) are atomic instructions can make lock- (The exact details are not as important
used, along with a scalable memory free synchronization slower than a as the basic idea, which is similar in
allocation library to avoid malloc bot- lock-based solution when there is no spirit to the example in Figure 4.) De-
tlenecks. No semantics-based optimi-
zation is implemented. Figure 6. Lock-free synchronization CAS failure problem.
This benchmark (and all other ex-
periments reported in this article) was (a) Lock-free vs. Lock-based Queue Throughput
run on an Intel Xeon E7-4870 (West-
mere EX) processor. The processor has 11
lock-based
Figure 3 shows the benchmark
throughput results, averaged over 10 8
runs. The lock-based algorithm scales
7 lock-free
to two threads, because it uses two
locks, but fails to scale beyond that 6
amount of concurrency because of se-
rialization. In contrast, the delegation- 5
1 2 4 8 10 12 14 16 18 20
based algorithm scales and ultimately Threads
performs almost 30 million operations
per second, which is more than 3.5 (b) Effect of CAS Failures
times that of the lock-based algo-
30 5
rithm’s throughput.
CAS
25
Avoiding CAS Failures in 4
Lock-Free Synchronization
M ops/second
CAS/op
20
Lock-free synchronization (also re-
CAS per op 3
ferred to as nonblocking synchroniza-
15
tion) directly manipulates shared data
using atomic instructions instead of CAS loop 2
10
locks. Most lock-free algorithms use
the CAS (compare-and-swap) instruc-
5 1
tion (or equivalent) available on all 1 2 4 8 10 12 14 16 18 20
multicore processors. A CAS takes Threads
three operands: a memory address
addr, an old value, and a new value.
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 49
practice
spite this, as Figure 6a shows, the lock- operations that fail in this way pile use- tions completed in Figure 5a).
free algorithm fails to scale beyond four less work on the critical path. Although To estimate the amount of perfor-
threads and eventually performs worse these failing CASes do not modify mem- mance wasted because of CAS failures,
than the two-lock queue algorithm. ory, executing them still requires ob- Figure 6b compares the throughput
The reason for this poor performance taining exclusive access to the variable’s of successful CASes executed in a CAS
is CAS failure: as the amount of concur- cache line. This delays the time at which loop (as in Figure 4) to the total CAS
rency increases, so does the chance that later operations obtain the cache line throughput (including failed CASes).
a conflicting CAS gets interleaved in the and complete successfully (see Figure Observe that the system executes con-
middle of a core’s read-compute-update 5b, in which only two operations com- tending atomic instructions at almost
CAS region, causing its CAS to fail. CAS plete in the same time that three opera- three times the rate ultimately observed
in the data structure. If there were a way
Figure 7. Infinite array queue. to make every atomic instruction use-
ful toward completing an operation,
// The following defines a node you would significantly improve perfor-
struct Cell { mance. But how can this be achieved,
void
* value;
}
given that CAS failures are inherent?
// Queue is infinite array of nodes, The key observation to make is that
// with head and tail pointers. the x86 architecture supports several
Cell Q [] = { ┴, ┴, ...};
atomic instructions that always suc-
int head = 0;
int tail = 0; ceed. One such instruction is FAA
(fetch-and-add), which atomically adds
void enqueue(void* x) { an integer to a variable and returns the
while (true) {
t = FAA(&tail, 1)
previous value stored in that variable.
if ( CAS(&Q[t], ┴, x) ) return The design of a lock-free queue based
} } on FAA instead of CAS is described
void *dequeue() {
next. The algorithm, named LCRQ (for
while (true) {
h = FAA(&head, 1) linked concurrent ring queue),12 uses
if ( !CAS(&Q[h], ┴, ┬) ) return Q[h] FAA instructions to spread threads
if ( tail ≤ h+1 ) return NULL among items in the queue, allowing
} }
them to enqueue and dequeue quickly
and in parallel. LCRQ operations typi-
cally perform one FAA to obtain their
Figure 8. Enqueue/dequeue throughput comparison of all queues. position in the queue, providing exact-
ly the desired behavior.
(a)
45 The LCRQ Algorithm
40 LCRQ (FAA) This section presents an overview of the
35 LCRQ algorithm; for a detailed descrip-
tion and evaluation, see the paper.12
M ops/second
30 delegation
25 Conceptually, LCRQ can be viewed as
20 a practical realization of the follow-
LCRQ (CAS)
15 ing simple but unrealistic queue algo-
10 rithm (Figure 7). The unrealistic al-
5 lock-free (CAS) gorithm implements the queue using
0
1 2 4 8 10 12 14 16 18 20
an infinite array, Q, with (unbounded)
Threads head and tail indices that identify
(b) the part of Q that may contain items.
45
Initially, each cell Q[i] is empty and
40
LCRQ (FAA) contains a reserved value ⊥ that may
35
not be enqueued. The head and tail
M ops/second
30
indices are manipulated using FAA
25
and are used to spread threads around
20
LCRQ (CAS) the cells of the array, where they syn-
15
lock-free (CAS) chronize using (uncontended) CAS.
10
An enqueue(x) operation obtains
5 delegation
a cell index t via an FAA on tail.
0
20 30 40 80 120 160 The enqueue then atomically places
Threads x in Q[t] using a CAS to update Q[t]
from ⊥ to x. If the CAS succeeds, the
enqueue operation completes; other-
wise, it repeats this process. previously. The impact of CAS failures Maximizing Power Efficiency with
A dequeue, D, obtains a cell in- is explored by testing LCRQ-CAS, a ver- Asymmetric Multicore Systems
Alexandra Fedorova et al.
dex h using FAA on head. It tries to sion of LCRQ in which FAA is imple-
http://queue.acm.org/detail.cfm?id=1658422
atomically CAS the contents of Q[h] mented with a CAS loop.
from ⊥ to another reserved value . Figure 8a shows the results. LCRQ Software and the Concurrency Revolution
Herb Sutter and James Larus
This CAS fails if Q[h] contained some outperforms all other queues beyond http://queue.acm.org/detail.cfm?id=1095421
x ≠ ⊥, in which case D returns x. Oth- two threads, achieving peak throughput
erwise, the fact that D stored in the of ≈ 40 million operations per second, or References
cell guarantees an enqueue operation about 1,000 cycles per queue operation. 1. Al Bahra, S. Nonblocking algorithms and scalable
multicore programming. Commun. ACM 56, 7 (July
that later tries to store an item in Q[h] From eight threads onward, LCRQ out- 2013), 50–61.
will not succeed. D then returns NULL performs the delegation-based queue 2. Boyd-Wickizer, S., Frans Kaashoek, M., Morris, R. and
Zeldovich, N. Non-scalable locks are dangerous. In
(indicating the queue is empty) if tail by 1.4 to 1.5 times and the MS (Michael Proceedings of the Ottawa Linux Symposium, 2012,
≤ h + 1 (the value of head following D’s and Scott) queue by more than three 121–132.
3. Boyd-Wickizer, S., Frans Kaashoek, M., Morris, R. and
FAA is h + 1). If D cannot return NULL, times. LCRQ-CAS matches LCRQ’s Zeldovich, N. OpLog: A library for scaling update-
it repeats this process. performance up to four threads, but at heavy data structures. Technical Report MIT-CSAIL-
TR2014-019, 2014.
This algorithm can be shown to that point its performance levels off. 4. Fatourou, P. and Kallimanis, N.D. Revisiting the
implement a FIFO queue correctly, Subsequently, LCRQ-CAS exhibits the combining synchronization technique. In Proceedings
of the 17th ACM SIGPLAN Symposium on Principles
but it has two major flaws that prevent throughput “meltdown” associated with and Practice of Parallel Programming, 2012, 257–266.
it from being relevant in practice: us- CAS failures. Similarly, the MS queue’s 5. Haas, A., Lippautz, M., Henzinger, T.A., Payer, H.,
Sokolova, A., Kirsch, C.M. and Sezgin, A. Distributed
ing an infinite array and susceptibility performance peaks at two threads and queues in shared memory: multicore performance
and scalability through quantitative relaxation. In
to livelock (when a dequeuer continu- degrades as concurrency increases. Proceedings of the ACM International Conference on
ously writes into the cell an enqueuer Oversubscribed workloads can Computing Frontiers, 2013, 17:1–17:9.
6. Hendler, D., Incze, I., Shavit, N., Tzafrir, M. Flat
is about to access). The practical LCRQ demonstrate the graceful behavior of combining and the synchronization-parallelism
algorithm addresses these flaws. lock-free algorithms under high load. trade-off. In Proceedings of the 22nd ACM Symposium
on Parallelism in Algorithms and Architectures, 2010,
The infinite array is first collapsed to In these workloads the number of soft- 355–364.
a concurrent ring (cyclic array) queue— ware threads exceeds the hardware-sup- 7. Klaftenegger, D., Sagonas, K. and Winblad, K.
Delegation locking libraries for improved performance
CRQ for short—of R cells. The head ported level, forcing the operating sys- of multithreaded programs. In Proceedings of the
and tail indices still strictly increase, tem to context-switch between threads. 20th International European Conference on Parallel
and Distributed Computing, 2014, 572–583.
but now the value of an index modulo R If a thread holding a lock is preempted, 8. Kulkarni, M., Pingali, K., Walter, B., Ramanarayanan,
specifies the ring cell to which it points. a lock-based algorithm cannot make G., Bala, K., Chew, L.P. Optimistic parallelism requires
abstractions. In Proceedings of the 2007 ACM
Because now more than one enqueuer progress until it runs again. Indeed, as SIGPLAN Conference on Programming Language
and dequeuer can concurrently access Figure 8b shows, when the number of Design and Implementation, 211–222.
9. Lozi, J.-P., David, F., Thomas, G., Lawall, J. and
a cell, the CRQ uses a more involved threads exceeds 20, the throughput of Muller, G. Remote core locking: migrating critical
CAS-based protocol for synchronizing the lock-based delegation algorithm section execution to improve the performance of
multithreaded applications. In Proceedings of the
within each cell. This protocol enables plummets by 15 times, whereas both 2012 USENIX Annual Technical Conference, 65–76.
10. Mellor-Crummey, J.M. and Scott, M.L. Algorithms
an operation to avoid waiting for the LCRQ and the MS queue maintain their for scalable synchronization on shared-memory
completion of operations whose FAA peak throughput. multiprocessors. ACM Trans. Computer Systems 9, 1
(1991), 21–65 .
returns smaller indices that also point 11. Michael, M.M. and Scott, M.L. Simple, fast, and
to the same ring cell. Conclusion practical non-blocking and blocking concurrent queue
algorithms. In Proceedings of the 15th Annual ACM
The CRQ’s crucial performance Advanced synchronization methods Symposium on Principles of Distributed Computing,
property is that in the common fast can boost the performance of shared 1996, 267–275.
12. Morrison, A. and Afek, Y. Fast concurrent queues
path, an operation executes only one mutable data structures. Synchroniza- for x86 processors. In Proceedings of the 18th ACM
FAA instruction. The LCRQ algorithm tion still has its price, and when perfor- SIGPLAN Symposium on Principles and Practice of
Parallel Programming, 2013, 103–112.
then builds on the CRQ to prevent the mance demands are extreme (or if the 13. Oyama, Y., Taura, K. and Yonezawa, A. Executing
livelock problem and handle the case properties of centralized data struc- parallel programs with synchronization bottlenecks
efficiently. In Proceedings of International Workshop
of the CRQ filling up. The LCRQ is es- tures are not needed), then distributed on Parallel and Distributed Computing for Symbolic
sentially a Michael and Scott linked data structures are probably the right and Irregular Applications, 1999, 182–204.
14. Shavit, N. Data structures in the multicore age. Comm.
list queue11 in which each node is a choice. For the many remaining cases, ACM 54, 3 (Mar. 2011), 76–84.
CRQ. A CRQ that fills up or experiences however, the methods described in 15. Treiber, R.K. Systems programming: coping with
parallelism. Technical Report RJ5118 (2006). IBM
livelock becomes closed to further en- this article can help build high-perfor- Almaden Research Center.
queues, which instead append a new mance software. Awareness of these
CRQ to the list and begin working in it. methods can assist those designing Adam Morrison works on making parallel and distributed
systems simpler to use without compromising their
Most of the activity in the LCRQ there- software for multicore machines. performance. He is an assistant professor at the Blavatnik
fore occurs in the individual CRQs, School of Computer Science, Tel Aviv University, Israel.
making contention (and CAS failures)
Related articles
on the list’s head and tail a nonissue. on queue.acm.org
Performance. This section com-
Nonblocking Algorithms and Scalable
pares the LCRQ to Michael and Scott’s Multicore Programming
classic lock-free queue,11 as well as to Samy Al Bahra Copyright held by author.
the delegation-based variant presented http://queue.acm.org/detail.cfm?id=2492433 Publication rights licensed to ACM. $15.00
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 51
practice
DOI:10.1145/ 2949033
and remained primarily academic con-
Article development led by
queue.acm.org
cerns. As a result, these protocols were
largely ignored by industry. The rise of
Internet-scale services and demands
Expert-curated guides to the best for automated solutions to cluster
of CS research for practitioners. management, failover, and sharding in
the 2000s finally led to the widespread
BY PETER BAILIS, CAMILLE FOURNIER, practical adoption of these techniques.
Adoption proved difficult, however,
JOY ARULRAJ, AND ANDREW PAVLO
and the process in turn led to new (and
Research
ongoing) research on the subject. The
papers in this selection highlight the
challenges and the rewards of making
the theory of consensus practical—
for Practice:
both in theory and in practice.
Second, while consensus concerns
distributed shared state, our second
selection concerns the impact of hard-
ware trends on single-node shared
and Implications of
NVM (non-volatile memory) technolo-
gies on modern storage systems. NVM
promises to overhaul the traditional
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 53
practice
Can We Make This Easier? were very few reliable and successful nologies—including phase-change
open-source systems based on Paxos. memory, memristors, and STT-MRAM
Ongaro, D., Ousterhout, J. (spin-transfer torque-magnetoresis-
In search of an understandable consensus Bottlenecks, Single Points of tive random-access memory)—that
algorithm.
Proceedings of the Usenix Annual Technical
Failure, and Consensus provide low-latency reads and writes
Conference, 2014, 305–320. Developers are often tempted to use on the same order of magnitude as
https://www.usenix.org/conference/atc14/ a centralized consensus system to serve DRAM (dynamic random-access mem-
technical-sessions/presentation/ongaro as the system of record for distributed ory), but with persistent writes and
coordination. Explicit coordination large storage capacity like an SSD (sol-
Finally we come to the question, have can make certain problems much easi- id-state drive). Unlike DRAM, writes to
we built ourselves into unnecessary er to reason about and correct for; how- NVM are expected to be more expen-
complexity by taking it on faith that ever, that puts the consensus system in sive than reads. These devices also
Paxos and its close cousins are the only the position of the bottleneck or criti- have limited write endurance, which
way to implement consensus? What if cal point of failure for the other sys- necessitates fewer writes and wear-
there was an algorithm that we could tems that rely on it to make progress. leveling to increase their lifetimes.
also show to be correct but was de- As we can see from these papers, mak- The first NVM devices released will
signed to be easier for people to com- ing a centralized consensus system have the same form factor and block-
prehend and implement correctly? production-ready can come at the cost oriented access as today’s SSDs. Thus,
Raft is a consensus algorithm writ- of adding optimizations and recovery today’s DBMSs will use this type of
ten for managing a replicated log but mechanisms that were not dreamed of NVM as a faster drop-in replacement
designed with the goal of making the in the original Paxos literature. for their current storage hardware.
algorithm itself more understandable What is the way forward? Argu- By the end of this decade, however,
than Paxos. This is done both by de- ably, writing systems that do not rely NVM devices will support byte-address-
composing the problem into pieces on centralized consensus brokers to able access akin to DRAM. This will
that can be implemented and under- operate safely would be the best op- require additional CPU architecture
stood independently and by reducing tion, but we are still in the early days and operating-system support for per-
the number of states that are valid for of coordination-avoidance research sistent memory. This also means that
the system to hold. and development. While we wait for existing DBMSs are unable to take full
Consensus is decomposed into is- more evolution on that front, Raft advantage of NVM because their inter-
sues of leader election, log replication, provides an interesting alternative, nal architectures are predicated on the
and safety. Leader election uses ran- an algorithm designed for readability assumption that memory is volatile.
domized election timeouts to reduce and general understanding. The im- With NVM, many of the components
the likelihood of two candidates for pact of having an easier algorithm to of legacy DBMSs are unnecessary and
leader splitting the vote and requiring implement is already being felt, as far will degrade the performance of data-
a new round of elections. It allows can- more developers are embedding Raft intensive applications.
didates for leader to be elected only within distributed systems and build- We have selected three papers that
if they have the most up-to-date logs. ing specifically tailored Raft-based focus on how the emergence of byte-
This prevents the need for transfer- coordination brokers. Consensus addressable NVM technologies will
ring data from follower to leader upon remains a tricky problem—but one impact the design of DBMS archi-
election. If a follower’s log does not that is finally seeing a diversity of ap- tectures. The first two present new
match the expected state for a new en- proaches to reaching a solution. abstractions for performing durable
try, the leader will replay entries from atomic updates on an NVM-resident
earlier in its log until it reaches a point Camille Fournier is a writer, speaker, and entrepreneur. database and recovery protocols for an
Formerly the CTO of Rent the Runway, she serves on
at which the logs match, thus correct- the technical oversight committee for the Cloud Native NVM DBMS. The third paper address-
ing the follower. This also means that a Computing Foundation, as a Project Management es the write-endurance limitations of
Committee member of the Apache ZooKeeper project, and
history of changes is stored in the logs, a project overseer of the Dropwizard Web framework. NVM by introducing a collection of
providing a side value of letting clients write-limited query-processing algo-
read (some) historical entries, should Implications of NVM on rithms. Thus, this selection contains
they desire. Database Management novel ideas that can help leverage the
The authors then show that after Systems unique set of attributes of NVM devic-
teaching a set of students both Paxos By Joy Arulraj and es for delivering the features required
and Raft, the students were quizzed Andrew Pavlo by modern data-management applica-
on their understanding of each and The advent of non-vola- tions. The common theme for these
scored meaningfully higher on the tile memory (NVM) will papers is that you cannot just run an
Raft quiz. Looking around the current fundamentally change existing DBMS on NVM and expect it
state of consensus systems in indus- the dichotomy between to leverage its unique set of proper-
try, we can see this play out in another memory and durable ties. The only way to achieve that is to
way: namely, several new consensus storage in a database come up with novel architectures, pro-
systems have been created since 2014 management system tocols, and algorithms that are tailor-
based on Raft, where previously there (DBMS). NVM is a broad class of tech- made for NVM.
ARIES Redesigned for NVM dates on an NVM-resident database new bounds on the number of writes
than the previous paper. In ARIES, that different kinds of query-process-
Coburn, J., et al. during recovery the DBMS first loads ing algorithms must perform.
From ARIES to MARS: Transaction support the most recent snapshot. It then Viglas presents a collection of novel
for next-generation, solid-state drives.
Proceedings of the 24th ACM Symposium on
replays the redo log to ensure that query-processing algorithms that mini-
Operating Systems Principles, 2013, 197–212. all the updates made by committed mize I/O by trading off expensive NVM
http://queue.acm.org/rfp/vol14iss3.html transactions are recovered. Finally, it writes for cheaper reads. One such
uses the undo log to ensure that the algorithm is the segment sort. The ba-
ARIES is considered the standard for changes made by incomplete transac- sic idea is to use a combination of two
recovery protocols in a transactional tions are not present in the database. sorting algorithms—external merge
DBMS. It has two key goals: first, it pro- This recovery process can take a lot of sort and selection sort—that splits the
vides an interface for supporting scal- time, depending on the load on the input into two segments that are then
able ACID (atomicity, consistency, iso- system and the frequency with which processed using a different algorithm.
lation, durability) transactions; second, snapshots are taken. Thus, this paper The selection-sort algorithm uses extra
it maximizes performance on disk- explores whether it is possible to le- reads, and writes out each element in
based storage systems. In this paper, verage NVM’s properties to speed up the input only once at its final location.
the authors focus on how ARIES should recovery from system failures. By using a combination of these two al-
be adapted for NVM-based storage. The authors present a software- gorithms, the DBMS can optimize both
Since random writes to the disk based primitive called non-volatile the performance and the number of
whenever a transaction updates the da- pointer. When a pointer points to data NVM writes.
tabase obviously decrease performance, residing on NVM, and is itself stored
ARIES requires that the DBMS first re- on NVM, then it will remain valid even Game Changer for
cord a log entry in the write-ahead log after the system recovers from a power DBMS Architectures
(a sequential write) before updating failure. Using this primitive, the au- NVM is a definite game changer for
the database itself (a random write). It thors design a library of non-volatile future DBMS architectures. It will re-
adopts a no-force policy wherein the up- data structures that support durable quire system designers to rethink
dates are written to the database lazily atomic updates. They propose a recov- many of the core algorithms and
after the transaction commits. Such a ery protocol that, in contrast to MARS, techniques developed over the past
policy assumes that sequential writes obviates the need for an ARIES-style 40 years. Using these new storage
to non-volatile storage are significantly redo log. This enables the system to devices in the manner prescribed
faster than random writes. The authors, skip replaying the redo log, and there- by these papers will allow DBMSs
however, demonstrate that this is no by allows the NVM DBMS to recover the to achieve better performance than
longer the case with NVM. database almost instantaneously. what is possible with today’s hard-
The MARS protocol proposes a new Both papers propose recovery pro- ware for write-heavy database appli-
hardware-assisted logging primitive tocols that target an NVM-only storage cations. This is because these tech-
that combines multiple writes to ar- hierarchy. The generalization of these niques are designed to exploit the
bitrary storage locations into a single protocols to a multitier storage hierar- low-latency read/writes of NVM to en-
atomic operation. By leveraging this chy with both DRAM and NVM is a hot able a DBMS to store less redundant
primitive, MARS eliminates the need for topic in research today. data and incur fewer writes. Further-
an ARIES-style undo log and relies on more, we contend that existing in-
the NVM device to apply the redo log at Trading Expensive Writes memory DBMSs are better positioned
commit time. We are particularly fond for Cheaper Reads to use NVM when it is finally available.
of this paper because it helps in better This is because these systems are al-
appreciating the intricacies involved Viglas, S.D. ready designed for byte-addressable
in designing the recovery protocol in a Write-limited sorts and joins access methods, whereas legacy disk-
for persistent memory.
DBMS for guarding against data loss. Proceedings of the VLDB Endowment 7, 5
oriented DBMSs will require laborious
(2014), 413–424. and costly overhauls in order to use
Near-Instantaneous http://www.vldb.org/pvldb/vol7/p413-viglas.pdf NVM correctly, as described in these
Recovery Protocols papers. Word is bond.
The third paper focuses on the higher
Arulraj, J., Pavlo, A., Dulloor, S.R. write costs and limited write-endur- Joy Arulraj is a Ph.D. candidate at Carnegie Mellon
Let’s talk about storage and recovery University. He is interested in the design and
ance problems of NVM. For several de- implementation of next-generation database
methods for non-volatile memory
database systems.
cades algorithms have been designed management systems.
Proceedings of the ACM SIGMOD International for the random-access machine model Andy Pavlo is an assistant professor of databaseology
in the Department of Computer Science at Carnegie
Conference on Management of Data, 2015, where reads and writes have the same Mellon University, Pittsburgh, PA.
707–722. cost. The emergence of NVM devices,
http://queue.acm.org/rfp/vol14iss3.html where writes are more expensive than
reads, opens up the design space for
This paper takes a different approach new write-limiting algorithms. It will Copyright held by authors.
to performing durable atomic up- be fascinating to see researchers derive Publication rights licensed to ACM. $15.00.
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 55
contributed articles
DOI:10.1145/ 2934664
for interactive SQL queries and Pregel11
This open source computing framework for iterative graph algorithms. In the
open source Apache Hadoop stack,
unifies streaming, batch, and interactive big systems like Storm1 and Impala9 are
data workloads to unlock new applications. also specialized. Even in the relational
database world, the trend has been to
BY MATEI ZAHARIA, REYNOLD S. XIN, PATRICK WENDELL, move away from “one-size-fits-all” sys-
TATHAGATA DAS, MICHAEL ARMBRUST, ANKUR DAVE, tems.18 Unfortunately, most big data
XIANGRUI MENG, JOSH ROSEN, SHIVARAM VENKATARAMAN, applications need to combine many
MICHAEL J. FRANKLIN, ALI GHODSI, JOSEPH GONZALEZ, different processing types. The very
SCOTT SHENKER, AND ION STOICA nature of “big data” is that it is diverse
and messy; a typical pipeline will need
Apache Spark:
MapReduce-like code for data load-
ing, SQL-like queries, and iterative
machine learning. Specialized engines
can thus create both complexity and
A Unified
inefficiency; users must stitch together
disparate systems, and some applica-
tions simply cannot be expressed effi-
ciently in any engine.
Engine for
In 2009, our group at the Univer-
sity of California, Berkeley, started
the Apache Spark project to design
a unified engine for distributed data
processing. Spark has a programming
Big Data
model similar to MapReduce but ex-
tends it with a data-sharing abstrac-
tion called “Resilient Distributed Da-
tasets,” or RDDs.25 Using this simple
Processing
extension, Spark can capture a wide
range of processing workloads that
previously needed separate engines,
including SQL, streaming, machine
learning, and graph processing2,26,6
(see Figure 1). These implementations
use the same optimizations as special-
ized engines (such as column-oriented
processing and incremental updates)
and achieve similar performance but
THE GROWTH OF data volumes in industry and research run as libraries over a common en-
gine, making them easy and efficient
poses tremendous opportunities, as well as tremendous to compose. Rather than being specific
computational challenges. As data sizes have outpaced
the capabilities of single machines, users have needed key insights
new systems to scale out computations to multiple ˽˽ A simple programming model can
capture streaming, batch, and interactive
nodes. As a result, there has been an explosion of workloads and enable new applications
that combine them.
new cluster programming models targeting diverse ˽˽ Apache Spark applications range from
computing workloads.1,4,7,10 At first, these models were finance to scientific data processing
and combine libraries for SQL, machine
relatively specialized, with new models developed for learning, and graphs.
new workloads; for example, MapReduce4 supported ˽˽ In six years, Apache Spark has
grown to 1,000 contributors and
batch processing, but Google also developed Dremel13 thousands of deployments.
to these workloads, we claim this result gine, Spark can run diverse functions applications that combine their func-
is more general; when augmented with over the same data, often in memory. tions (such as video messaging and
data sharing, MapReduce can emu- Finally, Spark enables new applica- Waze) that would not have been pos-
late any distributed computation, so tions (such as interactive queries on a sible on any one device.
it should also be possible to run many graph and streaming machine learn- Since its release in 2010, Spark
other types of workloads.24 ing) that were not possible with previ- has grown to be the most active open
Spark’s generality has several im- ous systems. One powerful analogy for source project or big data processing,
portant benefits. First, applications the value of unification is to compare with more than 1,000 contributors. The
are easier to develop because they use a smartphones to the separate portable project is in use in more than 1,000 or-
unified API. Second, it is more efficient devices that existed before them (such ganizations, ranging from technology
to combine processing tasks; whereas as cameras, cellphones, and GPS gad- companies to banking, retail, biotech-
prior systems required writing the gets). In unifying the functions of these nology, and astronomy. The largest
data to storage to pass it to another en- devices, smartphones enabled new publicly announced deployment has
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 57
contributed articles
Figure 1. Apache Spark software stack, with specialized processing libraries implemented returns a result to the program (here,
over the core engine. the number of elements in the RDD)
instead of defining a new RDD.
Spark evaluates RDDs lazily, al-
lowing it to find an efficient plan for
Streaming SQL ML Graph the user’s computation. In particular,
transformations return a new RDD ob-
ject representing the result of a compu-
tation but do not immediately compute
it. When an action is called, Spark looks
at the whole graph of transformations
used to create an execution plan. For ex-
ample, if there were multiple filter or
map operations in a row, Spark can fuse
them into one pass, or, if it knows that
data is partitioned, it can avoid moving
it over the network for groupBy.5 Users
can thus build up programs modularly
without losing performance.
Finally, RDDs provide explicit sup-
port for data sharing among compu-
tations. By default, RDDs are “ephem-
eral” in that they get recomputed each
time they are used in an action (such
as count). However, users can also
more than 8,000 nodes.22 As Spark has across a cluster that can be manipu- persist selected RDDs in memory or
grown, we have sought to keep building lated in parallel. Users create RDDs by for rapid reuse. (If the data does not
on its strength as a unified engine. We applying operations called “transfor- fit in memory, Spark will also spill it
(and others) have continued to build an mations” (such as map, filter, and to disk.) For example, a user searching
integrated standard library over Spark, groupBy) to their data. through a large set of log files in HDFS
with functions from data import to ma- Spark exposes RDDs through a func- to debug a problem might load just the
chine learning. Users find this ability tional programming API in Scala, Java, error messages into memory across the
powerful; in surveys, we find the major- Python, and R, where users can simply cluster by calling
ity of users combine multiple of Spark’s pass local functions to run on the clus-
libraries in their applications. ter. For example, the following Scala errors.persist()
As parallel data processing becomes code creates an RDD representing the
common, the composability of process- error messages in a log file, by search- After this, the user can run a variety of
ing functions will be one of the most ing for lines that start with ERROR, and queries on the in-memory data:
important concerns for both usability then prints the total number of errors:
and performance. Much of data analy- // Count errors mentioning MySQL
sis is exploratory, with users wishing to lines = spark.textFile(“hdfs://...”) errors.filter(s => s.contains(“MySQL”))
combine library functions quickly into errors = lines.filter( .count()
a working pipeline. However, for “big s => s.startsWith(“ERROR”)) // Fetch back the time fields of errors that
data” in particular, copying data be- println(“Total errors:“+errors.count()) // mention PHP, assuming time is field #3:
tween different systems is anathema to errors.filter(s => s.contains(“PHP”))
performance. Users thus need abstrac- The first line defines an RDD backed .map(line => line.split(‘\t’)(3))
tions that are general and composable. by a file in the Hadoop Distributed File .collect()
In this article, we introduce the Spark System (HDFS) as a collection of lines of
programming model and explain why it text. The second line calls the filter This data sharing is the main differ-
is highly general. We also discuss how transformation to derive a new RDD ence between Spark and previous com-
we leveraged this generality to build from lines. Its argument is a Scala puting models like MapReduce; other-
other processing tasks over it. Finally, function literal or closure.a Finally, the wise, the individual operations (such
we summarize Spark’s most common last line calls count, another type of as map and groupBy) are similar. Data
applications and describe ongoing de- RDD operation called an “action” that sharing provides large speedups, often
velopment work in the project. as much as 100×, for interactive que-
ries and iterative algorithms.23 It is also
Programming Model a The closures passed to Spark can call into any the key to Spark’s generality, as we dis-
existing Scala or Python library or even refer-
The key programming abstraction in ence variables in the outer program. Spark
cuss later.
Spark is RDDs, which are fault-toler- sends read-only copies of these variables to Fault tolerance. Apart from provid-
ant collections of objects partitioned worker nodes. ing data sharing and a variety of paral-
lel operations, RDDs also automatical- RDDs usually store only temporary SQL and DataFrames. One of the
ly recover from failures. Traditionally, data within an application, though most common data processing para-
distributed computing systems have some applications (such as the Spark digms is relational queries. Spark SQL2
provided fault tolerance through data SQL JDBC server) also share RDDs and its predecessor, Shark,23 imple-
replication or checkpointing. Spark across multiple users.2 Spark’s de- ment such queries on Spark, using
uses a different approach called “lin- sign as a storage-system-agnostic techniques similar to analytical da-
eage.”25 Each RDD tracks the graph of engine makes it easy for users to run tabases. For example, these systems
transformations that was used to build computations against existing data support columnar storage, cost-based
it and reruns these operations on base and join diverse data sources. optimization, and code generation for
data to reconstruct any lost partitions. query execution. The main idea behind
For example, Figure 2 shows the RDDs in Higher-Level Libraries these systems is to use the same data
our previous query, where we obtain the The RDD programming model pro- layout as analytical databases—com-
time fields of errors mentioning PHP by vides only distributed collections of pressed columnar storage—inside
applying two filters and a map. If any objects and functions to run on them. RDDs. In Spark SQL, each record in an
partition of an RDD is lost (for example, Using RDDs, however, we have built RDD holds a series of rows stored in bi-
if a node holding an in-memory partition a variety of higher-level libraries on nary format, and the system generates
of errors fails), Spark will rebuild it by Spark, targeting many of the use cas-
applying the filter on the corresponding es of specialized computing engines. Figure 2. Lineage graph for the third query
in our example; boxes represent RDDs, and
block of the HDFS file. For “shuffle” op- The key idea is that if we control the arrows represent transformations.
erations that send data from all nodes to data structures stored inside RDDs,
all other nodes (such as reduceByKey), the partitioning of data across nodes,
lines
senders persist their output data locally and the functions run on them, we can
filter(line.startsWith(“ERROR”))
in case a receiver fails. implement many of the execution tech-
errors
Lineage-based recovery is signifi- niques in other engines. Indeed, as we
filter(line.contains(“PHP”)))
cantly more efficient than replication show in this section, these libraries
in data-intensive workloads. It saves often achieve state-of-the-art perfor- PHP errors
both time, because writing data over mance on each task while offering sig- map(line.split(‘\t’)(3))
the network is much slower than writ- nificant benefits when users combine time fields
ing it to RAM, and storage space in them. We now discuss the four main
memory. Recovery is typically much libraries included with Apache Spark.
faster than simply rerunning the pro-
gram, because a failed node usually Figure 3. A Scala implementation of logistic regression via batch gradient descent in Spark.
contains multiple RDD partitions, and
these partitions can be rebuilt in paral- // Load data into an RDD
lel on other nodes. val points = sc.textFile(...).map(readPoint).persist()
A longer example. As a longer exam-
// Start with a random parameter vector
ple, Figure 3 shows an implementa- var w = DenseVector.random(D)
tion of logistic regression in Spark.
It uses batch gradient descent, a // On each iteration, update param vector with a sum
for (i <- 1 to ITERATIONS) {
simple iterative algorithm that
val gradient = points.map { p =>
computes a gradient function over p.x * (1/(1+exp(-p.y*(w.dot(p.x))))-1) * p.y
the data repeatedly as a parallel }.reduce((a, b) => a+b)
sum. Spark makes it easy to load the w -= gradient
}
data into RAM once and run multiple
sums. As a result, it runs faster than
traditional MapReduce. For example,
in a 100GB job (see Figure 4), MapRe- Figure 4. Performance of logistic regression in Hadoop MapReduce vs. Spark for 100GB of
data on 50 m2.4xlarge EC2 nodes.
duce takes 110 seconds per iteration
because each iteration loads the data
from disk, while Spark takes only one Hadoop Spark
second per iteration after the first load. 2,500
Running Time (s)
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 59
contributed articles
code to run directly against this layout. means model) are easily passed to oth-
Beyond running SQL queries, er libraries. Apart from compatibility
we have used the Spark SQL engine at the API level, composition in Spark
to provide a higher-level abstrac- is also efficient at the execution level,
tion for basic data transformations
called DataFrames,2 which are RDDs Spark has a similar because Spark can optimize across pro-
cessing libraries. For example, if one li-
of records with a known schema.
DataFrames are a common abstraction
programming brary runs a map function and the next
library runs a map on its result, Spark
for tabular data in R and Python, with model to will fuse these operations into a single
programmatic methods for filtering,
computing new columns, and aggrega-
MapReduce but map. Likewise, Spark’s fault recovery
works seamlessly across these librar-
tion. In Spark, these operations map extends it with ies, recomputing lost data no matter
down to the Spark SQL engine and re-
ceive all its optimizations. We discuss
a data-sharing which libraries produced it.
Performance. Given that these librar-
DataFrames more later. abstraction ies run over the same engine, do they
One technique not yet implemented
in Spark SQL is indexing, though other called “resilient lose performance? We found that by
implementing the optimizations we
libraries over Spark (such as Indexe- distributed just outlined within RDDs, we can often
dRDDs3) do use it.
Spark Streaming. Spark Streaming26 datasets,” or RDDs. match the performance of specialized
engines. For example, Figure 6 com-
implements incremental stream pro- pares Spark’s performance on three
cessing using a model called “discretized simple tasks—a SQL query, stream-
streams.” To implement streaming over ing word count, and Alternating Least
Spark, we split the input data into small Squares matrix factorization—versus
batches (such as every 200 milliseconds) other engines. While the results vary
that we regularly combine with state across workloads, Spark is generally
stored inside RDDs to produce new re- comparable with specialized systems
sults. Running streaming computations like Storm, GraphLab, and Impala.b For
this way has several benefits over tradi- stream processing, although we show
tional distributed streaming systems. results from a distributed implementa-
For example, fault recovery is less expen- tion on Storm, the per-node through-
sive due to using lineage, and it is pos- put is also comparable to commercial
sible to combine streaming with batch streaming engines like Oracle CEP.26
and interactive queries. Even in highly competitive bench-
GraphX. GraphX6 provides a graph marks, we have achieved state-of-the-
computation interface similar to Pregel art performance using Apache Spark.
and GraphLab,10,11 implementing the In 2014, we entered the Daytona Gray-
same placement optimizations as these Sort benchmark (http://sortbench-
systems (such as vertex partitioning mark.org/) involving sorting 100TB of
schemes) through its choice of parti- data on disk, and tied for a new record
tioning function for the RDDs it builds. with a specialized system built only
MLlib. MLlib,14 Spark’s machine for sorting on a similar number of ma-
learning library, implements more chines. As in the other examples, this
than 50 common algorithms for dis- was possible because we could imple-
tributed model training. For example, it ment both the communication and
includes the common distributed algo- CPU optimizations necessary for large-
rithms of decision trees (PLANET), La- scale sorting inside the RDD model.
tent Dirichlet Allocation, and Alternat-
ing Least Squares matrix factorization. Applications
Combining processing tasks. Spark’s Apache Spark is used in a wide range
libraries all operate on RDDs as the of applications. Our surveys of Spark
data abstraction, making them easy to
combine in applications. For example, b One area in which other designs have outper-
Figure 5 shows a program that reads formed Spark is certain graph computations.12,16
some historical Twitter data using However, these results are for algorithms with
Spark SQL, trains a K-means clustering low ratios of computation to communication
model using MLlib, and then applies (such as PageRank) where the latency from syn-
chronized communication in Spark is signifi-
the model to a new stream of tweets. cant. In applications with more computation
The data tasks returned by each library (such as the ALS algorithm) distributing the ap-
(here the historic tweet RDD and the K- plication on Spark still helps.
users have identified more than 1,000 making applications. Published use streaming with batch and interactive
companies using Spark, in areas from cases for Spark Streaming include queries. For example, video company
Web services to biotechnology to fi- network security monitoring at Cis- Conviva uses Spark to continuously
nance. In academia, we have also seen co, prescriptive analytics at Samsung maintain a model of content distribu-
applications in several scientific do- SDS, and log mining at Netflix. Many tion server performance, querying it
mains. Across these workloads, we find of these applications also combine automatically when it moves clients
users take advantage of Spark’s gener-
ality and often combine multiple of its Figure 5. Example combining the SQL, machine learning, and streaming libraries in Spark.
libraries. Here, we cover a few top use
// Load historical data as an RDD using Spark SQL
cases. Presentations on many use cases val trainingData = sql(
are also available on the Spark Summit “SELECT location, language FROM old_tweets”)
conference website (http://www.spark-
// Train a K-means model using MLlib
summit.org). val model = new KMeans()
Batch processing. Spark’s most com- .setFeaturesCol(“location”)
mon applications are for batch proc- .setPredictionCol(“language”)
.fit(trainingData)
essing on large datasets, including
// Apply the model to new tweets in a stream
Extract-Transform-Load workloads to TwitterUtils.createStream(...)
convert data from a raw format (such .map(tweet => model.predict(tweet.location))
as log files) to a more structured for-
mat and offline training of machine
learning models. Published examples Figure 6. Comparing Spark’s performance with several widely used specialized systems
for SQL, streaming, and machine learning. Data is from Zaharia24 (SQL query and stream-
of these workloads include page per- ing word count) and Sparks et al.17 (alternating least squares matrix factorization).
sonalization and recommendation at
Yahoo!; managing a data lake at Gold-
man Sachs; graph mining at Alibaba; Response Time Throughput Response Time
financial Value at Risk calculation; and (sec) (records/s) (hours)
text mining of customer feedback at 20 10 x 106 6
MATLAB
Impala (disk)
Spark
Spark (disk)
10 3
memory, many of the applications in 4
Spark (mem)
Mahout
this category run only on disk. In such
Redshift
GraphLab
cases, Spark can still improve perfor- 5
2
Spark
mance over MapReduce due to its sup- 1
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 61
contributed articles
across servers, in an application that queries during live experiments. Figure Figure 9, most organizations use mul-
requires substantial parallel work for 8 shows an example image generated tiple components; 88% use at least two
both model maintenance and queries. using Spark. of them, 60% use at least three (such
Scientific applications. Spark has also Spark components used. Because as Spark Core and two libraries), and
been used in several scientific domains, Spark is a unified data-processing en- 27% use at least four components.
including large-scale spam detection,19 gine, the natural question is how many Deployment environments. We also
image processing,27 and genomic data of its libraries organizations actually see growing diversity in where Apache
processing.15 One example that com- use. Our surveys of Spark users have Spark applications run and what data
bines batch, interactive, and stream shown that organizations do, indeed, sources they connect to. While the first
processing is the Thunder platform use multiple components, with over Spark deployments were generally in
for neuroscience at Howard Hughes 60% of organizations using at least Hadoop environments, only 40% of de-
Medical Institute, Janelia Farm.5 It is three of Spark’s APIs. Figure 9 out- ployments in our July 2015 Spark sur-
designed to process brain-imaging data lines the usage of each component in vey were on the Hadoop YARN cluster
from experiments in real time, scaling a July 2015 Spark survey by Databricks manager. In addition, 52% of respon-
up to 1TB/hour of whole-brain imaging that reached 1,400 respondents. We dents ran Spark on a public cloud.
data from organisms (such as zebrafish list the Spark Core API (just RDDs)
and mice). Using Thunder, researchers as one component and the higher- Why Is the Spark Model General?
can apply machine learning algorithms level libraries as others. We see that While Apache Spark demonstrates
(such as clustering and Principal Com- many components are widely used, that a unified cluster programming
ponent Analysis) to identify neurons in- with Spark Core and SQL as the most model is both feasible and useful, it
volved in specific behaviors. The same popular. Streaming is used in 46% of would be helpful to understand what
code can be run in batch jobs on data organizations and machine learning makes cluster programming models
from previous runs or in interactive in 54%. While not shown directly in general, along with Spark’s limita-
tions. Here, we summarize a discus-
Figure 8. Visualization of neurons in the zebrafish brain created with Spark, where each sion on the generality of RDDs from
neuron is colored based on the direction of movement that correlates with its activity.
Source: Jeremy Freeman and Misha Ahrens of Janelia Research Campus.
Zaharia.24 We study RDDs from two
perspectives. First, from an expres-
siveness point of view, we argue that
RDDs can emulate any distributed
computation, and will do so efficient-
ly in many cases unless the computa-
tion is sensitive to network latency.
Second, from a systems point of view,
we show that RDDs give applications
control over the most common bottle-
neck resources in clusters—network and
storage I/O—and thus make it possible
to express the same optimizations
for these resources that characterize
specialized systems.
Expressiveness perspective. To study the
expressiveness of RDDs, we start by com-
paring RDDs to the MapReduce model,
which RDDs build on. The first question
is what computations can MapReduce
Figure 9. Percent of organizations using each Spark component, from the Databricks 2015 itself express? Although there have been
Spark survey; https://databricks.com/blog/2015/09/24/.
numerous discussions about the limita-
tions of MapReduce, the surprising an-
swer here is that MapReduce can emu-
Core
late any distributed computation.
SQL To see this, note that any distributed
computation consists of nodes that per-
Streaming
form local computation and occasionally
MLlib exchange messages. MapReduce offers
the map operation, which allows local
GraphX
computation, and reduce, which allows
0% 20% 40% 60% 80% 100% all-to-all communication. Any distrib-
uted computation can thus be emulated,
Fraction of Users
perhaps somewhat inefficiently, by
breaking down its work into timesteps,
running maps to perform the local bottleneck resources in cluster com- Links. Each node has a 10Gbps
computation in each timestep, and putations? And can RDDs use them ef- (1.3GB/s) link, or approximately 40×
batching and exchanging messages at ficiently? Although cluster applications less than its memory bandwidth and
the end of each step using a reduce. A are diverse, they are all bound by the 2× less than its aggregate disk band-
series of MapReduce steps will capture same properties of the underlying hard- width; and
the whole result, as in Figure 10. Re- ware. Current datacenters have a steep Racks. Nodes are organized into racks
cent theoretical work has formalized storage hierarchy that limits most ap- of 20 to 40 machines, with 40Gbps–
this type of emulation by showing that plications in similar ways. For example, 80Gbps bandwidth out of each rack,
MapReduce can simulate many com- a typical Hadoop cluster might have the or 2×–5× lower than the in-rack net-
putations in the Parallel Random Ac- following characteristics: work performance.
cess Machine model.8 Repeated Map- Local storage. Each node has local Given these properties, the most
Reduce is also equivalent to the Bulk memory with approximately 50GB/s important performance concern in
Synchronous Parallel model.20 of bandwidth, as well as 10 to 20 lo- many applications is the placement of
While this line of work shows that cal disks, for approximately 1GB/s to data and computation in the network.
MapReduce can emulate arbitrary 2GB/s of disk bandwidth; Fortunately, RDDs provide the facili-
computations, two problems can
make the “constant factor” behind Figure 10. Emulating an arbitrary distributed computation with MapReduce.
this emulation high. First, MapReduce
is inefficient at sharing data across
timesteps because it relies on repli- map
cated external storage systems for this
purpose. Our emulated system may
reduce
thus become slower due to writing
out its state after each step. Second,
the latency of the MapReduce steps
(a) MapReduce provides primitives
determines how well our emulation for local computation and all-to-all
will match a real network, and most communication.
Map-Reduce implementations were
designed for batch environments with
minutes to hours of latency. ...
RDDs and Spark address both of (b) By chaining these steps together,
these limitations. On the data-sharing we can emulate any distributed
computation. The main costs for this
front, RDDs make data sharing fast by emulation are the latency of the rounds
avoiding replication of intermediate data and the overhead of passing state
and can closely emulate the in-memory across steps.
“data sharing” across time that would
happen in a system composed of long-
running processes. On the latency front, Figure 11. Example of Spark’s DataFrame API in Python. Unlike Spark’s core API, DataFrames
have a schema with named columns (such as age and city) and take expressions in a limited
Spark can run MapReduce-like steps language (such as age > 20) instead of arbitrary Python functions.
on large clusters with 100ms latency;
nothing intrinsic to the MapReduce model
users.where(users[“age”] > 20)
prevents this. While some applications
.groupBy(“city”)
need finer-grain timesteps and commu- .agg(avg(“age”), max(“income”))
nication, this 100ms latency is enough
to implement many data-intensive
workloads, where the amount of com-
putation that can be batched before a Figure 12. Working with DataFrames in Spark’s R API. We load a distributed DataFrame
using Spark’s JSON data source, then filter and aggregate using standard R column ex-
communication step is high. pressions.
In summary, RDDs build on Map-
Reduce’s ability to emulate any dis- people <- read.df(context, “./people.json”, “json”)
tributed computation but make this
emulation significantly more efficient. # Filter people by age
Their main limitation is increased adults = filter(people, people$age > 20)
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 63
contributed articles
ties to control this placement; the in- ity in new libraries. More than 200 third- relational optimizations under a data
terface lets applications place com- party packages are also available.c In the frame API.d
putations near input data (through research community, multiple projects While DataFrames are still new,
an API for “preferred locations” for at Berkeley, MIT, and Stanford build on they have quickly become a popular
input sources25), and RDDs provide Spark, and many new libraries (such API. In our July 2015 survey, 60% of
control over data partitioning and co- as GraphX and Spark Streaming) came respondents reported using them. Be-
location (such as specifying that data from research groups. Here, we sketch cause of the success of DataFrames,
be hashed by a given key). Libraries four of the major efforts. we have also developed a type-safe in-
(such as GraphX) can thus implement DataFrames and more declarative terface over them called Datasetse that
the same placement strategies used in APIs. The core Spark API was based on lets Java and Scala programmers view
specialized systems.6 functional programming over distrib- DataFrames as statically typed col-
Beyond network and I/O bandwidth, uted collections that contain arbitrary lections of Java objects, similar to the
the most common bottleneck tends to be types of Scala, Java, or Python objects. RDD API, and still receive relational
CPU time, especially if data is in memo- While this approach was highly ex- optimizations. We expect these APIs
ry. In this case, however, Spark can run pressive, it also made programs more to gradually become the standard ab-
the same algorithms and libraries used difficult to automatically analyze and straction for passing data between
in specialized systems on each node. For optimize. The Scala/Java/Python ob- Spark libraries.
example, it uses columnar storage and jects stored in RDDs could have com- Performance optimizations. Much of
processing in Spark SQL, native BLAS plex structure, and the functions run the recent work in Spark has been on per-
libraries in MLlib, and so on. As we over them could include arbitrary formance. In 2014, the Databricks team
discussed earlier, the only area where code. In many applications, develop- spent considerable effort to optimize
RDDs clearly add a cost is network la- ers could get suboptimal performance Spark’s network and I/O primitives, al-
tency, due to the synchronization at if they did not use the right operators; lowing Spark to jointly set a new record
parallel communication steps. for example, the system on its own for the Daytona GraySort challenge.f
One final observation from a systems could not push filter functions Spark sorted 100TB of data 3× faster
perspective is that Spark may incur extra ahead of maps. than the previous record holder based
costs over some of today’s special- To address this problem, we extend- on Hadoop MapReduce using 10× few-
ized systems due to fault tolerance. ed Spark in 2015 to add a more declara- er machines. This benchmark was not
For example, in Spark, the map tasks tive API called DataFrames2 based on executed in memory but rather on (solid-
in each shuffle operation save their the relational algebra. Data frames are state) disks. In 2015, one major effort was
output to local files on the machine a common API for tabular data in Py- Project Tungsten,g which removes Java
where they ran, so reduce tasks can re- thon and R. A data frame is a set of re- Virtual Machine overhead from many of
fetch it later. In addition, Spark imple- cords with a known schema, essentially Spark’s code paths by using code genera-
ments a barrier at shuffle stages, so the equivalent to a database table, that tion and non-garbage-collected memory.
reduce tasks do not start until all the supports operations like filtering One benefit of doing these optimizations
maps have finished. This avoids some and aggregation using a restricted in a general engine is that they simulta-
of the complexity that would be needed “expression” API. Unlike working in neously affect all of Spark’s libraries;
for fault recovery if one “pushed” re- the SQL language, however, data frame machine learning, streaming, and SQL
cords directly from maps to reduces in operations are invoked as function all became faster from each change.
a pipelined fashion. Although removing calls in a more general programming R language support. The SparkR
some of these features would speed language (such as Python and R), al- project21 was merged into Spark in
up the system, Spark often performs lowing developers to easily structure 2015 to provide a programming inter-
competitively despite them. The main their program using abstractions in the face in R. The R interface is based on
reason is an argument similar to our host language (such as functions and DataFrames and uses almost identical
previous one: many applications are classes). Figure 11 and Figure 12 show syntax to R’s built-in data frames. Oth-
bound by an I/O operation (such as examples of the API. er Spark libraries (such as MLlib) are
shuffling data across the network or Spark’s DataFrames offer a similar also easy to call from R, because they
reading it from disk) and beyond this API to single-node packages but auto- accept DataFrames as input.
operation, optimizations (such as matically parallelize and optimize the Research libraries. Apache Spark
pipelining) add only a modest benefit. computation using Spark SQL’s query continues to be used to build higher-
We have kept fault tolerance “on” by planner. User code thus receives op-
default in Spark to make it easy to reason timizations (such as predicate push-
d One reason optimization is possible is that
about applications. down, operator reordering, and join Spark’s DataFrame API uses lazy evaluation
algorithm selection) that were not where the content of a DataFrame is not com-
Ongoing Work available under Spark’s functional API. puted until the user asks to write it out. The
Apache Spark remains a rapidly evolv- To our knowledge, Spark DataFrames data frame APIs in R and Python are eager, pre-
ing project, with contributions from are the first library to perform such venting optimizations like operator reordering.
e https://databricks.com/blog/2016/01/04/in-
both industry and research. The code- troducing-spark-datasets.html
base size has grown by a factor of six c One package index is available at https:// f http://sortbenchmark.org/ApacheSpark2014.pdf
since June 2013, with most of the activ- spark-packages.org/ g https://databricks.com/blog/2015/04/28/
level data processing libraries. Recent amplab/spark-indexedrdd S., and Stoica, I. Shark: SQL and rich analytics at scale.
4. Dean, J. and Ghemawat, S. MapReduce: Simplified In Proceedings of the ACM SIGMOD/PODS Conference
projects include Thunder for neurosci- data processing on large clusters. In Proceedings of (New York, June 22–27). ACM Press, New York, 2013.
ence,5 ADAM for genomics,15 and Kira the Sixth OSDI Symposium on Operating Systems 24. Zaharia, M. An Architecture for Fast and General Data
Design and Implementation (San Francisco, CA, Dec. Processing on Large Clusters. Ph.D. thesis, Electrical
for image processing in astronomy.27 6–8). USENIX Association, Berkeley, CA, 2004. Engineering and Computer Sciences Department,
Other research libraries (such as 5. Freeman, J., Vladimirov, N., Kawashima, T., Mu, Y., University of California, Berkeley, 2014; https://www.eecs.
Sofroniew, N.J., Bennett, D.V., Rosen, J., Yang, C.-T., berkeley.edu/Pubs/TechRpts/2014/EECS-2014-12.pdf
GraphX) have been merged into the Looger, L.L., and Ahrens, M.B. Mapping brain activity 25. Zaharia, M. et al. Resilient distributed datasets: A
main codebase. at scale with cluster computing. Nature Methods 11, 9 fault-tolerant abstraction for in-memory cluster
(Sept. 2014), 941–950. computing. In Proceedings of the Ninth USENIX
6. Gonzalez, J.E. et al. GraphX: Graph processing in a NSDI Symposium on Networked Systems Design and
Conclusion distributed dataflow framework. In Proceedings of the Implementation (San Jose, CA, Apr. 25–27, 2012).
26. Zaharia, M. et al. Discretized streams: Fault-tolerant
11th OSDI Symposium on Operating Systems Design
Scalable data processing will be es- and Implementation (Broomfield, CO, Oct. 6–8). streaming computation at scale. In Proceedings of
USENIX Association, Berkeley, CA, 2014. the 24th ACM SOSP Symposium on Operating Systems
sential for the next generation of Principles (Farmington, PA, Nov. 3–6). ACM Press, New
7. Isard, M. et al. Dryad: Distributed data-parallel
computer applications but typically programs from sequential building blocks. In York, 2013.
Proceedings of the EuroSys Conference (Lisbon, 27. Zhang, Z., Barbary, K., Nothaft, N.A., Sparks, E., Zahn,
involves a complex sequence of pro- Portugal, Mar. 21–23). ACM Press, New York, 2007. O., Franklin, M.J., Patterson, D.A., and Perlmutter, S.
cessing steps with different com- 8. Karloff, H., Suri, S., and Vassilvitskii, S. A model Scientific Computing Meets Big Data Technology:
of computation for MapReduce. In Proceedings An Astronomy Use Case. In Proceedings of IEEE
puting systems. To simplify this of the ACM-SIAM SODA Symposium on Discrete International Conference on Big Data (Santa Clara,
task, the Spark project introduced Algorithms (Austin, TX, Jan. 17–19). ACM Press, CA, Oct. 29–Nov. 1). IEEE, 2015.
New York, 2010.
a unified programming model and 9. Kornacker, M. et al. Impala: A modern, open-source
engine for big data applications. Our SQL engine for Hadoop. In Proceedings of the Seventh Matei Zaharia ([email protected]) is an assistant
Biennial CIDR Conference on Innovative Data professor of computer science at Stanford University,
experience shows such a model can Systems Research (Asilomar, CA, Jan. 4–7, 2015). Stanford, CA, and CTO of Databricks, San Francisco, CA.
efficiently support today’s workloads 10. Low, Y. et al. Distributed GraphLab: A framework Reynold S. Xin ([email protected]) is the chief architect
for machine learning and data mining in the cloud. on the Spark team at Databricks, San Francisco, CA.
and brings substantial benefits to users. In Proceedings of the 38th International VLDB
We hope Apache Spark highlights the Conference on Very Large Databases (Istanbul, Patrick Wendell ([email protected]) is the vice
Turkey, Aug. 27–31, 2012). president of engineering at Databricks, San Francisco, CA.
importance of composability in pro- 11. Malewicz, G. et al. Pregel: A system for large-scale Tathagata Das ([email protected]) is a software
gramming libraries for big data and graph processing. In Proceedings of the ACM engineer at Databricks, San Francisco, CA.
SIGMOD/PODS Conference (Indianapolis, IN, June
encourages development of more eas- 6–11). ACM Press, New York, 2010. Michael Armbrust ([email protected]) is a
software engineer at Databricks, San Francisco, CA.
ily interoperable libraries. 12. McSherry, F., Isard, M., and Murray, D.G. Scalability!
But at what COST? In Proceedings of the 15th Ankur Dave ([email protected]) is a graduate
All Apache Spark libraries described HotOS Workshop on Hot Topics in Operating Systems student in the Real-Time, Intelligent and Secure Systems
in this article are open source at http:// (Kartause Ittingen, Switzerland, May 18–20). USENIX Lab at the University of California, Berkeley.
Association, Berkeley, CA, 2015.
spark.apache.org/. Databricks has 13. Melnik, S. et al. Dremel: Interactive analysis of Web- Xiangrui Meng ([email protected]) is a software
scale datasets. Proceedings of the VLDB Endowment 3 engineer at Databricks, San Francisco, CA.
also made videos of all Spark Summit
(Sept. 2010), 330–339. Josh Rosen ([email protected]) is a software
conference talks available for free at 14. Meng, X., Bradley, J.K., Yavuz, B., Sparks, E.R., engineer at Databricks, San Francisco, CA.
https://spark-summit.org/. Venkataraman, S., Liu, D., Freeman, J., Tsai, D.B.,
Amde, M., Owen, S., Xin, D., Xin, R., Franklin, M.J., Shivaram Venkataraman ([email protected])
Zadeh, R., Zaharia, M., and Talwalkar, A. MLlib: is a Ph.D. student in the AMPLab at the University of
California, Berkeley.
Acknowledgments Machine learning in Apache Spark. Journal of Machine
Learning Research 17, 34 (2016), 1–7. Michael Franklin ([email protected]) is the Liew
Apache Spark is the work of hun- 15. Nothaft, F.A., Massie, M., Danford, T., Zhang, Z., Family Chair of Computer Science at the University of
Laserson, U., Yeksigian, C., Kottalam, J., Ahuja, A.,
dreds of open source contributors Hammerbacher, J., Linderman, M., Franklin, M.J.,
Chicago and Director of the AMPLab at the University of
California, Berkeley.
who are credited in the release notes Joseph, A.D., and Patterson, D.A. Rethinking data-
intensive science using scalable analytics systems. Ali Ghodsi ([email protected]) is the CEO of Databricks
at https://spark.apache.org. Berke- In Proceedings of the SIGMOD/PODS Conference and adjunct faculty at the University of California,
ley’s research on Spark was sup- (Melbourne, Australia, May 31–June 4). ACM Press, Berkeley.
New York, 2015.
ported in part by National Science 16. Shun, J. and Blelloch, G.E. Ligra: A lightweight
Joseph E. Gonzalez ([email protected]) is an
assistant professor in EECS at the University of California,
Foundation CISE Expeditions Award graph processing framework for shared memory. Berkeley.
In Proceedings of the 18th ACM SIGPLAN PPoPP
CCF-1139158, Lawrence Berkeley Symposium on Principles and Practice of Parallel Scott Shenker ([email protected]) is a professor
National Laboratory Award 7076018, Programming (Shenzhen, China, Feb. 23–27). ACM in EECS at the University of California, Berkeley.
Press, New York, 2013.
and DARPA XData Award FA8750- 17. Sparks, E.R., Talwalkar, A., Smith, V., Kottalam,
Ion Stoica ([email protected]) is a professor in
EECS and co-director of the AMPLab at the University of
12-2-0331, and gifts from Amazon J., Pan, X., Gonzalez, J.E., Franklin, M.J., Jordan, California, Berkeley.
M.I., and Kraska, T. MLI: An API for distributed
Web Services, Google, SAP, IBM, The machine learning. In Proceedings of the IEEE ICDM
Thomas and Stacey Siebel Founda- International Conference on Data Mining (Dallas, TX, Copyright held by the authors.
Dec. 7–10). IEEE Press, 2013. Publication rights licensed to ACM. $15.00
tion, Adobe, Apple, Arimo, Blue 18. Stonebraker, M. and Cetintemel, U. ‘One size fits all’: An
Goji, Bosch, C3Energy, Cisco, Cray, idea whose time has come and gone. In Proceedings
of the 21st International ICDE Conference on Data
Cloudera, EMC2, Ericsson, Face- Engineering (Tokyo, Japan, Apr. 5–8). IEEE Computer
book, Guavus, Huawei, Informatica, Society, Washington, D.C., 2005, 2–11.
19. Thomas, K., Grier, C., Ma, J., Paxson, V., and Song,
Intel, Microsoft, NetApp, Pivotal, D. Design and evaluation of a real-time URL spam
Samsung, Schlumberger, Splunk, filtering service. In Proceedings of the IEEE
Symposium on Security and Privacy (Oakland, CA, May
Virdata, and VMware. 22–25). IEEE Press, 2011.
20. Valiant, L.G. A bridging model for parallel computation.
Commun. ACM 33, 8 (Aug. 1990), 103–111.
References 21. Venkataraman, S. et al. SparkR; http://dl.acm.org/
1. Apache Storm project; http://storm.apache.org citation.cfm?id=2903740&CFID=687410325&CFTO
2. Armbrust, M. et al. Spark SQL: Relational data KEN=83630888
processing in Spark. In Proceedings of the ACM 22. Xin, R. and Zaharia, M. Lessons from running large- Watch the authors discuss
SIGMOD/PODS Conference (Melbourne, Australia, May scale Spark workloads; http://tinyurl.com/large- their work in this exclusive
31–June 4). ACM Press, New York, 2015. scale-spark Communications video.
3. Dave, A. Indexedrdd project; http://github.com/ 23. Xin, R.S., Rosen, J., Zaharia, M., Franklin, M.J., Shenker, http://cacm.acm.org/videos/spark
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 65
contributed articles
DOI:10.1145/ 2934663
Our review of tools available to im-
Enterprises that impose stringent prove attack-resistance finds that,
for example, compelling returns are
password-composition policies appear offered by password blacklists, throt-
to suffer the same fate as those that do not. tling, and hash iteration, while cur-
rent password-composition policies
BY DINEI FLORÊNCIO, CORMAC HERLEY, AND PAUL C. VAN OORSCHOT fail to provide demonstrable improve-
ment in outcomes against offline
Pushing
guessing attacks.
Suppose a system administrator
is tasked with defending a corpo-
rate, government, or university net-
on String:
work or site. The user population’s
passwords are targeted by attackers
seeking access to network resources.
Passwords can be attacked by guess-
Region of
is a practical question faced by mil-
lions of administrators. Little action-
able guidance exists on how to do so
in a principled fashion. Sensible poli-
Password
cies must consider how to protect the
population of user accounts. What is
the best measure of the strength of a
population of passwords? Is a good
Strength
proxy for the overall ability of the
network to resist guessing attacks
the average, median, strongest, or
weakest password? Slogans (such as
suggesting that all passwords be “as
strong as possible”) are too vague to
guide action—and also suggest that
infinite user effort is both available
and achievable.
perhaps not; if half of them have attack channel, was it online guessing? ministrator consider the network only
been, he almost certainly should. Or, somehow, did the attacker get hold 50% compromised, or fully overrun?
What about a 1% compromise rate, or of the password hash file and succeed The answer depends of course on the
5%, or 10%? At what point is global re- with an offline guessing attack? nature of the network in question. If a
set the right answer? When attacks on passwords suc- compromised account never has impli-
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 67
contributed articles
cations for any other account, then we credentials, to get others. Phishing
might say the damage grows more-or- email messages that originate from
less linearly with α. However, for an en- an internal account are far more likely
terprise network, a compromised ac- to deceive co-workers. Depending on
count almost certainly has snowballing
effects;3 a single credential might give The argument— the attacker’s objective, a toehold in
the network may be all she requires.
access to many network resources, so
the damage to the network grows faster
“stronger The 2011 attack on RSA1 (which forced
a recall of all SecurID tokens) began
than α (one possible curve is shown in passwords are with phishing email messages to “two
Figure 1). At α = 0.5, a system is argu-
ably completely overrun. For many en-
always better”— small groups of employees,”13 none
“particularly high-profile or high-value
terprise environments in this scenario, while deeply targets.” Dunagan et al.3 in examin-
there would be few if any resources the
attacker cannot access; in many social
ingrained, thus ing a corporate network of more than
100,000 machines, found that 98.1%
networks, the network value would appears untrue. of the machines allowed an outward
approach zero, as spam would prob- snowballing effect allowing compro-
ably render things unusable; access to mise of 1,000 additional machines.
(probably well under) 50% of email in- Edward Snowden was able to compro-
boxes likely yields a view to almost all mise an enormous fraction of secrets
company email, as an attacker requires on the NSA network starting from just
access to only one of the sender(s) or one account.14 Given that RSA and the
recipient(s) of each message. NSA (organizations we might expect to
All password-based systems must have above-average security standards)
tolerate some compromise; passwords experienced catastrophic failures
will be keylogged, cross-site scripted, caused by handfuls of credentials in
and spear-phished, and a network un- attackers’ hands, we suggest a reason-
able to handle this reality will not be able upper bound on the saturation
able to function in a modern threat en- point for a corporate or government
vironment. As an attacker gains more network is αsat ≈ 0.1; saturation likely
and more credentials in an enterprise occurs at much lower values.
network, she naturally reaches some It seems likely that enterprises will
saturation point after which the im- have the lowest values of αsat; at con-
pact of additional fallen credentials is sumer Web services, compromise of
negligible, having relatively little effect one account has less potential to affect
on the attacker’s ability to inflict harm. the whole network. As noted earlier,
The first credential gives initial access our focus here is on enterprise; none-
to the network, and the second, third, theless, we suggest that damage prob-
and fourth solidify the beachhead, ably also grows faster than linearly
but the benefit brought by each addi- at websites. For example, online ac-
tional credential decreases steadily. counts at a bank should have minimal
The gain is substantial when k is small, crossover effects, but at 25% compro-
less when large; the second password mise all confidence in the legitimacy of
guessed helps a lot more than, say, transaction requests and the privacy of
password 102. customer data is likely lost.
Let αsat be the threshold value at For guessing attacks, the most eas-
which the attacker effectively has con- ily guessed passwords fall first. It is
trol of the password-protected part of thus the weakest passwords that de-
the system, in the sense that there is termine αsat; the number of guesses it
negligible marginal gain from compro- takes to gather a cumulative fraction
mising additional credentials. That is, αsat of accounts is what it takes to reach
if an attacker had control of a fraction the saturation point. Since the attack-
αsat of account credentials, there are er’s ability to harm saturates once she
very few resources that she could not reaches αsat, the excess strength of the
access; so the difference between con- remaining (1-αsat) of user passwords
trolling αsat and (αsat + ε) is negligible. is wasted. For example, the strongest
In what follows, our main focus is en- 50% of passwords might indeed be very
terprise networks to consider possible strong, but from a systemwide view-
values for αsat. point, that strength will come into play
There are a variety of tools attack- only when the other 50% of credentials
ers can use, once they have one set of has already been compromised—and
the attacker already has the run of ev- order to verify correctness of guesses gradually with the number of guesses a
erything that is password-protected in offline), and, in this case, an offline password will withstand. One hundred
the enterprise network. attack is necessary only if that file has guesses per account might be easy for
In summary, password strength, or been properly salted and hashed; oth- an online attacker, but 1,000 is some-
guessing resistance, is not an abstract erwise, simpler attacks are possible6 what more difficult, and so on; at some
quantity to be pursued for its own sake. (such as rainbow tables for unsalted point, online guessing is no longer fea-
Rather, it is a tool we use to deny an hashed passwords). sible. Similarly, at some point the risk
attacker access to network resources. There is an enormous difference be- from an offline attack begins to gradu-
There is a saturation point, αsat, where tween the strength required to resist on- ally decrease. Let T0 be a threshold rep-
the network is so thoroughly penetrat- line and offline guessing attacks. Natu- resenting the maximum number of
ed that additional passwords gain the rally, the probability of falling either to guesses expected from an online attack
attacker very little; resistance to guess- an online or offline attack decreases and T1 correspondingly the minimum
ing beyond that point is wasted since it
denies the attacker nothing. There is Figure 1. The fraction of network resources under attacker control grows faster than the
fraction of credentials compromised.
thus a “don’t care” region after that sat-
uration level of compromise, and for
enterprise networks, it appears quite For example, given half of a system’s credentials, an attacker likely effectively has access to all
reasonable to assume that αsat is no resources. For enterprise networks, we expect the saturation threshold, αsat, where the network is
completely penetrated, is likely under 0.1.
higher than 0.1. At that level, it is not
simply the case that the weakest 10% of 1.0
credentials is most important, but that
Fraction of resources controlled by attacker
1
for typical hash computations required 0.9
to test candidate guesses. Offline at-
Effective Attacher control
0.8
tacks can thus test many-orders-of- Online-offline chasm
0.7
magnitude more guesses than online
0.6
attacks, whether or not online attacks
0.5
are rate-limited by system defenses.
0.4
We consider online and offline guess-
ing attacks separately. 0.3
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 69
contributed articles
number of guesses expected from a offline gap are thus in a “don’t care” From the administrator’s point of view,
credible offline attack. (The asymme- region in which they do both too much the strongest password in the popula-
try in these definitions is intentional to and not enough—too much if the at- tion, or having the greatest guessing-
provide a conservative estimate in rea- tack vector is online guessing and not resistance (the password at α = 1.0) is
soning about the size of the gap.) A pass- enough if it is offline. Once a password similar to one at αsat + ε, in that both fall
word withstanding T0 guesses is then is able to withstand T0 guesses, to stop after the attacker’s ability to harm is al-
safe from online guessing attacks, while additional attacks, any added strength ready saturated.
one that does not withstand T1 guesses must be sufficient to move to the right In summary, shaded regions denote
will certainly not survive offline attacks. of T1. While both online and offline at- these areas where there is no return-
Our own previous work6 suggests T0 ≈ tacks are countered by guessing-resis- on-effort for “extra strength”: first,
106 and T1 ≈ 1014 are reasonable coarse tance, the amount needed varies enor- passwords whose guessing-resistance
estimates, giving a gap eight orders of mously. In practical terms, distinct lies within the online-offline gap; and
magnitude wide (see Figure 2); we em- defenses are required to stop offline second, passwords beyond where an
phasize, however, that the arguments and online attacks. This online-offline attacker gains little from additional
herein are generic, regardless of the ex- chasm then gives us a second “don’t credentials. The size of the “don’t
act values of T0 and T1. While estimates care” region, besides the one defined care” region naturally depends on the
for T1 in particular are inexact, depend- by αsat. particular values of αsat, T0, and T1, but
ing as they do on assumptions about in all cases the shape of the password
attacker hardware and strategy (a previ- The “Don’t Care” Region distribution (as defined by the col-
ous estimate6 assumed four months of The compromise saturation point and ored curves in Figure 2) matters only
cracking against one million accounts the online-offline chasm each imply in the areas below αsat AND (left of T0
using 1,000 GPUs, each doing one bil- regions where there is no return on ef- OR right of T1). An important obser-
lion guesses per second), clearly T1 fort. The marginal return on effort is vation is that, under reasonable as-
vastly exceeds T0. zero for improving any password that sumptions, the “don’t care” regions
A critical observation is that for a starts in the zone greater than T0 yet cover a majority of the design space.
password P whose guessing-resistance remains less than T1 and for passwords The relatively small unshaded regions
falls between T0 and T1, incremental with guessing-resistance above the αsat are shown in Figure 2; outside these
effort that fails to move it beyond T1 threshold. Figure 2 depicts this situa- regions, changes to the password dis-
is wasted in the sense that there is no tion, with shaded areas denoting the tribution accomplish nothing, at least
guessing attack to which the original “don’t care” regions. A password with- from the administrator’s viewpoint.
P falls, that the stronger password re- standing T1 − ε guesses has the same To anchor the discussion, based on
sists, since guessing attacks are either survival properties as one surviving T0 + what we know of enterprise networks
online or offline, with no continuum ε; both survive online but fall to offline and attacker abilities, we have offered
in between. Passwords in this online- attack, that is, equivalent outcomes. estimates of αsat = 0.1, T0 = 106, and T1
= 1014; but choosing different values
Figure 3. Defensive elements aiming to improve guessing resistance. does not alter the conclusion that in
(1) α
sat (the point at which attacker control saturates) can be raised by implementing basic security large areas of the guess-resistance
principles (such as least-privilege and compartmentalization). vs. credentials-compromised space,
(2) T
0 (the maximum number of online guesses) can be reduced by throttling mechanisms. changing the distribution of user-cho-
(3) T
1 (the minimum number to be expected from an offline attack) can be reduced by iterated hash
functions. sen passwords improves little of use
(4) T
he cumulative fraction of accounts that have fallen at a given number of attacker guesses can for the enterprise defender; it causes
be reduced (pushing the blue curve down) by improving the guessing-resistance of user-chosen no direct damage but, like pushing on
passwords (such as to the left of T0 by password blacklisting and by password composition
string, is ineffective and wasteful of
policies generally).
Changing αsat, T0 and T1 alter the size of white- and blue-shaded “don’t care” regions. user energy.
To understand the consequences of
the “don’t-care” regions, Figure 2 also
Fraction of credentials compromised α
1
depicts guessing resistance of three hy-
0.9
pothetical password distributions. The
0.8 2 3
blue (top) and green (middle) curves
0.7 4 diverge widely on the right side of the
0.6
figure; for a fixed number of guesses,
0.5
far fewer green-curve accounts will
0.4
be compromised than accounts from
0.3 4
the blue-curve distribution. The green
0.2 1
αsat
curve might appear better since those
4
0.1 passwords are much more guess-resis-
0 tant than those from the blue curve.
T0 T1
Nonetheless, they have identical attack
log10 (Number of guesses)
survival outcomes since their diver-
gence between T0 and T1 has minimal
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 71
contributed articles
by the network topology and might be against each of one million accounts.6
relatively difficult to control or change Furthermore, technology advances typi-
in a given environment. Basic net- cally aid attackers more than defenders;
work hygiene and adherence to secu- few administrators replace hardware
rity principles (such as least privilege,
isolation, and containment) can help If the hashed every year, while attackers can be as-
sumed to have access to the latest re-
minimize damage when intrusion oc-
curs. These defenses are also effective
password file sources. This moves T1 to the right—as
computing speeds and technology ad-
against intrusions that do not involve leaks, burdening vance and customized hardware gives
password guessing. We assume these
defenses will already be in force and in
users with complex attackers further advantages. Since the
hardware is controlled by the attacker,
the rest of this section concentrate on composition little can be done to directly throttle
measures that mostly affect password-
guessing attacks.
policies does not offline guessing. However, an effective
way to reduce the number of trials per
Improving T0. There are few, if any, alter the fact that a second is to use a slow hash, that is, one
good reasons for an authentication sys-
tem to allow hundreds of thousands of competent attacker that consumes more computation. The
most obvious means is hash iteration;12
distinct guesses on a single account. likely gets access to recent research is also exploring the
In cases of actual password forgetting,
it is unlikely that the legitimate user all the accounts she design of hash functions specifically
designed to be GPU unfriendly. For
types more than one dozen or so dis-
tinct guesses. Mechanisms that limit
could possibly need. example, ignoring the counteracting
force of speed gains due to advancing
the number of online guesses (thus technology, an iteration count of n =
reducing T0) include various throttling 104 reduces T1 from 1014 to 1010. Even
mechanisms (rate limiting) and IP ad- with iteration it is difficult to move T1
dress blacklisting. The possibility of all the way left to T0 to make the online-
denial-of-service attacks can usually offline chasm disappear. Limits on
be dealt with by IP address whitelist- further increasing n arise from the re-
ing. (We mean, not applying the throt- quirement that the time to verify legiti-
tling triggered by new IP addresses to mate users must be tolerable to both
known addresses from which a previ- the users (wait time) and system hard-
ous login succeeded; wrong guesses ware; for example, n might allow verify-
from that known address should still ing 100 legitimate users/second (10ms
be subject to throttling.) A simple, per user). If 10ms is a tolerable delay,
easily implemented throttling mecha- an attacker with access to 1,000 GPUs
nism may suffice for many sites. When can compute a total of 1,000 × 4 × 30 ×
denial-of-service attacks are a possibil- 24 × 60 × 60/10−2 ≈ 1012 guesses in four
ity, more complex mechanisms may months. Directing this effort at 100 ac-
be necessary, perhaps including IP counts would mean each would have to
address white- and blacklisting, and withstand a minimum of T1 = 1010 guesses.
methods requiring greater effort from Since these are conservative assumptions,
site administrators. Such defensive it appears challenging for a typical enter-
improvements should come without prise to decrease T1 below this point.
additional burdens of effort and incon- Note that, as technology evolves, the
venience to users. Together with pass- number of hash iterations can easily be
word blacklisting (discussed in the fol- increased, invisibly to users and on the
lowing section), throttling may almost fly—by updating the stored password
completely shut down generic online hashes in the systemside file to reflect
guessing attacks. new iteration counts.12 Among the ap-
Improving T1. A password must pealing aspects of iterated hashing, it
withstand a relatively large number of is long known as an effective defensive
guesses to have any hope of withstand- tool, and costs are borne systemside
ing credible offline attacks. A lower rather than by user effort. However,
bound T1 on this number may vary de- hash iteration is not a miracle cure-all;
pending on the defenses put in place for a password whose guess-resistance
and can be quite high. For example, if is 106, online throttling is still impor-
an attacker can make 10 billion guess- tant. An online attacker who could test
es per second on each of 1,000 GPUs,7 one password every 10ms (matching the
then in a four-month period, she can system rate noted earlier) will succeed
try approximately T1 = 1014 guesses in 104 seconds = 2 hours, 47minutes.
Eliminating offline attacks altogeth- attacker to online guessing, since, by Password blacklisting. Attackers ex-
er. The emphasis by any parties who en- design, the connection is too small to ploit the fact that certain passwords
courage users to choose stronger pass- support the typical guessing rate an of- are common. Explicitly forbidding
words can obscure the fact that offline fline attacker needs, or to allow export passwords known to be common can
attacks are only a risk when the pass- of any file that would be useful to an of- thus reduce risk. Blacklists concen-
word hash file “leaks,” a euphemism fline attacker. trate at the head of the distribution,
for “is somehow stolen,” or otherwise Improving the password distribu- blocking the choice of most common
becomes available to an attacker. Any tion. Finally, we consider changes to passwords. For example, a user who
means or mechanisms that prevent the the password distribution as a means attempts to choose “abcdefg” is in-
password hash file from leaking entire- of improving outcomes. Recall the formed this is not allowed and asked to
ly remove the need for individual pass- curves in Figure 2 represent the cu- choose another. Certain large services
words to withstand an offline attack. mulative fraction of accounts compro- do this; for example, Twitter black-
Since we defined T1 as the minimum mised as a function of the number of listed 380 common passwords after an
number of guesses a password must guesses per account. In general, it has online guessing incident in 2009, and
withstand to resist an offline attack, any been accepted without much second Microsoft applied a blacklist of several
such mechanism effectively reduces T1 thought that the lower this cumulative hundred to its online consumer prop-
to zero. We next discuss one particular- fraction the better; a great deal of effort erties in 2011. With improvements of
ly appealing such mechanism. has gone into coercing users to choose password crackers and the recent wide
Hardware security modules (HSMs). supposedly “stronger” passwords, thus availability of passwords lists, black-
A properly used HSM eliminates the pushing the cumulative distribution lists need to be longer.
risk of a hash file leaking6 or, equiva- curve downward in one or more of the A blacklist of, say, 106 common
lently, eliminates the risk of a decryp- three regions induced by T0 and T1. passwords may help bring guessing re-
tion key (or backup thereof) leaking in However, as explained earlier, lower sistance to the 105 level. A natural con-
the case that the means used to protect is of tangible benefit only outside the cern with blacklists is that users may
the information stored systemside to “don’t care” region; improvements to not understand why particular choices
verify passwords is reversible encryp- the curve inside the “don’t care” region are forbidden. Kelley et al.8 examined
tion. In such a proper HSM architec- have negligible effects on outcomes in the guessing resistance of blacklists of
ture, rather than a file of the one-way any attack scenario. various sizes, but the question of how
hashes of salted passwords, what is First, note that tools to influence long one can be before the decisions
stored in each file entry is a message the cumulative distribution are mostly appear capricious is an open one. Kom-
authentication code (MAC) computed indirect; users choose passwords, not anduri et al.9 pursue this question with
over the corresponding password us- administrators. For example, by some a meter that displays, as a user types,
ing a secret key. When a password can- combination of education campaigns, the most likely password completion.
didate is presented for verification, password policies, and password me- A further unknown is the improve-
the candidate plus the corresponding ters, administrators may try to influence ment achieved when users are told
MAC from the system file are provided this curve toward “better” passwords. their password choice is forbidden. It
as HSM inputs. The HSM holds the However, the cumulative distribution appears statistically unlikely that all
system secret key used to compute the is ultimately determined by user pass- of the users who initially selected one
MAC; importantly, this secret key is word choice; if users ignore advice, do of Twitter’s 380 blacklisted passwords
by design never available outside the the minimum to comply with policies, would collide again on so small a list
HSM. Upon receiving the (MAC, can- or are not motivated by meters, then ef- when making their second choice, but
didate password) input pair, the HSM forts to lower the curve may have little we are not aware of any measurements
independently computes a MAC over impact on user choices. of the dispersion achieved. A promis-
the input candidate, compares it to the Second, note that many policy and ing recent practical password strength
input MAC, and answers yes (if they education mechanisms are unfocused estimator is zxcvbn,15 a software tool
agree) or no if they do not. Stealing the in the sense that they cannot be target- that can also be of use for password
password hash file—in this case a pass- ed at the specific part of the cumulative blacklisting.
word MAC file—is now useless to the distribution where they make the most Observe that blacklists are both di-
offline attacker, because the HSM is difference—and away from the “don’t rect and focused: they explicitly pre-
needed to verify guesses; that is, offline care” region where they make none. vent choices known to be bad rather
attacks are no longer possible. Even if they succeed, exhortations to than rely on indirect measures, and
Another interesting scheme to miti- “choose better passwords” are not con- they target those users making bad
gate offline attacks, proposed by Cres- centrated at one part of the curve or choices, leaving the rest of the popu-
cenzo et al.,2 bounds the number of another; if all users respond to such a lation unaffected. A blacklist appears
guesses that can be made by restrict- request by improving their passwords to be one of the simplest measures to
ing the bandwidth of the connection marginally, the related effort of 90% of meaningfully improve the distribution
between the authentication server and users is still wasted for an enterprise in resisting online attacks.
a specially constructed hash server where αsat = 0.1. We now examine com- Composition policies. Composition
that requires a very large collection of mon approaches to influencing pass- policies attempt to influence user-
random bits. This approach limits an word choices in this light. chosen passwords by mandating a
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 73
contributed articles
minimum length and the inclusion ing resistance were free this would not few barriers to implementing these
of certain character types; a typical ex- matter, but such increases are typically simple defenses.
ample is “length at least eight charac- achieved at great cost in user effort; for
ters, and use three of the four charac- example, there is a void of evidence that References
1. Bright, P. RSA finally comes clean: SecurID is
ter sets {lowercase, uppercase, digits, current approaches based on password compromised. Ars Technica (June 6, 2011); http://
special characters}.” Certain policies composition policies significantly im- arstechnica.com/security/news/2011/06/rsa-finally-
comes-clean-securid-is-compromised.ars/
may help improve guess-resistance prove defensive outcomes and strong 2. Crescenzo, G.D., Lipton, R.J., and Walfish, S. Perfectly
secure password protocols in the bounded retrieval
in the 105 to 108 range. However, for arguments that they waste much user ef- model. In Proceedings of the Theory of Cryptography
what we have suggested as the reason- fort. This situation thus creates risk of a Conference (New York, Mar. 4–7). Springer-Verlag,
2006, 225–244.
able values αsat = 0.1 and T1 = 1014, the false sense of security. 3. Dunagan, J., Zheng, A.X., and Simon, D.R. Heat-Ray:
evidence strongly suggests that none It is common to assume that users Combating identity snowball attacks using machine
learning, combinatorial optimization and attack
of the password-composition policies must choose passwords that will with- graphs. In Proceedings of the ACM Symposium on
in common use today or seriously pro- stand credible offline attacks. However, Operating Systems Principles (Big Sky, MT, Oct. 11–14).
ACM Press, New York, 305–320.
posed8,10 can help; for these to become if we assume an offline attacker can 4. Florêncio, D. and Herley, C. Where do security policies
relevant, one must assume that an at- mount T1 = 1014 guesses per account and come from? In Proceedings of the SOUPS Symposium
On Usable Privacy and Security (Redmond, WA, July
tacker’s ability to harm saturates much has all the access she needs by the time 14–16, 2010).
higher than αsat = 0.1 or that the attack- she compromises a fraction αsat = 0.1 of 5. Florêncio, D., Herley, C., and van Oorschot, P. Password
portfolios and the finite-effort user: Sustainably
er can manage far fewer than T1 = 1014 accounts, we must acknowledge that managing large numbers of accounts. In Proceedings
offline guesses. Such policies thus fail trying to stop offline attacks by aim- of the 23rd USENIX Security Symposium (San Diego,
CA, Aug. 20–22). USENIX Association, Berkeley, CA,
to prevent total penetration of the net- ing user effort toward choosing “better 2014, 575–590.
work. Their ineffectiveness is perhaps passwords” is unachievable in practice. 6. Florêncio, D., Herley, C., and van Oorschot, P.C. An
administrator’s guide to Internet password research.
the reason why a majority of large Web The composition policies in current use In Proceedings of the USENIX LISA Conference
(Seattle, WA, Nov. 9–14). USENIX Association,
services avoid onerous policies.4 seem so far from reaching this target Berkeley, CA, 2014, 35–52.
Note that composition policies are that their use appears misguided. This 7. Goodin, D. Why passwords have never been
weaker and crackers have never been stronger. Ars
indirect: the constraints they impose is not to say that offline attacks are not Technica (Aug. 20, 2012); http://arstechnica.com/
are not themselves the true end objec- a serious threat. However, it appears security/2012/08/passwords-under-assault/
8. Kelley, P.G., Komanduri, S., Mazurek, M.L., Shay, R.,
tives, but it is hoped they result in a that enterprises that impose stringent Vidas, T., Bauer, L., Christin, N., Cranor, L.F., and Lopez,
more defensive password distribution. password-composition policies on J. Guess again (and again and again): Measuring
password strength by simulating password-cracking
This problem is compounded by the their users suffer the same fate as those algorithms. In Proceedings of the IEEE Symposium
fact that whether the desired improve- that do not. If the hashed password file on Security and Privacy (San Francisco, May 20–23).
IEEE Press, 2012, 523–537.
ment is indeed achieved is concealed leaks, burdening users with complex 9. Komanduri, S., Shay, R., Cranor, L.F., Herley, C., and
from an administrator. The justifiably composition policies does not alter the Schechter, S. Telepathwords: Preventing weak
passwords by reading users’ minds. In Proceedings
recommended practice of storing pass- fact that a competent attacker likely of the 23rd USENIX Security Symposium (San Diego,
words as salted hashes means the pass- gets access to all the accounts she CA, Aug. 20–22). USENIX Association, Berkeley, CA,
2014, 591–606.
word distribution is obscured, as are could possibly need. Nudging users in 10. Mazurek, M.L., Komanduri, S., Vidas, T., Bauer, L.,
any improvements caused by policies. the “don’t care” region (where most Christin, N., Cranor, L.F., Kelley, P., Shay, R., and Ur,
B. Measuring password guessability for an entire
Composition policies are also unfo- passwords appear to lie) is simply a university. In Proceedings of the 20th ACM Conference
on Computer and Communications Security (Berlin,
cused in that they affect all users rather waste of user effort. Germany, Nov. 4–8). ACM Press, New York, 2013.
than being directed specifically where The best investments to defend 11. Tippett, P. Stronger passwords aren’t. Information
Security Magazine (June 2001), 42–43.
they may matter most. A policy may against offline attacks appear to in- 12. Provos, N. and Mazières, D. A future-adaptable
greatly affect user password choice and volve measures transparent to users. It- password scheme. In Proceedings of the 1999
USENIX Annual Technical Conference, FREENIX
still have little effect on outcome (such eration of password hashes lowers the Track (Monterey, CA, June 6–11). USENIX Association,
as if all of the change in the cumulative T1 boundary; however, even with very Berkeley, CA, 1999, 81–91.
13. RSA FraudAction Research Labs. Anatomy of a hack.
distribution happens inside the “don’t aggressive iteration, we expect that at RSA, Bedford, MA, Apr. 1, 2011; https://blogs.rsa.com/
care” region). least 1010 offline guesses remain quite anatomy-of-an-attack/
14. Toxen, B. The NSA and Snowden: Securing the all-
feasible for attackers. Use of MACs, seeing eye. Commun. ACM 57, 5 (May 2014), 44–51.
Conclusion that is, keyed hash functions, instead 15. Wheeler, D. zxcvbn: Low-budget password strength
estimation. In Proceedings of the 25th USENIX
Password strength, which actually of regular (unkeyed) password hashes, Security Symposium (Austin, TX, Aug. 10–12). USENIX
means guessing resistance, is not a uni- provides effective defense against of- Association, Berkeley, CA, 2016.
versal good to be pursued for its inher- fline attacks that exploit leaked hash
Dinei Florêncio ([email protected]) is a senior
ent benefits; it is useful only to the extent files—provided the symmetric MAC researcher in the Multimedia and Interactive Experiences
it denies things to adversaries. When we key is not also leaked. Use of HSMs is group of Microsoft Research, Redmond, WA.
consider a population of accounts, there one method of protecting MAC keys, Cormac Herley ([email protected]) is a principal
researcher at Microsoft Research, Redmond, WA.
are large areas where increased guessing as discussed; though more expensive
Paul C. van Oorschot ([email protected]) is a
resistance accomplishes nothing—ei- than software-only defenses, HSMs professor of computer science and Canada Research Chair
ther because passwords fall between the can eliminate offline attacks entirely. in Authentication and Computer Security at Carleton
University, Ottawa, Canada.
online and offline thresholds or because Online guessing attacks, in contrast,
so many accounts have already fallen cannot be entirely eliminated, but ef-
that attacker control has already satu- fective defenses include password Copyright held by the authors.
rated. If increases in password guess- blacklists and throttling. There appear Publication rights licensed to ACM. $15.00
A Theory
on Power
in Networks
A “NETWORK” CONSISTS of a crowd of actors and a set
of binary relations that tie pairs of actors. Networks are
pervasive in the real world. Nature, society, information,
and technology are supported by ostensibly different
networks that in fact share an amazing number of
interesting structural properties.
Networks are modeled in math- tral (important) nodes? Many mea-
ematics as “graphs,” with actors rep- sures have been proposed to address
resented as points (also called nodes it. Among them, “eigenvector central-
or vertices) and relations depicted as ity” (or simply centrality in this article)
lines (also called edges or arcs) con- states that an actor is central if it is con-
necting pairs of points. In this article, nected with central actors. This circu-
we focus on undirected graphs, where lar definition is captured by an elegant
the edges do not have a particular ori- recursive equation
entation. A meaningful question on
networks is: Which are the most cen- λx = Ax, (1)
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 75
contributed articles
covered and rediscovered many times the more ties an actor has, the more
over in different contexts. It has been powerful the actor is. The second
investigated, in chronological order, in property characterizes power; for an
econometrics, sociometry, bibliometrics, equal number of ties, actors linked to
Web information retrieval, and net-
work science; see Franceschet 12 for Central actors powerless others are powerful. On the
other hand, actors tied to powerful
an historical overview.
In some circumstances, however,
are those with others are powerless.
We investigate the existence and
centrality—the quality of being con- many ties or, for uniqueness of a solution for Equation
nected to central ones—has limited util-
ity in predicting the locus of “power”
an equal number (2), exploiting well-known results in com-
binatorial matrix theory. We study how
in networks.2,8,11 Consider exchange of ties, central to regain the solution when it does
networks, where the relationship in
the network involves the transfer of
actors are those not exist, by perturbing the matrix rep-
resenting the network. We formally
valued items like information, time, connected with relate the introduced notion of power
money, or energy. A set of exchange
relations is positive if exchange in one central others. with alternative notions and empiri-
cally compare them on the European
relation promotes exchange in others natural gas pipeline network.
and negative if exchange in one rela-
tion inhibits exchange in others.7 In Motivating Example
“negative exchange networks,” power In his seminal work on power-depen-
comes from being connected to those dence relations, from 1962, Richard
with few options. Being connected to Emerson11 claimed that power is a
those with many possibilities reduces property of the social relation, not
one’s power. Think of, for instance, an attribute of the person: “X has
a social network in which time is the power” is meaningless, unless we
exchanged value. Imagine every actor specify “over whom.” Power resides
has a limited time to listen to others implicitly in others’ dependence, and
and that each actor divides its time dependence of an actor A upon actor
among its neighbors. Exchange of B is directly proportional to A’s moti-
time in one relation clearly precludes vational investment in goals medi-
exchange of the same time in other ated by B and inversely proportional
relations. Which actors receive the to the availability of those goals to A
most attention? They are the nodes that outside the A–B relation. The availabil-
are connected to many neighbors with ity of such goals outside that relation
few options, since they receive almost refers to alternative avenues of goal
full attention from all their neighbors. achievement, most notably through
On the other hand, actors connected other social relations.11 This type of
to few neighbors with many options relational power is endogenous with
receive little consideration because respect to the network structures,
their neighbors are mostly busy with meaning it is a function of the position
others. of the node in the network. Exogenous
In this article, we propose a theory factors (such as allure or charisma)
on power in the context of networks. external to the network structure
We start by this thesis: An actor is might be added to endogenous power
powerful if it is connected to power- to complete the picture.
less actors. We implement this circu- We begin with some small archetypal
lar thesis through this equation examples typically used in exchange-
x = Ax ÷, (2) network theory to informally illustrate
the notion of power and sometimes
where x is the sought power vector, to distinguish it from the intersecting
A is a matrix encoding the network, concept of centrality.10 Consider a two-
and x ÷ is the vector whose entries are node path
the reciprocal of those of x. Equation
(2) states two important properties A−B.
of power: the power of an actor is
directly correlated with the number The situation is perfectly symmetric,
of its neighbors and is inversely corre- and a reasonable prediction is that
lated with the power of its neighbors. both actors have the same power. In a
The first property seems reasonable; three-node path
A−B−C, network. Nodes are European coun- matrix of G; that is, Ai, j is the weight of
tries (country codes according to ISO edge (i, j) if such edge exists and Ai, j = 0
much is changed. Intuitively, B is pow- 3166-1), and there is an undirected otherwise. Hence, A is a square, sym-
erful and A and C are not. Indeed, both edge between two nations if a natu- metric, nonnegative matrix. Loops in
A and C have no alternative venues ral gas pipeline crosses the borders G correspond to elements in the main
besides B (both depend on B), while B of the two countries. Data has been diagonal of A.
can exclude one of them by choosing downloaded from the website of the The “centrality problem” is as fol-
the other.a In a four-node path International Energy Agency (http:// lows: find a vector x with positive
, www.iea.org). The original data cor- entries such that
A−B−C−D responds to a directed, weighted
multigraph, with edge weights corre- λx = Ax, (3)
actors B and C hold power, while A sponding to the maximum flow of the
and D are dependent on either B or C. pipeline. We simplified and symme- where λ > 0 is a constant. This means
Nevertheless, the power of B is less here trized the network, mapping the edge λxi = ∑j Ai, j xj; that is, the centrality of
than in the three-node path; in both weights in a consistent way. a node is proportional to the weighted
cases, A depends on B, but in the three- This is a negative exchange network sum of centralities of its neighbors.
node path, C also depends on B, while because the exchange of gas with a This is the main idea behind PageRank,
in the four-node path, C has an alterna- country precludes the exchange of the Google’s original webpage ranking
tive, node D. Hence, node B is less pow- same gas with others. Intuitively, pow- algorithm. PageRank determines the
erful in the four-node path with respect erful countries are those that are con- importance of a webpage in terms of
to the three-node path since its neigh- nected with states with few options the importance assigned to the pages
bors are more powerful. Finally, the for exchanging the gas. Suppose coun- that hyperlink to it. Besides Web infor-
five-node path try B is connected to countries A and mation retrieval, this thesis has been
C, and B is the only connection for successfully exploited in disparate con-
A−B−C−D−E them, or A−B−C. Countries A and C texts, including bibliometrics, sociom-
can sell or buy gas only from B, while etry, and econometrics.12
is interesting since it discriminates country B can choose between A and We define the “power problem” as
power from centrality. All traditional C. Reasonably, the bargaining power follows: find a vector x with positive
central measures (eigenvector, closeness, of B is greater, which traduces in entries such that
betweenness) claim that C is the central higher revenues or less expense for B
one. Nevertheless, B and D are reasonably in the gas negotiation. x = Ax÷, (4)
the powerful ones. Again, this is because
they negotiate with weak partners (A and A Theory on Power where we denote with x÷ the vector
C or E and C), while C bargains with Let G be an undirected, weighted whose entries are the reciprocal of those
strong parties (B and D). This example graph. The graph G may contain of x. This means xi = ∑j Ai, j/xj; that is, the
is useful for illustrating an additional “loops,” or edges from a node to itself. power of a node is equal to the weighted
subtle aspect of power. Notice that in The edges of G are labeled with posi- sum of reciprocals of power of its neigh-
both the five-node path and the four- tive weights. Let A be the adjacency bors. Notice that if λx = Ax÷, then, setting
node path, B is surrounded by nodes
(A and C) that are locally similar; for Figure 1. The European natural gas pipeline network.
instance, they have the same degree
in both paths. However, the power of PT
MA
IE
C is reasonably less in the five-node
path than in the four-node path; ES
UK
hence, we might expect the power SE
DZ
of B is greater in the five-node path NL NO BE FR
DK TN
with respect to the four-node path. LU
This separation is possible only if the CH
DE
notion of power spans beyond the local IT LY
neighborhood of a node, if, say, power CZ AT SI
is recursively defined. FI PL HR
RU SK
As a larger and more realistic exam- EE BY
LV LT HU
ple, consider Figure 1, which depicts UA
the European natural gas pipeline RS
RO
TR
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 77
contributed articles
y= , we have that y = Ay÷; hence, Proof. If DAD is doubly stochastic, matrix A is said to have “support” if
the proportionality constant λ is not then DADe = e and eT DAD = eT, where e it contains a positive diagonal and is
necessary in the power equation. This is a vector of all 1s. Actually, since A and D said to have “total support” if A ≠ 0
notion of power is relevant on negative are symmetric, it holds that DADe = e ⇔ and every positive element of A lies
exchange networks.2,8 In these networks, eT DAD = eT. If the vector x does not have on a positive diagonal. Total support
when a value is exchanged between zero entries, then Dx is invertible and D-1x = clearly implies support.
actors along a relation, it is consumed Dx÷. We have that x = Ax÷ ⇔ Dxe = ADx ÷e A matrix is “indecomposable (irre-
and cannot be exchanged along another ⇔ e = Dx-1 ADx ÷e ⇔ e = Dx ÷ ADx÷e. ducible” if it is not possible to find a
relation. Hence, important actors are permutation matrix P such that
those in contact with many actors with Existence and unicity of a solu-
few exchanging possibilities. tion. The link between the balanc- ,
Finally, the “balancing problem” is ing problem and the power problem
the following: find a diagonal matrix D we established in Theorem 1 allows where X and Z are both square matrices
with positive main diagonal such that us to investigate a solution of the and 0 is a matrix of 0s; otherwise A is
power problem (Equation 4) using “decomposable (reducible).” A matrix
S = DAD the w ell-established theory of matrix is “fully indecomposable” if it is not
balancing. possible to find permutation matrices
is doubly stochastic; that is, all rows Recall that the “diagonal” of a P and Q such that
and columns of S sum to 1. The balanc- square n × n matrix is a sequence of
ing problem is a fundamental question n elements that lies on different rows ,
that is claimed to have first been used and columns of the matrix. A permu-
in the 1930s to calculate traffic flow4 tation matrix is a square n × n matrix where X and Z are both square matri-
and since then has been applied in that has exactly one entry equal to one ces; otherwise, A is “partly decom-
many disparate contexts.14 in each row and each column, while posable.” Clearly, a matrix (fully
It turns out that the power problem all the other entries are equal to zero. indecomposable) is also irreducible.
is intimately related to the balancing Each diagonal clearly corresponds to It also holds that full indecomposabil-
problem. Given a vector x, let Dx be a permutation matrix where the posi- ity implies total support.5 Moreover,
the diagonal matrix whose diagonal tions of the diagonal elements corre- the adjacency matrix of a bipartite
entries coincide with those of x. We spond to those of the unity entries of graph is never fully indecomposable,
thus have the following result. the permutation matrix. In particu- while the adjacency matrix of a non-
lar, the identity matrix I is a permu- bipartite graph is fully indecompos-
Theorem 1. The vector x is a solution tation matrix, and the diagonal of A able if and only if it has total support
of the power problem if and only if the associated with I is called the main and is irreducible.9 We say a graph has
diagonal matrix Dx÷ is a solution for the diagonal of A. A diagonal is positive if support, has total support, is irreduc-
balancing problem. all its elements are greater than 0. A ible, and is fully indecomposable if
the corresponding adjacency matrix
Figure 2. (top left) The graph has no support since a spanning cycle forest is missing. (top- has these properties.
right). The graph has support formed by edges (1, 4) and (2, 3), but the support is not total;
(edges (1, 3) and (1, 2) are not part of any spanning cycle graph. (bottom left) The graph has
The combinatorial notions just out-
total support but is not irreducible, hence is not fully indecomposable. (bottom right) The lined are rather terse. Fortunately, most
graph has total support and is irreducible and not bipartite, so is fully indecomposable. of them have a simple interpretation in
graph theory. It is known that irreduc-
ibility of the adjacency matrix corre-
3 3 sponds to connectedness of the graph.
Moreover, given an undirected graph G,
let us define a “spanning cycle forest” of
1 1 G a spanning subgraph of G whose con-
4 4 nected components are single edges or
2 2 cycles, including loops that are cycles
of length 1. It is easy to realize that
there exists a correspondence between
diagonals in the adjacency matrix and
spanning cycle forests in the graph.
3 1
Hence, a graph has support if and only
3
if it contains a spanning cycle forest and
total support if and only if each edge
1 is included in a spanning cycle forest.
4 Four examples are given in Figure 2.
2 2 The following is a well-known nec-
essary and sufficient condition for the
solution of the balancing problem.9,17
Theorem 2. Let A be a symmetric Figure 3. Correlation between original and perturbed powers varying the damping parameter
nonnegative square matrix. A necessary from 0 to 1 on the largest biconnected component of the social network among dolphins
and sufficient condition for the existence (which has total support). The horizontal line corresponds to the correlation with diagonal
perturbation and maximum damping. The correlation on the other networks is similar.
of a doubly stochastic matrix S of the
form DAD, where D is a diagonal matrix
1.00
with positive main diagonal, is that A has
Diagonal perturbation
total support. If S exists, then it is unique. Full perturbation
If A is fully indecomposable, then matrix
0.90
Correlation
D is unique.
0.80
It follows that the power problem
x = Ax÷ has a solution on the class
of graphs that has total support.
0.70
Moreover, if the graph is fully inde-
0.0 0.2 0.4 0.6 0.8 1.0
composable, then the solution is also
unique. Damping
Perturbation (regaining the solu-
tion). What about the power problem
on graphs whose adjacency matrix in the network. We can thus play with original power; power with diagonal
lacks total support? For such graphs, the diagonal of the adjacency matrix to perturbation is closer to original power
the power problem has no solu- assign nodes with potentially different than power with full perturbation; and
tion. Nevertheless, a solution can be entry levels of exogenous power. the larger the damping parameter, the
regained by perturbing the adjacency Intuitively, the diagonal perturbation lower the adherence of perturbed solu-
matrix of the graph in a suitable way. is less invasive than its full counter- tions to the original one.
We investigate two perturbations on part; the former modifies the diago- Computing power. Due to the
the adjacency matrix A nal elements only, and the latter established relationship between
touches all matrix elements. But the balancing problem and the power
1. Diagonal perturbation: AαD = A + αI, how invasive is the perturbation with problem, we can use known meth-
where α > 0 is a damping parame- respect to the resulting power? To ods for the former in order to solve
ter and I is the identity matrix. investigate this issue, we computed the latter. The simplest approach for
2. Full perturbation: AαF = A + αE, the correlation between original and solving Equation (4) is to set up the
where α > 0 is a damping param- perturbed power solutions. A simple iterative method
eter and E is a full matrix of all 1s. and intuitive measure of the correla-
tion between two rankings of size n is x k+1 = Axk÷, (5)
Matrix AαF is clearly fully indecompos- Kendall rank correlation coefficient
able, has total support, and is irreduc- k, which is the difference between known as the Sinkhorn–Knopp method
ible. Hence, the power problem (as well the fraction of concordant pairs c (the (SKM).17 If we set x0 = e, the vector of all
as the centrality problem) on a fully number of concordant pairs divided 1s, then the first iteration x1 = Ae; that
perturbed matrix has a unique solu- by n(n − 1)/2) and that of discordant is, x1 (i) = ∑j Ai, j is the degree di of i. The
tion. On the other hand, matrix AαD has pairs d in the two rankings: k = c − d. second iteration x2 = A(Ae)÷; that is,
total support. Indeed, if Ai, j > 0 and i = The coefficient runs from −1 to 1, with x2 (i) = ∑j Ai, j/ di is the sum of reciprocals
j, then the main diagonal Ak,k for 1 ≤ k negative values indicating negative cor- of the degrees of the neighbors of i.
≤ n is positive and contains Ai, j. If i ≠ j, relation, positive values indicating pos- If A has total support, then the SKM
then the diagonal Ai, j, Aj,i, Ak,k for 1 ≤ k itive correlation, and values close to 0 converges, or, more precisely, the
≤ n and k ≠ i, j is positive and contains indicating independence. We used the even and odd iterates of the method
Ai, j. The power problem on a diagonally following network datasets: a social converge to power vectors that dif-
perturbed matrix thus has a solution. network among dolphins, the Madrid fer by a multiplicative constant. The
Moreover, the solution is unique if A train bombing terrorist network, a convergence is linear with a rate of
is irreducible, since it is known that social network of jazz musicians, a net- convergence that depends of the sub-
for a symmetric matrix A it holds that work of friendships between members dominant eigenvalue of the balanced
A is irreducible if and only if A + I is of a karate club, a collaboration net- matrix S = DAD (see Theorem 2).17 In
fully indecomposable.6 Interestingly, work of scholars in the field of network some cases, however, the convergence
the diagonal perturbation, besides science, and a co-appearance network can be very slow. Knight and Ruiz14
providing convergence of the method, of characters in the novel Anna Karenina proposed a faster algorithm based on
is useful for incorporating exogenous by Lev Tolstoj. Newton’s method (NM) that we now
power in the model. By setting a posi- The main outcomes of the current describe according to our setting and
tive value in the ith position of the experiment are as follows (see Figure 3): notations. In order to solve Equation
diagonal, we are saying that node i as soon as the damping parameter is (4), we apply NM for finding the zeros of
has a minimal amount of power, or not small, both diagonal and full perturba- the function f: Rn → Rn defined by f (x) =
a function of the position of the node tions do not significantly change the x − Ax÷. It is not difficult to check that
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 79
contributed articles
Table 1. Complexity of computation of power with different methods: PM (benchmark); SKM and PM; and NM with diagonal per-
(SKM without perturbations for totally supported networks); SKM-D (SKM with diagonal turbation is even faster than NM, and
perturbation and damping 0.15); SKM-F (SKM with full perturbation and damping 0.01); NM the larger the damping parameter, the
(NM without perturbations for totally supported networks); and NM-D (NM with diagonal
faster the method.
perturbation and damping 0.15).
Relationship with alternative power
measures. Bonacich2 proposed a fam-
Network PM SKM SKM-D SKM-F NM NM-D ily of parametric measures depend-
Dolphin 73 294 300 72 47 30 ing on two parameters: α and β. If A is
Madrid 28 416 320 78 46 27 the adjacency matrix of the graph, the
Jazz 42 300 288 78 37 27 Bonacich index x is defined as
Karate 42 – 494 52 – 31
Collab 65 – 9740 30 – 33
Karenina 24 – 1006 32 – 32 x = αAe + βAx. (6)
But this first approximation is a member The power of lowing, we describe the dynamics that
capture such an extension. Let A be
of the family of Bonacich’s measures,
with α = 2γ and β = −γ2. Since β is nega-
an actor somewhat the adjacency matrix of an undirected,
unweighted graph G. Hence, Ai,j = 1 if
tive, we indeed are facing a measure of inversely depends there is an edge (i, j) in G and Ai,j = 0
power. Hence, Bonacich power can be
considered as a first-order approxima-
on the power otherwise. Negotiation among actors
is possible only along edges; each pair
tion of power using NM. of its neighbors. of actors on an edge negotiates for a
Bozzo et al.3 investigated power mea- fixed amount of €1, and each actor may
sures on sets of nodes. Given a node set conclude a negotiation with at most
T let B(T) be the set of nodes whose one neighbor (one-exchange rule). For
neighbors all belong to T. Notice that every edge (i, j), define
nodes in B(T) do not have connections
outside T, hence are potentially at the •• Ri,j as the amount of “revenue”
mercy of nodes in T. We define a power actor i receives in a negotiation
function p such that p(T) = |B(T)| − |T|. with j.
Hence, a set T is powerful if it has poten- •• Li, j as the amount of revenue actor i
tial control over a much larger set of receives in the best alternative
neighbors B(T). The power measure is negotiation, excluding the one with
interpreted as the characteristic func- j.
tion of a coalition game played on the
graph and the “Shapley value” of the Notice that matrices R and L have
game; or the average marginal contri- the same zero-non-zero pattern as A.
bution to power carried by a node when More precisely, consider the following
it is added to any node set is proposed iterative process. We start with Ri,(0)j = 1/2
as a measure of power for single nodes. for all edges (i, j) and Ri,(0)j = 0 elsewhere.
Interestingly, the discovered game-theo- Let N(i) be the set of neighbors of node
retic power measure corresponds to the i. For t > 0, the best alternative matrix
second iteration of SKM for the compu- L(t) at time t is
tation of power as defined by Equation
(4); that is, to the sum of reciprocals of
neighbors’ degrees.
The study of power has a long his-
tory in economics (in its recognition Let the surplus Si,(t)j = 1 - Li,(t)j - Lj,i(t) be
of bargaining power) and sociol- the amount for which actors i and j will
ogy (in its interpretation of social negotiate at time t; notice that actor
power). 10 Consider the most basic i will never accept an offer from j less
case where just two actors, A and B, than his alternate option Li, j(t), and actor
are involved in a negotiation over j will never accept an offer from i less
how to divide one unit of money. than her alternate option Lj, i(t). The profit
Each actor has an alternate option— matrix R(t) at time t is then
a backup amount it can collect in
case negotiations fail, say, α for A
and β for B. A natural prediction,
known as “Nash’s bargaining solu-
tion,” 15 is that the two actors will
split the surplus s = 1 − α − β, if any,
equally between them; that is, if s < Notice that Ri, j(t) + Rj, i(t) = 1; that is, Ri, j(t)
0 no agreement between A and B is and Rj, i(t) is the Nash’s bargaining solu-
possible, since any division is worse tion of a negotiation between actors i and
than the backup option for at least j, given their alternate options Li, j(t) and
one of the parties. On the other hand, Lj, i(t). Let R be the fixpoint of the itera-
if s >= 0, then A and B will agree on tive process R(t) for growing time t. The
α+s/2 for A and β+s/2 for B. “Nash power” xi of node i is the best
A natural extension of the Nash bar- revenue of actor i among its neighbors;
gaining solution from pairs of actors that is
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 81
contributed articles
Table 3. Matrix of correlations among power and centrality measures. for an actor i depends directly on the rev-
enues of i among its neighbors, which
directly depend on the alternate options
S B P N C
of i among its neighbors, which inversely
S 1.00 0.82 0.90 0.69 0.41
depends on the revenues of neighbors of
B 0.82 1.00 0.84 0.61 0.46 i, which determine the power of neigh-
P 0.90 0.84 1.00 0.72 0.47 bors of i. Hence, power of an actor some-
N 0.69 0.61 0.72 1.00 0.36 what inversely depends on the power of
C 0.41 0.46 0.47 0.36 1.00 its neighbors.
S, Shapley power; B, Bonacich power; P, power as defined in this article; N, Nash power; C, centrality.
Using Kendall correlation, we
assessed the overlapping of power, as
defined in this article, with central-
ity and degree, as well as Bonacich power
Table 4. The top 10 powerful and central countries in the European natural gas exchange (Bonacich index with negative param-
network.
eter β), Shapley power (the sum of
reciprocals of neighbors’ degrees), and
Nash power on the social networks
P TR DE IT ES HU RU BG BE AT UK mentioned earlier. The main empirical
6.26 6.09 5.54 5.50 4.62 4.53 3.99 3.60 3.29 3.09 outcomes are summarized in the fol-
B DE IT HU TR AT RU ES BE NO BG lowing (see Table 2): as expected, both
7.07 4.58 4.06 3.73 3.41 3.37 3.29 3.23 2.92 2.76 power and centrality are positively
S TR ES IT DE RU HU BG RO UK AT
correlated with degree, but power is
negatively correlated with centrality
2.92 2.70 2.56 2.54 2.46 2.23 1.95 1.67 1.53 1.51
when the effect of degree is excluded
N ES TR BG RU IT HU UK RO DK LV (we used partial correlation); power
1.00 1.00 1.00 0.87 0.83 0.83 0.75 0.75 0.75 0.75 is positively correlated with Bonacich
C DE NO BE NL FR AT DK CH CZ UK power, and the association increases
1.00 0.71 0.68 0.62 0.56 0.52 0.46 0.45 0.40 0.39 as the parameter β declines below
0 down to −1/r, with r the spectral
P, power as defined in this article; B, Bonacich power; S, Shapley power; N, Nash power; C, centrality.
radius of the adjacency graph matrix
(moreover, the association is greater
when the adjacency matrix is per-
Figure 4. Scatterplot of power versus centrality. Vertical and horizontal lines correspond to turbed); power is positively corre-
third quartile.
lated with Shapley power, and the
association is generally stronger
than with Bonacich power; and power
1.0
DE
is positively correlated with Nash
bargaining network power, but the
0.8
FR
AT maps the power scores of the nodes
CH DK of the surveyed networks into a small
0.4
CZ UK
set of values, with very high frequency
LU PL for values close to 0, 0.5, and 1. Hence,
SK RU IT
HU it is difficult to discriminate different
0.2
we set Bonacich index parameters but not powerful; and there are many parametric; and it is global (the power
α = 1 and β = −0.85/r, where r is the countries that are neither powerful of a node depends on the entire net-
spectra radius of the graph. nor central outside the rankings. For work) and can be approximated with
In the two-node path, all methods instance, Italy contracts with nations a simple local measure—the sum of
agree to give identical power to both that are both powerless and periph- reciprocals of node degrees—that has
nodes. In the three-node path A−B−C, eral, namely Austria, Switzerland, a game-theoretic interpretation and
all methods agree B is the powerful Croatia, Tunisia, Libya, and Slovenia, can be efficiently computed on all net-
one. Notably, Nash power assigns all with only Austria included in the top- works. The definition has limitations
power (1) to B and no power (0) to A 10 power list and only Austria and as well, mainly that an exact solution
and C, while the other methods say A Switzerland included in the top-10 exists only on the class of totally sup-
and C hold a small amount of power. centrality list (not in the first posi- ported networks and is not immedi-
In the four-node path A−B−C−D, all tions). The ranking according to ately normalizable, so care is needed
methods claim B and C are the pow- Nash power is somewhat unusual when comparing power values for
erful ones. Moreover, all methods if compared with the other power nodes in different networks.
recognize that the power of B in this measures; for instance, Germany
instance is less than its power in the has bargaining power 0.5 and only References
1. Bayati, M., Borgs, C., Chayes, J., Kanoria, Y., and
three-node path. Finally, in the five- in 14 th postion, tied with the other Montanari, A. Bargaining dynamics in exchange
node path A−B−C−D−E, all methods countries. It is fair to note that the networks. J. Econ. Theory 156 (2015), 417–454.
2. Bonacich, P. Power and centrality: A family of
discriminate B and D as the most generalized Nash bargaining solu- measures. Am. J. Sociol. 92, 5 (1987), 1170–1182.
powerful nodes, followed by C and tion was originally proposed in 3. Bozzo, E., Franceschet, M., and Rinaldi, F. Vulnerability
and power on networks. Network Sci. 3, 2 (2015),
finally A and E, with the only excep- the context of assignment problems 196–226.
tion of Nash power, which assigns all (such as in matching apartments to 4. Brown, J.B., Chase, P.J., and Pittenger, A.O. Order
independence and factor convergence in iterative
power (1) to B and D and null power tenants and students to colleges) scaling. Linear Algebra Appl. 190 (1993), 1–38.
5. Brualdi, R.A. Matrices of 0s and 1s with total support.
(0) to all other nodes; hence, the cen- and was not suggested as a rating- J. Combin. Theory Ser. A 28, 3 (1980), 249–256.
tral node C has the same power as the and-ranking method for nodes in a 6. Brualdi, R.A. and Ryser, H.J. Combinatorial Matrix
Theory, Vol. 39 of Encyclopedia of Mathematics
peripheral nodes A and E, according network. For instance, in balanced and Its Applications. Cambridge University Press,
to this method. All methods, with the matching over the gas network, Italy Cambridge, U.K., 1991.
7. Cook, K.S., Emerson, R.M., Gillmore, M.R., and
exception of Shapley, notice that the preferably negotiates with Libya and Yamagishi, T. The distribution of power in exchange
power of B is greater in the five-node Turkey with Georgia. In fact, Cook networks: Theory and experimental results. Am. J.
Sociol. 89, 2 (1983), 275–305.
path with respect to the three-node and Yamagishi 8 proposed using the 8. Cook, K.S. and Yamagishi, T. Power in exchange
path. This is because Shapley is a negotiation values obtained by each networks: A power-dependence formulation. Social
Networks 14 (1992), 245–265.
local method, while the others are node in such a solution as a struc- 9. Csima, J. and Datta, B.N. The DAD theorem for
global (recursive) methods. tural power measure; see also Easley symmetric nonnegative matrices. J. Combin. Theory
12, 1 (1972), 147–152.
Let us now revisit the natural gas and Kleinberg 10 (chapter 12) for a 10. Easley, D. and Kleinberg, J. Networks, Crowds, and
pipeline example. We ranked all similar interpretation. According Markets: Reasoning About a Highly Connected World.
Cambridge University Press, New York, 2010.
countries according to the follow- to the experiments we conducted 11. Emerson, R.M. Power-dependence relations. Am.
ing power and centrality measures: for this article, this interpretation Sociol. Rev. 27, 1 (1962), 31–41.
12. Franceschet, M. PageRank: Standing on the shoulders
Shapley power (S), Bonacich power might seem opinable, but further of giants. Commun. ACM 54, 6 (2011), 92–101.
13. Kleinberg, J. and Tardos, E. Balanced outcomes in
(B), power as defined in this article investigation is necessary to gain a social exchange networks. In Proceedings of the 40th
(P), Nash power (N), and eigenvec- solid conclusion. Annual ACM Symposium on Theory of Computing
(2008), 295–304.
tor centrality (C). Table 3 shows the 14. Knight, P.A. and Ruiz, D. A fast algorithm for matrix
corresponding Kendall correlation Conclusion balancing. IMA J. Numer. Anal. 33, 3 (2013),
1029–1047.
matrix. As expected, P is well corre- We proposed a theory on power in the 15. Nash, J. The bargaining problem. Econometrica 18
lated with its approximations B and S. context of networks. The philosophy (1950), 155–162.
16. Rochford, S.C. Symmetrically pairwise-bargained
Moreover, P is positively correlated underlying our notion of power main- allocations in an assignment market. J. Econ. Theory
with N, but the correlation strength is tains that an actor is powerful if it is 34, 2 (1984), 262–281.
17. Sinkhorn, R. and Knopp, P. Concerning nonnegative
weaker. Also, the association between connected with many powerless actors. matrices and doubly stochastic matrices. Pac. J.
P and C is positive but weak and This thesis has its roots and applica- Math. 21 (1967), 343–348.
These associations are mirrored in The virtues of our definition of Massimo Franceschet (massimo.franceschet@uniud.
it) teaches network science and generative art in the
the top-10 rankings and ratings listed power are: it is a simple, elegant, and Department of Mathematics, Computer Science, and
in Table 4, as well as in the scatter- understandable measure; it is theo- Physics at the University of Udine, Udine, Italy.
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 83
review articles
DOI:10.1145/ 2934662
of Charles Darwin’s The Origin of Spe-
Looking at the mysteries of evolution cies in 1859.8 But of course nothing is
that simple: during the first half of the
from a computer science point of view yields 19th century, several scientists were con-
some unexpected insights. vinced the diversity of life we see around
us must be the result of evolution (see
BY ADI LIVNAT AND CHRISTOS PAPADIMITRIOU the sidebar on Charles Babbage and
the accompanying figure). Darwin’s
Sex as an
immense contribution lies in three
things: His identification of natural se-
lection as the engine of evolution; his
articulation of the common descent
Algorithm
hypothesis, stating that different spe-
cies came from common ancestors, and
further implying that all life came from
a common source; and the unparalleled
force of argument with which he em-
The Theory
powered his theory. But of course The
Origin was far from the last word on the
subject: Darwin knew nothing about ge-
of Evolution
netics, and had no clue about the role of
sex in evolution, among several impor-
tant gaps. On the ultimate reason for
of Computation
ics by appreciating mathematical pat-
terns in the ratios of sibling pea plants
exhibiting different characteristics
and model building. When his laws
were rediscovered 40 years later, their
discrete nature was misunderstood
as being at loggerheads with Darwin’s
key insights
L OOK AROU ND YO U, and you will be stunned by the ˽˽ Sexual reproduction is nearly ubiquitous
work of evolution. According to Nobel Laureate in nature. Yet, despite a century of intense
research, its evolutionary role and origin
Jacques Monod, a strange thing about evolution is that is still a mystery.
all educated persons think they understand it fairly ˽˽ Recent research at the interface of
evolution and CS has revealed that
well, and yet very few—if any, one may grumble— evolution under sex possesses a
surprising and multifaceted computational
actually do. Understanding evolution is essential: nature: It can be seen as a coordination
game between genes played according
“Nothing in biology makes sense except in the light to the powerful Multiplicative Weights
Update Algorithm; or as a randomized
of evolution,” famously said the eminent 20th century
MICROSCOPE PH OTO BY ARM IN STAU DT
More so than most scientific fields, the theory of ˽˽ Computational models and considerations
are becoming an indispensable tool for
evolution has a sharp beginning: the publication unlocking the secrets of evolution.
sition of a population changes over the much center stage Evolution and Computer Science
generations. This mathematical theo-
ry of population genetics is introduced
in life, the basis of Over the past 70 years, computer sci-
entists, starting with von Neumann,37
briefly in the sidebar “The Equation a fantastic variety have been inspired and intrigued by
that Reconciled Darwin and Mendel.”
It is key to what is called the “modern of behavior and evolution. During the 1950s, computer
scientists working in optimization de-
evolutionary synthesis”—the 20th-cen- structure. veloped local search heuristics: Start
tury view of evolution, because it pro- with a random solution and repeat the
posed one way of unifying Darwinism following step: If there is a “mutation”
and Mendelism. of this solution that is better than the
During the near-century since then, current one, change to that, until a
the study of evolution has flourished local optimum is discovered. By “mu-
into a mature, comprehensive, and tation” (much more often the word
prestigious scientific discipline, while “neighbor” is used), we mean a solu-
over the past two decades it has been tion differing from the present one
inundated by a deluge of molecular in a very small number of features; in
data, a vast scientific gold mine that the traveling salesman problem, for
informs—and often challenges—its te- example, a mutation could change
nets. And yet, despite the towering ac- two, or three, edges of the tour to form
complishments of modern evolution- a new tour. This process is repeated
ary biology, there are several important many times from random starts, a
questions that are beyond our current stratagem that can be seen as a se-
understanding: quential way of maintaining a popula-
˲˲ What is the role of sex in evolution? tion (see Papadimitriou and Steiglitz32
Reproduction with recombination is for a survey on the 1980s).
almost ubiquitous in life (even bacteria This basic idea of local search
exchange genetic material), while obli- was enhanced in the 1980s by a ther-
gate asexual species appear to be rare modynamic metaphor,20 to help the
evolutionary dead ends. And yet there algorithm escape local optima and
is no agreement among the experts as barriers: Simulated annealing, as this
to what makes sex so advantageous. variant of local search is called, allows
˲˲ What exactly is evolution optimiz- the adoption of even a disadvanta-
ing, if anything? With all the evolution- geous mutation, albeit with a prob-
inspired optimization heuristics coded ability decreasing with the disadvan-
by computer scientists (as we discuss tage, and with time. A further variant
next), this question—also very much called go with the winners1 is closer to
contemplated by biologists over the evolution in that it keeps a population
decades—comes to the fore. of solutions, teleporting the individu-
˲˲ The paradox of variation. Genetic als that are stuck at local optima to the
variation in humans, and in many more promising spots. Notice that all
other species, is much higher than the these heuristics are inspired by asexual
theory predicted due to selection; and evolution (no recombination between
assuming the variation is neutral has solutions happens); heuristics of this
its problems too. genre have been used successfully in
˲˲ Are mutations completely ran- many realms of practice, and there
dom? We know they are far from uni- are several practically important hard
formly distributed in the genome, but problems, such as graph partitioning,
can they be the results of elaborate ge- for which such heuristics are competi-
netic mechanisms? tive with the best known.
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 87
review articles
ideal fitness function by a polynomially to sea, and concluding that he wants to standard equations used to describe
large population of genotypes in poly- bring food to his family’s table. (Inci- how genotype frequencies change
nomially many generations35 through dentally, the designers of genetic algo- over generations (see the Darwin and
learning by mutations. Notably, there rithms are well aware of this downside Mendel sidebar), we demonstrated
is no recombination (sex) in his theory, of sex, and often allow the most suc- an important difference between sex
even though it can be added for a mod- cessful individuals into the next gen- and asex: In asexual evolution, the
est advantage.17 Several natural classes eration, a stratagem known as “elit- best combination of alleles always pre-
of functions are evolvable in this sense; ism,” which however cannot be easily vails. In the presence of sex, however,
in fact, functions susceptible to a lim- imitated by nature.) natural selection favors “mixable” al-
ited weak form of learning called statis- Evolutionary theorists have labored leles, those alleles that, even though
tical learning.18 In the section “A Game for about a century to find other explana- they may not participate in any truly
between Genes,” we discuss another in- tions for the role of sex in evolution, but great genetic combinations, perform
teresting connection between machine all 20th century explanations are valid adequately across a wide variety of dif-
learning algorithms and evolution. only under specific conditions, contra- ferent combinations.23–26 To put it dif-
dicting the prevalence of sex in nature.9,a ferently, in the hypothetical three-by-
The Role of Sex This is not a small problem. Imag- three fitness landscape in the sidebar,
Sex is nearly universal in life: it occurs ine, for example, that even though the winner of asexual evolution will be
in animals and plants by the coming much of the terrestrial world is green, the largest entry of the fitness matrix
together of sperm and egg, in fungi by we had no clue why leaves exist. That (in this case, 1.05). In contrast, sexual
the fusion of hyphae, and even in bacte- would have been a pretty big gap in evolution will favor, roughly speaking,
ria:34 Two bacterial cells can pair up, for our understanding of nature. Not those alleles (rows and columns) with
example, and build a bridge between knowing the role of sex is an even big- larger average value; where “average”
them through which genes are trans- ger gap, because far more life forms takes into account the prevalence of
ferred. Many species engage in asexual exchange genes than photosynthe- these genotypes in the population, as
reproduction, or in selfing, at some size. It is no wonder the role of sex has we will explore.b
times, but also engage in sexual repro- been called “the queen of problems” in
duction at other times, keeping their evolutionary biology.6 Sex as a Randomized Algorithm
genotypes well shuffled. In contrast, Since sex breaks down genetic com- One of the most central and striking
species that do not exchange genes in binations, it has been mainly thought themes of algorithms research in the
any form or manner, called “obligate in evolution that effective selection past few decades has been the surpris-
asexuals,” are extremely rare, inhabit- acts on individual alleles,38 that is, each ing power of randomization.29 Para-
ing sparse, recent twigs of the tree of (non-neutral) allele is either beneficial doxically, evoking chance is often the
life, coming from sexual ancestral spe- or detrimental on its own. According safest and most purposeful and effec-
cies that lost their sexuality, and head- to this line of thinking, two main forc- tive way of solving a computational
ing toward eventual extinction without es drive allele frequencies: selection problem. For one, it helps avoid the
producing daughter species.22 acting on alleles as independent ac- worst case, as in Quicksort. Second,
Not only is sex essentially univer- tors (where alleles are often assumed sampling from a distribution D helps
sal, but it seems to be very much cen- to be making additive contributions decide between competing hypoth-
ter stage in life, the basis of a fantas- to fitness), and random genetic drift eses about D: Randomized algorithms
tic variety of behavior and structure: (chance sampling effects on allele fre- for software testing, as well as for test-
from bacterial conjugation to the in- quency, as discussed previously). The ing primality, or validity of polynomial
tense molecular machinery of meio- interaction between alleles, within identities, are like this.
sis (cell division producing gametes), and between loci—even though it has Evolution under sex can be seen
from flower coloration to bird courtship been of interest in population genet- as an instance of a randomized algo-
dances, from stag fights all the way to ics from the start10,39—has played a rithm of the latter type. Suppose we
the drama of human passion, much of secondary role, often being treated as want to design a hypothetical evolu-
life seems to revolve around sex. So why? a mere correction to the above, under tionary experiment for determining
What role might sex play in evolution? the term “epistasis.” A few years ago, whether a new allele of a particular
One common answer is that sex while working with biologists Mar- gene performs better than its alterna-
generates vast genetic diversity, and cus Feldman and Jonathan Dushoff tive, across all genetic combinations.
hence it must help evolution. But, and computer scientist Nicholas Pip- If the population is asexual, this could
just as sex puts together genetic com- penger, we asked whether interactions be done by inserting this mutation in
binations, it also breaks them down: between alleles could be crucial to the the genome of one individual, and
a highly successful genotype will be understanding of the role of sex in a yet gauging the lineage thus founded to
absent from the next generation, as unexplored manner.23–26 Based on the see if it thrives. This kind of sampling
children inherit half their genes from
each parent. To say the role of sex is to
a See the appendix available in the ACM Digital b The original paper24 refers to the unweighted
create particular, highly favorable ge- Library (dl.acm.org) under Source Material average fitness as mixability, instead of the
netic combinations is like watching a for a more extensive bibliography on this and more natural average weighted by genotype
man catch fish only to toss them back other subjects. frequency.
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 89
review articles
is very inefficient, because we sample is written Fg = 1 + s Δg, where s is small ing surprisingly well in a multitude
from a small pool (the genotypes that and Δg is the differential fitness of the of difficult problems and contexts:
happen to be available in the popula- genotype, ranging in [− 1, 1]. the multiplicative weights update al-
tion), and must repeat the insertion Working with the weak selection gorithm (MWU),2 also known in ma-
many times—in many individuals. assumption, and after some alge- chine learning as “no-regret learning”
But if the population is sexual, then braic manipulation, we noticed the or “hedge” (see more in box “The Ex-
by inserting the mutation once, af- equations of the evolution of a popu- perts Problem” in the online appen-
ter log n generations, where n is the lation under sex are mathematically dix). MWU changes the frequencies xi
number of genes with which this par- equivalent to a novel process, which of the i-th allele of the gene as follows
ticular allele interacts,33 we will be entails an entirely different way of xi ← xi (1 + smi), (1)
sampling from all possible genetic looking at evolution: where mi is the expected differential
combinations that could in principle ˲˲ The process is a game between fitness, positive or negative, of the i-th
be constructed. Sex enables evolu- the genes of the organism. Recall that allele in the current gene pool. This
tion to sample quickly from the en- a game is a mathematical model of the quantity mi is a measure of what we have
tire space of genetic combinations, in interaction between several players, called the mixability of allele i, its ability
the distribution under which they ap- each player having a set of available to form fit combinations with alleles of
pear in the population. What is more, actions or pure strategies, and a util- other genes in the current genetic mix.
evolution under sex not only decides ity function: the objective the player To summarize, at each generation in
among the competing hypotheses is trying to maximize in the game, sexual evolution, each gene boosts the
(which allele performs better), but a mapping from choices of actions, frequency of each of its alleles by a fac-
also implements this decision (even- one for each player, to a real number. tor that increases with the mixability
tually, and with high probability, it The game here is repeated, that is, of this allele in the current generation.
will fix the winner). the same precise game is played over Naturally, the quantities resulting from
Finally, for yet another take on ex- and over for many rounds, with each the equation are normalized appropri-
plaining the ubiquity of sex in life, in round corresponding to a generation ately so as to add to one.
terms and concepts familiar to com- of the population. This is a completely new way of
puter scientists, view The Red-Blue ˲˲ The available actions of each gene/ looking at evolution. And it is a produc-
Tree Theorem sidebar. player correspond to the gene’s alleles. tive view, because it gets more inter-
˲˲ At every generation, each gene esting: Let us look back at the update
A Game between Genes chooses and plays a mixed strategy: Equation 1 and ask once again the
The search in population genetics for it randomizes over its actions, that is question: This choice of the new prob-
a quantity that is optimized by natu- to say, its alleles. The probability as- abilities for the alleles by the gene, is it
ral selection has a long history. Fisher signed to each allele in this mixed optimizing something? For once, the
wanted a theory for evolution with a strategy corresponds to the frequency answer is very clean: Yes, the choice of
mathematical law as clean and central of the allele in the population during allele frequencies by the gene shown
as the second law of thermodynam- this generation. in Equation 1 optimizes the following
ics,10 while Wright pointed out that ˲˲ The intricacy of game theory is function, specific to this gene, of the al-
the frequency of an allele in a diploid mostly due to conflicts between the lele frequencies:
locus changes in the direction that objectives of the players. But in the
1
increases the population’s mean fit- game played by the genes there are no Φ (x) = ΣxiMi – Σ xi ln xi. (2)
i s i
ness.40 Later, investigators tried their conflicts—this is not Dawkins’s self-
hand again at looking for a Lyapunov ish gene metaphor: All players have Here Mi denotes the cumulative rel-
function that will describe evolution, the same utility function, which is ative fitness of allele i, that is, the sum
albeit with little success. the fitness of the genotype resulting of the mi in Equation 1 over all genera-
Our search for an analytical maxi- from the players’ choices. Games of tions up to and including t − 1. It is easy
mization principle involving mixability identical utilities are called coordina- to notice that Φ is a strictly concave
ended with a surprise: We did not an- tion games in game theory, and are function, and thus has one maximum,
swer the question “What is evolution the simplest possible kind. They are and this maximum can be checked by
optimizing?” but, perhaps more inter- of interest only in cases in which the routine calculation to be exactly the
estingly, we identified the quantity that players are cognitively weak, or cannot new frequencies as updated in Equation
each gene seems to be optimizing dur- communicate effectively. 1! Now notice that the second term of Φ
ing evolution under sex. Together with ˲˲ In a repeated game, the players is plainly the entropy of the distribution
Erick Chastain and Umesh Vazirani,7 we must update their mixed strategies x, a well-known measure of a distribu-
focused on the standard equations de- from one round to the next, based on tion’s diversity.
scribed in the Darwin Mendel sidebar, in the outcome of the previous rounds. There is much that is unexpected
a particular evolutionary regime known Here lies the biggest surprise: The and evocative here, but perhaps most
as weak selection.30 Weak selection is the update rule used by the genes in this surprising of all is that this radically
widely held assumption that fitness dif- game is identical to a venerable learn- new interpretation of evolution was
ferences between genotypes are small. ing algorithm, well known to comput- lurking for almost a century so close to
The fitness of a genotype g in this regime er scientists for its prowess in work- the surface of these well-trodden equa-
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 91
review articles
happen at a C base which comes before an image of evolution that is even more
a G after that C is chemically modified explicitly algorithmic. It also means that
(methylated);11 methylation is know to genes interacting in one organism can
be the result of complex enzymatic pro- leave hereditary effects on the organ-
cesses. As to rearrangement mutations,
there are powerful agents of mutagenic- There is a mismatch ism’s offspring.22 It no longer matters
that a lucky genetic combination created
ity (creation of mutations) in the genome,
such as transposable elements: DNA se-
between heuristics by sex is doomed to vanish from the face
of the earth (that the fisherman of our
quences prone to “jump” from one place and evolution. earlier metaphor throws the fish away): It
of the genome to another, carrying other
DNA sequences with them.13 A key step
Heuristics should may have achieved a lasting effect on the
population through mutagenicity. Final-
in mammalian pregnancy (decidualiza- strive to create ly, the biological mechanisms affecting
tion), for instance, was the result of mas-
sive evolutionary rewiring of about 1,500
populations that mutations may themselves evolve.
complexity of local search;15 this theory pristine solutions to difficult problems 18. Kearns, M. Efficient noise-tolerant learning from
statistical queries. J. ACM 45, 6 (1998), 983–1006.
examines how difficult it is for a local such as communication, cooperation, vi- 19. Kimura, M. and Ohta, T. The average number of
search process to reach a local optimum, sion, locomotion, and reasoning, among generations until fixation of a mutant gene in a finite
population. Genetics 61, 3 (1969), 763.
and the conclusion has in many cases so many more. One is tempted to ask: 20. Kirkpatrick, S., Gelatt, Jr,, C.D. and Vecchi, M.P.
been: “pretty hard.” By applying this What algorithm could create all this in just Optimization by simulated annealing. Science 220 (1983),
671–680.
point of view to selection on interacting 1012 steps? The number 1012—one tril- 21. Lewontin, R.C. and Hubby, J.L. A molecular approach to
genes, we showed that there are n-gene lion—comes up because this is believed the study of genic heterozygosity in natural populations;
amount of variation and degree of heterozygosity in
systems, in which the fitness is the sum to be the number of generations since natural populations of Drosophila pseudoobscura.
total of contributions of certain pairs the dawn of life 3.5 ∙ 109 years ago (notice Genetics 54 (1966), 595–609.
22. Livnat, A. Interaction-based evolution: How natural
of alleles—that is, the next step beyond that most of our ancestors could not have selection and nonrandom mutation work together.
Biology Direct 8, 1 (2013), 24.
selection on single alleles—for which lived for much more than a day). And it 23. Livnat, A., Feldman, M.W., Papadimitriou, C. and
fixation takes a number of generations is not a huge number: cellphone proces- Pippenger, N. On the advantage to sexual species in
diversification rates. Unpublished manuscript.
proportional to 2n to happen. A stronger sors do many more steps in an hour. 24. Livnat, A., Papadimitriou, C., Dushoff, J. and Feldman,
result can also be obtained under the Over the past decade, computer sci- M.W. A mixability theory for the role of sex in evolution.
In Proceedings of the National Academy of Sciences 105,
well-accepted complexity assumption entists and evolutionary biologists work- 50 (2008), 19803–19808.
that local search is intractable in general ing together have come up with new 25. Livnat, A., Papadimitriou, C. and Feldman, M.W. An
analytical contrast between fitness maximization and
(see Johnson et al.15 for details). The im- insights about central open problems selection for mixability. J. Theoretical Biology 273, 1 (2011),
plication is that, if gene interactions are surrounding evolution—including, 232–234.
26. Livnat, A., Papadimitriou, C., Pippenger, N. and Feldman,
taken into account, fixation may take rather surprisingly, a proposed answer M.W. Sex, mixability, and modularity. In Proceedings
much longer than in the regime of selec- to the “algorithm” question—by looking of the National Academy of Sciences 107, 4 (2010),
1452–1457.
tion on individual genes. at evolution from a computational point 27. Lynch, V.J., Leclerc, R.D., May, G. and Wagner, G.P.
Does this insight explain the mystery of view. And, of course many more ques- Transposon-mediated rewiring of gene regulatory
networks contributed to the evolution of pregnancy in
of variation? Not yet, because our analy- tions, inviting similar investigation, were mammals. Nature Genetics 43 (2011), 1154–1159.
sis so far has been disregarding two oth- opened up in the process. 28. Mitchell, M. An Introduction to Genetic Algorithms. MIT
Press, Cambridge, MA, 1996.
er powerful forces in evolution, besides 29. Motwani, R. and Raghavan, P. Randomized Algorithms.
Additional background and literature appears in an online Cambridge University Press, 1995.
mutation and selection, acting on varia- appendix available with this article in the ACM Digital 30. Nagylaki, T., Hofbauer, J. and Brunovský, P. Convergence
tion: the finiteness of the population, Library (dl.acm.org) under Source Material. of multilocus systems under weak epistasis or weak
selection. J. Mathematical Biology 38, 2 (1999), 103–133.
and heterozygosity (a diploid organism 31. Nevo, E., Beiles, A. and Ben-Shlomo, R. The
References
carrying two different alleles of a gene.). 1. Aldous, D. and Vazirani, U. ‘Go with the winners’ algorithms.
evolutionary significance of genetic diversity: Ecological,
demographic and life history correlates. Lecture Notes in
First, finiteness. Because the num- In Proceedings of the 35th Annual IEEE Symposium on Biomathematics 53 (1984), 13–213.
Foundations of Computer Science (1994). 492–501.
ber of individuals carrying the alleles in 2. Arora, S., Hazan, E. and Kale, S. The multiplicative
32. Papadimitriou, C. and Steiglitz, K. Combinatorial
Optimization: Algorithms and Complexity. Dover, 1998.
question is finite, say N, the number of weights update method: A meta-algorithm and 33. Rabani, Y. Rabinovich, Y. and Sinclair, A. A computational
applications. Theory of Computing 8, 1 (2012), 121–164.
individuals carrying each allele at each 3. Athreya, K. and Ney, P. Branching Processes. Springer, 1972.
view of population genetics. In Proceedings of the 27th
Annual ACM Symposium on Theory of Computing,(1995),
generation evolves as in a kind of a ran- 4. Babbage, C. The Ninth Bridgewater Treatise. 2nd edn. 83–92.
John Murray, London, 1838.
dom walk within the confines of [0, N], 5. Barton, N.H., Novak, S. and Paixão, T. Diverse forms
34. Stearns, S.C. and Hoekstra, R.F. Evolution: An
Introduction. Oxford University Press, New York, 2005.
and, ignoring selection, this results in of selection in evolution and computer science. In 35. Valiant, L. Probably Approximately Correct: Nature’s
Proceedings of the National Academy of Sciences 111, 29 Algorithms for Learning and Prospering in a Complex
fixation after O(N) generations.19 Sec- (2014), 10398–10399. World. Basic Books, 2013.
ond, diploidy introduces the possibility 6. Bell, G. The Masterpiece of Nature: The Evolution and 36. Valiant, L.G. Evolvability. J. ACM 56, 1 (2009), 3.
Genetics of Sexuality. University of California Press, 37. Von Neumann, J. and A. W. Burks, A.W. Theory of self-
of overdominance, in which organisms Berkeley, CA, 1982. reproducing automata. IEEE Transactions on Neural
with two different alleles of a gene are 7. Chastain, E., Livnat, A., Papadimitriou, C. and Vazirani, Networks 5, 1 (1966), 3–14.
U. Algorithms, games, and evolution. In Proceedings 38. Williams, G.C. Adaptation and Natural Selection, 8th
more fit than organisms with two copies of the National Academy of Sciences 111, 29 (2014), edition. Princeton University Press, 1996.
of one allele or two copies of the other. In 10620–10623. 39. Wright, S. Evolution in Mendelian populations. Genetics
8. Darwin, C. On the Origin of Species by Means of Natural 16 (1931), 97–159.
overdominance, the equations of selec- Selection, or the Preservation of Favoured Races in the 40. Wright, S. The distribution of gene frequencies in
Struggle for Life. Murray, London, 1859.
tion point to stable variation, with both 9. Feldman, M.W. Otto, P. and Christiansen, F.B. Population
populations. In Proceedings of the National Academy of
Sciences of the United States of America, 23, 6 (1937),
alleles enjoying stably high frequency in genetic perspectives on the evolution of recombination. 307–320.
Annual Review of Genetics 30 (1997), 261–295.
the population. 10. Fisher, R.A. The Genetical Theory of Natural Selection.
How these three effects, of finite The Clarendon Press, Oxford, U.K., 1930. Adi Livnat ([email protected]) is a Senior Lecturer
11. Fryxell, K.J. and Moon, W.-J. CpG mutation rates in
population, of heterozygosity, and of the human genome are highly dependent on local GC
in the Department of Evolutionary and Environmental
Biology, and Institute of Evolution at the University of
selection acting on combinations of content. Molecular Biology and Evolution 22 (2005), Haifa, Israel.
650–658.
alleles across loci, interact with one 12. Goldberg, D. Genetic Algorithms in Search, Optimization Christos Papadimitriou ([email protected]) is
another is an important subject for fur- and Machine Learning. Addison-Wesley, Reading, MA, 1989. the C. Lester Hogan Professor in the Computer Science
13. Graur, D. and Li, W.-H. Fundamentals of Molecular Division of the University of California at Berkeley.
ther research. Evolution. Sinauer Associates, Sunderland, MA, 2000.
14. Holland, J.H. Adaptation in Natural and Artificial Copyright held by authors.
Systems: An Introductory Analysis with Applications to
Epilogue Biology, Control, and Artificial Intelligence. U Michigan
Publications rights licensed to ACM. $15.00.
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 93
review articles
DOI:10.1145/ 2891406
a variety of signals about characteris-
The future success of these systems tics of the users and items, including
people’s explicit or implicit evaluations
depends on more than a Netflix challenge. of items. The systems process these
signals at a massive scale, often under
BY DIETMAR JANNACH, PAUL RESNICK, real-time constraints. Most impor-
ALEXANDER TUZHILIN, AND MARKUS ZANKER tantly, the recommendations are of sig-
nificant quality on average. In empiri-
cal tests, people choose the suggested
Recommender
items far more often than they choose
suggested items based on unpersonal-
ized benchmark algorithms that are
Systems—
based on overall item popularity.
In other ways, the systems that
produce these recommendations are
Beyond Matrix
sometimes remarkably bad. Occasion-
ally, they make recommendations that
are embarrassing for the system, such
Completion
as recommending to a faculty mem-
ber an introductory book from the
“for dummies” series on a topic she
is expert in. Or, they continue recom-
mending items the user is no longer
interested in. Shortcomings like these
motivate ongoing research both in
industry and academia, and recom-
mender systems are a very active field
of research today.
To provide an understanding of the
THE USE OF recommender systems has exploded over state of the art of recommender sys-
the last decade, making personalized recommendations tems, this article starts with a bit of his-
tory, culminating in the million-dollar
ubiquitous online. Most of the major companies, Netflix challenge. That challenge led to
including Google, Facebook, Twitter, LinkedIn, Netflix, a formulation of the recommendation
problem as one of matrix completion:
Amazon, Microsoft, Yahoo!, eBay, Pandora, Spotify, Given a matrix of users by items, with
and many others use recommender systems (RS) item ratings as cells, how well can an
within their services. algorithm predict the values in some
critical technology in several companies. For example, ˽˽ Today, the scientific community
operationalizes the research problem
Netflix reports that at least 75% of its downloads and mainly on principles from information
retrieval and machine learning, leading
rentals come from their RS, thus making it of strategic to a well-defined but narrow problem
characterization.
importance to the company.a ˽˽ We briefly review the history of the
In some ways, the systems that produce these field, report on the recent advances,
and propose a more comprehensive
recommendations are remarkable. They incorporate research approach that considers both
the consumer's and the provider's
a http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html perspective.
(IF) systems that scan and filter text systems based on these techniques are
A Brief History documents based on personal user typically called “content-based filter-
Many fields have contributed to rec- preferences or interests. The idea of us- ing” approaches.
ommender systems research, includ- ing a computer to filter a stream of in- Leveraging the opinions of others.
ing information systems, information coming information according to the As early as 1982, then ACM president
retrieval (IR), machine learning (ML), preferences of a user dates back to the Peter J. Denning complained about
human-computer interaction (HCI), 1960s, when first ideas were published “electronic (email) junk” and advo-
and even more distant disciplines like under the term “selective dissemina- cated the development of more intel-
marketing and physics. The common tion of information.”17 Early systems ligent systems that help to organize,
starting point is that recommenda- used explicit keywords that were pro- prioritize, and filter the incoming
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 95
review articles
outperform Netflix’s in-house system ratings of songs randomly assigned to mendation domain, the problem re-
by 10% on RMSE (see the sidebar). In- users had a very different distribution mains that the “ground truth” (that is,
terest in this competition was huge. than ratings of songs users had chosen whether or not an item is actually rel-
More than 5,000 teams registered for to rate.26 evant for a user) is available only for a
the competition and the prize was fi- As a result, algorithms that predict tiny fraction of the items. The results of
nally awarded in 2009. Substantial well on held-out ratings that users pro- an empirical evaluation can depend on
progress was made with respect to the vided may predict poorly on a random how the items with unknown ground
application of ML approaches for the set of items the user has not rated. This truth are treated when determining
rating prediction task. In particular, can mean that algorithms tuned to per- the accuracy metrics. In addition, the
various forms of matrix factorization form well on past ratings are not the problem of items not missing at ran-
as well as ensemble learning tech- best algorithms for recommending in dom also exists for learning-to-rank
niques were further developed in the the real world.7 approaches and at least some of them
course of the competition and proved In addition, the matrix completion exhibit a strong bias to recommend
to be highly successful. problem setup is not suitable to as- blockbusters to everyone, which might
sess the value of reminding users of be of little value for the users.19
Beyond Matrix Completion items they have already purchased or In some domains, like music rec-
At the conclusion of the Netflix Prize consumed in the past. However, such ommendation, it is also important to
competition, it might have been plausi- repeated recommendations can be a avoid very “bad” recommendations
ble to think that recommender systems desired functionality of recommend- as they can greatly impact the user’s
were a solved problem. After all, many ers, for example, in domains like music quality perception.6,21 Omitting some
very talented researchers had devoted recommendation or the recommenda- “good” recommendations is not nearly
themselves for an extended period of tion of consumables. so harmful, which would argue for risk-
time to improve the prediction of with- In the end, the standardized evalua- averse algorithm designs that mostly
held ratings. The returns on that effort tion setup and the availability of public recommend items with a high average
seemed to be diminishing quite rap- rating datasets made it attractive for re- rating and low rating variance. Recom-
idly, with the final small improvements searchers to focus on accuracy measures mending only such generally liked,
that were sufficient to win the prize and the matrix completion setup and non-controversial items might however
coming from combining the efforts of may have lured them away from inves- not be particularly helpful for some of
many independent contestants. tigating the value of other information the users.
However, it turns out that recom- sources and alternative ways of evaluat- System quality factors beyond accu-
mender systems are far from a solved ing the utility of recommendations. racy. The Netflix Prize with its focus on
problem. Here, we first give examples of Today, a growing number of academ- accuracy has undoubtedly pushed rec-
why optimizing the prediction accuracy ic studies try to evaluate the performance ommender systems research forward.
for held-out historical ratings might be of their methods using A/B tests on live However, it has also partially over-
insufficient or even misleading. Then customers in real industrial settings shadowed many other important chal-
we discuss selected quality factors of (for example, Dias et al,9 Garcin et al.,12 lenges when building a recommender
recommender systems not covered by and Gorgoglione et al.16). This is a very system and today even Netflix states
the matrix completion task at all and positive trend that requires cooperation “there are much better ways to help
give examples of recent research that from a commercial vendor who may not people find videos to watch than focus-
goes beyond matrix completion. agree to make data publicly available, ing only on those with a high predicted
Pitfalls of matrix completion set- thus making it difficult for results to be star rating.”15 Next, we give examples of
ups. Postdiction ≠ prediction. Predicting checked or reproduced by others. quality factors other than single-item
held-out matrix entries is really pre- Not all items and errors are equally accuracy, review how recent research
dicting the past rather than the future. important. RMSE, the evaluation met- has approached these problems, and
If the held-out rating entries are repre- ric used in the Netflix Prize, equally sketch open challenges.
sentative of the hidden rating entries, weights errors of prediction on all Novelty, diversity, and other compo-
then the distinction does not matter. items. However, in most practical set- nents of utility. Making good rating pre-
However, in many recommender set- tings items with low predicted ratings dictions for as-yet unrated items is al-
tings, the held-out ratings are not rep- are never shown to users, so it hardly most never the ultimate goal. The true
resentative of the missing ratings. matters if the correct prediction for goal of providing recommendations is
One reason is the missing ratings those items is 1, 2, or 3 stars. Intuitively, rather some combination of a certain
are generally not missing at random. it is more appropriate in these domains value for the user and profit for the site.
Even for items that people have experi- to optimize a ranking criterion that fo- In some domains, user ratings may rep-
enced, if rating requires any effort at all cuses on having the top items correct. resent a general quality assessment but
they are more likely to rate items that In recent years, a number of learn- still not imply the item should be rec-
they love or hate rather than those that ing-to-rank approaches have been pro- ommended. As an example, consider
they feel lukewarm about. Moreover, posed in the literature to address this the problem of recommending restau-
people are more likely to try items they issue, which aim to optimize (proxies rants to travelers. Most people dining
expect to like. For example, in one em- of) rank-based measures. When ap- at a Michelin-starred location may give
pirical study, researchers found that plying such IR measures in the recom- it five stars, but budget travelers may be
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 97
review articles
sites, such as Spotify, ask listeners for sational recommender systems were ating fake profiles and ratings.
their current mood or adapt the recom- proposed to elicit user preferences in- In the long run, customers who were
mendations depending on the time teractively and engage in a “propose, misled by such manipulated reviews
of the day. Online shopping sites look feedback, and revise” cycle with us- would distrust the recommendations
at the very recent navigation behavior ers.37 They are employed in domains made by the system and in the worst
and infer short-term shopping goals where consumers are confronted case the online service as a whole. Be-
of their visitors. Mobile recommender with high involvement buying deci- ing resilient against such manipula-
systems finally constitute a special sions, such as financial services, real tions can therefore be crucial to the
case of context-aware recommenders, estate, or tourism. Most approaches long-term success of a system.
as more and more sensor information use forms-based dialogues to let us- There has been considerable re-
becomes available, for example, about ers choose from predefined options search on manipulation resistance,
the user’s location and local time. or use natural language processing where resistance is defined as attackers
From a research perspective, con- techniques to cope with free-text or having only a limited ability to change
text is a multifaceted concept that has oral user input. Recent alternative ap- the rating predictions that are made.
been studied in various research dis- proaches also include more emotional Most of it identifies archetypal attack
ciplines. Over the last 10 years signifi- ways of expressing preferences, for ex- strategies and proposes ways to detect
cant progress has been made also in ample, based on additional sensors to and counteract them. For example, one
the field of context-aware recommend- determine the user’s emotional state “shilling” or “profile injection” attack
ers and the first comparative evalua- or by supporting alternative ways of creates profiles for fake users, with rat-
tions and benchmark datasets were user input such as selecting from pic- ings for many items close to the overall
published.31 Nonetheless, much more tures.30 Furthermore, the integration average for all users. Then, these fake
work is required to fully understand of better recommendation functional- users give top (or bottom) ratings to
this multifaceted concept and to go ity in voice-controlled virtual assistants the items that are being manipulated.22
beyond what is called the representa- like Apple’s Siri represents another This line of research has identified al-
tional approach with its predefined promising path to explore by the RS re- gorithms that are more or less resistant
and fixed set of observable attributes. search community. to particular attack strategies.29
Interacting with users. Coming back One key insight in conversational In recent research, the textual re-
to our restaurant recommender, let systems is that users may not initially views provided by users on platforms
us assume we have extended its capa- understand the space of available like TripAdvisor are used instead or in
bilities and it now considers the user’s items, and so do not have well-formed combination with numerical ratings
time and geographical location when preferences that can be expressed in to understand long-term user prefer-
making recommendations. But what terms of attributes of items. An inter- ences. These textual reviews do not
happens if the user—in contrast to her active or visualization-based recom- only carry more detailed information
past preferences—is in the mood to try mender can help users explore the than the ratings, they can also be au-
out something different, for example, item space and incrementally reveal tomatically analyzed to detect fake en-
a vegan restaurant. How would she tell their preferences to the system. For tries.20 Research suggests that in some
the system? And if she did not like it example, critiquing-based interfaces domains the fraction of manipulated
afterward, how would she inform the invite users to say “Show me more like entries can be significant.
system not to recommend vegan res- restaurant A, but cheaper.” Although Generally, to resist manipulation,
taurants again in the future? these approaches attracted consider- algorithms take some countermeasure
In many application domains, able interest in research, they are not that discards or reduces the influence
short-term preferences must be elic- yet mainstream in practice.27 given to ratings or reviews that are sus-
ited and recommending cannot be a Overall, with interactive systems, pected of not being trustworthy. How-
one-shot process of determining and the design challenge is no longer sim- ever, this has the effect of throwing
presenting a ranked list of items. In- ply one of choosing items to recom- away some good information. There is
stead, various forms of user interac- mend but also to choose a sequence a lower bound on the good information
tions might be required or useful to put of conversational moves as proposed that must be discarded in any attempt
the user in control. Examples of typi- by Mahmood et al.24 who developed an to prevent attacks by statistical means
cal interaction patterns are interactive adaptive conversational recommenda- of noticing anomalous patterns.34 No
preference elicitation and refinement tion system for the tourism domain. easy solution to this problem seems to
procedures, the presentation of expla- Manipulation resistance. Moving on exist, unless attackers can be prevent-
nations and persuasive arguments, from the specific problems of how the ed from injecting fake profiles.
or the provision of mechanisms that preferences are acquired and which Trust and loyalty. Manipulation re-
help users explore the space of avail- algorithms are used in our restaurant sistance is not the only requirement for
able options. The design of the user recommender, the question could building a trustworthy system. Let us
experience and the provided means of arise of whether we can trust that the return to our restaurant recommender
interacting with the system can be a ratings of the community are honest and assume that our user has eventu-
key quality factor for the success of the and fair. Interested parties might ma- ally decided to try out one of our rec-
recommendation service. nipulate the output of a recommender ommendations. Thus, from a provider
In the research literature, conver- to their advantage, for example, by cre- perspective, we were successful in driv-
N OV E MB E R 2 0 1 6 | VO L. 59 | N O. 1 1 | C OM M U N IC AT ION S OF T HE ACM 99
review articles
ing the user’s short-term behavior. But ated by complex ML models. Another ally looking for new items to discover
what if the user is dissatisfied after- is how to leverage additional informa- or are they seeking “more of the same”
ward with her choice and in particular tion such as the browsing history or the for comparison purposes—this ques-
feels that our recommendations were user’s social graph to make recommen- tion is seldom asked.
biased and not objective? dations look more plausible or familiar Due to their high practical relevance,
As a result, she might not trust the to the user.32 RS are naturally a field of research in
service in the future and even the most disciplines other than computer sci-
relevant recommendations might be From Algorithms to Systems ence (CS), including information sys-
ignored. In the worst case, she will even Our brief survey on the history of the tems (IS), e-commerce, consumer re-
distrust the competence and integrity field indicates that recommender sys- search, or marketing. Research work
of the service provider.6 An important tems have arrived at the Main Street like Xiao and Benbasat39 that develop
quality factor of a recommendation with broad industry interest and an a comprehensive conceptual model
system is that it is capable of building active research community. Further- of the characteristics, use, and impact
long-term loyalty through repeated more, we have seen the recommender of e-commerce “recommendation
positive experiences. systems community address a variety agents” are largely unnoticed in the CS
In e-commerce settings, users can of topics beyond rating prediction and literature. In their work, the authors
rightly assume that economic consider- item ranking, for example, concern- develop 28 propositions that center
ations might influence what is placed ing the system’s user interface or long- around two practically relevant ques-
in the recommendation lists and can term effects. tions in e-commerce settings: How can
be worried that what is being proposed Beyond the computer science per- RS help to improve the user’s decision
is not truly optimal for them but for the spective. Many of the proposals dis- process and quality? and Which factors
seller. Transparency is therefore an im- cussed earlier focus on algorithmic influence the user’s adoption of and
portant factor that has been shown to aspects, for example, how to combine trust toward the system? The process
positively influence the user’s trust in context information with matrix com- of actually generating the recommen-
a system: What data does the RS con- pletion approaches, how to find the dations—which is the focus in the CS
sider? How does the data lead to recom- most “informative” items that users field—is certainly important, but only
mendations? Explanations put the fo- should be asked to rate, or how to de- one of several factors that contribute to
cus on providing additional information sign algorithms that balance diversity the success of an RS.
in order to answer these questions and and accuracy in an optimal way. As sus- Research questions in the context of
justify the proposed recommendations. pected in Wagstaff38 for the ML com- a RS should therefore be viewed from
In the research literature, a number munity, the RS research community, a more comprehensive perspective as
of explanation strategies have been ex- to some extent, still seems too focused sketched in Figure 1. Whenever new
plored over the past 10 years. Many of on benchmark datasets and abstract technological proposals are made, we
them are based on “white-box” strate- performance metrics. Whether or not should ask which specific need or re-
gies that expose how a system derived the reported improvements actually quirement in a given domain are ad-
the recommendations.11 However, many matter in the real world for a certain dressed. Making better buying deci-
challenges remain open. One is how to application domain, and the needs of sions can be one need from the user’s
explain recommendations that are cre- the users—are they, for instance, actu- perspective; guiding customers to oth-
er parts of the product spectrum can
Figure 1. A more comprehensive view on the recommendation problem. be a desired effect from the provider’s
side. Correspondingly, these goals de-
termine the choice of the evaluation
User Service Provider/Business measure that is chosen to assess the ef-
Needs, e.g., Business Goals, fectiveness of the approach.
Information Filtering Desired Impact on Users,
Item Discovery e.g.,
At the end, the goals of a recom-
Decision Making Customer Loyalty mendation system can be very diverse,
Revenue Increase ranging from improved decision mak-
Data, e.g., Sales Diversification
Preferences ing over item filtering and discovery,
Context to increased conversion or user en-
Demographics
Long-term Profile gagement on the platform. Abstract,
Recommender System domain-independent accuracy mea-
Design Decisions
Algorithms
sures as often used today are typically
Interactivity insufficient to assess the true value of
Explanations, ... a new technique.15
Data, e.g., Focusing on business- and util-
Ratings, ity-oriented measures and the con-
Item Features,
Social Network, ... sideration of novelty, diversity, and
Environment serendipity aspects of recommenda-
tions—as discussed earlier—are im-
portant steps into that direction. In
any case, which measures are actually Figure 2. A new characterization of the recommendation problem.
chosen for the evaluation, always has
to be justified by the specific goals
that should be achieved with the sys-
actions
tem. Furthermore, in offline experi- Set of actions
Acquire preferences
mentations, multi-metric evaluation Present recommendations Purpose and Goals
schemes, application-specific mea- Display explanations
Adapt recommendation strategy User/Provider
sures, and the consideration of recom- Short-term/Long-term
...
mendation biases represent one way
of assessing desired and potentially time
Recommendation System
undesired effects of a RS on its users.19
However, to better understand the
effectiveness of a RS and its impact
on users, more user-centric and util-
ity-oriented research is generally re-
quired within the CS community and
the algorithmic works should be better explanations when desired, present al- built-in learning capabilities.10,24
connected with the already existing in- ternative or complementary shopping Finally, mobile and wearable de-
sights from neighboring fields. proposals and, in general, put the user vices have become the personal digi-
Putting the user back in the loop. more into control and allow for new tal assistants of today. With the recent
A recommender system is usually one types of interactions. developments in speech recognition,
component within an interactive ap- When looking at the recommen- gesture-based interactions, and a mul-
plication. The minimal interaction dations provided by Amazon.com on titude of additional sensors of these
level provided by such a component is their websites, we can see that various devices, new opportunities arise re-
that a list of recommendations is dis- forms of user interaction already exist garding how we interact with recom-
played and users can select them for that are underexplored in academia. mender systems.
inspection or immediate consump- Amazon.com, for example, provides Toward a more comprehensive
tion, for example, on media stream- multiple recommendation lists on its characterization of the recommen-
ing platforms. landing page. Amazon's system also dation task. In the research literature,
RS have one of their roots in the field supports explanations for the made an often-cited definition of the recom-
of human-computer interaction (HCI) recommendations and even lets the mendation problem is to find a func-
and the design of the user interface, the user indicate if a past user action ob- tion that outputs a relevance score for
choice of the supported forms of inter- served by the system (for example, a each item given information about the
activity, or the selection of the content purchase) should no longer be consid- user profile and her contextual situa-
to be displayed can all have an impact ered in the recommendation process. tion, “content” information about the
on the success of a recommender. However, many questions, such as how items, and information about prefer-
However, the amount of research dedi- to design such interactive elements in ence patterns in the user community.1
cated to these questions is comparably the best possible way, how much cogni- Although the development of even
low, particularly when compared to the tive load for the user is acceptable, or better techniques for item selection
huge amount of research on item-rank- how the system can stimulate or per- and ranking will remain at the core of
ing algorithms. suade people to do certain actions are the research problem, the discussions
Therefore, our second tenet is that largely unexplored. here indicate this definition seems
the CS community should put more Furthermore, in case a system sup- too narrow. To conclude our consider-
effort on the HCI perspective of RS, ports various forms of interactivity and ations related to the HCI perspective
as has been advocated earlier, for ex- is at the same time capable of acquir- on recommenders and the more com-
ample, in Konstan and Riedl21 and ing additional information from the prehensive consideration of the in-
McNee et al.28 Current research largely user, additional algorithmic and com- terplay between users, organizations,
relies on explicit ratings and automati- putational challenges arise. An intel- and the recommendation system, we
cally observable user actions—often ligent system might, for example, de- propose a new characterization of the
called implicit feedback—as preference cide on the next conversational move, recommendation problem (Figure 2).
indicators. Many real-world systems or whether to display an explanation A new problem characterization. A
allow users to explicitly specify their or not, depending on the current state recommendation problem has the fol-
preferences, for example, in terms of of the interaction or the estimated ex- lowing three components: an overall
preferred item categories. Recommen- pertise and competence of the user. goal that governs the selection and
dation components on websites could Some approaches in that direction ranking of items; a set of available ac-
be much more interactive and act, for were proposed in the literature in the tions centered on the presentation of
example, in the e-commerce domain past, but they often come at expense of recommended items; and an optimiza-
as “virtual advisers”39 and social actors considerable ramp-up costs in terms of tion timeframe:
that ask questions, adapt their commu- knowledge engineering and they might ˲˲ The overall goal constitutes the
nication to the current user, provide appear to be quite static if they have no operationalized measure or a set of
measures that should be optimized by search. For example, the huge amounts CHI ‘95 (1995), 194–201.
19. Jannach, D., Lerche, L., Kamehkhosh, I. and
an appropriate selection and ranking of user data and preference signals that Jugovac, M. What recommenders recommend: An
of items from a (large) set. Optimizing become available through the Social analysis of recommendation biases and possible
countermeasures. User Modeling and User-Adapted
a specific rank measure can be such Web and the Internet of Things not Interaction (2015), 25:1–65.
a goal, but more utility-oriented goals only leads to technical challenges such 20. Jindal, N. and Liu, B. Opinion spam and analysis. In
Proceedings WSDM ‘08, (2008), 219–230.
and corresponding measures like user as scalability, but also to societal ques- 21. Konstan, J. and Riedl, J. Recommender systems: From
satisfaction, decreased decision efforts, tions concerning user privacy. algorithms to user experience. User Modeling and
User-Adapted Interaction 22, 1-2 (2012), 101–123.
revenues, or loyalty might be equally Based on our reflections on the de- 22. Lam, S.K. and Riedl, J. Shilling recommender systems
important. Generally, the goals can be velopments in the field, we finally em- for fun and profit. In Proceedings of WWW ‘04, (2004),
393–402.
derived from the user’s perspective, the phasize the need for a more holistic 23. Linden, G., Smith, B. and York, J. Amazon.com
recommendations: Item-to-item collaborative
provider’s perspective, or both. research approach that combines the filtering. IEEE Internet Computing 7, 1 (2003), 76–80.
˲˲ Depending on the application do- insights of different disciplines. We 24. Mahmood, T., Ricci, F. and Venturini, A. Improving
recommendation effectiveness: Adapting a dialogue
main, a set of actions is available for urge that research focuses even more strategy in online travel planning. J. of IT & Tourism
the recommendation system to take. on practical problems that matter and 11, 4 (2009), 285–302.
25. Malone, T.W., Grant, K.R., Turbak, F.A., Brobst, S.A. and
The central action typically is the selec- are truly suited to increase the utility of Cohen, M.D. Intelligent information-sharing systems.
tion and presentation of a set of items. recommendations from the viewpoint Commun. ACM 30, 5 (May 1987), 390–402.
26. Marlin, B.M. and Zemel, R.S. Collaborative prediction
Additional possible moves are varying of the users. and ranking with non-random missing data. In
its strategy to recommend items, pro- Proceedings RecSys ‘09 (2009), 5–12.
27. McGinty, L. and Reilly, J. On the evolution of critiquing
viding specific explanations or other References
recommenders. Recommender Systems Handbook,
1. Adomavicius, G. and Tuzhilin, A. Toward the next
communication content, requesting generation of recommender systems: A survey of the
Springer, 2011, 419–453.
28. McNee, S.M., Riedl, J. and Konstan, J.A. Being accurate
feedback or alternative variants of user state-of-the-art and possible extensions. IEEE Trans.
is not enough: How accuracy metrics have hurt
Knowledge and Data Engineering 17, 6 (2005), 734–749.
input. These conversational moves are recommender systems. In Proceedings CHI ‘06, (2006),
2. Adomavicius, G. and Tuzhilin, A. Context-aware
1097–1101.
building blocks for goal achievement. recommender systems. Recommender Systems
29. Mobasher, B., Burke, R., Bhaumik, R. and Williams,
Handbook. Springer, 2011, 217–253.
The selection of the most helpful next C. Toward trustworthy recommender systems: An
3. Billsus, D. and Pazzani, M.J. Learning collaborative
analysis of attack models and algorithm robustness.
action and its timing can be the result information filters. In Proceedings ICML ‘98 (1998),
ACM Trans. Internet Technology 7, 4 (Oct. 2007).
46–54.
30. Neidhardt, J., Seyfang, L., Schuster, R. and Werthner,
of a reasoning process itself. 4. Breese, J.S., Heckerman, D. and Kadie, C.M. Empirical
H. A picture-based approach to recommender
˲˲ The timeframe or optimization analysis of predictive algorithms for collaborative
systems. J. of IT & Tourism 15 (2015), 1–21.
filtering. In Proceedings UAI ‘98 (1998), 43–52.
31. Panniello, U., Tuzhilin, A. and Gorgoglione, M.
horizon signifies the time window over 5. Castells, P., Wang, J., Lara, R. and Zhang, D.
Comparing context-aware recommender systems in
Introduction to the special issue on diversity and
which the goal should be optimized. discovery in recommender systems. ACM Trans.
terms of accuracy and diversity. User Modeling and
User-Adapted Interaction 24, 1-2 (2014), 35–65.
The explicit consideration of the time Intell. Syst. Technology 5, 4 (2014), 52:1–52:3.
32. Papadimitriou, A., Symeonidis, P. and Manolopoulos,
6. Chau, P.Y.K., Ho, S.Y., Ho, K.K.W. and Yao, Y. Examining
dimension allows us to differentiate the effects of malfunctioning personalized services on
Y. A generalized taxonomy of explanations styles for
traditional and social recommender systems. Data
between single one-shot interactions online users’ distrust and behaviors. Decision Support
Min. Knowl. Discovery 24, 3 (2012), 555–583.
Syst. 56 (2013), 180–191.
and longer time spans that can be 7. Cremonesi, P., Garzotto, F. and Turrin, R. Investigating
33. Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P.
and Riedl, J. Grouplens: An open architecture for
more relevant to businesses and users. the persuasion potential of recommender systems
collaborative filtering of netnews. In Proceedings of
from a quality perspective: An empirical study. ACM
The recommendation problem Trans. Interact. Intell. Syst. 2, 1 (2012), 11:1–11:41.
CSCW ‘94 (1994), 175–186.
34. Resnick, P. and Sami, R. The information cost of
finally can be defined as: Find a se- 8. Denning, P.J. ACM president’s letter: Electronic junk.
manipulation-resistance in recommender systems. In
Commun. ACM 25, 3 (Mar. 1982), 163–165.
quence of conversational actions and Proceedings RecSys ‘08 (2008), 147–154.
9. Dias, M.B., Locher, D., Li, M., El-Deredy, W. and Lisboa,
35. Schafer, J.B., Konstan, J. and Riedl, J. Recommender
item recommendations for each par- P.J. The value of personalised recommender systems
systems in e-commerce. In Proceedings ACM EC ‘99
to e-business: A case study. In Proceedings RecSys’08
ticular user that optimizes the overall (1999), 158–166.
(2008), 291–294.
36. Shardanand, U. and Maes, P. Social information
goal over the specified timeframe. 10. Felfernig, A., Friedrich, G., Jannach, D. and Zanker, M.
filtering: Algorithms for automating “word of mouth.”
An integrated environment for the development of
In Proceedings CHI ‘95 (1995), 210–217.
knowledge-based recommender applications. Int. J.
37. Shimazu, H. Expertclerk: Navigating shoppers’ buying
Electron. Commerce 11, 2 (2006), 11-34.
Summary 11. Friedrich, G. and Zanker, M. A taxonomy for generating
process with the combination of asking and proposing.
In Proceedings IJCAI ‘01 (2001), 1443–1448.
Recommender systems have become explanations in recommender systems. AI Magazine
38. Wagstaff, K. Machine learning that matters. In
32, 3 (2011), 90–98.
a natural part of the user experience 12. Garcin, F., Faltings, B. Donatsch, O., Alazzawi, A.,
Proceedings ICML (2012), 529–536.
39. Xiao, B. and Benbasat, I. E-commerce product
in today’s online world. These systems Bruttin, C. and Huber, A. Offline and online evaluation
recommendation agents: Use, characteristics, and
of news recommender systems at swissinfo.ch. In
are able to deliver value both for users Proceedings RecSys ‘14 (2014), 169–176.
impact. MIS Q. 31, 1 (Mar. 2007), 137–209.
and providers and are one prominent 13. Ghose, A., Ipeirotis, P.G. and Li, B. Designing ranking
systems for hotels on travel search engines by mining Dietmar Jannach ([email protected]) is a
example where the output of academic user-generated and crowdsourced content. Marketing Chaired Professor of Computer Science at TU Dortmund,
research has a direct impact on the ad- Science 31, 3 (2012), 493–520. Germany.
14. Goldberg, D., Nichols, D., Oki, B. and Terry, D. Using
vancements in industry. collaborative filtering to weave an information Paul Resnick ([email protected]) is the Michael D. Cohen
In this article, we have briefly re- tapestry. Commun. ACM (1992), 61–70. Collegiate Professor of Information at the University of
15. Gomez-Uribe, C.A. and Hunt, N. The Netflix Michigan School of Information, Ann Arbor, MI.
viewed the history of this multidis- Recommender System: Algorithms, business value,
Alexander Tuzhilin ([email protected]) is the
ciplinary field and looked at recent and innovation. ACM Trans. Manage. Inf. Syst. 6, 4
Leonard N. Stern Professor of Business in the Stern
(2015), 13:1–13:19.
efforts in the research community School of Business, New York University, NY.
16. Gorgoglione, M., Panniello, U. and Tuzhilin, A.
to consider the variety of factors that The effect of context-aware recommendations Markus Zanker ([email protected]) is an associate
on customer purchasing behavior and trust. In professor of computer science at Free University
may influence the long-term success Proceedings RecSys ‘11 (2011), 85–92. of Bozen-Bolzano, Italy.
17. Hensley, C.B. Selective dissemination of information
of a recommender system. The list of (SDI): State of the art in May, 1963. In Proceedings of
open issues and success factors is still AFIPS ‘63 (Spring), 1963, 257–262.
18. Hill, W., Stead, L., Rosenstein, M. and Furnas,
far from complete and new challenges G. Recommending and evaluating choices in a Copyright held by authors.
arise constantly that require further re- virtual community of use. In Proceedings Publication rights licensed to ACM. $15.00.
P. 113 P. 114
Technical
Perspective A Reconfigurable Fabric
FPGA Compute for Accelerating Large-Scale
Acceleration
Is First About Datacenter Services
By Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou,
Energy Efficiency Kypros Constantinides, John Demme, Hadi Esmaeilzadeh,
By James C. Hoe Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, Michael Haselman,
Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim,
Sitaram Lanka, James Larus, Eric Peterson, Simon Pope,
Aaron Smith, Jason Thong, Phillip Yi Xiao, and Doug Burger
Technical Perspective
To view the accompanying paper,
visit doi.acm.org/10.1145/2996864 rh
If I Could Only
Design One Circuit …
By Kurt Keutzer
NEURAL-INSPIRED COMPUTING MOD- Like many other machine learning a processor architecture for a neural
ELS have captured our imagination approaches, neural net development net accelerator and puts a particularly
from the very beginning of computer has two phases. The training phase is strong focus on efficiently supporting
science; however, victories of this ap- essentially an optimization problem the memory access patterns of neu-
proach were modest until 2012 when in which parameter weights of neural ral net computations. This includes
AlexNet, a “deep” neural net of eight net models are adjusted to minimize minimizing both on-chip and off-chip
layers, achieved a dramatic improve- the error of the neural net on its train- memory transfers. Other members of
ment on the image classification prob- ing set. This is followed by the imple- the DianNao family include DaDian-
lem. One key to AlexNet’s success was mentation or inference phase, in which Nao, ShiDianNao, and PuDianNao.
its use of the increased computational the resulting neural net is deployed in DaDianNao (big computer) focuses
power offered by graphics processing its target application, such as a speech on the challenges of efficiently com-
units (GPUs), and it’s natural to ask: recognizer in a cellphone. puting neural nets with one billion or
Just how far can we push the efficient Training neural nets is a highly more model parameters. ShiDianNao
computing of neural nets? distributed optimization problem in (vision computer) is further special-
Computing capability has advanced which interprocessor-communication ized to reduce memory access require-
with Moore’s Law over these last three costs quickly dominate local computa- ments of Convolutional Neural Nets,
decades, but integrated circuit design tional costs. On the other hand, the im- a neural net family that is used for
costs have grown nearly as fast. Thus, plementation of neural nets in embed- computer vision problems. While the
any discussion of novel circuit archi- ded applications, such as cellphones, number of problems solved by neural
tectures must be met with a sobering calls out for a special-purpose, energy- nets grows every week, some might
discussion of design costs. That said, efficient accelerator. Thus, if I could wonder: Is this a fundamental change
a neural net accelerator has two big cajole my circuit designer colleagues in the field, or will the pendulum
things going for it. First, it is a special- into designing only one circuit, it swing back to favor a broader range of
purpose accelerator. Since the end of would surely be a special-purpose, machine learning approaches? With
single-thread performance scaling due energy-efficient accelerator that is flex- the PuDianNao (general computer)
to power density issues, integrated cir- ible enough to provide efficient imple- architecture, the architects hedge
cuit architects have searched for clever mentations of the growing family of their bets on this question by provid-
ways to exploit the increasing transis- neural net models. This is the goal of ing an accelerator for more traditional
tor counts afforded by Moore’s Law DianNao (diàn nǎo, Chinese for com- machine learning algorithms.
without increasing power dissipation. puter, or, literally “electric brain”). Despite, or perhaps because of, Di-
This has led to a resurgence of special- The DianNao accelerator family anNao’s two Best Paper Awards, some
purpose accelerators that are able to comprehensively considers the prob- readers may think that building a neu-
provide 10–100x better energy efficien- lem of designing a neural net accel- ral network accelerator is just an aca-
cy than general-purpose processors erator, and the following paper shows demic enterprise. These doubts should
when accelerating their special func- a deep understanding of both neural be allayed by Google’s announcement
tions, and which consume practically net implementations and the issues in of the Tensor Processing Unit, a novel
no power when not in use. computer architecture that arise when neural network accelerator deployed
Second, a neural net accelerator can building an accelerator for them. Neu- in their datacenters. These processors
accelerate a broad range of applica- ral net models are evolving rapidly, and were recently used to help AlphaGo
tions. Deep neural nets have begun to a significant new neural network model win at Go. It may be quite some time
realize the promise that has intrigued is proposed every month, if not every before we learn of TPU’s architecture,
so many for so long: a single, neuron- week. Thus, a computer architect build- but details on the DianNao family are
inspired computational model that ing an accelerator for neural nets must only a page away.
offers superior results on a diverse va- be familiar with their variety. A special-
riety of problems. In particular, mod- izer-architecture that isn’t sufficiently Kurt Keutzer ([email protected]) is a professor
of electrical engineering and computer science at the
ern deep neural net models are win- flexible to accommodate a broad range University of California at Berkeley.
ning competitions in computer vision, of neural net models is certain to be-
speech recognition, and text analytics. come quickly outdated, wasting the ex-
Without exaggeration, the list of victo- tensive chip design effort.
ries achieved through the use of deep The DianNao family also engages
neural nets grows every week. the issues associated with building Copyright held by author.
Name Process (nm) Peak performance (GOP/s) Peak power (W) Area (mm2) Applications
DianNao 65 452 0.485 3.02 Neural networks
DaDianNao 28 5585 15.97 67.73 Neural networks
ShiDianNao 65 194 0.32 4.86 Convolutional neural networks
PuDianNao 65 1056 0.596 3.51 Seven representative machine
learning techniques
network techniques (e.g., deep learninga), and inherits the Figure 1. Accelerator architecture of DianNao.
broad application scope of neural networks.
Control Processor (CP)
2.1. Architecture
DMA
Inst.
Instructions
DianNao has the following components: an input buffer
for input neurons (NBin), an output buffer for output neu-
Tn
rons (NBout), and a third buffer for synaptic weights (SB), NFU-1 NFU-2 NFU-3
DMA
apses and neurons computations) which we call the Neural Inst.
Functional Unit (NFU), and the control logic (CP), see Figure 1.
Memory Interface
DMA
Inst.
Neural Functional Unit (NFU). The NFU implements a
Tn
functional block of Ti inputs/synapses and Tn output neurons,
which can be time-shared by different algorithmic blocks of NBout
Tnx Tn
neurons. Depending on the layer type, computations at the
NFU can be decomposed in either two or three stages. For
classifier and convolutional layers: multiplication of syn-
apses × inputs, additions of all multiplications, sigmoid. The SB
for (int nnn = 0; nnn < Nn ; nnn += Tnn ) { // tiling for output neurons
for (int iii = 0; iii < Ni ; iii += Tii ) { // tiling for input neurons
for (int nn = nnn; nn < nnn + Tnn; nn += Tn){
for (i n t n = n n; n < nn + T n ; n++)
sum[ n ] = 0 ;
for (i n t ii = ii i; ii < iii + T ii; i i + = T i )
for ( int n = nn ; n < n n + T n; n + +)
for ( int i = ii; i < i i + T i ; i+ + )
sum[n ] + = sy na p se [ n ][ i ] * n e u r o n [ i ] ;
f o r ( i n t n = n n ; n < n n + T n ; n++)
neuro n [ n ] = si g m o id ( sum[ n ] );
} } }
factor Tnn for output neurons partial sums. The layer memory Figure 3. Snapshot of DianNao’s layout.
behavior is now dominated by synapses. In a classifier layer,
all synapses are usually unique, and thus there is no reuse
within the layer. Overall, tiling drastically reduces the total
memory bandwidth requirement of the classifier layer, and
we observe a ∼50% reduction in our empirical study.5
Layer Nx Ny Kx Ky Ni No Description
CONV1 500 375 9 9 32 48 Street scene parsing (CNN)12 (e.g., identifying “building,” “vehicle,” etc.)
POOL1 492 367 2 2 12 –
CLASS1 – – – – 960 20
CONV2* 200 200 18 18 8 8 Detection of faces in YouTube videos (DNN),24 largest NN to date (Google)
CONV3 32 32 4 4 108 200 Traffic sign identification for car navigation (CNN)39
POOL3 32 32 4 4 100 –
CLASS3 – – – – 200 100
CONV4 32 32 7 7 16 512 Google Street View house numbers (CNN)38
CONV5* 256 256 11 11 256 384 Multi-Object recognition in natural images (DNN),16 winner 2012 ImageNet
POOL5 256 256 2 2 256 – competition
CONV, convolutional; POOL, pooling; CLASS, classifier.
Nx × Ny is the size of an input feature map, Kx × Ky is the size of a convolutional/pooling window, and Ni and No are numbers of input and output feature maps, respectively.
*
Indicates private kernels.
Figure 4. Speedups of DianNao over SIMD. energy cost of main memory accesses. We tried to artifi-
cially set the energy cost of the main memory accesses in
Speedup
both the SIMD and accelerator to 0, and we observed that
10000
the average energy reduction of the accelerator increases
DianNao over SIMD
by more than one order of magnitude, in line with previous
results.
100
We have also explored different parameter settings for
DianNao in our experimental study, where we altered the size
of NFU as well as sizes of NBin, NBout, and SB. For example,
1
we evaluated a design with Tn = 8 (i.e., the NFU only has 8 hard-
ware neurons), and thus 64 multipliers in NFU-1. We corre-
spondingly reduced widths of all buffers to fit the NFU. As a
0.01 result, the total area of that design is 0.85 mm2, which is 3.59x
smaller than for the case of Tn = 16.
1
n
S
NV
OL
NV
ea
S
NV
NV
OL
OL
NV
AS
AS
oM
PO
CO
PO
PO
CO
CO
CO
CO
CL
CL
Ge
S3
OL
NV
ea
NV
NV
OL
OL
NV
AS
AS
oM
PO
CO
PO
PO
CO
CO
CO
CO
CL
CL
Figure 6. DaDianNao architecture: tile-based organization of a node Figure 7. Snapshot of DaDianNao’s node layout.
(left) and tile architecture (right). HT0 PHY
HT0
Data Tile0 Tile1
Controller
Tile4 Tile5
to SB
HT2.0 (North Link)
HT3 PHY
Bank0 Bank1
tile tile tile tile
HT2
Controller Central Block HT3
Controller
eDRAM
HT2 PHY
NFU
tile tile
router tile tile 16 16 Tile8 Tile9 Tile12 Tile13
input output
SB SB
neurons neurons Tile10 Tile11 Controller
HT1 Tile14 Tile15
tile tile tile tile eDRAM eDRAM
Bank2 Bank3
HT2.0 (South Link) HT1 PHY
average; a 64-node system achieves a speedup of 450.65x classification of linearly separable data, complex neu-
over the NVIDIA K20M GPU and reduces energy by 150.31x ral networks can easily become over-fitting, and perform
on average.6 worse than a linear classifier. In application domains such
as financial quantitative trading, linear regression is more
4. SHIDIANNAO: A LOW-POWER ACCELERATOR FOR widely used than neural network due to the simplicity and
CONVOLUTIONAL NEURAL NETWORK interpretability of linear model.3 The famous No-Free-Lunch
DaDianNao targets at high-performance ML applications, theorem, though was developed under certain theoretical
and integrates eDRAMs in each node to avoid main mem- assumptions, is a good summary of the above situation: any
ory accesses. In fact, the same principle is also applicable to learning technique cannot perform universally better than
embedded systems, where energy consumption is a critical another learning technique.46 In this case, it is a natural idea
dimension that must be taken into account. In a recent study, to further extend DianNao/DaDianNao to support a basket
we focused on image applications in embedded systems, and of diverse ML techniques, and the extended accelerator will
designed a dedicated accelerator (ShiDianNao9) for a state- have much broader application scope than its ancestors do.
of-the-art deep learning technique called CNN.26 PuDianNao is a hardware accelerator accommodating
In a broad class of CNNs, it is assumed that each neuron seven representative ML techniques, that is, k-means, k-NN,
(of a feature map) shares its weights with other neurons, naive bayes, support vector machine, linear regression, clas-
making the total number of weights far smaller than in fully sification tree, and deep neural network. PuDianNao consists
connected networks. For instance, a state-of-the-art CNN has of several Functional Units (FUs), three data buffers (HotBuf,
60 millions weights21 versus up to 1 billion24 or even 10 billions ColdBuf, and OutputBuf), an instruction buffer (InstBuf),
for state-of-the-art deep networks. This simple property can a control module, and a DMA, see Figure 9. The functional
have profound implications for us: we know that the highest unit for machine learning (MLU) is designed to support sev-
energy expense is related to memory behaviors, in particular eral basic yet important computational primitives. As illus-
main memory (DRAM) accesses, rather than computation.40 trated in Figure 10, the MLU is divided into 6 pipeline stages
Due to the small memory footprint of weights in CNNs, it is (Counter, Adder, Multiplier, Adder tree, Acc, and Misc), and
possible to store a whole CNN within a small on-chip SRAM different combinations of selected stages collaboratively com-
next to the functional units, and as a result, there is no lon- pute primitives that are common in representative ML tech-
ger a need for DRAM memory accesses to fetch the CNN niques, such as dot product, distance calculations, counting,
model (weights) in order to process each input. sorting, nonlinear functions (e.g., sigmoid and tanh) and so
The absence of DRAM accesses combined with a care- on. In addition, there are some less common operations that
ful exploitation of the specific data access patterns within
CNNs allows us to design the ShiDianNao accelerator which Figure 9. Accelerator architecture of PuDianNao.
is 60× more energy efficient than the DianNao accelerator
(c.f., Figure 8). We present a full design down to the layout InstBuf HotBuf ColdBuf
at a 65 nm process, with an area of 4.86 mm2 and a power of
320 mW, but still over 30x faster than NVIDIA K20M GPU.
The detailed ShiDianNao architecture was presented at
MLU MLU
FUs MLU MLU
42nd ACM/IEEE International Symposium on Computer Control Module
Architecture (ISCA’15).9 ALU ALU ALU ALU
Bypass Bypass
Figure 8. Snapshot of ShiDianNao’s layout.
ACC
<
ò
NBin
ACC
k-Sorter
SB
NFU
ACC
<
NBout
IB
References
1. Cadambi, S., Durdanovic, I., Jakkula, V., Machines, 2009. FCCM’09 (2009)
Sankaradass, M., Cosatto, E., IEEE, 115–122.
Chakradhar, S., Graf, H.P. A massively 2. Chakradhar, S., Sankaradas, M.,
parallel fpga-based coprocessor Jakkula, V., Cadambi, S. A dynamically
for support vector machines. In configurable coprocessor for
17th IEEE Symposium on Field convolutional neural networks. In
Programmable Custom Computing International Symposium on Computer
Architecture (Saint Malo, France, June on Computer Architecture (New York, 30. Maeda, N., Komatsu, S., Morimoto, M., 39. Sermanet, P., LeCun, Y. Traffic
2010). ACM 38(3): 247–257. New York, USA, 2010). ACM, 38(3): Shimazaki, Y. A 0.41 µa standby sign recognition with multi-scale
3. Chan, E. Algorithmic Trading: Winning 37–47. leakage 32 kb embedded SRAM with convolutional networks. In
Strategies and Their Rationale. John 16. Hinton, G., Srivastava, N. Improving low-voltage resume-standby utilizing International Joint Conference on
Wiley & Sons, 2013. neural networks by preventing all digital current comparator in Neural Networks (July 2011). IEEE,
4. Chen, T., Chen, Y., Duranton, M., Guo, Q., co-adaptation of feature detectors. 28 nm hkmg CMOS. In International 2809–2813.
Hashmi, A., Lipasti, M., Nere, A., Qiu, S., arXiv preprint arXiv: …, 1–18, 2012. Symposium on VLSI Circuits 40. Stamoulias, I., Manolakos, E.S.
Sebag, M., Temam, O. BenchNN: On the 17. Hussain, H.M., Benkrid, K., Seker, H., (VLSIC), 2012. Parallel architectures for the
broad potential application scope of Erdogan, A.T. Fpga implementation of 31. Majumdar, A., Cadambi, S., Becchi, M., KNN classifier–design of soft IP
hardware neural network accelerators. k-means algorithm for bioinformatics Chakradhar, S.T., Graf, H.P. A cores and FPGA implementations.
In International Symposium on application: An accelerated approach massively parallel, energy efficient ACM Transactions on Embedded
Workload Characterization, 2012. to clustering microarray data. In 2011 programmable accelerator for Computing Systems (TECS) 13, 2
5. Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., NASA/ESA Conference on Adaptive learning and classification. ACM (2013), 22.
Chen, Y., Temam, O. Diannao: A small- Hardware and Systems (AHS) (2011). Trans. Arch. Code Optim. (TACO) 9, 1 41. Swanson, S., Michelson, K., Schwerin,
footprint high-throughput accelerator IEEE, 248–255. (2012), 6. A., Oskin, M. Wavescalar. In ACM/
for ubiquitous machine-learning. 18. Keckler, S. Life after Dennard and 32. Majumdar, A., Cadambi, S., IEEE International Symposium on
In International Conference on how I learned to love the Picojoule Chakradhar, S.T. An energy-efficient Microarchitecture (MICRO)
Architectural Support for Programming (keynote). In International Symposium heterogeneous system for embedded (Dec 2003). IEEE Computer
Languages and Operating Systems on Microarchitecture, Keynote learning and classification. Embedded Society, 291.
(ASPLOS), (March 2014). ACM 49(4): presentation, Sao Paolo, Dec. 2011. Systems Letters 3, 1 (2011), 42–45. 42. Temam, O. The rebirth of neural
269–284. 19. Kim, J.Y., Kim, M., Lee, S., Oh, J., Kim, 33. Manolakos, E.S., Stamoulias, I. IP- networks. In International
6. Chen, Y., Luo, T., Liu, S., Zhang, S., K., Yoo, H.-J.A. GOPS 496 mW real-time cores design for the KNN classifier. Symposium on Computer
He, L., Wang, J., Li, L., Chen, T., Xu, Z., multi-object recognition processor with In Proceedings of 2010 IEEE Architecture, (2010).
Sun, N., Temam, O. Dadiannao: A bio-inspired neural perception engine. International Symposium on Circuits 43. Temam, O. A defect-tolerant
machine-learning supercomputer. In IEEE Journal of Solid-State Circuits and Systems (ISCAS) (2010). IEEE, accelerator for emerging high-
ACM/IEEE International Symposium 45, 1 (Jan. 2010), 32–45. 4133–4136. performance applications. In
on Microarchitecture (MICRO) 20. Krizhevsky, A., Sutskever, I., Hinton, G. 34. Maruyama, T. Real-time k-means International Symposium on
(December 2014). IEEE Computer Imagenet classification with deep clustering for color images on Computer Architecture (Sep 2012).
Society, 609–622. convolutional neural networks. In reconfigurable hardware. In 18th Portland, Oregon, 40(3), 356–367.
7. Coates, A., Huval, B., Wang, T., Wu, D.J., Advances in Neural Information International Conference on 44. Vanhoucke, V., Senior, A., Mao, M.Z.
Ng, A.Y. Deep learning with cots HPC Processing Systems (2012), 1–9. Pattern Recognition (ICPR) (Aug Improving the speed of neural
systems. In International Conference 21. Krizhevsky, A., Sutskever, I., Hinton, G. 2006). IEEE, Volume 2, 816–819. networks on CPUs. In Deep Learning
on Machine Learning, 2013: 1337–1345. ImageNet classification with deep 35. Muller, M. Dark silicon and the and Unsupervised Feature Learning
8. Deng, J. Dong, W., Socher, R., Li, L.-J., convolutional neural networks. In internet. In EE Times “Designing Workshop (NIPS) (2011). Vol. 1.
Li, K., Fei-Fei, L. ImageNet: A large- Advances in Neural Information with ARM” Virtual Conference, 26, 45. Wang, G., Anand, D., Butt, N., Cestero, A.,
scale hierarchical image database. In Processing Systems (2012) 1–9. 70(2010), 285–288. Chudzik, M., Ervin, J., Fang, S.,
Conference on Computer Vision and 22. Larkin, D., Kinane, A., O’Connor, N.E. 36. Papadonikolakis, M., Bouganis, C. A Freeman, G., Ho, H., Khan, B., Kim, B.,
Pattern Recognition (CVPR) (2009). Towards hardware acceleration heterogeneous FPGA architecture for Kong, W., Krishnan, R., Krishnan, S.,
IEEE, 248–255. of neuroevolution for multimedia support vector machine training. In Kwon, O., Liu, J., McStay, K., Nelson, E.,
9. Du, Z., Fasthuber, R., Chen, T., Ienne, P., processing applications on mobile 2010 18th IEEE Annual International Nummy, K., Parries, P., Sim, J.,
Li, L., Luo, T., Feng, X., Chen, Y., devices. In Neural Information Symposium on Field-Programmable Takalkar, R., Tessier, A., Todi, R.,
Temam, O. Shidiannao: Shifting vision Processing (2006). Springer, Berlin Custom Computing Machines (FCCM) Malik, R., Stiffler, S., Iyer, S. Scaling
processing closer to the sensor. Heidelberg, 1178–1188. (May 2010). IEEE, 211–214. deep trench based EDRAM on SOI
In Proceedings of the 42nd ACM/ 23. Le, Q.V. Building high-level features 37. Qadeer, W., Hameed, R., Shacham, O., to 32 nm and beyond. In IEEE
IEEE International Symposium on using large scale unsupervised Venkatesan, P., Kozyrakis, C., International Electron Devices
Computer Architecture (ISCA’15) learning. In 2013 IEEE International Horowitz, M.A. Convolution engine: Meeting (IEDM) (2009). IEEE, 1–4.
(2015). ACM, 92–104. Conference on Acoustics, Speech and Balancing efficiency & flexibility 46. Wolpert, D.H. The lack of a priori
10. Esmaeilzadeh, H., Blem, E., Amant, R.S., Signal Processing (ICASSP) (2013). in specialized computing. In distinctions between learning
Sankaralingam, K., Burger, D. Dark IEEE, 8595–8598. International Symposium on algorithms. Neural Comput. 8, 7
silicon and the end of multicore 24. Le, Q.V., Ranzato, M.A., Monga, R., Computer Architecture, 2013). ACM, (1996), 1341–1390.
scaling. In Proceedings of the 38th Devin, M., Chen, K., Corrado, G.S., 41(3), 24–35. 47. Yeh, Y.-J., Li, H.-Y., Hwang, W.-J., Fang,
International Symposium on Computer Dean, J., Ng, A.Y. Building 38. Sermanet, P., Chintala, S., LeCun, Y. C.-Y. Fpga implementation of KNN
Architecture (ISCA) (June 2011). high-level features using large Convolutional neural networks classifier based on wavelet transform
IEEE, 365–376. scale unsupervised learning. In applied to house numbers digit and partial distance search. In Image
11. Esmaeilzadeh, H., Sampson, A., International Conference on Machine classification. In Pattern Recognition Analysis (June 2007). Springer Berlin
Ceze, L., Burger, D. Neural acceleration Learning, June 2012. (ICPR), …, 2012. Heidelberg, 512–521.
for general-purpose approximate 25. LeCun, Y., Bengio, Y., Hintion, G. Deep
programs. In Proceedings of the 2012 learning. Nature 521, 7553 (2015),
45th Annual IEEE/ACM International 436–444. Yunji Chen, Tianshi Chen, Zhiwei Xu, Olivier Temam ([email protected]),
Symposium on Microarchitecture 26. Lecun, Y., Bottou, L., Bengio, Y., and Ninghui Sun ({cyj, chentianshi, Inria Saclay, France.
(Dec 2012). IEEE Computer Society, Haffner, P. Gradient-based learning zxu, snh}@ict.ac.cn), SKL of Computer
449–460. applied to document recognition. Architecture, ICT, CAS, China.
12. Farabet, C., Martini, B., Corda, B., Proceedings of the IEEE 86 11
Akselrod, P., Culurciello, E., LeCun, Y. (1998), 2278–2324.
NeuFlow: A runtime reconfigurable 27. Li, S., Ahn, J.H., Strong, R.D.,
dataflow processor for vision. In CVPR Brockman, J.B., Tullsen, D.M.,
Workshop (June 2011). IEEE, 109–116. Jouppi, N.P. McPAT: An integrated
13. Farabet, C., Martini, B., Corda, B., power, area, and timing modeling
Akselrod, P., Culurciello, E., LeCun, Y. framework for multicore and
Neuflow: A runtime reconfigurable manycore architectures. In Proceedings
dataflow processor for vision. In 2011 of the 42nd Annual IEEE/ACM
IEEE Computer Society Conference International Symposium on
on Computer Vision and Pattern Microarchitecture, MICRO 42 (New
Recognition Workshops (CVPRW) York, NY, USA, 2009). ACM, 469–480.
(2011). IEEE, 109–116. 28. Liu, D., Chen, T., Liu, S., Zhou, J.,
14. Frery, A., de Araujo, C., Alice, H., Zhou, S., Teman, O., Feng, X., Zhou, X.,
Cerqueira, J., Loureiro, J.A., de Lima, M.E., Chen, Y. Pudiannao: A polyvalent
Oliveira, M., Horta, M., et al. machine learning accelerator.
Hyperspectral images clustering on In International Conference on
reconfigurable hardware using the Architectural Support for Programming
k-means algorithm. In Proceedings Languages and Operating Systems
of the 16th Symposium on Integrated (ASPLOS) (2015). ACM, 369–381.
Circuits and Systems Design, 2003. 29. Maashri, A.A., Debole, M., Cotter, M.,
SBCCI 2003 (2003). IEEE, 99–104. Chandramoorthy, N., Xiao, Y.,
15. Hameed, R., Qadeer, W., Wachs, M., Narayanan, V., Chakrabarti, C.
Azizi, O., Solomatnikov, A., Lee, B.C., Accelerating neuromorphic vision
Richardson, S., Kozyrakis, C., algorithms for recognition. In
Horowitz, M. Understanding sources Proceedings of the 49th Annual
of inefficiency in general-purpose Design Automation Conference
chips. In International Symposium (2012). ACM, 579–584. © 2016 ACM 0001-0782/16/11 $15.00
Technical Perspective
To view the accompanying paper,
visit doi.acm.org/10.1145/2996868 rh
T HE F O L LOW I N G PA PE R presents a re- formance improvements by the middle these abstraction overheads and at the
search deployment of Field Program- of the last decade. same time gains unrestricted optimiza-
mable Gate Arrays (FPGAs) in a Micro- In the present power-constrained tion freedom, in particular the ability
soft Bing datacenter. The FPGAs enabled design regime—whether dictated by to exploit all forms and granularity of
more efficient search processing directly the cooling and packaging of a micro- parallelism in a task. In return, FPGA
at the digital logic level. This pioneering processor or the air handling capacity computing places a great burden on ap-
work is the first to successfully demon- of a datacenter, plus nowadays actual plication developers to design at a low
strate at scale the idea of FPGAs as an ef- supply-side considerations in battery level of specificity to meet functional
fective large-scale, first-class component powered devices—the problem of get- and performance objectives. How to
in cloud computing. Following this land- ting more performance (“operations simplify application development, while
mark effort, Microsoft has launched full- per second”) requires a solution that retaining FPGA’s efficiency advantage, is
scale production deployment of FPGA can somehow do so while expending an important and challenging problem.
accelerators in its new datacenter servers less “energy per operation.” Mapping computations to ASICs has
for a range of cloud services. Parallelism provided the first solu- the same benefits as FPGAs. In fact, an
For most of its 30-year existence, tion to running faster while using less ASIC implementation can be significant-
FPGA technology had primarily served energy in each step. As a rule of thumb, ly more energy efficient than an FPGA
as an alternative to application-specific it takes proportionally more power to implementation due to the overhead in
integrated circuits (ASICs), with only increase the performance of a sequen- making the FPGAs’ logic substrate repro-
niche applications in computing. To- tial task. Therefore, when parallelism grammable. But reprogrammability is
day, besides Microsoft’s activities, we is available, one could use parallel ex- very important to computing. As argued
are also finding Intel and IBM adding ecution to reduce power instead of for in the following paper, besides the utility
FPGAs as programmable computing speedup. In an illustrative example, of repurposing a hardware investment
substrates to their product lines. At given design A with throughput per- over its multiyear ownership, a particu-
the root of the computing industry’s formance ThroughputA at PowerA and lar accelerated task of interest can evolve
current embrace of FPGAs is the same a lower-performing design B with too quickly to be committed into ASIC
Power Wall struggle that brought about ThroughputB = ThroughputA/N and development time and cost scales.
the transition to multicore micropro- PowerB < PowerA/N, we can use N cop- Finally, it should be noted that FPGA
cessors in the last decade. ies of B to complete the same number computing is not the answer in every
In the decades prior, single-thread- of tasks per second as A at less power, scenario or across all trade-offs. Ulti-
ed microprocessors enjoyed a regular and we can use >N copies of B to exceed mately, the quest for efficiency is a mat-
doubling of compute performance with the throughput of A at the same PowerA. ter of using the right tool for the job.
each new Very Large Integrated Circuits Both multicore microprocessors and Driven by the contradicting needs
(VLSI) scaling generation, taking advan- GPGPUs are practitioners of this prin- for more performance and better en-
tage of the more numerous and faster ciple. FPGA computing can take this to ergy efficiency, parallel computing in
transistors. However, each new genera- further extremes, delivering high over- the form of multicore microprocessors
tion of faster microprocessors had also all performance using a sea of individu- seemingly exploded into the commer-
required more power. The Power Wall ally slow processing elements that are cial mainstream in the last decade.
is less about supplying power but more very energy efficient per operation. This With the Catapult effort, perhaps we
about removing the resulting heat fast route to energy efficiency is of course are again seeing the start of something
enough. For set upper bounds in cost, only applicable when the computing equally transformative in the pursuit
weight, size, and noise of the cooling ap- task of interest is amenable to a high de- of still higher performance and energy
paratus, there is a limit to how fast heat gree of parallelization. efficiency. As we watch the current ex-
can be extracted from a microproces- Relative to microprocessors, FPGA citing developments unfold across the
sor die. Microprocessors remained well computing has another source of energy computing industry, we should also
below market-set economical cooling efficiency. The majority of the power in recognize the multiple decades of prior
limits until the 1990s. Despite the best a microprocessor is consumed by the work that led up to this pivotal time.
concerted efforts from software down overhead in presenting the simplifying
to material science to reign in the power von Neumann abstraction to the pro- James C. Hoe ([email protected]) is a professor of electrical
and computer engineering at Carnegie Mellon University,
increase in the ensuing years, single- grammer and in ensuring good perfor- Pittsburgh, PA.
threaded microprocessors ran out of mance across a wide range of program
cooling headroom to sustain their per- behaviors. FPGA computing avoids Copyright held by author.
are typically developed in a hardware description language application development requires flexibility and robust-
(HDL) such as VHDL or Verilog; these designs are compiled ness. And the physical constraints and uptime requirements
down to an FPGA program (called a bitstream), similar to how make it largely impractical to modify or upgrade the hard-
a software language is compiled into a software executable. ware after initial delivery.
However, specifying a design in HDL is significantly more To succeed in the datacenter environment, an FPGA-
complex than for a typical software language, and compilation based reconfigurable fabric must (at a minimum) meet the
may take hours to complete. Once the bitstream is generated, following requirements:
it can be loaded into an FPGA in a matter of seconds, configur-
ing the FPGA to implement the desired computation. • preserve server homogeneity to avoid complex manage-
FPGAs cover a middle-ground in performance/efficiency ment of heterogeneous servers,
and flexibility compared to fully custom ASICs and general- • scale to large workloads that might not fit into a single
purpose CPUs. General-purpose CPUs are the most flexible FPGA,
platform, capable of implementing any application. But this • avoid consuming too much power or network
flexibility generally comes at the cost of a 100× or greater bandwidth,
reduction in performance and energy efficiency compared • avoid single points of failure, and have minimal impact
to an ASIC designed for the same task. However, when the on reliability,
application changes, the CPU can still run the new applica- • provide positive return on investment (ROI), and
tion by simply recompiling the software. When the target • operate within the space and power confines of existing
application for an ASIC changes, it likely requires designing servers.
a new chip—a process which typically takes months and car-
ries huge development costs. These requirements guided the architectural choices we
FPGAs combine aspects of both CPUs and ASICs. They can made throughout the Catapult system development.
be reprogrammed as applications change to provide CPU-
like flexibility, but the program produces hardware which is 3.1. Integration
specialized to the application, providing performance and How to integrate FPGAs into the datacenter is perhaps the
power efficiency closer to ASICs. The generic FPGA logic can most important consideration when designing a recon-
be configured to exploit huge amounts of fine-grained par- figurable fabric. We investigated a variety of approaches
allelism and can implement very complex pipelined struc- which can be roughly broken into two categories based on
tures that can be several orders-of-magnitude faster and how they integrate with conventional servers: “networked”
lower power than their software equivalents. and “integrated.” Networked designs add FPGAs to special
Modern FPGAs provide millions of gates of random logic FPGA-enabled servers, and arrange the servers either as
and can include multiple megabytes of internal storage that entire racks of specialized servers or embedding some num-
supports thousands of reads and writes on every clock cycle. ber of specialized servers in otherwise conventional racks.
These chips also incorporate smaller fully custom blocks, Integrated designs add FPGAs directly inside the conven-
such as complex embedded arithmetic elements (DSP tional servers, requiring no specialized servers and no net-
blocks), high-speed I/O, IP protocol blocks, and may include work communication to reach the FPGA.
complete microprocessors as well, either as hardened sub- Networked designs have been developed and deployed
systems or by mapping the microprocessor logic into the in High Performance Computing (HPC) environments.
programmable logic of the chip. While HPC systems are not subject to the same scale, cost,
FPGAs have been available for several decades and they power, and homogeneity constraints as the datacenter, they
have huge potential for accelerating a wide variety of appli- are one place where integrating FPGAs with CPUs has seen
cations. Yet despite this promise, FPGAs have not been some success. Entire large systems have been built using
widely deployed as compute accelerators, even in datacen- only specialized servers, including the Cray XD-1,7 Novo-G,11
ter environments where their potential for flexibility, power and QP.18 Examples of specialized servers that could be inte-
efficiency, and performance should make them extremely grated with racks of conventional datacenter servers include
attractive. the Convey HC-2,6 Maxeler MPC series,17 BeeCube BEE4,4
The Catapult architecture described here unlocks the and SRC MAPstation.19 In fact, embedding a few specialized
potential of FPGAs in the datacenter. To achieve this, the servers into each rack is the approach that we first took early
architecture has to be distributed, scalable, robust, and in the project. However, as we learned from our first proto-
work on real-world datacenter-scale workloads. In the fol- type, the networked design approach is inappropriate for
lowing sections, we describe the architecture, and show datacenter use for several reasons.
how production workloads can finally unlock the potential First, specialized racks and servers are single points of
efficiency and performance gains promised for so long by failure, where the failure of the specialized nodes impacts
FPGAs. many dependent conventional servers—amplifying the
impact to uptime and overall service reliability.
3. CATAPULT ARCHITECTURE Second, specialized racks and servers require separate
Datacenters are a challenging environment for any new cooling and power provisioning, as well as different soft-
technology to succeed. The scale alone demands extremely ware, firmware, and spare parts provisioning, making man-
high reliability, efficiency, and low cost. The rapid pace of agement and maintenance more difficult.
Role Flash
4
QSPI fabric in a production datacenter. The deployment con-
(RSU) Config
Flash sisted of a total of 1632 machines, that were organized into
JTAG in 17 server racks. Each server uses two 12-core Intel Xeon
Host 8 x8 PCIe LEDs CPUs, 64 Gbytes of DRAM, and two solid-state drives (SSDs)
CPU Core Application
Temp in addition to four hard-disk drives. The machines have
DMA
Sensors a standard 10-Gbit Ethernet network card connected to a
Engine I2C 48-port top-of-rack switch, which in turn connects to the
xcvr broader datacenter network. The FPGA daughter cards and
Inter-FPGA Router
reconfig cable assemblies were tested at manufacture and again at
SEU system integration. At deployment, we discovered that seven
North South East West cards (0.4%) had a hardware failure and that one of the 3264
SLIII SLIII SLIII SLIII links (0.03%) in the cable assemblies was defective. Since
2 2 2 2
then, after several months of operation, we have seen no
additional hardware failures.
Spare FFE 0
Hit vector
preprocessing
FSM
Scoring 2 Requests & Scoring
CPU 6 FFE 1 CPU 2
Responses
Feature-
Scoring 1 Compress
gathering
network
Scoring 0
CPU 5 CPU 3
(such as adding two features) to large and complex (with 4.5. Parallelism
thousands of operations including conditional execution To overcome the slower clock frequency of FPGAs rela-
and complex floating-point operators such as ln, pow, and tive to CPUs and GPUs, each of the scoring stages takes
fpdiv). FFEs vary greatly across models, making it impracti- advantage of two forms of parallelism that are not easily
cal to synthesize customized datapaths for each expression. handled by the other architectures. First, each processing
One potential solution is to tile many off-the-shelf soft stage described here is configured with deep pipelines that
processor cores (such as Nios II1), but these single-threaded match the amount of pipeline parallelism available in the
cores are inefficient at processing thousands of threads application.
with long-latency floating-point operations in the desired Second, FE and FFE exhibit multiple instruction single
amount of time per macropipeline stage (8 ms). Instead, we data (MISD) parallelism, a cousin of the more commonly
developed a custom multicore processor with massive multi- known single instruction multiple data (SIMD) parallelism
threading and long-latency operations in mind. The result is exploited by GPUs and the vector processing units in CPUs.
the FFE processor shown in Figure 7. The FFE microarchi- A single source of data (the document) is operated on by a
tecture is highly area efficient, letting us instantiate 60 cores very large number of independent instruction streams (fea-
on a single FPGA. ture extractors and free form expressions for FE and FFE,
The custom FFE processor has three key characteris- respectively). SIMD architectures require the opposite—a
tics that make it capable of executing all of the expressions large number of independent data elements operated on by
within the required deadline. First, each core supports four the same instruction stream.
simultaneous threads that arbitrate for functional units on While SIMD architectures can efficiently process applica-
a cycle-by-cycle basis. When one thread is stalled on a long tions with MISD parallelism by batching many sets of data
operation such as a floating-point divide or natural log oper- together, this comes at the cost of increased latency, which
ation, other threads continue to make progress. All func- is often prohibitive in interactive cloud applications such as
tional units are fully pipelined, so any unit can accept a new web search. As such, web ranking is an example of a cloud
operation on each cycle. application that FPGAs can accelerate more effectively other
Second, rather than fair thread scheduling, threads are parallel processing architectures.
statically prioritized using a priority encoder. The assembler
maps the expressions with the longest expected latency to 5. EVALUATION
thread slot 0 on all cores, then fills in slot 1 on all cores, and We evaluated the Catapult fabric by deploying and mea-
so forth. Once all cores have one thread in each thread slot, suring the described Bing ranking engine on a bed of 1632
the remaining threads are appended to the end of previously servers with FPGAs. Six hundred and seventy-two ran the
mapped threads, starting again at thread slot 0. ranking service, and the other machines ran the selection
Third, the longest-latency expressions are split across service to feed documents and queries to the ranking serv-
multiple FPGAs. An upstream FFE unit can perform part of ers. We compare the average and tail latency distributions
the computation and produce an intermediate result called of Bing’s production-level ranker running with and without
a metafeature. These metafeatures are sent to the down- FPGAs on that bed.
stream FFEs like any other feature, effectively replacing that User experience is dictated determined more by tail
part of the expression computation with a simple feature latencies rather than average latencies—users care little
read. if their search results come back faster than expected, but
they become unhappy quickly if the results are slower than
4.4. Document scoring expected. As such, we report performance at the latency of
The last stage of the pipeline is a machine-learned model the 95th percentile of queries—the time at which only 5%
evaluator that takes the features and FFEs as inputs and pro- of queries are slower. Performance results are very simi-
duces a single floating-point score. This score is sent back lar for average latency queries (50th percentile), and are
to the search software, and all of the resulting scores for the even better for higher tail latencies (99th percentile and
query are sorted and returned to the user in sorted order as 99.9th percentile), which have the biggest impact on user
the search results. experience.
Figure 8 illustrates how the FPGA-accelerated ranker
substantially reduces the end-to-end scoring latency and
Figure 7. Free-form expressions (FFEs) placed and routed on an FPGA.
Sixty cores fit on a single FPGA. improves throughput relative to software. There are two ways
to view the performance improvements on this graph. First,
for a fixed point on the x-axis, it shows the improvement in
throughput at that specified query latency. For example, at
Cluster Core 0 Core 1 Core 2 1.0 (which represent the maximum acceptable latency to
Bing at the 95th percentile), the FPGA achieves a 95% gain in
Output
0 FST Complex
scoring throughput relative to software.
Core 3 Core 4 Core 5 Second, for a fixed point on the y-axis, it shows the
improvement in response time at a given throughput. At 1.0,
representing the average query load on a server, the FPGA
reduces query latency by 29%. The improvement in FPGA
FPGA
4 the SIMD parallelism that GPUs handle so efficiently are
Software not a good match for latency-sensitive, but highly diver-
3 95% more gent ranking stages (such as FE). In addition, the high
throughput
2 power consumption of GPUs meant that they couldn’t
be easily incorporated into conventional servers, which
1 only have power and cooling provisioning for a standard
29% lower latency
0 25W PCIe card. Instead, they are likely more appropriate
0 0.5 1 1.5 2 for HPC environments rather than widespread datacen-
Latency (normalized to 95th percentile target) ter deployment.
We conclude that distributed reconfigurable fabrics are a
viable path forward as increases in server performance level
scoring latency increases further at higher injection rates, off, and will be crucial at the end of Moore’s law for contin-
because the variability of software latency increases at ued cost and capability improvements in cloud computing.
higher loads (due to contention in the CPU’s memory hier- Reconfigurability is a critical means by which hardware
archy), whereas the FPGA’s performance remains stable. acceleration can keep pace with the rapid rate of change in
This improved stability means that the FPGA is capable of datacenter services.
absorbing bursts of traffic better than software alone, which Going forward, the biggest obstacle to widespread
may reduce the need for overprovisioning for bursty traffic. adoption of FPGAs in the datacenter is likely to be pro-
Given that FPGAs can be used to improve both latency grammability. FPGA development still requires exten-
and throughput, Bing could reap the benefits in many sive hand-coding in Register Transfer Level and manual
ways. For example, for equivalent ranking capacity, fewer tuning. Yet we believe that incorporating careful HW/
servers can be purchased. At the current average query SW co-design of custom ISAs such as those used in FE
rate, Bing could use roughly half the number of servers and FFE, domain-specific languages such as OpenCL,
and still achieve their performance targets while achiev- FPGA-targeted C-to-gates tools, and libraries of reusable
ing massive cost savings. As another example, the faster components and design patterns, will be sufficient to
response time means that additional capabilities and fea- permit high-value services to be productively targeted
tures can be added to the software and/or hardware stack to FPGAs. Longer term, improvements in FPGA archi-
to improve the quality of searches without exceeding the tectures for computing, more integrated development
maximum allowed latency. Of course, a combination of tools, and the design of languages and tools that con-
the two is also possible. sider accelerator offload as a core functionality will be
necessary to increase the programmability of these fab-
6. CONCLUSION rics beyond teams of specialists working with large-scale
For many years, FPGAs have shown promise for accelerating service developers. Within 10–15 years, well past the end
many computational tasks. Yet despite the huge potential, of Moore’s Law, we believe that compilation to a combi-
they have not yet become mainstream in modern datacen- nation of hardware and software will be commonplace.
ters. Our goal in building the Catapult fabric was to under- Reconfigurable systems, such as the Catapult fabric pre-
stand what problems must be solved to operate FPGAs at sented here, will be necessary to support these hybrid
datacenter scale, and whether significant performance computation models.
improvements are achievable for large-scale production
workloads, especially workloads that change over the life- Acknowledgments
time of the servers. Many people across many organizations contributed to this
We found that efficiently mapping a significant portion system’s construction, and although they are too numerous
of a complex datacenter workload to FPGAs is both possible to list here individually, we thank our collaborators in
and provides a significant ROI. We showed that an at-scale Microsoft Global Foundation Services, Bing, the Autopilot
deployment of FPGAs can increase ranking throughput in team, and our colleagues at Altera and Quanta for their
a production search infrastructure by 95% at comparable excellent partnership and hard work. We thank Reetuparna
latency to a software-only solution, making possible both Das, Ofer Dekel, Alvy Lebeck, Neil Pittman, Karin Strauss,
cost savings with fewer servers needed and headroom for and David Wood for their valuable feedback and contribu-
improved search algorithms. We achieved this without tions. We also thank Qi Lu, Harry Shum, Craig Mundie, Eric
breaking the homogeneous architecture of data-center serv- Rudder, Dan Reed, Surajit Chaudhuri, Peter Lee, Gaurav
ers, and without increasing the server failure rate. The added Sareen, Darryn Dieken, Darren Shakib, Chad Walters,
FPGA boards increased power consumption by only 10%, did Kushagra Vaid, and Mark Shaw for their support.
Subject Areas
Baylor University candidates working on their dissertation with re- are expected to develop a vibrant, high-quality ex-
Assistant, Associate or Full Professor of search interests in computer science or computer ternally sponsored research program, supervise
Computer Science in the area of Software information systems. The Tenure Track Lecturer graduate students, and interact and collaborate
Engineering position requires an MS in Computer Science or a with faculty across the department and campus.
closely related field. Please visit www.bradley.edu/ Applicants must submit (i) a cover letter, (ii)
The Department of Computer Science seeks a dy- humanresources/opportunities for full position current curriculum vita, (iii) statement of research
namic scholar to fill this position beginning August, description and application process. interests, and (iv) statement of teaching interests
2017. For position details and application informa- and (v) arrange to have at least three references.
tion please visit: www.baylor.edu/hr/facultypositions.
Baylor University is a private Christian univer- Cal Poly State University, Application materials may be sent to:
sity and a nationally ranked research institution, San Luis Obispo Faculty Search Committee
consistently listed with highest honors among The Computer Science Department of Electrical
Chronicle of Higher Education’s “Great Colleges The Electrical Engineering Department and Engineering and Computer Science
to Work For.” Chartered in 1845 by the Republic of Computer Engineering Program within the Col- Case Western Reserve University
Texas through the efforts of Baptist pioneers, Bay- lege of Engineering at Cal Poly San Luis Obispo 10900 Euclid Avenue
lor is the oldest continuously operating university invite applications for a full-time, academic year, Cleveland, OH 44106-7071
in Texas. The university provides a vibrant cam- tenure-track faculty appointment in electrical
pus community for over 15,000 students from all and computer engineering. Salary is commensu-
50 states and more than 80 countries by blending rate with background and experience. California State University,
interdisciplinary research with an international Sacramento, Department of
reputation for educational excellence and a faculty Computer Science.
commitment to teaching and scholarship. Baylor California Institute of Technology
is actively recruiting new faculty with a strong com- (Caltech) Two tenure-track assistant professor positions to
mitment to the classroom and an equally strong begin with the Fall 2017 semester. Applicants spe-
commitment to discovering new knowledge as we The Computing and Mathematical Sciences (CMS) cializing in any area of computer science will be
pursue our bold vision, Pro Futuris. department at the California Institute of Technology considered. Those with expertise in areas related
Baylor University is a private not-for-profit uni- (Caltech) invites applications for tenure-track or ten- to embedded systems, software engineering, or
versity affiliated with the Baptist General Conven- ured faculty positions. CMS is a unique environment data science are especially encouraged to apply.
tion of Texas. As an Affirmative Action/Equal Oppor- where innovative, interdisciplinary, and foundation- Ph.D. in Computer Science, Computer Engineer-
tunity employer, Baylor is committed to compliance al research is conducted in a collegial atmosphere. ing, or closely related field required by August
with all applicable anti-discrimination laws, includ- Candidates in all areas of computing and math- 2017. For detailed position information, includ-
ing those regarding age, race, color, sex, national ematical sciences are invited to apply, including (but ing application procedure, please see http://www.
origin, marital status, pregnancy status, military not limited to) learning and computational statis- csus.edu/about/employment/. Screening will
service, genetic information, and disability. As a reli- tics, security and privacy, networked and distributed begin January 10, 2017, and remain open until
gious educational institution, Baylor is lawfully per- systems, optimization and computational math- filled. AA/EEO employer. Clery Act statistics avail-
mitted to consider an applicant’s religion as a selec- ematics, control and dynamical systems, theory of able. Mandated reporter requirements. Criminal
tion criterion. Baylor encourages women, minorities, computation and algorithmic economics, scientific background check will be required.
veterans and individuals with disabilities to apply computing, etc. Additionally, we are seeking candi-
dates who have demonstrated strong connections to
other fields, including the mathematical, physical, Dartmouth College
Boston College biological, and social sciences.
A commitment to world class research, high- The Dartmouth College Department of Com-
The Boston College Computer Science Department quality teaching, and mentoring is expected. The puter Science invites applications for a tenured
invites applications for a tenure-track Assistant initial appointment at the Assistant-Professor faculty position at the level of associate or full
Professorship starting September 2017. A PhD in level is for four years and is contingent upon the professor. We seek candidates who will be excel-
Computer Science, research conducive to sustained completion of a Ph.D. degree in Computer Sci- lent researchers and teachers in the broad range
external funding, and a commitment to quality in ence, Applied Mathematics or related field. of areas related to cyber-security. This position is
undergraduate teaching are required. Interest in in- Applicants are encouraged to have all their ap- the first of three hires that the College anticipates
terdisciplinary collaboration with broader impact is plication materials on file by October 21st, 2016, making in the area of cyber-security. We particu-
desirable. See http://cs.bc.edu for more information. but applications will be accepted until the end of larly seek candidates who will help lead, initiate,
Application review begins October 15, 2016. Submit December. For a list of documents required and and participate in collaborative research projects
applications at http://apply.interfolio.com/37623 full instructions on how to apply on-line, please within Computer Science and beyond, including
visit http://www.cms.caltech.edu/search. Ques- Dartmouth researchers from other Arts & Sci-
tions about the application process may be di- ences departments, Geisel School of Medicine,
Bradley University rected to: [email protected]. Thayer School of Engineering, and Tuck School
The Department of Computer Science and Caltech is an Equal Opportunity/Affirmative of Business.
Information Systems at Bradley Action Employer. Women, minorities, veterans, The Computer Science department is home
University invites applications for a Tenure and disabled persons are encouraged to apply. to 21 tenured and tenure-track faculty members
Track Assistant Professor or and two research faculty members. Research ar-
eas of the department encompass the areas of
Tenure Track Lecturer position for the 2017-2018 Case Western Reserve University security, computational biology, machine learn-
academic year. The Tenure Track Assistant Pro- ing, robotics, systems, algorithms, theory, digital
fessor position requires a PhD in Computer Sci- Applicants should have potential for excellence arts, vision, and graphics. The Computer Science
ence or a closely related field; we will consider in innovative research. All successful candidates department has strong Ph.D. and M.S. programs
and outstanding undergraduate majors. The ed status. Applications by members of all under- (on the Vermont border). Dartmouth has a beau-
department’s security faculty are affiliated with represented groups are encouraged. tiful, historic campus, located in a scenic area on
Dartmouth’s Institute for Security, Technology, Application review will begin November 1, the Connecticut River. Recreational opportuni-
and Society (ISTS), which also involves faculty 2016, and continue until the position is filled. ties abound in all four seasons.
from Engineering, Sociology, and Business. We seek candidates who have a demonstrated
Dartmouth College, a member of the Ivy ability to contribute to Dartmouth’s undergradu-
League, is located in Hanover, New Hampshire Dartmouth College ate diversity initiatives in STEM research, such as
(on the Vermont border). Dartmouth has a beau- the Women in Science Program, E. E. Just STEM
tiful, historic campus, located in a scenic area on The Dartmouth College Department of Computer Scholars Program, and Academic Summer Un-
the Connecticut River. Recreational opportuni- Science invites applications for a tenure-track fac- dergraduate Research Experience (ASURE). We
ties abound in all four seasons. ulty position at the level of assistant professor. We are especially interested in applicants with a
We seek candidates who have a demonstrated seek candidates who will be excellent researchers demonstrated track record of successful teaching
ability to contribute to Dartmouth’s undergradu- and teachers in the area of machine learning. We and mentoring of students from all backgrounds
ate diversity initiatives in STEM research, such as particularly seek candidates who will help lead, (including first-generation college students, low-
the Women in Science Program, E. E. Just STEM initiate, and participate in collaborative research income students, racial and ethnic minorities,
Scholars Program, and Academic Summer Un- projects within Computer Science and beyond, women, LGBTQ, etc.).
dergraduate Research Experience (ASURE). We including Dartmouth researchers from other Arts Applicants are invited to submit application
are especially interested in applicants with a & Sciences departments, Geisel School of Medi- materials via Interfolio at https://apply.interfolio.
demonstrated track record of successful teaching cine, Thayer School of Engineering, and Tuck com/37189. Upload a CV, research statement, and
and mentoring of students from all backgrounds School of Business. teaching statement, and request at least four ref-
(including first-generation college students, low- The Computer Science department is home erences to upload letters of recommendation, at
income students, racial and ethnic minorities, to 21 tenured and tenure-track faculty members least one of which should comment on teaching.
women, LGBTQ, etc.). and two research faculty members. Research ar- Email [email protected] with
Applicants are invited to submit a cover let- eas of the department encompass the areas of any questions.
ter and CV via Interfolio at http://apply.interfolio. security, computational biology, machine learn- Dartmouth College is an equal opportunity/
com/36691. ing, robotics, systems, algorithms, theory, digital affirmative action employer with a strong com-
Email [email protected] with arts, vision, and graphics. The Computer Science mitment to diversity and inclusion. We prohibit
any questions. department is in the School of Arts & Sciences, discrimination on the basis of race, color, reli-
Dartmouth College is an equal opportunity/ and it has strong Ph.D. and M.S. programs and gion, sex, age, national origin, sexual orientation,
affirmative action employer with a strong com- outstanding undergraduate majors. The depart- gender identity or expression, disability, veteran
mitment to diversity and inclusion. We prohibit ment is affiliated with Dartmouth’s M.D.-Ph.D. status, marital status, or any other legally protect-
discrimination on the basis of race, color, reli- program and has strong collaborations with Dart- ed status. Applications by members of all under-
gion, sex, age, national origin, sexual orientation, mouth’s other schools. represented groups are encouraged.
gender identity or expression, disability, veteran Dartmouth College, a member of the Ivy Application review will begin January 1, 2017,
status, marital status, or any other legally protect- League, is located in Hanover, New Hampshire and continue until the position is filled.
The School of Computer and Communication Sciences at The following documents are requested in PDF format: cover
EPFL invites applications for faculty positions in computer letter, curriculum vitae including publication list, brief state-
and communication sciences. We are seeking candidates for ments of research and teaching interests, names and addresses
tenure-track assistant professor as well as for senior positions. (including e-mail) of 3 references for junior positions and 6 for
senior positions. Screening will start on December 1, 2016.
Successful candidates will develop an independent and crea-
tive research program, participate in both undergraduate and Further questions can be addressed to:
graduate teaching, and supervise PhD students.
Prof. Arjen Lenstra
The school is seeking candidates in the fields of data science Chairman of the Recruiting Committee
and machine learning – including application of these tech- Email: [email protected]
niques in bioinformatics, natural language processing, and
speech recognition – and in security and privacy, and biocom- For additional information on EPFL, please consult:
puting. Candidates in other areas will also be considered. http://www.epfl.ch or http://ic.epfl.ch
of Computer Science
high University to start in August 2017. Outstand-
ing candidates in all areas of computer science
will be considered, with priority areas including
computer systems (including parallel and distrib-
→ The Department of Computer Science (www.inf.ethz.ch) at ETH Zurich invi-
uted systems, database systems, operating sys-
tems, and systems aspects of data mining), data tes applications for assistant professorships (tenure track) with focus on the
analytics, cybersecurity, algorithms, and perva- following broad areas within computer science. For each area, several possible
sive intelligence (robotics, the Internet of Things,
and human-computer interaction). Rank will be
examples (not exhaustive) of expertise are provided.
commensurate with experience.
The successful applicants will hold a Ph.D. in - Programming Languages and Software Engineering (language design and
Computer Science, Computer Engineering, or a implementation, testing and debugging, compilers and language runtimes,
closely related field. The candidates must demon-
programming models, dynamic languages)
strate a strong commitment to quality undergrad-
uate and graduate education, and the potential
- Robotics and Cyber-physical Systems (artificial intelligence, human-robot
to develop and conduct a high-impact research
program with external support. The successful interaction, planning and control, virtual/augmented reality, internet of
applicants will also be expected to contribute to things, embedded systems, data acquisition systems)
interdisciplinary research programs, including
the Data X Initiative (http://lehigh.edu/datax), - Data Science (machine learning, language/media processing, data privacy,
which includes not only data analytics, but also
medical applications, data centers architecture and management, program-
the underlying algorithms and systems that make
large-scale data analytics possible. ming and runtime platforms for data centers and cloud computing)
The faculty in Computer Science and Engi-
neering maintains an outstanding international - All other areas in Computer Science (while there is focus on the three areas
reputation in a variety of research areas, and in- above, ETH Zurich is broadly looking in all areas)
cludes ACM and IEEE fellows as well as five NSF
CAREER award winners. In academic year 2017-
18, the department will move to a new home in a → Please only apply for one of the above four areas as all applications will be
large former industrial building now undergoing jointly reviewed.
a major renovation to create a spectacular collab-
orative space for the Data X Initiative. → Applicants should be strongly rooted in computer science, have internatio-
Applications are accepted online at http://ac-
ademicjobsonline.org/ajo/jobs/7774 and should nally recognized expertise in their field and pursue research at the forefront of
include a cover letter, curriculum vita, both computer science. Successful candidates should establish and lead a strong
teaching and research statements, and contact
research program. They will be expected to supervise doctoral students and
information for at least three references. Review
of applications will begin December 1, 2016 and teach both undergraduate and graduate level courses (in German or in English).
will continue until the position is filled. Collaboration in research and teaching is expected both within the department
Lehigh University is an affirmative action/
and with other groups of ETH Zurich and related institutions.
equal opportunity employer and does not dis-
criminate on the basis of age, color, disabil-
ity, gender, gender identity, genetic information, → Assistant professorships have been established to promote the careers of
marital status, national or ethnic origin, race, younger scien¬tists. ETH Zurich implements a tenure track system equivalent
religion, sexual orientation, or veteran status. Le-
to other top international universities. For candidates with exceptional research
high University is a 2010 recipient of an NSF AD-
VANCE Institutional Transformation Grant for accomplishments, applications for a tenured associate or full professorship will
promoting the careers of women in academic sci- also be considered.
ence and engineering. Lehigh University provides
comprehensive benefits including domestic → Please apply online (application period starts on 31 October 2016) at:
partner benefits (see also http://www.lehigh.edu/
worklifebalance/). Lehigh Valley Inter-regional www.facultyaffairs.ethz.ch
Networking & Connecting (LINC) is a newly cre-
ated regional network of diverse organizations → Applications include a curriculum vitae, a list of publications with the three
designed to assist new hires with dual career, most important ones marked, a statement of future research and teaching
community and cultural transition needs. Please
contact [email protected] for more informa- interests, the names of three references, and a description of the three most
tion. Questions concerning this search may be important achievements. The letter of application should be addressed to the
sent to [email protected].
President of ETH Zurich, Prof. Dr. Lino Guzzella. The closing date for appli-
cations is 15 December 2016. ETH Zurich is an equal opportunity and family
Georgia Institute of Technology friendly employer and is further responsive to the needs of dual career couples.
We specifically encourage women to apply.
Computational Science and Engineering solves
real-world problems in science, engineering,
health, and social domains, by using high-
performance computing, modeling and from women and under-represented minorities ployment without regard to race, color, religion, eth-
simulation, and large-scale “big data” analytics. are strongly encouraged. nicity, sex (including pregnancy and gender identity),
The School of Computational Science and For more information about Georgia Tech’s national origin, disability status, age, sexual orienta-
Engineering of the College of Computing at the School of Computational Science and Engineer- tion, genetic information, protected veteran status, or
Georgia Institute of Technology seeks tenure- ing please visit: http://www.cse.gatech.edu/ any other characteristic protected by law. We always
track faculty at all levels. Our school seeks welcome nominations and applications from women,
candidates who may specialize in a broad range members of any minority group, and others who share
of application areas including biomedical and Mississippi State University our passion for building a diverse community that re-
health; urban systems and smart cities; social Bagley College of Engineering flects the diversity in our student population.
good and sustainable development; materials and Assistant Professors
manufacturing; and national security. Applicants
must have an outstanding record of research, a Mississippi State University, through its Bagley New York University/Courant Institute
sincere commitment to teaching, and interest College of Engineering, is seeking four new ten- of Mathematical Sciences
in engaging in substantive interdisciplinary ure-track faculty at the rank of Assistant Professor. Department of Computer Science
research with collaborators in other disciplines. Applicants should have teaching and research in- Faculty
Georgia Tech is located in the heart of metro terests that can enhance the strengths of the col-
Atlanta, a home to more than 5.3 million people lege in one or more of the following areas of inter- The department expects to have several regular
and nearly 150,000 businesses, a world-class est: (1) Energy, (2) Human Health Enhancement, faculty positions and invites candidates at all
airport, lush parks and green spaces, competi- (3) Information and Decision Systems (4) Materi- levels to apply. We will consider outstanding can-
tive schools and numerous amenities for enter- als – Science and Engineering, (5) Transportation didates in any area of computer science, in par-
tainment, sports and restaurants that all offer a and Vehicular Systems, and (6) Water and the En- ticular in systems, machine learning and data sci-
top-tier quality of life. From its diverse economy, vironment. The successful applicants from this ence, scientific computing and verification.
global access, abundant talent and low costs of strategic college-level search will be placed into Faculty members are expected to be outstand-
business and lifestyle, metro Atlanta is a great the most appropriate academic department. A ing scholars and to participate in teaching at all
place to call “home.” Residents have easy access PhD in an appropriate engineering or computer levels from undergraduate to doctoral. New ap-
to arts, culture, sports and nightlife, and can ex- science discipline is required. Screening of ap- pointees will be offered competitive salaries and
perience all four seasons, with mild winters that plications will begin November 28, 2016 and will startup packages, with affordable housing within
rarely require a snow shovel. continue until the position is filled. a short walking distance of the department. New
For best consideration, applications are due For a complete job description and require- York University is located in Greenwich Village,
by December 16, 2016. The application mate- ments, visit at https://www.bagley.msstate.edu. one of the most attractive residential areas of
rial should include a full academic CV, a personal Interested candidates must apply on-line at Manhattan.
narrative on teaching and research, a list of at https://www.msujobs.msstate.edu (search for po- The department has 34 regular faculty mem-
least three references and up to three sample sitions in the Dean of Engineering). bers and several clinical, research, adjunct, and
publications. Georgia Tech is an Affirmative Ac- MSU is an equal opportunity employer, and all visiting faculty members. The department’s cur-
tion/Equal Opportunity Employer. Applications qualified applicants will receive consideration for em- rent research interests include algorithms, cryp-
The Hong Kong Polytechnic University (PolyU) is a government-funded tertiary institution in Hong Kong. It offers programmes at various
levels including Doctorate, Master’s and Bachelor’s degrees. It has a full-time academic staff strength of around 1,200. The total consolidated
expenditure budget of the University is about HK$6.6 billion (US$1 = HK$7.8 approximately) per year. Committed to academic excellence in
a professional context, PolyU aspires to become a world-class university with an emphasis on the application value of its programmes and
research. Its vision is to become a leading university that excels in professional education, applied research and partnership for the
betterment of Hong Kong, the nation and the world.
The University is now inviting applications or nominations for the following post:
www.polyu.edu.hk
Coordination of our Capstone project series, De- strong commitment to involving undergraduates Swarthmore College
sign of short term tutorials dedicated to Data Sci- in their research. A Ph.D. in Computer Science at Computer/Electrical Engineering Faculty
ence topics of current interest and Coordination or near the time of appointment is required. (all ranks)
of Data Science Workshops. For the tenure-track position, we are inter-
To apply, please go to http://apply.interfolio. ested in applicants whose areas fit broadly into Swarthmore College invites applications for a
com/37381. theory and algorithms, systems, or programming tenure-track or tenured position at any rank in
languages. Priority will be given to complete ap- the area of Computer/Electrical Engineering, to
plications received by November 15, 2016. start during the Fall semester of 2017. A doctor-
Swarthmore College For the visiting position, strong applicants in ate in Computer or Electrical Engineering or a re-
Assistant Professor any area will be considered. Priority will be given lated field is required. The appointee will pursue
to complete applications received by February 1, a research program that encourages involvement
The Computer Science Department invites ap- 2017. by undergraduate students. Strong interests in
plications for one tenure-track position and mul- Applications for both positions will continue undergraduate teaching, supervising senior de-
tiple visiting positions at the rank of Assistant to be accepted after these dates until the posi- sign projects, and student mentoring are also re-
Professor to begin Fall semester 2017. tions are filled. quired. Teaching responsibilities include courses
Swarthmore College is a small, selective, Applications should include a cover letter, in computer hardware such as computer archi-
liberal arts college located 10 miles outside of vita, teaching statement, research statement, tecture and digital logic, and electives in the ap-
Philadelphia. The Computer Science Department and three letters of reference, at least one (pref- pointee’s area of specialization.
offers majors and minors at the undergraduate erably two) of which should speak to the candi- Located in the suburbs of Philadelphia,
level. date’s teaching ability. In your cover letter, please Swarthmore College is a highly selective under-
Swarthmore College has a strong institutional briefly describe your current research agenda; graduate liberal arts institution with 1500 stu-
commitment to excellence through diversity and what would be attractive to you about teaching dents, whose mission combines academic excel-
inclusivity in its educational program and em- in a liberal arts college environment; and what lence and social responsibility. Eight full-time
ployment practices. The College actively seeks background, experience, or interests are likely to faculty members in the Department of Engineer-
and welcomes applications from candidates with make you a strong teacher of a diverse group of ing offer a rigorous, ABET-accredited program
exceptional qualifications, particularly those with Swarthmore College students. for the Bachelor of Science in Engineering to ap-
demonstrated commitments to a more inclusive proximately 120 students. Sabbatical leave with
society and world. For more information on Fac- Tenure-track applications are being accepted support is available every fourth year. The depart-
ulty Diversity and Excellence at Swarthmore, see online at ment has an endowed equipment budget, and
http://www.swarthmore.edu/faculty-diversity- https://academicjobsonline.orgajo?job- there is support for faculty/student collaborative
excellence/information-candidates-new-faculty 890-8018 research. For program details, see http://engin.
Applicants must have teaching experience Visiting applications are being accepted online at swarthmore.edu/.
and should be comfortable teaching a wide range https://academicjobsonline.org ajo?job- Please upload your CV, brief statements de-
of courses at the introductory and intermedi- 890-8020 scribing teaching philosophy and research in-
ate level. Candidates should additionally have a Candidates may apply for both positions. terests, along with three letters of reference to:
ADVERTISING IN CAREER The newly launched ShanghaiTech University is built as a world-class research
university, which locates in Zhangjiang High-Tech Park. We invite highly qualified
OPPORTUNITIES candidates to fill tenure-track/tenured faculty positions as its core team in the School
of Information Science and Technology (SIST). Candidates should have exceptional
academic records or demonstrate strong potential in cutting-edge research areas
How to Submit a Classified Line Ad: Send an e-mail to of information science and technology. They must be fluent in English. Overseas
[email protected]. Please include text, and indicate academic connection or background is highly desired.
the issue/or issues where the ad will appear, and a contact Academic Disciplines:
name and number. We seek candidates in all cutting edge areas of information science and
technology. Our recruitment focus includes, but is not limited to: computer
Estimates: An insertion order will then be e-mailed back to architecture and technologies, nano-scale electronics, high speed and RF circuits,
you. The ad will by typeset according to CACM guidelines. intelligent and integrated signal processing systems, computational foundations,
NO PROOFS can be sent. Classified line ads are NOT big data, data mining, visualization, computer vision, bio-computing, smart
commissionable. energy/power devices and systems, next-generation networking, as well as inter-
disciplinary areas involving information science and technology.
Rates: $325.00 for six lines of text, 40 characters per line.
Compensation and Benefits:
$32.50 for each additional line after the first six. The MINIMUM Salary and startup funds are highly competitive, commensurate with experience
is six lines. and academic accomplishment. We also offer a comprehensive benefit package
to employees and eligible dependents, including housing benefits. All regular
Deadlines: 20th of the month/2 months prior to issue date. ShanghaiTech faculty members will be within its new tenure-track system
For latest deadline info, please contact: commensurate with international practice for performance evaluation and promotion.
[email protected] Qualifications:
Career Opportunities Online: Classified and recruitment • A detailed research plan and demonstrated record/potentials;
display ads receive a free duplicate listing on our website at: • Ph.D. (Electrical Engineering, Computer Engineering, Computer Science, or
related field);
http://jobs.acm.org • A minimum relevant research experience of 4 years.
at the undergraduate and graduate levels, obtain- The University of Alaska Fairbanks is an equal City and Region
ing funded research, publishing in peer-reviewed opportunity/affirmative action employer and ed- The city of Buffalo is the second largest city in New
publications, and performing public service. ucational institution. York state, and was recently voted as one of the top
Candidates must have a PhD degree or all but ten best places to live and raise a family by Forbes
dissertation in Computer Science or equivalent. magazine. Buffalo is near the world-famous Niag-
Starting Salary: $81,000 DOE. Based on a 9 month University at Buffalo, The State ara Falls, the Finger Lakes, and the Niagara Wine
academic contract. University of New York Trail. The city is renowned for its architecture and
The department offers an ABET-accredited Department of Computer Science and features excellent museums, dining, cultural at-
BS, MS, and interdisciplinary PhD degree. UAF Engineering tractions, and several professional sports teams,
(www.uaf.edu) is the major research campus in and has a packed year-round calendar of cultural
the University of Alaska system and hosts several The Department of Computer Science and Engi- events and sporting activities, coupled with rela-
research institutes. Fairbanks, a modern com- neering, University at Buffalo invites candidates tively low house prices and great schools. The eco-
munity with approximately 98,000 residents, is to apply for multiple tenured and tenure-track nomic renaissance of the region is underlined by a
located in interior Alaska between the Alaska and faculty positions beginning in the 2017-2018 aca- revitalized downtown waterfront and an energetic
Brooks mountain ranges and noted for the scope demic year. Candidates at all ranks from all areas tech and start-up community. In an extraordinary
of unique outdoor activities. We seek candidates of computer science and engineering, including recognition of Western New York’s potential,
who will strengthen our degree programs and but not limited to areas covered by existing fac- Governor Andrew M. Cuomo has committed an
appreciate the unique geography and climate of ulty strength such as Algorithms, Big Data, Cyber historic $1 billion investment in the Buffalo area
interior Alaska. Security, Cyber Physical Systems (or Internet of economy to create thousands of jobs and spur
Required documentation includes a cover Things), Databases, Distributed Systems, Embed- billions in new investment and economic activity
letter, curriculum vitae, list of three professional ded Systems, Machine Learning, Mobile Comput- over the next several years.
references with contact information, and state- ing, Multimedia, Pattern Recognition, Robotics,
ment of teaching and research interests. Interest- and Theory. Applicants must have a Ph.D. in com-
ed candidates must apply online at http://www. puter science or a related area by August 2017 and The State University of New York at
alaska.edu/jobs/ or http://careers.alaska.edu/cw/ demonstrate potential for excellence in research, Buffalo
en-us/job/504751?lApplicationSubSourceID= by teaching, service and mentoring. Applicants from Department of Computer Science and
submitting all the required documents. underrepresented groups, especially women and Engineering
Please direct questions regarding this recruit- minorities, are strongly encouraged. We are look- Non-Tenure Track Lecturer
ment, to Dr. Jon Genetti ([email protected]) ing for candidates who can operate effectively in
or the department’s web site: www.cs.uaf.edu. a diverse community of students and faculty and The State University of New York at Buffalo De-
The successful candidate must be eligible to share our vision of keeping all constituents reach partment of Computer Science and Engineering
work in the United States in compliance with the their potential. invites candidates to apply for non-tenure track
Immigration Reform and Control Act. Review of Applications will be accepted from October lecturer positions beginning in fall 2017. We in-
applications will begin November 21, 2016 and 15, 2016 to January 15, 2017. Applicants must sub- vite applications from candidates from all areas
the position will remain open until filled. mit their application electronically via www.ub- of Computer Science and Computer Engineering
jobs.buffalo.edu. Posting number 1600687. Any who have a passion for teaching. We are particu-
questions can be directed to Search Committee larly looking for candidates who can operate ef-
Co-Chairs, Prof. Rohini Srihari and Chang-Wen fectively in a diverse community of students and
Chen at [email protected]. The University faculty and share our vision of helping all con-
at Buffalo is an Equal Opportunity Employer. stituents reach their potential. Applicants from
underrepresented groups, especially women and
Computer Science and Engineering Department minorities, are strongly encouraged. We are look-
LECTURER POSITIONS Housed in a new $75M building, and as a part of ing for candidates who can operate effectively in
Department of Electrical and the School of Engineering and Applied Sciences, a diverse community of students and faculty and
Systems Engineering the Computer Science and Engineering depart- share our vision of keeping all constituents reach
The University of Pennsylvania’s Department of ment offers both BA and BS degrees in Computer their potential.
Electrical and Systems Engineering invites applicants Science and a BS in Computer Engineering (ac- Lecturer’s duties include teaching and devel-
for two full-time Lecturer positions. The department
seeks individuals with exceptional promise for, or credited by ABET) as well as MS and PhD programs. opment of undergraduate Computer Science and
a proven record of, excellence in teaching, course The department currently has 38 tenured/ten- Computer Engineering courses (with an empha-
and curriculum innovation. Applicants should have ure-track faculty, 7 teaching faculty, and approxi- sis on lower division), advising undergraduate
a Ph.D. degree in Electrical or Systems Engineering mately 900 undergraduate majors, 450 masters students, as well as participation in department
or related field. We are particularly interested in students, and 160 PhD students. Eighteen faculty, and university governance (service). Contribution
candidates that enhance our educational curricula in
the broad areas of: including 16 junior faculty have been hired since to research is encouraged.
2010, and we are continuing to expand. Two mem-
1. Computer engineering & embedded systems
(embedded programming, distributed systems, bers of our faculty currently hold key university Computer Science and Engineering Department
hardware/software co-design, model-based leadership positions and eight members of our Computer Science and Engineering department
design, internet of things), and faculty are IEEE and/or ACM Fellows. Our faculty is housed in a new $75M building and, as a part of
2. Information & systems engineering (control members are actively involved in cutting-edge re- the School of Engineering and Applied Sciences,
systems, optimization, robotics, signal search and successful interdisciplinary programs the department offers both BA and BS degrees in
processing, stochastic systems, model-based and centers devoted to biometrics; bioinformat- Computer Science, a BS in Computer Engineer-
systems engineering, systems engineering
projects). ics; biomedical computing; computational and ing (accredited by ABET), a combined 5-year BS/
data science and engineering, document analysis MS program, a minor in Computer Science, two
The department is strongly interested in individuals
that will balance principles-based lectures with and recognition; high performance computing; joint programs (a BA/MBA and with Computa-
hands-on projects addressing emerging application information assurance and cyber security; em- tional Physics), and MS and PhD programs.
domains. bedded, networked and distributed systems, and The department currently has 38 tenured/ten-
Diversity candidates are strongly encouraged to apply. sustainable transportation. Our annual research ure-track faculty, 7 teaching faculty, and approxi-
Interested persons should submit an online applica- expenditure is about $5 Million dollars. mately 900 undergraduate majors, 450 masters
tion at http://www.ese.upenn.edu/faculty-positions students, and 160 PhD students. Eighteen faculty,
and include curriculum vitae, statement of teaching
University at Buffalo (UB) including 16 junior faculty have been hired since
interests, and three references.
UB is New York’s largest and most comprehensive 2010, and we are continuing to expand. Two mem-
The University of Pennsylvania is an Equal
public university, with approximately 20,000 bers of our faculty currently hold key university
Opportunity Employer. Minorities/Women/Individuals
with Disabilities/Veterans are encouraged to apply undergraduate students and 10,000 graduate leadership positions and eight members of our
students. faculty are IEEE and/or ACM Fellows. Our faculty
The Department of Computer Science (cs. To be considered an applicant, the following Chicago epitomizes the modern, livable, vi-
uchicago.edu) is the hub of a large, diverse com- materials are required: brant, and diverse city. Its airports are among
puting community of two hundred researchers ˲˲ Curriculum vitae with a list of publications the busiest in the world, with frequent non-stop
focused on advancing foundations of comput- ˲˲ One page teaching statement flights to virtually anywhere. Yet the cost of living,
ing and driving its most advanced applications. ˲˲ Three reference letters, one of which must whether in an 88th floor condominium down-
Long distinguished in theoretical computer sci- address the candidate’s teaching ability town or on a tree-lined street in one of the na-
ence and artificial intelligence, the Department tion’s finest school districts, is surprisingly low.
is now building strong systems and machine Reference letter submission information will Applications must be submitted at https://
learning groups. The larger community in these be provided during the application process. jobs.uic.edu/. Include a curriculum vitae, teach-
areas at the University of Chicago includes the Review of complete applications, including ing and research statements, and names and ad-
Department of Statistics, the Computation Insti- reference letters, will begin October 3, 2016, and dresses of at least three references in the online
tute, the Toyota Technological Institute at Chi- continue until the position is filled. application. Applicants needing additional infor-
cago (TTIC), the Polsky Center for Entrepreneur- The University of Chicago is an Affirmative Ac- mation may contact the Faculty Search at search-
ship and Innovation, and the Mathematics and tion/Equal Opportunity/Disabled/Veterans Em- [email protected]. For fullest consideration, ap-
Computer Science Division of Argonne National ployer and does not discriminate on the basis of ply by December 1, 2016, but applications will
Laboratory. race, color, religion, sex, sexual orientation, gen- be accepted until the positions are filled. The
The Chicago metropolitan area provides a di- der identity, national or ethnic origin, age, status University of Illinois is an Equal Opportunity, Af-
verse and exciting environment. The local econ- as an individual with a disability, protected veter- firmative Action employer. Minorities, women,
omy is vigorous, with international stature in an status, genetic information, or other protected veterans and individuals with disabilities are en-
banking, trade, commerce, manufacturing, and classes under the law. For additional information couraged to apply. The University of Illinois con-
transportation, while the cultural scene includes please see the University’s Notice of Nondiscrimi- ducts background checks on all job candidates
diverse cultures, vibrant theater, world-renowned nation at http://www.uchicago.edu/about/non_ upon acceptance of contingent offer of employ-
symphony, opera, jazz, and blues. The University discrimination_statement/. Job seekers in need ment. Background checks will be performed in
is located in Hyde Park, a Chicago neighborhood of a reasonable accommodation to complete the compliance with the Fair Credit Reporting Act.
on the Lake Michigan shore just a few minutes application process should call 773-702-5671 or
from downtown. email [email protected] with
The University of Chicago is an Affirmative Ac- their request. University of Kentucky
tion/Equal Opportunity/Disabled/Veterans Em- Department of Computer Science
ployer and does not discriminate on the basis of
race, color, religion, sex, sexual orientation, gen- University of Illinois at Chicago The University of Kentucky Computer Science
der identity, national or ethnic origin, age, status Department of Computer Science Department invites applications for multiple
as an individual with a disability, protected veter- Information Retrieval / Natural Language tenure-track faculty positions to begin in either
an status, genetic information, or other protected Processing / Theoretical Computer Science January or August of 2017.
classes under the law. For additional information Faculty We are seeking energetic and creative faculty
please see the University’s Notice of Nondiscrimi- who have a passion for teaching students and
nation at http://www.uchicago.edu/about/non_ The Computer Science Department at the Uni- for building a research program centered on ad-
discrimination_statement/. Job seekers in need versity of Illinois at Chicago (UIC) invites applica- vanced computing. We will consider all ranks,
of a reasonable accommodation to complete the tions for multiple full-time tenure-track positions with preference for candidates at the assistant
application process should call 773-702-5671 or at the rank of Assistant Professor (exceptional se- professor level. Outstanding candidates at the
email [email protected] with nior level candidates will also be considered). All rank of assistant professor will be considered for
their request. candidates must have a doctorate in Computer an endowed fellowship.
Science or a closely related field by the appoint- We value collaborative and interdisciplinary
ment’s starting date. Candidates will be expected research. Our faculty members have established
University of Chicago to conduct world class research and teach effec- research programs with other members of the
Lecturer tively at the undergraduate and graduate levels. department and with a wide variety of other de-
Senior candidates must have an outstanding partments and programs, including statistics,
The Department of Computer Science at the research record, a strong record of funded re- biology, linguistics, internal medicine, electrical
University of Chicago invites applications for the search, demonstrated leadership in collaborative engineering, computer engineering, and the hu-
position of Lecturer. Subject to the availability of research, and an excellent teaching record at the manities. We favor researchers who are eager to
funding, this would be a two year position with undergraduate and graduate level. collaborate to solve problems that extend beyond
the possibility of renewal. This position involves This search primarily seeks candidates in their own research areas.
teaching in the fall, winter and spring quarters. three research areas. Please clearly indicate for We seek applications from excellent can-
The successful candidate will have competence which one of those areas you wish to be consid- didates in all areas, with a particular desire for
in teaching and superior academic credentials, ered. Exceptional candidates from other areas expertise in computer networking, security and
and will carry responsibility for teaching comput- may also be considered. The focused research ar- privacy, machine learning, big data and data min-
er science courses and laboratories. Completion eas of faculty search are: ing, visualization and computer vision, artificial
of all requirements for a Ph.D. in Computer Sci- 1. Information Retrieval and Web search. intelligence, and software engineering. These ar-
ence or a related field is required at the time of ap- 2. N
atural Language Processing and computa- eas complement the department’s Laboratory for
pointment and candidates must have experience tional linguistics. Advanced Networking, Software Verification and
teaching Computer Science at the College level. 3. Theoretical Computer Science. Validation Lab, and established collaborations
The Chicago metropolitan area provides a di- with the Center for Computational Sciences and
verse and exciting environment. The local econ- In addition, we may also have a position in the Center for Biomedical Informatics.
omy is vigorous, with international stature in cyber-physical systems. We value teaching and the student experi-
banking, trade, commerce, manufacturing, and The Computer Science department has 31 ence. Candidates should be eager and prepared
transportation, while the cultural scene includes tenure-system faculty and offers BS, MS and to teach upper-level courses in their areas of ex-
diverse cultures, vibrant theater, world-renowned PhD degrees. Our faculty includes 11 NSF CA- pertise, as well as (ultimately) core courses in our
symphony, opera, jazz and blues. The University REER award recipients. UIC has an advanced ABET-accredited undergraduate Computer Sci-
is located in Hyde Park, a Chicago neighborhood computing and networking infrastructure in ence and Computer Engineering programs.
on the Lake Michigan shore just a few minutes place for data-intensive scientific research that The University of Kentucky Computer Science
from downtown. is well-connected regionally, nationally and in- Department, one of the first CS departments in
Applicants must apply on line at the Universi- ternationally. Further information about the po- the United States, has 21 faculty members com-
ty of Chicago Academic Careers website at http:// sitions can be found at https://www.cs.uic.edu/ mitted to excellence in education, research, and
tinyurl.com/h84fu8p. MainShowJob?name=facINT. service. The Department offers programs leading
University Of Oregon (including but not limited to architecture, data- York University
Department of Computer and Information driven systems, mobile and embedded systems,
Science Faculty Position networks, operating systems, and parallel and The Department of Electrical Engineering and
distributed systems), privacy and security, and Computer Science, York University, is seeking a
The Department of Computer and Information human computer interaction. We also welcome 3-year Contractually Limited Appointment (CLA)
Science (CIS) seeks applications for two tenure applicants in other areas of traditional strength at the rank of Sessional Assistant Lecturer (alter-
track faculty positions at the rank of Assistant for us including natural language processing. nate stream) to serve as Course Coordinator of all
Professor, beginning September 2017. The Uni- Candidates must have a PhD in computer science first-year Major and Service computing courses
versity of Oregon is an AAU research university lo- or a related discipline. offered by the Department, to commence July 1,
cated in Eugene, two hours south of Portland, and Apply online via https://www.rochester.edu/ 2017, subject to budgetary approval. The success-
within one hour’s drive of both the Pacific Ocean faculty-recruiting/login. ful candidate has demonstrated excellence in
and the snow-capped Cascade Mountains. Consideration of applications at any rank teaching, is licensed as a Professional Engineer
The open faculty positions are targeted to- will begin immediately and continue until all in- in Canada or could obtain licensure in the very
wards the following three research areas: 1) high terview slots are filled. Candidates should apply short term, and will assume the teaching of up
performance computing, 2) networking and dis- no later than January 1, 2017, for full consider- to 6 course sections. For full position details, see
tributed systems and 3) data sciences. We are par- ation. Applications that arrive after this date risk http://www.yorku.ca/acadjobs. Applicants should
ticularly interested in applicants whose research being overlooked or arriving after the interview complete the on-line process at http://lassonde.
addresses security and privacy issues in these schedule has been filled. The department also yorku.ca/new-faculty/. A complete application
sub-disciplines; additionally, we are interested in has a search in progress for a full-time lecturer includes a cover letter, a detailed curriculum vi-
applicants whose research complements existing position. Details on eligibility and position re- tae, statement of contribution to research, teach-
strengths in the department, so as to support in- sponsibilities may be obtained via http://www. ing and curriculum development, three sample
terdisciplinary research efforts. Applicants must cs.rochester.edu/about/recruit.html. research publications and three reference let-
have a Ph.D. in computer science or closely relat- The Department of Computer Science is re- ters. Complete applications must be received by
ed field, a demonstrated record of excellence in search focused, with a distinguished history of December 31, 2016. York University is an Affirma-
research, and a strong commitment to teaching. contributions in artificial intelligence, HCI, sys- tive Action (AA) employer. The AA Program can be
A successful candidate will be expected to con- tems, and theory. We have a highly collaborative found at http://www.yorku.ca/acadjobs or a copy
duct a vigorous research program and to teach at culture and strong ties to electrical and computer can be obtained by calling the AA office at 416-
both the undergraduate and graduate levels. engineering, brain and cognitive science, linguis- 736-5713. All qualified candidates are encour-
We offer a stimulating, friendly environment tics, and several departments in the medical cen- aged to apply; however, Canadian citizens and
for collaborative research both within the depart- ter. We also have a growing Institute for Data Sci- permanent residents will be given priority.
ment, which expects to grow substantially in the ence with potential for synergistic collaboration
next few years, and with other departments on opportunities and more joint hires. Over the past
campus. The department hosts two research cen- decade, a third of the department’s PhD gradu- York University
ters, the Center for Cyber Security and Privacy and ates have won tenure-track faculty positions, and
the NeuroInformatics Center. Successful candi- its alumni include leaders at major research labo- The Department of Electrical Engineering and
dates will have access to a new high-performance ratories such as Google, Microsoft, and IBM. Computer Science, York University, is seeking a
computing facility that opens in October 2016. The University of Rochester is a private, Tier I 3-year Contractually Limited Appointment (CLA)
The CIS Department is part of the College of research institution located in western New York at the rank of Sessional Assistant Lecturer (alter-
Arts and Sciences and is housed within the Lorry State. It consistently ranks among the top 30 insti- nate stream) to serve as Course Coordinator of all
Lokey Science Complex. The department offers tutions, both public and private, in federal fund- first-year Major and Service computing courses
B.S., M.S. and Ph.D. degrees. More information ing for research and development. The university offered by the Department, to commence July 1,
about the department, its programs and faculty has made substantial investments in computing 2017, subject to budgetary approval. The success-
can be found at http://www.cs.uoregon.edu. infrastructure through the Center for Integrated ful candidate has demonstrated excellence in
Applications will be accepted electronically Research Computing (CIRC) and the Health Sci- teaching, is licensed as a Professional Engineer
through the department’s web site. Applica- ences Center for Computational Innovation (HSC- in Canada or could obtain licensure in the very
tion information can be found at http:// www. CI). The university includes the Eastman School short term, and will assume the teaching of up
cs.uoregon.edu/Employment/. Applications re- of Music, a premiere music conservatory, and the to 6 course sections. For full position details, see
ceived by December 15, 2016 will receive full con- University of Rochester Medical Center, a major http://www.yorku.ca/acadjobs. Applicants should
sideration. Review of applications will continue medical school, research center, and hospital sys- complete the on-line process at http://lassonde.
until the positions are filled. Please address any tem. The greater Rochester area is home to over yorku.ca/new-faculty/. A complete application
questions to [email protected]. a million people, including 80,000 students who includes a cover letter, a detailed curriculum vi-
The University of Oregon is an equal oppor- attend its 8 colleges and universities. Tradition- tae, statement of contribution to research, teach-
tunity/affirmative action institution committed ally strong in optics research and manufacturing, ing and curriculum development, three sample
to cultural diversity and is compliant with the it was recently selected by the Department of De- research publications and three reference let-
Americans with Disabilities Act. The University fense as the hub of a $360M-plus Integrated Pho- ters. Complete applications must be received by
encourages all qualified individuals to apply, and tonics Institute for Manufacturing Innovation. December 31, 2016. York University is an Affirma-
does not discriminate on the basis of any protect- The University of Rochester has a strong com- tive Action (AA) employer. The AA Program can be
ed status, including veteran and disability status. mitment to diversity and actively encourages found at http://www.yorku.ca/acadjobs or a copy
The successful candidate will have the ability to applications from groups underrepresented in can be obtained by calling the AA office at 416-
work effectively with faculty, staff, and students higher education. The University is an Equal Op- 736-5713. All qualified candidates are encour-
from a variety of diverse backgrounds. portunity Employer. aged to apply; however, Canadian citizens and
EOE Minorities / Females / Protected Veter- permanent residents will be given priority.
ans/Disabled
University of Rochester
Faculty Positions in Computer Science
US Naval Academy
The University of Rochester Department of Com-
puter Science seeks applicants for multiple ten- USNA Electrical & Computer Engineering is seek-
ure track positions. Candidates in all areas of ing applicants to fill tenure-track Assistant Pro-
computer science who see a good synergistic fit fessor positions in Computer Engineering. Ap-
with research initiatives at the university are en- plicants with teaching & research interests in all
couraged to apply. We are especially interested areas of computer engineering will be considered.
in growing our research strengths in systems Applications accepted through “Apply URL” only.
Future Tense
The Candidate
Seeking the programmer vote, an AI delivering a slogan
like “Make Coding Great Again” could easily be seen as a threat.
entity called Halle, topped with a base- is needed for that role?” we’re way more flexible.”
ball cap labeled “Make Coding Great “I could turn that back on you,” said “Now let’s bring in the union’s me-
Again.” “As president of the Code Cut- Halle, “and ask why we need human dia spokesman, Adam Selene,” said
ters Association, I would be represent- intelligence in any role involving rou- the anchor. What’s wrong with Halle
ing thousands of highly skilled and tine or repetitive effort. AI is here to running for the post, Mr. Selene?
well-compensated American workers stay. The deep neural networks in my Who knows [C O NTINUED O N P. 135]
HUMANS,
MACHINES
AND THE
FUTURE OF WORK
The conference will focus on issues created by the impact RENOWNED SPEAKERS AND PANELISTS:
of information technology on labor markets over the next
Diane Bailey John Markoff
25 years, addressing questions such as: Associate Professor, School of Senior Writer, The New York Times
Information, The University of
Texas at Austin Lawrence Mishel
• What advances in artificial intelligence, robotics President, Economic Policy Institute
and automation are expected over the next 25 Guruduth Banavar
Vice President, Cognitive Computing, Joel Mokyr
years? IBM Research Robert H. Strotz Professor,
Northwestern University;
John Seely Brown Sackler Professor by Special
• What will be the impact of these advances on job Co-chairman, Deloitte’s Center for the Appointment, Eitan Berglas School
Edge; Adviser to the provost at USC of Economics, Tel Aviv University
creation, job destruction and wages in the labor
market? Daniel Castro David Nordfors
Vice President, Information Technology Co-founder and Co-chair,
and Innovation Foundation Innovation for Jobs Summit
• What skills are required for the job market of the Stuart Elliott Debra Satz
future? Directorate for Education and Skills Marta Sutton Weeks Professor
Organization for Economic Co- of Ethics in Society, Professor of
operation and Development (OECD) Philosophy, Senior Associate Dean for
• Can education prepare workers for that job the Humanities and Arts, J. Frederick
Richard B. Freeman and Elisabeth Brewer Weintz University
market? Herbert Ascherman Chair in Fellow in Undergraduate Education,
Economics, Harvard University Stanford University