Communications201710-Dl - Barriers To Refactoring
Communications201710-Dl - Barriers To Refactoring
ACM
CACM.ACM.ORG OF THE 10/2017 VOL.60 NO.10
Internet Advertising
Metaphors We Compute By
3D-Printing Body Parts
What Can Agile Methods Bring
to Software Development?
Association for
Computing Machinery
Previous
A.M. Turing Award
Recipients
70 Internet Advertising:
Technology, Ethics, and
a Serious Difference of Opinion
Exploring the technical and ethical
issues surrounding Internet
advertising and ad blocking.
By Stephen B. Wicker
and Kolbeinn Karlsson
Research Highlights
42 Metaphors We Compute By 54 Barriers to Refactoring
Code is a story that explains Developers know refactoring 80 Technical Perspective
how to solve a particular problem. improves their software, but Broadening and Deepening
By Alvaro Videla many find themselves unable Query Optimization Yet
to do so when they want to. Still Making Progress
46 Research for Practice: Technology By Ewan Tempero, Tony Gorschek, By Jeffrey F. Naughton
for Underserved Communities; and Lefteris Angelis
Personal Fabrication 81 Multi-Objective Parametric
Expert-curated guides to Query Optimization
Watch the authors discuss
the best of CS research. their work in this exclusive
By Immanuel Trummer
Communications video. and Christoph Koch
https://cacm.acm.org/
50 Four Ways to Make CS videos/barriers-to-
and IT More Immersive refactoring 90 Technical Perspective
Why the Bell curve hasn’t Shedding New Light on
transformed into a hockey stick. 62 Millennials’ Attitudes Toward IT an Old Language Debate
By Thomas A. Limoncelli Consumerization in the Workplace By Jeffrey S. Foster
Millennials entering the workforce
(L) IM AGE BY ANDRIJ BORYS ASSOCIAT ES/SH UTT ERSTOC K; ( R) ILLUSTRATION BY J USTIN MET Z
Articles’ development led by ignore the risks of using privately 91 A Large-Scale Study
queue.acm.org
owned devices on the job. of Programming Languages and
By Heiko Gewald, Xuequn Wang, Code Quality in GitHub
Andy Weeger, Mahesh S. Raisinghani, By Baishakhi Ray, Daryl Posnett,
Gerald Grant, Otavio Sanchez, Premkumar Devanbu,
and Siddhi Pittayachawan and Vladimir Filkov
Communications of the ACM is the leading monthly print and online magazine for the computing and information technology fields.
Communications is recognized as the most trusted and knowledgeable source of industry information for today’s computing professional.
Communications brings its readership in-depth coverage of emerging areas of computer science, new trends in information technology,
and practical applications. Industry leaders use Communications as a platform to present and debate various technology implications,
public policies, engineering challenges, and market trends. The prestige and unmatched reputation that Communications of the ACM
enjoys today is built upon a 50-year commitment to high-quality editorial content and a steadfast dedication to advancing the arts,
sciences, and applications of information technology.
ACM, the world’s largest educational STA F F EDITORIAL BOARD ACM Copyright Notice
and scientific computing society, delivers DIRECTOR OF PU BL ICATIONS E DITOR- IN- C HIE F Copyright © 2017 by Association for
resources that advance computing as a Scott E. Delman Andrew A. Chien Computing Machinery, Inc. (ACM).
science and profession. ACM provides the [email protected] [email protected] Permission to make digital or hard copies
computing field’s premier Digital Library of part or all of this work for personal
and serves its members and the computing or classroom use is granted without
Executive Editor S E NIOR E DITOR
profession with leading-edge publications, fee provided that copies are not made
Diane Crawford Moshe Y. Vardi
conferences, and career resources. or distributed for profit or commercial
Managing Editor
advantage and that copies bear this
Thomas E. Lambert NE W S
Executive Director and CEO notice and full citation on the first
Senior Editor Co-Chairs
Bobby Schnabel page. Copyright for components of this
Andrew Rosenbloom William Pulleyblank and Marc Snir
Deputy Executive Director and COO work owned by others than ACM must
Senior Editor/News Board Members
Patricia Ryan be honored. Abstracting with credit is
Lawrence M. Fisher Monica Divitini; Mei Kobayashi;
Director, Office of Information Systems permitted. To copy otherwise, to republish,
Web Editor Michael Mitzenmacher; Rajeev Rastogi;
Wayne Graves to post on servers, or to redistribute to
David Roman François Sillion
Director, Office of Financial Services lists, requires prior specific permission
Rights and Permissions
Darren Ramdin and/or fee. Request permission to publish
Deborah Cotton VIE W P OINTS
Director, Office of SIG Services from [email protected] or fax
Editorial Assistant Co-Chairs
Donna Cappo (212) 869-0481.
Jade Morris Tim Finin; Susanne E. Hambrusch;
Director, Office of Publications
Scott E. Delman John Leslie King; Paul Rosenbloom For other copying of articles that carry a
Art Director Board Members code at the bottom of the first or last page
Andrij Borys William Aspray; Stefan Bechtold; or screen display, copying is permitted
ACM CO U N C I L Michael L. Best; Judith Bishop;
Associate Art Director provided that the per-copy fee indicated
President Stuart I. Feldman; Peter Freeman;
Margaret Gray in the code is paid through the Copyright
Vicki L. Hanson Mark Guzdial; Rachelle Hollander;
Assistant Art Director Clearance Center; www.copyright.com.
Vice-President Richard Ladner; Carl Landwehr;
Cherri M. Pancake Mia Angelica Balaquiot
Production Manager Carlos Jose Pereira de Lucena; Subscriptions
Secretary/Treasurer Beng Chin Ooi; Loren Terveen;
Bernadette Shade An annual subscription cost is included
Elizabeth Churchill Marshall Van Alstyne; Jeannette Wing
Advertising Sales Account Manager in ACM member dues of $99 ($40 of
Past President
Ilia Rodriguez which is allocated to a subscription to
Alexander L. Wolf
Communications); for students, cost
Chair, SGB Board P R AC TIC E
is included in $42 dues ($20 of which
Jeanna Matthews Columnists Chair is allocated to a Communications
Co-Chairs, Publications Board David Anderson; Phillip G. Armour; Stephen Bourne and Theo Schlossnagle subscription). A nonmember annual
Jack Davidson and Joseph Konstan Michael Cusumano; Peter J. Denning; Board Members subscription is $269.
Members-at-Large Mark Guzdial; Thomas Haigh; Eric Allman; Samy Bahra; Peter Bailis;
Gabriele Anderst-Kotis; Susan Dumais; Leah Hoffmann; Mari Sako; Terry Coatta; Stuart Feldman; Nicole Forsgren; ACM Media Advertising Policy
Elizabeth D. Mynatt; Pamela Samuelson; Pamela Samuelson; Marshall Van Alstyne Camille Fournier; Benjamin Fried; Communications of the ACM and other
Eugene H. Spafford Pat Hanrahan; Tom Killalea; Tom Limoncelli; ACM Media publications accept advertising
SGB Council Representatives Kate Matsudaira; Marshall Kirk McKusick;
C O N TAC T P O IN TS in both print and electronic formats. All
Paul Beame; Jenna Neefe Matthews; Erik Meijer; George Neville-Neil;
Copyright permission advertising in ACM Media publications is
Barbara Boucher Owens Jim Waldo; Meredith Whittaker
[email protected] at the discretion of ACM and is intended
Calendar items to provide financial support for the various
BOARD C HA I R S
[email protected] C ONTR IB U TE D A RTIC LES activities and services for ACM members.
Education Board Co-Chairs Current advertising rates can be found
Change of address
Mehran Sahami and Jane Chu Prey James Larus and Gail Murphy by visiting http://www.acm-media.org or
[email protected]
Practitioners Board Board Members by contacting ACM Media Sales at
Letters to the Editor
Terry Coatta and Stephen Ibaraki William Aiello; Robert Austin; (212) 626-0686.
[email protected]
Elisa Bertino; Gilles Brassard; Kim Bruce;
REGIONA L C O U N C I L C HA I R S Alan Bundy; Peter Buneman; Carl Gutwin; Single Copies
W E B S IT E
ACM Europe Council Yannis Ioannidis; Gal A. Kaminka; Single copies of Communications of the
http://cacm.acm.org
Dame Professor Wendy Hall Karl Levitt; Igor Markov; Gail C. Murphy; ACM are available for purchase. Please
ACM India Council Bernhard Nebel; Lionel M. Ni; Adrian Perrig; contact [email protected].
Srinivas Padmanabhuni AU T H O R G U ID E L IN ES
Sriram Rajamani; Marie-Christine Rousset;
ACM China Council http://cacm.acm.org/about-
Krishan Sabnani; Ron Shamir; Yoav Shoham; COMMUN ICATION S OF THE ACM
Jiaguang Sun communications/author-center
Josep Torrellas; Michael Vitale; (ISSN 0001-0782) is published monthly
Hannes Werthner; Reinhard Wilhelm by ACM Media, 2 Penn Plaza, Suite 701,
PUB LICATI O N S BOA R D New York, NY 10121-0701. Periodicals
ACM ADVERTISIN G DEPARTM E NT
Co-Chairs RES E A R C H HIGHLIGHTS postage paid at New York, NY 10001,
2 Penn Plaza, Suite 701, New York, NY
Jack Davidson; Joseph Konstan Co-Chairs and other mailing offices.
10121-0701
Board Members Azer Bestavros and Gregory Morrisett
T (212) 626-0686
Phoebe Ayers; Karin K. Breitman; Board Members POSTMASTER
F (212) 869-0481
Terry J. Coatta; Anne Condon; Nikil Dutt; Martin Abadi; Amr El Abbadi; Sanjeev Arora; Please send address changes to
Roch Guerrin; Chris Hankin; Carol Hutchins; Michael Backes; Maria-Florina Balcan; Communications of the ACM
Yannis Ioannidis; Michael L. Nelson; Advertising Sales Account Manager Andrei Broder; Doug Burger; Stuart K. Card; 2 Penn Plaza, Suite 701
M. Tamer Ozsu; Eugene H. Spafford; Ilia Rodriguez Jeff Chase; Jon Crowcroft; Alexei Efros; New York, NY 10121-0701 USA
Stephen N. Spencer; Alex Wade; [email protected] Alon Halevy; Sven Koenig; Steve Marschner;
Keith Webster; Julie R. Williamson Tim Roughgarden; Guy Steele, Jr.;
Printed in the U.S.A.
Media Kit [email protected] Margaret H. Wright; Nicholai Zeldovich;
ACM U.S. Public Policy Office Andreas Zeller
1701 Pennsylvania Ave NW, Suite 300,
Washington, DC 20006 USA WEB
T (202) 659-9711; F (202) 667-1066 Association for Computing Machinery Chair
(ACM) James Landay
Computer Science Teachers Association 2 Penn Plaza, Suite 701 Board Members A
SE
REC
Y
Deborah Seehorn, New York, NY 10121-0701 USA Marti Hearst; Jason I. Hong;
E
CL
PL
Interim Executive Director T (212) 869-7440; F (212) 869-0481 Jeff Johnson; Wendy E. MacKay
NE
TH
S
I
Z
I
M AGA
Computing Is a Profession
T
H E N O T I O N O F what consti- ing demands adherence to the highest fessionalism. What roles specifically?
tutes a profession has been technical and ethical standards, and First, a professional society should be
studied extensively through failure has significant consequences at a conservator and disseminator of deep
exploration of the attributes personal, corporate, national, and inter- technical knowledge and expertise:
of the activities, roles, and national scale. Computing applications championing the advance of the field by
community that lead to their rise, reach into every corner. Appropriation, leading technologists worldwide, docu-
definition, and how they achieve im- misuse, or just free flow of this informa- menting the state-of-the-art in technolo-
portance and influence and society.1 tion has demonstrably destroyed indi- gy and application, and accelerating the
Common among these attributes are vidual privacy, relationships, and careers dissemination and availability of such
a deep technical expertise, an essen- (for example, Ashley Madison), grand knowledge to computing professionals.
tial, valued, societal contribution, corporate plans (for example, Sony), and Second, societies develop and advo-
and the need to adhere to high ethi- radically changed international rela- cate principles for ethical technical con-
cal and technical standards. Pro- tions (for example, U.S. Govt OPM pen- duct that frame the role of computing
fessions such as medicine, law, and etration, and Snowden releases). professionals, and buttress them with the
accounting exemplify these attri- With growing excitement for artifi- stature and role of the profession in so-
butes. Computing exhibits all of the cal intelligence, computing is being ciety. Examples include the articulation
attributes of a profession. thrust into new societal roles (recom- of best practices, intellectual challenges
Deep Technical Expertise. Every day mender, decider) and given autonomy for the field,3 as well as address societal
we witness and drive computing’s rapid to make decisions with life-changing questions that require deep technical
technical advance—new technologies, human impact (personal assistant, perspective, such as the USACM joint re-
advancing sophistication, and outright sentencing guidelines, self-driving lease on the Internet of Things.4
new capabilities. The compounding of cars). While deep technical challenges An independent professional society
this continued and accelerating advance abound, the ethical challenges, prin- must transcend any individual, organiza-
give rise to a deep technical expertise. Al- ciples, and standards are even more tion, government, or cause. Necessarily
gorithms and systems behavioral and in- daunting. Compared to Hanks’ “rule- so, as technical knowledge and profes-
ternal complexity are peers to the great- book” in Bridge of Spies, the underlying sional ethics must inform profession-
est complexities humanity has known in principles of a representative govern- al conduct, and inevitably come into
biology, society, and the universe. ment, as embodied in the Constitution conflict with personal interest, corpo-
Societal Recognition. Computing’s of the U.S., and one can only wonder if rate interest, government or national
evident importance to society is deep the issues are any less thorny. interest, or even overt coercion.
and growing—sophisticated collection So yes, computing is a profession, and As the leading computing profes-
and information processing underpins we should proudly embrace the respon- sional society, the ACM seeks to fill
decision-making, logistics, and optimi- sibility. We should welcome, educate, these roles for computing!
zation industry and commerce. Web, and mentor new generations not just
email, and messaging platforms are the as “coders” and “hackers” or program- Andrew A. Chien, EDITOR-IN-CHIEF
information backbone of government mers, but as computing professionals.
and commerce. Social application Computing professionals have a Andrew A. Chien is the William Eckhardt Distinguished
Service Professor in the CS Department at the University
platforms expand these roles from of- responsibility to practice at the state- of Chicago, Director of the CERES Center for Unstoppable
ficial to social, insinuating computing of-the-art, and maintain their knowl- Computing, and a Senior Scientist at Argonne National Lab.
into the core of social fabric. A world edge at the forefront. In addition,
References
without cheap, pervasive computing of professionals have an obligation to 1. Abbott, A. The System of Professions: An Essay on the
extraordinary capability is if not incon- share clear understanding of the tech- Division of Expert Labor. University of Chicago Press,
1988, ISBN-13: 978-0226000695
ceivable at least so distant as to be un- nology and its implications to non- 2. ACM Code of Ethics, (1992); https://www.acm.org/
recognizable. Just as modern existence professionals, and operate in accord about-acm/code-of-ethics
3. Denning, P.J. The Profession of IT. Columns in
without the practice of medicine or law with professional ethics.2 These are Commun. ACM, 2008–2017.
would be unimaginable. daunting goals for an individual, and 4. USACM. Statement on Internet of Things Privacy and
Security, 2017; https://www.acm.org/
Necessity for High Ethical and Techni- so professional societies play a critical
cal Standards. The practice of comput- role in cultivating and supporting pro- Copyright held by author.
A.M. Turing Award: ACM’s most prestigious award recognizes contributions of a technical nature which are of lasting and major technical
A.M. Turing Award: ACM’s most prestigious award recognizes contributions of a technical nature which are of lasting and major technical
importance to the computing community. The award is accompanied by a prize of $1,000,000 with financial support provided by Google.
importance to the computing community. The award is accompanied by a prize of $1,000,000 with financial support provided by Google.
ACM Prize in Computing (previously known as the ACM-Infosys Foundation Award in the Computing Sciences): recognizes an early-
ACM
to Prize infundamental,
mid-career Computinginnovative
(previously known asinthe
contribution ACM-Infosys
computing Foundation
that, through Award
its depth, in the
impact andComputing Sciences):
broad implications, recognizes
exemplifies the an early-
to mid-career
greatest fundamental,
achievements innovative
in the discipline. Thecontribution
award carriesin computing
a prize that,Financial
of $250,000. throughsupport
its depth, impact and
is provided broadLtd.
by Infosys implications, exemplifies the
greatest achievements in the discipline. The award carries a prize of $250,000. Financial support is provided by Infosys Ltd.
Distinguished Service Award: recognizes outstanding service contributions to the computing community as a whole.
Distinguished Service Award: recognizes outstanding service contributions to the computing community as a whole.
Doctoral Dissertation Award: presented annually to the author(s) of the best doctoral dissertation(s) in computer science and
engineering, and is accompanied
Doctoral Dissertation Award: by a prize of $20,000.
presented annuallyThe Honorable
to the Mention
author(s) of theAward is accompanied
best doctoral by a prizeintotaling
dissertation(s) $10,000.
computer science and
Financial support
engineering, andof is
the award is provided
accompanied by a by Google,
prize Inc. Winning
of $20,000. dissertations
The Honorable are published
Mention Awardinis the ACM Digital by
accompanied Library andtotaling
a prize the ACM$10,000.
Books Series.
Financial support of the award is provided by Google, Inc. Winning dissertations are published in the ACM Digital Library and the ACM
Books
ACM Series.
– IEEE CS George Michael Memorial HPC Fellowship: honors exceptional PhD students throughout the world whose research focus
is on high-performance computing applications, networking, storage, or large-scale data analysis using the most powerful computers that
ACM
are – IEEEavailable.
currently CS George TheMichael Memorial
Fellowship includesHPC Fellowship:
a $5,000 honors exceptional PhD students throughout the world whose research focus
honorarium.
is on high-performance computing applications, networking, storage, or large-scale data analysis using the most powerful computers that
Grace Murray Hopper
are currently Award:
available. presented toincludes
The Fellowship the outstanding
a $5,000young computer professional of the year, selected on the basis of a
honorarium.
single recent major technical or service contribution. The candidate must have been 35 years of age or less at the time the qualifying
Grace Murray
contribution wasHopper
made. AAward: presented
prize of $35,000 to the outstanding
accompanies the award. young computer
Financial support isprofessional of the year, selected on the basis of a
provided by Microsoft.
single recent major technical or service contribution. The candidate must have been 35 years of age or less at the time the qualifying
Paris Kanellakis Theory and Practice Award: honors specific theoretical accomplishments that have had a significant and demonstrable
contribution was made. A prize of $35,000 accompanies the award. Financial support is provided by Microsoft.
effect on the practice of computing. This award is accompanied by a prize of $10,000 and is endowed by contributions from the Kanellakis
family, and financialTheory
Paris Kanellakis supportand
by ACM’s SIGACT,
Practice SIGDA,
Award: SIGMOD,
honors SIGPLAN,
specific and theaccomplishments
theoretical ACM SIG Project Fund,
thatand individual
have contributions.
had a significant and demonstrable
effect
Karl on the practice
V. Karlstrom of computing.
Outstanding This award
Educator Award: is accompanied
presented by a prize
to an outstanding of $10,000
educator who isand is endowed
appointed by contributions
to a recognized from the Kanellakis
educational
family, and financial
baccalaureate support
institution, by ACM’s
recognized SIGACT, new
for advancing SIGDA, SIGMOD,
teaching SIGPLAN, and
methodologies, the ACM
effecting newSIG Project development
curriculum Fund, and individual contributions.
or expansion
in computer science and engineering, or making a significant contribution to ACM’s educational mission. Teachers with 10 years or less
Karl V. Karlstrom Outstanding Educator Award: presented to an outstanding educator who is appointed to a recognized educational
experience are given special consideration. The Karlstrom Award is accompanied by a prize of $10,000. Financial support is provided by
baccalaureate
Pearson institution, recognized for advancing new teaching methodologies, effecting new curriculum development or expansion
Education.
in computer science and engineering, or making a significant contribution to ACM’s educational mission. Teachers with 10 years or less
ACM – AAAI Allen
experience Newell
are given Award:
special presented to individuals
consideration. selected
The Karlstrom for career
Award contributions
is accompanied bythat haveof
a prize breadth within
$10,000. computer
Financial science,
support is provided by
or that bridge
Pearson computer science and other disciplines. The $10,000 prize is provided by ACM and AAAI, and by individual contributions.
Education.
Outstanding Contribution to ACM Award: recognizes outstanding service contributions to the Association. Candidates are selected
ACM – AAAI Allen Newell Award: presented to individuals selected for career contributions that have breadth within computer science,
based on the value and degree of service overall.
or that bridge computer science and other disciplines. The $10,000 prize is provided by ACM and AAAI, and by individual contributions.
ACM Policy Award: recognizes an individual or small group that had a significant positive impact on the formation or execution of public
Outstanding
policy affecting Contribution to ACM
computing or the Award:
computing recognizes
community. Thisoutstanding service contributions
can be for education, to the Association.
service, or leadership Candidates
in a technology are selected
position; for
based on the
establishing value and program
an innovative degree ofin service overall. or advice; for building the community or a community resources in technology
policy education
policy; or other notable policy activity. The biennial award is accompanied by a $10,000 prize. The first award will be the 2017 award.
ACM Policy Award: recognizes an individual or small group that had a significant positive impact on the formation or execution of public
Software System Award:
policy affecting presented
computing or thetocomputing
an institution or individuals
community. recognized
This foreducation,
can be for developingservice,
a software system that in
or leadership hasa had a lasting position; for
technology
influence, reflected in contributions to concepts, in commercial acceptance, or both. A prize of $35,000 accompanies
establishing an innovative program in policy education or advice; for building the community or a community resources the award with in technology
financial
policy; support
or otherprovided
notableby IBM. activity. The biennial award is accompanied by a $10,000 prize. The first award will be the 2017 award.
policy
ACM Athena Lecturer Award: celebrates women researchers who have made fundamental contributions to computer science. The award
Software System Award: presented to an institution or individuals recognized for developing a software system that has had a lasting
includes a $25,000 honorarium provided by Google.
influence, reflected in contributions to concepts, in commercial acceptance, or both. A prize of $35,000 accompanies the award with
For SIG-specific
financial supportAwards, please
provided IBM. http://awards.acm.org/sig-awards.
by visit
ACM Athena Lecturer Award: celebrates women researchers who have made fundamental contributions to computer science. The award
Vinton G. aCerf,
includes $25,000
ACMhonorarium providedCo-Chair
Awards Committee by Google. John R. White, ACM Awards Committee Co-Chair
Insup Lee, SIG Governing
For SIG-specific Awards,Board Awards
please visit Committee Liaison Rosemary McGuinness, ACM Awards Committee Liaison
http://awards.acm.org/sig-awards.
Vinton G. Cerf, ACM Awards Committee Co-Chair John R. White, ACM Awards Committee Co-Chair
ABA_acm-nominations-cacm-ad-2017.indd 1 8/29/17 9:49 AM
cerf’s up
Six Education
How many ways can you make numbers add
up to six? The obvious pairs are (1,5), (2,4),
(3,3), (4,2) and (5,1). Ok, I left out (0,6) and
(6,0), which are of negligible interest.
Of course, if you permit negative num- methods get the same mathematical these adults will need to learn new
bers the number of pairs is infinite: results. The point is not to use every things to stay current as technology con-
(-1,7), (-2,8), and so on. When I was possible way to do computation but, tinues its relentless evolutionary pace.
growing up, we learned our addition rather, to understand in a more fun- Adapting to change will be a career-en-
and times tables, memorized them, damental way the nature of computa- hancing capability.
and used that information to do simple tion. Students are asked to show their One of the keynote speakers was
arithmetic. I recently had the opportu- work—their reasoning—as a way of Jill Biden, a teacher of teachers who,
nity to join several thousand teachers in determining how well they have ab- astonishingly, taught at Northern Vir-
Southern California for an annual con- sorbed basic concepts of mathemat- ginia Community College while serving
fab on teaching.a I came away with a very ics, for example. Another way in which the U.S. as Second Lady. She referred to
different view of elementary and sec- this depth of knowledge is assessed is the ineffable pain as her stepson, Beau
ondary education than I had going in. to ask students to show they can make Biden, lost his battle with cancer and
For years we have tended to measure practical estimates of the magnitude drew to mind Vice President Biden’s
how well our students have learned of answers they should expect to get. earlier loss of his first wife, Nelia, and
by testing what they know. In the dis- This can be a quick way of ensuring the daughter, Naomi, in a car crash years
cussions at California State Univer- answers are in the right ballpark. For before. In her moving account of life
sity, Fullerton, a different mind set example: before, during, and after the Obama/
emerged. Why not test what students Biden Administration she summarized:
can do? The shift in focus takes into ac- 197 “What matters most in life is how well
count that we have myriad ways in the +332 we will walk through fire.” This, too, was
21st century to find out what we want a life lesson I took away from this very
to know. Part of knowing how to do 527 thought-provoking event.
things is finding information when we In the last couple of months I have
need it. Moreover, it seems evident that To estimate the ballpark, we could also encountered new ways to teach
we learn better not from rote memori- sum 200+330 to get 530, which is a mathematics in online settings. Alexan-
zation, but from using knowledge to good approximation to 527 or even der Khachatryan founded the Reason-
do things. The Montessori schools 200+300 = 500 and easier to do in our ingMind organization to create courses
approach learning from an experien- heads. in mathematics for pre-K, K, elemen-
tial and exploratory perspective, for In the No Child Left Behind legisla- tary, and secondary school levels.c I tried
example. We learn from our mistakes tion, testing knowledge was the gold some of the lessons for the youngest
sometimes more than from our suc- standard but it led to behavior such as children and they plainly laid a founda-
cesses. Certainly science tends to work “teach to the test.” While this might pro- tion for understanding the notion of sets
that way. duce good scores, it might not produce and set membership and equivalence
The Common Core State education good understanding. What I think we classes, all without using unnecessarily
standardsb initiative gets at this idea by want is to produce graduate students complex and obscure terminology.
focusing on the reasons why different who know how to learn and how to find
information they need from a variety of c https://www.reasoningmind.org/
sources. This will also prove to be im-
a Better Together: California Teachers Summit
2017 (http://cateacherssummit.com/)
portant as children grow into adults Vinton G. Cerf is vice president and Chief Internet Evangelist
at Google. He served as ACM president from 2012–2014.
b https://en.wikipedia.org/wiki/Common_Core_ and experience longer working careers
State_Standards_Initiative spanning decades. Without much doubt Copyright held by author.
DOI:10.1145/3135241
I
N T H E I R A R T I C L E “The Science of all possible ways to describe phy-
of Brute Force” (Aug. 2017), sician-documented insomnia.1 We
Marjin J.H. Heule and Oliver tested this approach extensively, and
Kullmann humorously asked it played an important role in all our
whether a mathematician subsequent publications. We even ex-
Call for
using brute force is really “a kind of tended it for additional uses (such as
barbaric monster.” While applying to extract family histories of coronary
Dear Colleague,
Without computing professionals like you, the world might not know the modern
operating system, digital cryptography, or smartphone technology to name an obvious few.
For over 60 years, ACM has helped computing professionals be their most creative, connect
to peers, and see what’s next, and inspired them to advance the profession and make a
positive impact.
ACM offers the resources, access and tools to invent the future. No one has a larger
global network of professional peers. No one has more exclusive content. No one
presents more forward-looking events. Or confers more prestigious awards. Or provides
a more comprehensive learning center.
Here are just some of the ways ACM Membership will support your professional growth
and keep you informed of emerging trends and technologies:
Joining ACM means you dare to be the best computing professional you can be. It means
you believe in advancing the computing profession as a force for good. And it means
joining your peers in your commitment to solving tomorrow’s challenges.
Sincerely,
Vicki L. Hanson
President
Association for Computing Machinery
q Join ACM-W: ACM-W supports, celebrates, and advocates internationally for the full engagement of women in
computing. Membership in ACM-W is open to all ACM members and is free of charge.
Priority Code: CAPP
Payment Information
Payment must accompany application. If paying by check
or money order, make payable to ACM, Inc., in U.S. dollars
Name or equivalent in foreign currency.
Credit Card #
City/State/Province
Exp. Date
ZIP/Postal Code/Country
Signature
Email
1-800-342-6626 (US & Canada) Hours: 8:30AM - 4:30PM (US EST) [email protected]
1-212-626-0500 (Global) Fax: 212-944-1318 acm.org/join/CAPP
The Communications Web site, http://cacm.acm.org,
features more than a dozen bloggers in the BLOG@CACM
community. In each issue of Communications, we’ll publish
selected posts or excerpts.
DOI:10.1145/3131066 http://cacm.acm.org/blogs/blog-cacm
Representations,
mathematics in the n-dimensional
space). This is certainly an intrigu-
ing result, in accord with our un-
queen ary. That close reflection is derived from The lessons to be learned from earli-
The female ruler of an independent the fact that words in natural text will be er school reformers are straightforward.
state, especially one who inherits arranged in accordance with dictionary ˲˲ Build teacher capabilities in con-
the position by right of birth definitions. The word2vec result is revela- tent and skills since both determine to
king tion of an embedded regularity. what degree, if any, a policy gets past the
The male ruler of an independent classroom door.
state, especially one who inherits References ˲˲ With or without enhanced capabili-
1. Mikolov, T., Chen, K., Corrado, C., and Dean, J. Efficient
the position by right of birth Estimation of Word Representations in Vector Space, ties and expertise, teachers will adapt
ruler https://arxiv.org/abs/1301.3781. policies aimed at altering how and what
2. Oxford Living Dictionary, 2017, Oxford University
A person exercising government or Press, https://en.oxforddictionaries.com/. they teach to the contours of the class-
3. Rong, X. word2vec Parameter Learning Explained,
dominion https://arxiv.org/abs/1411.2738.
rooms in which they teach. If policymak-
ers hate teacher fingerprints over inno-
These definitions2 show gender can vations, if they seek fidelity in putting
be “factored out,” and in common usage Mark Guzdial desired reforms into practice, they wish
the gender aspect of sovereigns is nota- Coding in Schools as for the impossible.
ble. We would expect those phenomena New Vocationalism: ˲˲ Ignoring both of the above lessons
to show up in vast text corpora. In fact, Larry Cuban on ends up with incomplete implementa-
we would expect that to show up in text What Schools Are For tion of desired policies and sorely disap-
corpora because of the dictionary entries. http://bit.ly/2tpSgip pointed school reformers.
Since we base word use on definitions July 18, 2017 ˲˲ In Part 3 (http://bit.ly/2wpo9o8), he
captured by the dictionary, it is natural Larry Cuban is an educational historian returns to the question of what school is
for any graph-theoretic distance metric who has written before on why requiring for. He describes successful reform as
based on node placement to (somehow) coding in schools is a bad idea (http:// a collaboration between top-down de-
reflect that cross-semantic structure. bit.ly/2uocMLQ). Jane Margolis and Yas- signers and policy-makers and bottom-
Suppose that, employing the English min Kafai wrote an excellent response up teachers. He describes a successful
slang terms “gal” and “guy” for male and about the importance of coding in model for reform that created “work
female, the word for queen were “ruler- schools (http://bit.ly/2vmi52U). Cuban circles” of researchers and teachers (at
gal,” and for king “rulerguy,” (perhaps penned a three-part series about “Cod- Northwestern University) to achieve the
the word for mother were “parentgal,” ing: The New Vocationalism,” likely in- goals of the researchers’ curriculum by
and for father, “parentguy”). Then the spired by a recent New York Times article adapting it with the help of the teachers.
word vector offsets calculated would not about the role tech firms are having on Cuban is not necessarily against
appear as remarkable, the relationships school policy (http://nyti.ms/2uodaKi). teaching computing in schools; he says
exposed in the words themselves. ˲˲ In Part 1 (http://bit.ly/2v2ULoo), he it doesn’t make sense to impose it as a
The system word2vec constructs and describes the ‘dance’ schools have had mandate from industry. More impor-
operates through the implicit frame- with industry over more than 100 years, tantly, he offers a path forward: mutual
work of a dictionary, which gave rise to between preparing future citizens and adaptation of curricular goals, between
the input data to word2vec. How could preparing future workers. designers and teachers.
it be otherwise? As we understand the Preparation for the workplace is not Mutual adaptation can benefit teach-
high degree of contextual dependency the only goal for public schooling, yet ers and students. While this is only one
of word meanings in a language, any that has been the primary purpose for study of four teachers wrestling with
representation of word meaning to a most reformers over the past three de- teaching a science unit, it is suggestive
significant degree will reflect context, cades. A century ago, reformers also of what can occur.
where context is its interassociation elevated workplace preparation as the Will similar efforts involve teachers
with other words. overarching purpose for tax-supported and make the process of mutual adap-
The result is still intriguing. We public schools. tation work for both teachers and stu-
have to ask how co-occurrence of In the new vocationalism, Cuban dents? I have yet to read of such initia-
words can reliably lay out semantic sees schools have been tied to economic tives as districts and states mandating
relationships. We might explore the growth and the needs of information- computer science courses and requiring
aspects of semantics missing from age society. He sees coding advocates young children to learn to code. Repeat-
context analysis, if any. We might (and blending the roles of school in prepar- ing the errors of the past and letting
should) ask what sort of processing of ing citizens and school as preparing mutual adaptation roll out thought-
a dictionary would deliver the same workers by arguing that computing is lessly has been the pattern thus far. The
sort of representations, if any. necessary for modern society. “New Vocationalism,” displaying a nar-
The word vectors produced by the ˲˲ In Part 2 (http://bit.ly/2vmjCG8), he rowed purpose for tax-supported public
method of training on a huge natural text points out any education reform faces schools, marches on unimpeded.
dataset, in which words are given dis- the reality of what teachers know and
tributed vector representations refined what they will actually do in the class- Robin K. Hill is an adjunct professor in the Department of
Philosophy at the University of Wyoming. Mark Guzdial
through associations present in the in- room. He draws on efforts in the 1950s is a professor in the College of Computing at Georgia
put context, reflect the cross-referential and 1960s, and uses the story of Logo to Institute of Technology.
semantic compositionality of a diction- explain how reformers get it wrong. Copyright held by authors.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 13
SEC 2017
The 2nd ACM/IEEE Symposium on Edge Computing
San Jose, CA, October 12-14, 2017
http://acm-ieee-sec.org/2017/
General Chair
Register Now
Junshan Zhang, Arizona State Univ.
Program Chairs
Mung Chiang, Purdue Univ.
Keynote Speaker Bruce Maggs, Akamai/Duke Univ.
Steering Committee
Prof. Mahadev Satyanarayanan Victor Bahl, Microsoft Research
Flavio Bonomi, Nebbiolo Technologies, Inc
Carnegie Mellon Univ. Rong N. Chang, IBM Research
Dejan Milojicic, HP Labs
Michael Rabinovich, Case Western Reserve
Univ.
Dr. Pablo Rodriguez Weisong Shi (Chair), Wayne State Univ.
CEO of Telefonica Alpha Tao Zhang, Cisco
Conference Sponsorship
Platinum Sponsors
Gold Sponsors
N
news
3D-Printing
Human Body Parts
Bioprinting has generated bones, cartilage, and some muscles;
hearts and livers are still years away.
T
H E A D VE N T OF three-dimen-
sional (3D) printing is al-
ready yielding benefits in
many fields by improving
the speed and efficiency of
product development, prototyping,
and manufacturing, while also en-
abling true on-off customization to
suit individual needs. While this is
certainly a boon for manufacturers,
3D printing holds even greater prom-
ise when considering the possibilities
for creating body parts or tissues to
replace or repair organs or limbs that
have worn out, become damaged, or
have been lost to due injury or disease.
Indeed, significant ongoing re-
search is being conducted at the uni-
versity level into the use of 3D printing
to create a variety of replacement parts
for aspects of the human anatomy.
PHOTO BY CH RIS RATCLIF FE/ BLOOM BERG VIA GET T Y IM AGES
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 15
news
The process was far too slow and complex tissue and more control; 3D that support blood vessels.
labor-intensive to be practical, so Ata- printing can provide that.” “The biggest issue on the whole
la’s team, which had relocated to the The Wake Forrest University team right now is blood supply,” Goldstein
Wake Forest Institute for Regenera- has successfully fabricated mandible says. “You can’t make the constructs
tive Medicine in Winston-Salem, NC, and calvarial (skullcap) bone, car- too large without running into the is-
reconfigured standard desktop inkjet tilage, and skeletal muscle, and ex- sue of having a necrotic core (an area
printers so they could use the inkjets pects to conduct additional clinical of dead cells) in the middle. Just like
to shoot out cells. The idea behind trials over the next three to five years our body needs blood circulating all
bioprinting is to use a small nozzle to demonstrate the effectiveness of its the time, so do these living tissues.”
to precisely control the placement of process. Studies have shown the printed
cells, and the biomaterials around the Researchers on Long Island, NY, tracheal tissue was just as strong
cells, to form a structural framework, have addressed 3D bioprinting in a as the tracheal tissue found natu-
while also creating a construct that different manner, using relatively in- rally in mammals, and since the suc-
can deliver nutrients to ensure the expensive consumer-grade 3D print- cess of printed tracheas, the team
cells’ growth. ers to print replacement body parts, has gone on to print other types of
One of the key considerations with including bone, cartilage, and a tra- body parts, including bone and car-
bioprinting is the need to maintain chea. Indeed, Todd Goldstein, a 3D tilage. These parts are particularly
a 3D structure. The liquid “inks,” bioprinting researcher and director of strong candidates for 3D printing
which contain cells and structural Northwell Ventures 3D Printing Labo- because each individual will re-
biomaterial such as collagen, must ratory, Manhasset, NY, has printed quire uniquely sized parts. Further-
gel as quickly as they are deposited, tracheas made of living cells, using a more, because bone, cartilage, and
to maintain their form. The printers, Makerbot Replicator 2, a 3D printer tendons don’t require a blood sup-
with multiple nozzles, allow for the that has been modified to print bio- ply, they can be implanted without the
simultaneous printing of multiple material, at a cost of under $2,000. possibility of rejection.
cell types, structural biomaterials, Goldstein modified the Makerbot Indeed, that is why the printing of
and other chemicals. This, of course, printer with a syringe that dispenses more-complex body parts is still likely
requires significant pre-planning bio-ink, a material made of living cells several years away. Creating pieces of
calculations to be made focused on with a viscous consistency. The nozzle soft tissue to mend organs such as the
cell density, type, and proportions of that comes standard on the machine liver, kidney, and heart, can be quite
other supporting biomaterial prior to is used to emit an organic plastic challenging to bioprint, given the
the printing, although according to called PLA (polylactic acid), which cre- need to support constant, consistent
researchers, much of this can be mod- ates the scaffold of the trachea. The blood flow.
eled by observing existing tissue taken material costs just a few dollars, and “The biggest challenge is creating
from patients, and then replicating is biodegradable in the body so only blood vessels that will work immedi-
that structure with biomaterial. the body part is left after implantation ately,” Goldstein says. “When you bio-
More recently, Wake Forest Univer- is completed. print cells, if they don’t have a blood
sity’s Sang Jin Lee and his team have Goldstein says his team’s work is supply, they’ll be dead in five hours or
been working on the development mostly focused on bone, cartilage, and less.”
of an integrated tissue-organ printer trachea printing, due to the current Goldstein acknowledges that at
that can fabricate stable, human-scale limitations surrounding the efficient present, “there isn’t a good solution
tissue constructs of any shape. To en- and effective printing of constructs for that right now. That’s why a lot
sure the shape of the tissue construct of research is going into 3D-printing
is correct, clinical imaging data is fed small capillaries and a vascular net-
into a computer model of the dam- The printing of work that can support larger 3D-print-
aged or missing tissue, and that mod- ed constructs, like the liver, kidney, or
el data is then translated into a pro- complex body parts heart.”
gram that controls the motions of the is likely still several Work is progressing on creating
printer nozzles, which dispense cells structures that can support blood
to specific locations. The printer also years away. The soft flow. Adam Feinberg, associate profes-
incorporates micro-channels into the tissue required to sor of materials science and biomedi-
constructs, which allow the precise cal engineering at Carnegie Mellon’s
diffusion of nutrients to the printed mend organs can College of Engineering, has developed
cells, ensuring their survival. be challenging to a new, less-expensive way of 3D print-
The key benefit to using 3D printers ing biological soft materials, includ-
is the ability to create complex parts bioprint. ing coronary arteries and embryonic
relatively quickly and efficiently. “Us- hearts. Feinberg also uses modified
ing 3D-printing techniques, we can consumer-grade 3D printers that use
easily build complex structures,” Lee open-source software to control and
says, noting that in a clinical situation fine-tune their print parameters.
with real patients, “you need more What makes Feinberg’s work
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 17
news
Digital Hearing
Advances in audio processing help separate
the conversation from background noise.
L
IKE MANY OTHER technolo-
gies, hearing aids have im-
proved rapidly over the last
two decades.
Digital audio processing
has enabled more sophisticated and
personalized features, and users report
higher satisfaction. Still, in noisy so-
cial situations such as cocktail parties,
hearing-impaired users must work
hard to track the conversation of com-
panions, just as someone with normal
hearing may struggle to follow a lively
conference call.
Researchers and companies contin-
ue to advance the technology by under-
standing how listeners process a com-
plex “auditory scene,” which requires
more than just amplifying sounds.
“Restoring audibility is simply not
enough,” said Christine Jones, direc-
tor of the Phonak Audiology Research
Center in Warrenville, IL. “There’s a
loss of fidelity that prevents individu-
als from having full restoration of func-
tion even when the sound becomes au-
dible again.”
Research also has shown that peo-
ple process sound in different ways, so The Earlens Light-Driven Hearing Aid converts sounds into pulses of light, which activate a
the “best” hearing aid technology will lens on the eardrum.
differ, too. Even as researchers struggle
to measure these differences, compa- DSPs also suppress the feedback that labs because of the severe design con-
nies are providing new ways to address arises from the close proximity of mi- straints posed by small device size and
them and to develop fitting procedures crophone and speaker, and can re- the need for low power consumption to
and user controls that work best for duce distracting background noise, preserve battery life. In addition, pro-
each individual. such as wind. cessing must be done with very limited
In recent years, hearing aid compa- delay, especially because the processed
Digital Revolution nies also have introduced frequency- signal must coexist with sound passing
Digital signal processors (DSPs) were lowering technology. Patients with ex- through bypass channels that avoid
first used in hearing aids in 1996. In the treme hearing loss at high frequencies uncomfortable ear-canal blockage.
ensuing years, almost all hearing devices can benefit when sounds are shifted to In special situations, engineers can
adopted them because of their flexibility frequencies where they still have some dramatically improve listeners’ experi-
in providing programmable features. function, said Brent Edwards, CTO of ences by incorporating external devic-
For example, simple amplification Earlens in Menlo Park, CA. “It’s a dis- es. For example, asking conversation
makes soft sounds perceptible, but tortion, but if you do it effectively, you partners to clip a microphone on their
PHOTO BY EDWA RD OLIVE
loud sounds uncomfortable. DSPs can get some phonemes still through collar greatly reduces background
make it easy to adjust both the gain the damaged auditory system.” noise, and combining the output of
(sound boost) and the dynamic-range Hearing researchers have explored several microphones on a table can
compression to match an individual’s other processing strategies over the enhance a particular speaker. External
hearing loss at different frequencies. years, but many of them remain in the loudspeakers, as at a business, may
also provide a more compelling spatial Edwards said the “hottest topic in
perception of sound than in-ear micro- hearing science right now” is called
phones. Yet everyday use still requires “As you develop new “hidden hearing loss,” in which patients
small, low-power devices. technologies, you lose some of the normal nerve connec-
One useful type of external device tions. “It’s like losing half the pixels on
that many people already carry is the have to develop new your high-def TV,” he said, but it does
smartphone. A Bluetooth connection ways of measuring its not affect the audibility of pure tones.
to a phone app gives users a richer The standard test of detection
and more intuitive interface, for ex- benefit. That’s part of threshold “is not telling you, say in a
ample, even with a low data rate. A what engineers don’t restaurant, how salient the representa-
more bandwidth-intensive approach tion of the auditory scene is,” Edwards
offloads some signal processing to understand about warned. “As you develop new technolo-
the phone, with its larger battery and hearing loss.” gies, you have to develop new ways of
higher processing power, but long de- measuring its benefit,” he said. “That’s
lays are unacceptable. part of what engineers don’t under-
stand about hearing loss. There’s a lot
Auditory Scenes about the individual patient that we
Although increased processing power can’t tell from the diagnostics.”
has clearly benefited hearing aid tech- most effective features,” says Jones,
nology, designs must extend beyond but “we have different levels of beam- Fitting the Listener
electrical engineering to encompass forming depending on the scene.” As Fine-tuning settings is already a key as-
the complex and idiosyncratic ways background noise increases, for ex- pect of “fitting” of a new hearing aid by a
that people process and interpret ample, the effective beam is made in- trained audiologist, which contributes
sounds. A particularly influential con- creasingly narrow. But if a car scene is to the devices’ price. Edwards com-
cept is auditory scene analysis, which detected, the beamforming is turned pares this process to improving a TV
posits that people perceive sound off, because speakers may well be to picture by yelling instructions to a tech-
not as frequencies but as individual the side or behind the listener. nician who can’t see the picture. “What
sources, each emitting waveforms people are now doing is developing sys-
with similar attributes. In this view, Individual Strategies tems where the patient can make adjust-
hearing aids should preserve or ac- There are limits to such automated ments themselves,” he said. “There has
centuate the cues that let a listener selection of algorithms, however, due to be a balance between diagnostic-driv-
identify and attend to sources of in- not only to varying preferences of indi- en fitting by a professional and this sort
terest while disregarding others. viduals, but to differences in how their of self-adjustment.”
As a step in that direction, hearing brains process sound. “We’ve done a really good job now of
aid companies often categorize the One feature that helps distinguish creating algorithms that work on aver-
sound environment and vary the pro- sources in an auditory scene is the age, based on the degree and configu-
cessing algorithms accordingly. “We overall modulation of the auditory sig- ration of the hearing loss,” said Jones,
have at least 12 different attributes we nal, Kuk said. Some people, particular- who has practiced as an audiologist, in-
look at,” including the frequency spec- ly those with both hearing impairment cluding fitting hearing aids to pediatric
trum, the modulation characteristics, and cognitive limitations, “are more patients with limited communication
and the angle of the sound deduced reliant on the temporal waveform” to skills. “But two patients who present
by comparing two microphones, said identify a source, for example when with the same hearing loss could have
Francis Kuk of the Office of Research different frequencies get louder at the very different levels of functional audi-
in Clinical Amplification (ORCA-USA) same time. Hearing aids that respond tory behavior. There’s still room to do
of Widex, in Lisle, IL. rapidly to protect users from sudden some fine-tuning based on their own
Based on these parameters, Widex loud sounds distort the temporal wave- personal capabilities.”
will predict that the environment con- form, making it difficult for these lis- In light of the high price of current
tains music, speech in a quiet envi- teners to use this important cue. hearing aids (which can cost as much
ronment, speech in a train station, or The effort and attention required at $4,000 per unit), there is increasing
other settings. The inferred scene then to decipher speech can be exhaust- pressure to open the market to cheaper
determines the default parameters of ing for hearing-impaired people, but over-the-counter (OTC) devices. Low-
the hearing aid, such as the noise re- that is not easy to assess. “We used to ered regulatory requirements, advo-
duction for a noisy environment. say speech understanding and sound- cates suggest, might encourage com-
The auditory scene can also be used quality assessment were the two mea- panies like Dolby and Bose to leverage
to modify beamforming, which wire- sures of what a technology was doing. their expertise in entertainment audio
lessly combines audio signals from Now we can add a third dimension, and for personal devices.
both earpieces. The technique can en- that’s cognitive impact,” said Edwards. “I support creating an OTC catego-
hance sounds coming from straight “Some technologies may not improve ry, as long as it doesn’t affect the tra-
ahead, for example, because they ar- speech understanding that much, but ditional distribution and provision of
rive at the ears in sync. “It’s one of our reduce cognitive load.” hearing aids,” Edwards said. Kuk cau-
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 19
news
tioned such over-the-counter products segment contributes to the speech ing target. As technology improves, peo-
are unlikely to have the advanced capa- of interest or can be disregarded as ple’s expectations for what a satisfying
bilities of medically-approved hearing noise. A similar binary classification hearing aid is also increases.”
aids like those his company offers, and and masking is used to make MP3 au-
may instead resemble the generic read- dio encoding more efficient. “Once
Further Reading
ing glasses sold in drug stores. you have formulated it as a classifica-
“As an audiologist, I believe that for tion problem, then modern machine Abrams, H., and Kihm, J.
An Introduction to MarkeTrak IX:
many people with hearing loss, taking learning techniques can be utilized;
A New Baseline for the Hearing Aid Market,
something off the shelf will not do the in our case, deep neural networks,” Hearing Review, June 2015,
best job,” Jones agreed, although pa- Wang said. Lab testing shows that http://bit.ly/2sKl3cL
tients with limited impairment may masking the “noisy” time-frequency Edwards, B.
find such devices helpful. “There are segments significantly improves lis- A Model of Auditory-Cognitive Processing
still people that benefit from the ability teners’ speech comprehension. and Relevance to Clinical Applicability,
of an audiologist to more closely fine- If research like this can be exploited Ear & Hearing July/August 2016,
http://bit.ly/2rX9DpQ
tune the amplification and the behav- in practical devices, it could work with
ior of the hearing instruments to both listeners’ brains to help extract specific Rönnberg, J., et al.
their hearing loss and also their life- sources from a complex audio scene. The Ease of Language Understanding (ELU)
model: theoretical, empirical, and clinical
style needs.” But when speech is also the noise, Jones advances, Front. Syst. Neurosci., 13 July
said, “there’s no acoustic separation, 2013, http://bit.ly/2rtSD93
Distinguishing Sources other than direction and maybe level, Wang, D.
In spite of steady advances in hearing between your target and your inter- Deep Learning Reinvents the Hearing Aid,
aids, users still struggle to distinguish ference. That is the challenge.” IEEE Spectrum, December 6, 2016,
voices in a hubbub of other voices, said In spite of these aspirations for the http://bit.ly/2qYmhQm
DeLiang Wang of Ohio State Universi- future, “the satisfaction rate for hearing Stix, G.
ty. “Every hearing-aid company recog- aids today is much, much higher than How Hearing Works [Video],
nizes that the cocktail-party problem several years ago, and especially before Scientific American, August 1, 2016,
http://bit.ly/2qQsbE0
is not solved—and needs to be solved.” digital hearing aids,” Kuk said. Industry-
In the laboratory, Wang and his col- sponsored surveys show that “almost
Don Monroe is a science and technology writer based in
leagues divide the audio stream into 90% of patients who wear today’s hear- Boston, MA.
20ms-long segments in various fre- ing aids are satisfied,” he explained.
quency bands, and ask whether each “Having said that, satisfaction is a mov- © 2017 ACM 0001-0782/17/10 $15.00
Milestones
L
A S T Y E A R , T H E American “What we are finding in our lab is
Academy of Pediatrics (AAP) that these devices command atten-
updated its guidelines on tion much better than other things.
how much access children It can make it more difficult for par-
should have to electronic de- ents to interact with their children,”
vices amid growing concerns among Christakis adds. “I tend to think of
parents of the effects of electronic me- the effects being mediated through
dia. Yet extrapolation of the evidence two different pathways. One is the di-
linking television and behavior may rect pathway, which is the actual con-
obscure potentially more subtle and tent. Interactive media could lead to
diverse effects. Recent developments the same kind of overstimulation as
in work with interactive devices rep- fast-paced TV, although being inter-
resent an increased understanding active, it means the child can control
of how children learn and the impor- the pacing in a way that isn’t possible
tance of social interaction. with television.
Concerns over the mental effects “There is also the indirect pathway,
of electronic devices have been largely which works through displacement.
driven by fears that the prevalence of This is about what could they be doing
conditions such as attention-deficit/ that they aren’t, whether it’s singing,
hyperactivity disorder (ADHD) seem to reading, or going outside to play. Even
follow their adoption. In 2011, the U.S. if someone developed the perfect app
Centers for Disease Control and Pre- Danielle Erkoboni-Wilbur, a pedia- that was perfectly paced and shown to
vention (CDC) reported an increase trician at St. Christopher’s Hospital be beneficial, if children used that app
of 33% in ADHD prevalence among for Children in Philadelphia. “When eight hours a day, we would recognize
children from 1997 to 2008. A 2016 those studies on behavior were done, that behavior as being a problem,”
follow-up study by the CDC found the technology just meant television. Christakis adds.
increase continued to 2012, but then There is nothing that’s formally in the Christakis is concerned about the
began to fall through 2015 among literature linking those outcomes to addictiveness of applications on tab-
children of poorer families, although portable digital media.” lets and smartphones, and the po-
that reduction was not reflected in A 2015 study by pediatrician Dr. Hil- tential for them to eat too far into the
wealthier homes. da Kabali and colleagues at the Einstein child’s daytime activities. “I tell par-
From 1999 onward, a number of Medical Center in Philadelphia found ents to limit time on interactive devic-
studies looked at heavy television use much of the average child’s time with es to no more than 30 minutes. People
by young children and identified pos- portable devices is spent watching on- ask, ‘How do I know?’ I came up with
sible effects on their development. line TV, rather than using interactive ap- that limit when we looked at what peo-
Dr. Dimitri Christakis, director of plications. However, as less passive ex- ple have done in the past, using tech-
the Center for Child Health, Behav- periences become more common over niques such as time diaries. Children
ior, and Development at Seattle Chil- time, they may have different effects on in the pre-iPad days typically spent
dren’s Hospital, says: “We found that children’s behavior. a maximum of 20–30 minutes a day
the more television children watched Says Christakis, “I think the touch- with a particular toy, but they will of-
at age three, the more likely they were screen interactive experience that began ten spend much more time than that
to have attention problems.” The fast- with the iPad is, for the lack of a better with an iPad game; there is something
changing images and sounds in many word, a transformative technology. It’s very different about the experience.
television programs that are made to the interactivity that makes it different. “The makers of many of these
capture the attention of young children With traditional media, a child never games design them to be addictive.
“condition the mind to a reality that thinks or says ‘I did it’: it’s a completely We live in an attentional economy,
PHOTO BY T Y LIM
doesn’t exist,” he notes. passive experience. But it’s so gratifying so apps writers are often trying to get
The question is whether such a link to make something happen by touching people drawn in and get people ad-
extends to portable devices, says Dr. an object on the screen.” dicted,” Christakis adds.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 21
news
Alexis Hiniker, a Ph.D. candidate One underappreciated limitation information really quite rapidly,” Barr
in Human Centered Design and Engi- of screen-based devices that has says. “There is something about a per-
neering at the University of Washing- emerged in studies is their two-di- son helping you that is different, but the
ton, says parents quickly detect over- mensional (2D) nature. That realiza- device itself can often seem so compel-
addictive apps. “A lot of the families tion is helping to bolster research on ling. I wonder if maybe parents some-
we talked to, they said ‘we tried X how children learn. Research by de- times think maybe the device can teach
but it was so hard to get them to put velopmental psychologist Rachel Barr them better,” Barr adds.
it away that we had to take it away’. and colleagues at Georgetown Univer- Researchers such as Christakis in-
They had to go cold turkey. So we find sity revealed infants find it more diffi- sist parents should resist the tempta-
apps that are super-grabby are not cult than older children and adults to tion to believe the device will be better
successful, at least when it comes to translate learning from the 2D space at teaching and to be confident in what
younger children.” to three dimensions. they can bring to the situation. “One
Taking the device away inevitably “We call it the transfer deficit. A has to keep in mind that our brains
causes tension. Hiniker’s own work very common problem in learning in have evolved over millennia; they are
involves finding ways to design apps general is when we have take some- contingent on social interaction. Every
for children that make it easier to set thing from one context to another. time we are interacting with a child,
limits on usage time without causing What you are faced with on the screen we are laying down new synapses and
tantrums when device time has to stop. is different to what you are faced with making new connections in the brain.”
“Our approach was to look at what is in the real world,” Barr says. “We need Erkoboni-Wilbur concludes: “The
helpful to children in acquiring self- to understand that this transfer defi- message that we are focusing on is that
regulation. We found if children have cit is important. Children can seem to technology can be a social experience.
explicit opportunities to make plans, be digital natives when they use these We need to guide it away from being an
they will take more ownership over their devices so easily, but it’s actually isolated type of interaction. We are really
behavior,” she explains. more cognitively demanding to learn encouraging parents to sit down togeth-
The Plan & Play system trialed by from them.” er and learn with technology. Parents
Hiniker lets children work out how What makes it easier for children and their children, teachers and chil-
long they want to spend on an activity to learn is what Barr calls the “social dren, children and their peers; experi-
before starting. “As they are playing, scaffold.” The Georgetown research- ence it as a social medium, and enhance
we surface that information back to ers showed this effect in an experiment that social interaction.”
them and remind them of what their based on a jigsaw-like puzzle. Research-
plan was. ers demonstrated the game to half the
Further Reading
“In terms of practical consider- children. The other half only viewed a
ations for industry, I would love to “ghost version”: a puzzle that assembles Kabali, H. K., Irigoyen, M. M., Nunez-Davis, R.,
Budacki, J. G., Mohanty, S. H., Leister, K. P., and
see design for things like parental itself onscreen without human interven-
Bonner Jr, R. L.
controls shift to include these ideas, tion. When given the tablet to solve the Exposure and use of mobile media devices by
[and] move away from lockout mecha- puzzle, the young children who only saw young children, Pediatrics, 136, 1044–1050
nisms that simply force children to do the ghost version were often unable to (2015)
other things.” complete the exercise. “We gave them Chassiakos, Y.R., Radesky, J., Christakis, D.,
Erkoboni-Wilbur says a focus on de- the touchscreen and they were baffled, Moreno, M.A., and Cross, C.
signing devices and apps for children but if someone showed them first, they Children and Adolescents and Digital Media,
American Academy of Pediatrics Technical
can avoid the potential for harm and were really good at it.”
Report.
make them support better develop- Barr points to a 2014 study performed
ment. “There are a variety of futures in by Northwestern University doctoral Barr, R.
Memory Constraints on Infant Learning From
front of us. We can see how easy it is for candidate Courtney Blackwell (now a re- Picture Books, Television, and Touchscreens,
technology to isolate folks, but we do see search assistant professor of medical so- Child Development Perspectives (2013)
pointers in research as to what causes cial sciences at Northwestern’s Feinberg Hiniker, A., Lee, B., Sobel, K., and Choe, E.K.
great outcomes; what we can do to op- School of Medicine) with kindergarten- Plan and Play: Supporting Intentional Media
timize technology to improve develop- age children at three schools in Chicago Use in Early Childhood, Proceedings of the
mental outcomes.” as another example of the social scaf- 16th Annual ACM Conference on Interaction
Allison Druin, a professor in the fold in action, this time with children Design and Children (IDC ’17)
College of Information Studies at the working together. At the beginning of Blackwell, C. K.
University of Maryland, College Park, the school year, children at one location iPads in Kindergarten? Investigating
the effects of tablet computers on
agrees: “Technology, as with any tool received an iPad, another school did not student achievement, Proceedings
in a person’s life, is either going to supply them at all, and a third gave iPads of the International Communication
amplify the challenges or support the to pairs of students. Those who shared Association (2015).
strengths. We have the situation where devices scored higher on literacy tests
autistic kids finally feel comfortable than the other two groups. Chris Edwards is a Surrey, U.K.-based writer who reports
socially because they have this technol- “This human ability is really quite an on electronics, IT, and synthetic biology.
ogy bridge, but there is also the poten- amazing one that we have. We can take
tial to amplify inattention.” information from someone and link © 2017 ACM 0001-0782/17/10 $15.00
In January 2018, the Journal of Human-Robot Interaction (JHRI) will become an ACM
publication and be rebranded as the ACM Transactions on Human-Robot Interaction (THRI).
To submit, go to https://mc.manuscriptcentral.com/thri
V
viewpoints
Technology Strategy
and Management
Amazon and Whole Foods:
Follow the Strategy
(and the Money)
Checking out the recent Amazon acquisition of Whole Foods.
I
N JU N E 2 0 1 7 , Amazon an- new business development and ex- has always made deft use of physical
nounced it would acquire pansion, and since Amazon does not assets as well as the Internet. It soon
Whole Foods, the national have to dilute its stock, this seems like offered two million or more titles—far
grocery chain with nearly 500 a great deal for CEO Jeff Bezos. But is more than actual bookstores were able
stores, for $13.7 billion in cash. it? And what does this acquisition say to stock. Later in the 1990s, Amazon
This is less than Whole Foods’ 2016 rev- about Amazon’s strategy? added a platform service that linked
enues of $15.7 billion, which included Launched by Bezos in 1994 as a buyers who wanted less-popular books
$507 million in net profits (3.2% of rev- pioneering online bookstore, Amazon with third-party sellers that held the in-
enues). Whole Foods became a take- ventory, a business now called Amazon
over target because of declining sales Marketplace. Unlike Google and other
and profits as well as increasing com- Unlike Google and Internet platforms, Amazon was never
petition in the cutthroat supermarket a completely virtual business. Bezos
business.1 By comparison, Amazon’s other Internet operated out of a warehouse that in-
2016 revenues were $136 billion, up platforms, Amazon tercepted shipments from distributors
27% over 2015, with operating profits and then re-sent the books to custom-
of $4.1 billion and net profits of $2.4 was never ers. The company also began to buy
billion (1.8% of revenues). Amazon’s a completely large numbers of best-seller books and
market value of around $480 billion stock them so Amazon could benefit
is 3.5 times last year’s revenues. It is virtual business. from scale economies in purchasing
buying Whole Foods for less than one and earn premiums from delivering
times its last year’s revenues (0.85%). the books quickly.
Given that Whole Foods is actually Eventually, Amazon expanded to
more profitable than Amazon, which other popular items suitable for sale
plows most of its potential profits into over the Internet, ranging from elec-
tronic goods to digital content (music platforms, and mix products and ser- customer loyalty and seem to drive
and videos) to clothing. Amazon now vices with varying profit rates. In 2016, long-term sales growth.
sells some two million items.4 A decade sales of products accounted for 70% of Bezos’ interest in groceries goes
ago it also launched Amazon Web Ser- Amazon’s revenues.2 Sales of services back a decade, when he launched Am-
vices (AWS) to sell excess computing (mostly from selling third-party goods azon Fresh. Why? Volume. The annual
and storage capacity from its massive on Amazon Marketplace, revenues grocery business in the U.S. alone was
data centers. In addition, Bezos experi- from AWS, some allocations from worth some $1.3 trillion in 2016, with
mented with his own products such Amazon Prime membership fees, and Walmart (which has 4,500 stores)
as the successful Kindle tablet for e- some advertising) accounted for the having the largest market share at
books, the failed Fire smartphone, and other 30%. AWS produced only $12.2 18%.7 Amazon and Whole Foods to-
the intriguing Echo/Alexa digital assis- billion or 9% of revenues (compared to gether would have a market share
tant and speaker device. More recently, 5% in 2014) but a whopping $3.1 billion of about 3.5%.11 Amazon has figured
Amazon opened a bookstore in Seattle (74%) of operating profits. By contrast, out how to sell perishable and non-
to complement the online store, and it Amazon lost $1.3 billion on overseas perishable groceries via the Web. It
IMAGE BY ALICIA KUBISTA /A ND RIJ BORYS ASSOCIAT ES
is experimenting with new technology sales of $44 billion, nearly one-third is not clear the company has figured
(Amazon Go) that eliminates cashiers. of revenues. It is also estimated that out how to do this at a profit. Amazon
Until now, Amazon’s pieces fit Amazon loses up to $2 billion per year can learn from Whole Foods how to
together pretty well. Now the com- on Prime Memberships.6 With 70- to sell groceries to upscale customers,
pany is moving deeper into the low- 80-million members, Prime generated but this market is shrinking.
margin grocery business. We have to $6.8 billion in 2016 revenue.5 Prime of- Many customers are the same, how-
wonder why. fers free shipping, a lot of free digital ever, so Amazon can try cross-market-
Amazon’s financials are not particu- content, and other benefits for a $99 ing. About 62% of Whole Foods cus-
larly strong compared to other leading annual fee or less in some cases. The tomers are Amazon Prime members.9
technology companies and Internet free services are costly but encourage Perhaps Amazon can convert the 38%
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 25
viewpoints
Inside Risks
The Real Risks of
Artificial Intelligence
Incidents from the early days of AI research
are instructive in the current AI environment.
T
HE VAST INCREASE in speed, An Early Introduction to AI
memory capacity, and com- As a student at Carnegie Mellon Uni-
munications ability allows Application versity (CMU),b I learned about “arti-
today’s computers to do of AI methods ficial intelligence” from some of the
things that were unthink- field’s founders. My teachers were
able when I started programming six can lead to devices clever but took a cavalier, “Try it and
decades ago. Then, computers were and systems fix it,” attitude toward programming.
primarily used for numerical calcula- I missed the disciplined approach to
tions; today, they process text, images, that are problem solving that I had learned as
and sound recordings. Then, it was an untrustworthy a student of physics, electrical engi-
accomplishment to write a program neering, and mathematics. Science
that played chess badly but correctly. and sometimes and engineering classes stressed
Today’s computers have the power to dangerous. careful (measurement-based) defini-
compete with the best human players. tions; the AI lectures used vague con-
The incredible capacity of today’s cepts with unmeasurable attributes.
computing systems allows some pur- My engineering teachers showed me
veyors to describe them as having “artifi- how to use physics and mathematics
cial intelligence” (AI). They claim that AI to thoroughly analyze problems and
is used in washing machines, the “per- products; my AI teachers relied almost
sonal assistants” in our mobile devices, eventually make people superfluous. entirely on intuition.
self-driving cars, and the giant com- Experts have predicted AI will even re- I distinguished three types of AI
puters that beat human champions at place specialized professionals such as research:
complex games. lawyers. A Microsoft researcher recently ˲˲ building programs that imitate hu-
Remarkably, those who use the made headlines saying, “As artificial man behavior in order to understand
term “artificial intelligence” have not intelligence becomes more powerful, human thinking;
defined that term. I first heard the people need to make sure it’s not used ˲˲ building programs that play games
term more than 50 years ago and have by authoritarian regimes to centralize well; and
yet to hear a scientific definition. Even power and target certain populations.”a ˲˲ showing that practical computer-
now, some AI experts say that defin- Automation has radically trans- ized products can use the methods that
ing AI is a difficult (and important) formed our society, and will continue humans use.
question—one that they are working to do so, but my concerns about “arti- Computerized models can help re-
on. “Artificial intelligence” remains a ficial intelligence” are different. Appli- searchers understand brain function.
buzzword, a word that many think they cation of AI methods can lead to devic- However, as illustrated by Joseph Weizen-
understand but nobody can define. es and systems that are untrustworthy baum,2 a model may duplicate the “black-
Recently, there has been growing and sometimes dangerous. box” behavior of some mechanism with-
alarm about the potential dangers of out describing that mechanism.
artificial intelligence. Famous giants
of the commercial and scientific world a Interview with Kate Crawford in The Guardian, b CMU was then known as Carnegie Institute
have expressed concern that AI will March 13, 2017. of Technology.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 27
viewpoints
professor Joseph Weizenbaum.3 In the ficult for computers. The optical char- Newell’s example of a good thesis was
mid-1960s, he created Eliza, a program acter recognition software that I use Floyd’s example of a bad one.
that imitated a practitioner of Rogeri- to recognize characters on a scanned The disputed thesis presented an
an psychotherapy.d Eliza had interest- printed page frequently errs. The fact AI program that would generate pars-
ing conversations with users. Some that character recognition is easy for ersh from grammars. Newell consid-
“patients” believed they were dealing humans but still difficult for comput- ered it good because it demonstrated
with a person. Others knew that Eliza ers is used to try to keep programs from that AI could solve practical problems.
was a machine but still wanted to con- logging on to Internet sites. For exam- Floyd, a pioneer in the field of parsing,
sult it. Nobody who examined Eliza’s ple, the website may displayg a CAPT- explained that nobody could tell him
code would consider the program to CHA as shown here what class of grammars the AI parser
be intelligent. It had no information generator could handle, and he could
about the topics it discussed and did prove that that class was smaller than
not deduce anything from facts that the class of languages that could be
it was given. Some believed Weizen- handled by previously known math-
baum was seriously attempting to cre- ematical techniques. In short, while
ate intelligence by creating a program the AI system appeared to be useful, it
that could pass Turing’s test. How- and require the user to type “s m w m.” was inferior to systems that did not use
ever, in talks and conversations, Wei- This technique works well because the heuristic methods. Bob Floyd taught
zenbaum emphasized that was never character recognition problem has not me that an AI program may seem im-
his goal. On the contrary, by creating been solved. pressive but come out poorly when
a program that clearly was not “intel- Early AI experts taught us to design compared to math-based approaches.
ligent” but could pass as human, he character recognition programs by in-
showed that Turing’s test was not an terviewing human readers. For example, An AI System that “Understood”
intelligence test. readers might be asked how they distin- Drawings and Text
guished an “8” from a “B.” Consistently, A 1967 AI Ph.D thesis described a pro-
Robert Dupchak’s Penny-Matcher the rules they proposed failed when im- gram that purportedly “understood”
Around 1964,e the late Robert Dupchak, plemented and tested. People could do both natural language text and pic-
a CMU graduate student, built a small the job but could not explain how. tures. Using a light pen and a graph-
box that played the game of “penny Modern software for character rec- ics display,i a user could draw geomet-
matching.”f His box beat us consistently. ognition is based on restricting the ric figures. Using the keyboard, users
Consequently, we thought it must be fonts that will be used and analyzing could ask questions about the drawing.
very intelligent. the properties of the characters in those For example, one could ask “Is there a
It was Dupchak who was intelli- fonts. Most humans can read a text, in triangle inside a rectangle”? When the
gent—not his machine. The machine a new font without studying its charac- author demonstrated it, the program
only remembered past moves by its teristics, but machines often cannot. appeared to “understand” both the pic-
opponent and assumed that patterns The best solution to this problem is to tures and the questions. As a member
would repeat. Like Weizenbaum, Dup- avoid it. For texts created on a comput- of the examining committee, I read the
chak demonstrated that a computer er, both a human-readable image and a thesis and asked to try it myself. The
could appear smart without actually machine-readable string are available. system used heuristics that did not al-
being intelligent. He also demonstrat- Character recognition is not needed. ways work. I repeatedly input examples
ed that anyone who knew what was in- that caused the system to fail. In pro-
side his box would defeat it. In a seri- An AI System for duction use, the system would have
ous application, it would be dangerous Constructing Parsers been completely untrustworthy.
to depend on such software. As a new professor, I made appoint- The work had been supervised by
ments with three famous colleagues to another Turing Award recipient, Her-
Character Recognition ask how to recognize a good topic for bert Simon, whose reaction to my ob-
A popular topic in early AI research and my students’ Ph.D theses. The late Alan serving that the system did not work
courses was the character recognition Perlis, the first recipient of ACM’s pres- was, “The system was not designed for
problem. The goal was to write pro- tigious Turing Award, gave the best an- antagonistic users.” Experience has
grams that could identify hand-drawn swer. Without looking up from his work, shown that computer systems must
or printed characters. This task, which he said, “Dave, you’ll know one when be prepared for users to be careless
most of us perform effortlessly, is dif- you see it. I’m busy; get out of here!” Two and, sometimes, antagonistic. The
other Turing Award winners, the late Al- techniques used in that thesis would
d Practitioners of Rogerian psychotherapy echo len Newell and the late Robert Floyd, not be acceptable in any commercial
the patient’s words in their responses. met with me. Separately, both said that
e Dupchak’s accidental death prevented publi- while they could not answer my ques- h Parsers, an essential component of compilers,
cation of his work. I cannot give a precise date. tion directly, they would discuss both a divide a program into its constituent parts.
f Penny-matching is a two-player game. Each Before Floyd’s work, parsers were created by
player uses a coin to make a head or tail
good thesis and a bad one. Interestingly, humans. Floyd’s algorithm automatically gen-
choice. One player wins if both pick the same erated parsers for a large class of languages.
face; the other wins if the choices are different. g This example was found in Wikipedia. i Advanced hardware for the time.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 29
viewpoints
product. If heuristics are used in criti- Artificial Neural Networks heat absorption/loss characteristics
cal applications, legal liability will be a Another approach to AI is based on of the building, and so on. Using this
serious problem. modeling the brain. Brains are a net- model, which allowed their system
work of units called neurons. Some to anticipate needs, and the ability to
An AI Assembly-Line Assistant researchers try to produce AI by imitat- pump heat from one part of the build-
An assembly line could run faster after ing the structure of a brain. They cre- ing to another, they designed a system
tool-handling assistants were hired: ate models of neurons and use them that reduced temperature fluctuations
Whenever workers finished using a to simulate neural networks. Artificial and was more energy efficient.
tool, they tossed it in a box; when a tool neural networks can perform simple Humans do not have the measure-
was needed, the assistant retrieved it tasks but cannot do anything that can- ment and calculation ability that is
for the workers. not be done by conventional comput- available to a modern computer sys-
A top research lab was contracted ers. Generally, conventional programs tem; a system that imitates people
to replace the human assistants with are more efficient. Several experiments won’t do as well as one based on physi-
robots. This proved unexpectedly dif- have shown that conventional mathe- cal models and modern sensors.
ficult. The best computer vision algo- matical algorithms outperform neural Humans solve complex physics prob-
rithms could not find the desired tool networks. There is intuitive appeal to lems all the time. For example, running
in the heap. Eventually, the problem constructing an artificial brain based is complex. Runners maintain balance
was changed. Instead of tossing the on a model of a biological brain, but no intuitively but have no idea how they
tool into the box, assemblers handed reason to believe this is a practical way do it. A solution to a control problem
it to the robot, which put it in the box. to solve problems. should be based on physical laws, and
The robot remembered where the tool mathematics, not mimicking people.
was and could retrieve it easily. The AI The Usefulness of Physics Computers can rapidly search complex
controlled assistant could not imitate and Mathematics spaces completely; people cannot. For
the human but could do more. It is A researcher presented a paper on us- example, a human who wants to drive to
wiser to modify the problem than to ing AI for image processing to an audi- a previously unvisited location is likely
accept a heuristic solution. ence that included experts in radar sig- to modify a route to a previously visited
nal processing. They observed that the nearby place. Today’s navigation devic-
“Artificial Intelligence” in Germanj program used special cases of widely es can obtain the latest data and calcu-
When AI was young, a German psychol- used signal-processing algorithms late a route from scratch and often find
ogy researcher visited pioneer AI re- and asked “What is new in your work?” better routes than a human would.
searchers Seymour Papert and Marvin The speaker, unaware of techniques
Minsky (both now deceased) at MIT. used in signal processing, replied, “My Machine Learning
He asked how to say “artificial intel- methods are new in AI.” AI researchers Another approach to creating artifi-
ligence” in German because he found are often so obsessed with imitating cial intelligence is to construct pro-
the literal translation (Künstliche Intel- human beings that they ignore practi- grams that have minimal initial capa-
ligenzk) meaningless. cal approaches to a problem. bility but improve their performance
Neither researcher spoke Ger- A study of building temperature- during use. This is called machine
man. However, they invited him to control systems compared an AI ap- learning. This approach is not new.
an AI conference, predicting that he proach with one developed by expe- Alan Turing speculated about build-
would know the answer after hearing rienced engineers. The AI program ing a program with the capabilities of
the talks. Afterward, he announced monitored individual rooms and a child that would be taught as a child
that the translation was “natürliche turned on the cooling/heating as need- is taught.1 Learning is not magic; it is
Dummheit” (natural stupidity) be- ed. The engineers used a heat-flow the use of data collected during use to
cause AI researchers violated basic model that included the building’s improve future performance. That re-
rules of psychology research. He orientation, the amount of sunlight quires no “intelligence.” Robert Dup-
said that psychology researchers do hitting sections of the building, the chak’s simple penny-matching ma-
not generally ask subjects how they chine used data about an opponent’s
solve a problem because the answers behavior and appeared to “learn.” Use
might not be accurate; if they do Learning is not magic, of anthropomorphic terms obscures
ask, they do not trust the answers. the actual mechanism.
In contrast, AI researchers were ask- it is the use of data Building programs that “learn” seems
ing chess players how they decide on collected during use easier than analyzing the actual prob-
their next move and then writing pro- lem, but the programs may be untrust-
grams based on the player’s answers. to improve future worthy. Programs that “learn” often ex-
performance. hibit the weaknesses of “hill-climbing”l
algorithms; they can miss the best
j I cannot warrant the truth of this story; it was
related to me as true, but I was not present for
the events. I include it because it contains an l Hill-climbing algorithms are analogous to hikers
important lesson. who always walk uphill. They may end up at the
k Current terminology in German. top of a foothill far below the mountain peak.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 31
V
viewpoints
Economic and
Business Dimensions
FinTech Platforms
and Strategy
Integrating trust and automation in finance.
T
E N Y E A R S F R O M now, will
Google and Amazon play
more of a role in managing
your investment portfolio
than Fidelity? Finance has
traditionally been about trust. We
trust banks to hold our money and
give it back to us when we want it; and
we trust brokerage firms to buy the
securities we want at market prices
and to debit and credit our accounts
accordingly. Because trust is so im-
portant, we have historically held
banks and financial firms to much
higher standards of compliance and
control than other businesses. Finan-
cial institutions are required to follow
well-defined processes with oversight
and failsafe plans aimed at minimiz-
ing risk and maximizing public trust.
These processes have traditionally
involved humans, even as they have
been increasingly augmented with
technology over time.
But the last 20 years have seen the
emergence of a fundamentally new
and different phenomenon enabled to make or support decisions. It would able: mistakes are relatively infrequent
by Internet connectivity and access: have seemed unthinkable 10 years ago and their consequences do not exceed
the “platform” business. In platform to imagine trusting autonomous driv- a reasonable tolerance threshold.1
businesses, many complex processes, ing vehicles to take over the wheel, or The question is, will future inves-
including compliance and checks-and- trust robots to perform surgery on us. tors trust FinTech platforms to the
IMAGE BY ANTON K HRUPIN
balance procedures, are performed But we are increasingly doing just that degree that previous generations
securely by machines. Along with this in a growing number of real-world ar- have trusted traditional banks? Con-
platform phenomenon, newer genera- eas. Broadly, we seem comfortable versely, what will it take for FinTech
tions of humans have co-evolved to be trusting machine-made decisions in platforms to be trusted sufficiently by
more comfortable trusting machines domains in which the risks are accept- future generations?
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 33
viewpoints
XRDS
minimize market impact. Now, robots a number of ways. To start, the frame-
match market participants wishing to work identifies which capabilities are
transact, without third-party interme- required for each type of platform to at-
diation. In so doing they increase li- tain completeness.
quidity at a much larger scale and with Conversely, the framework suggests
more efficiency. which business functions are vulner-
able to different types of disruptions
Seeks Student Lack of open access is not the only
way in which a finance platform may due to their incompleteness. It also
Volunteers be incomplete. Yahoo Finance, for ex- provides a way to think about transi-
ample, is incomplete because it lacks tions from partial to complete plat-
Are you a student who key business processes to complete forms, the implications of these, and
transactions. Ratings agencies are in- the opportunities that would motivate
enjoys staying up to
complete because most of their key val- such transitions. Finally, the frame-
date with the latest ue-adding business processes require work permits us to examine new tech-
tech innovations, and is human expertise and are not amenable nologies in terms of the businesses
looking to make an impact to codification in IT systems. they are likely to impact. Technologies
on the ACM community? The Venn diagram shown in the ac- that facilitate platform completion
companying figure provides a graphi- along a specific dimension will likely
Editorial positions cal view of our framework, which com- affect businesses that are incomplete
bines the three platform components along that dimension.
are now open. and also provides some examples. Complete platforms may also be
The intersection at the center defines created, by design, de novo. Amazon
For more information complete platforms; all of the other and PayPal are early examples in retail
and to apply visit: regions denote incomplete ones. The and payments. Peer-to-peer lending
http://xrds.acm.org/ Venn diagram representation is intui- and robo-advisor platforms are recent
volunteer.cfm tive. The businesses on the periphery instances of complete platforms in
of the diagram tend to be those that finance. Lending Tree, for example,
play supporting roles, rather than cen- is “open,” and provides the technol-
Association for
Computing Machinery
tral ones, in the life cycle of financial ogy infrastructure and processing re-
transactions. The platform framework quired to connect lenders and borrow-
can be useful for analyzing incumbent ers directly. Robo-advisors are openly
S
and new entrant business models in accessible to retail investors and aim
G
.OR
CM
S .A
XRD
D
O .3
•N
L .2 2
VO
0 16
G2
R
The three core components of a complete platform with examples of platform businesses
R IN
SP
ts
Stu
den
exhibiting various levels of completeness.
X
for
zine
aga
AC MM
The
ads
ssro
Cro
G
.OR
.A CM
RDS
ital ion
Digbricat
Fa
S
for n .OR
G
est atio abric CM
anif bric
S .A
XRD
A M ital Fa F
D
with nd
Dig ting a Ha NO
.4
f t Prin L ends VO
L .2
2•
So a tion 20
16
r ic ER
Fab
MM
R
SU
ts
den
r Stu
X
e fo
azin
M Mag
AC
The
ads
ssro
Cro
S
G
.OR
CM
S .A
XRD
D
O .2
•N
L .2 2
VO
15
20
TER
R
W IN
ts
den
Stu
X
for
zine
s of
ultureuting
C mp er y:
Co d Th oce
r an nd Eur
riph m
e Pe ntris
call
y,
te
Cen Beyo g o bally
L
The ignin Glo
Des alizing n and
ib ig hina
ann Des
34 COMMUNICATIO NS O F TH E ACM | O C TO BER 201 7 | VO L . 60 | NO. 1 0 C Guan onation in C
v
Jian Inno
viewpoints
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 35
V
viewpoints
Kode Vicious
IoT: The Internet
of Terror
If it seems like the sky is falling, that’s because it is.
Dear KV,
We are deploying a consumer IoT (In-
ternet of Things) device, with each de-
vice connected to a cloud service that
acts as the platform from which it will
be controlled. The device itself is not
dangerous: it is a simple, slimmed-
down tablet to be used in hotel chains
to replace an alarm clock and TV re-
mote, and to provide access to room
service. The device is battery operated,
rechargeable, and cheap enough that
hopefully no one will want to steal it.
Guests cannot load any information
into it, and—unlike a typical tablet—it
does not serve as a Web browser.
We have an engineer who seems
overly worried about the security of the
communication between the device
and the back end. We are working on
the alpha version of the system now.
All communication between the device
and the cloud is unencrypted, and no
user data crosses the network. Who
cares what TV channel you are watch-
ing or what you ate for breakfast? On
such a lightweight embedded device,
the battery-life cost of encryption is
significant and reduces battery life by lot more of the battery protecting what running around announcing that the
25% in our tests. We expect many users is, in reality, data that is hardly a state sky is falling, but, unless you have been
IMAGE BY AND RIJ BORYS ASSOCIAT ES/SHUT TERSTOCK
will not replace them in their charg- secret. What do you think is the right living under a rock, you will notice that,
ing cradles, since most people cannot level of encryption, if any, to use in indeed, the sky is falling. Not a day goes
remember to charge their own phones such a system, and how can we get the by without a significant attack against
every night; thus, battery life will be annoying engineer to shut up? networked systems making the news,
very important to the project. Not So Secret and the Internet of Terror is leading the
Even if we do turn on some encryp- charge in taking distributed systems
tion, we would like it to be as little as down the road to hell—a road that you
possible, again, to preserve battery life. Dear Not So, wish to pave with your good intentions.
I know a longer key is harder to break, It is true that many security-focused en- Before I get to the question of en-
but it also means that we will be using a gineers can sound like Chicken Little, cryption and key length, I would like
to point out two things. An IoT device suggests AES-128 as a minimum from
is nothing more than an embedded 2016 through 2030, and as your system
system with a TCP/IP stack. It is not a Encryption, (likely) does not contain state secrets, it
magical object that is somehow pro- in and of itself, ought to be sufficient for your use.
tected from attackers because of how Many embedded microprocessors
cool, interesting, or colorful it is. is not a solution now come with dedicated circuits
Second, and I cannot believe I have to securing to offload cryptographic operations.
to point this out to people who would The offloading of crypto algorithms is
read or write to KV, but the Internet has a system. nothing new and is subject to the same
a lot of stuff on it, and it is getting bigger “cycle of life” that we see in other areas
every day. Once upon a time, there were of software and systems engineering,
fewer than 100 nodes on the Internet, where specialized services first appear
and that time is long past. KV has three as add-on peripherals, then show up
Internet-enabled devices within arm’s in the CPU itself. In the 1990s, micro-
reach, and, if you think a billion users with their room number and the ap- processors were unable to keep up with
of the Internet aren’t scary, try mul- propriate settings. Perhaps you would the then-new algorithms being used to
tiplying that by 10 once every fridge, like to put a hotel out of business? set up virtual private networks (VPNs),
microwave, and hotel alarm clock can Easy enough, just order expensive so companies produced cryptographic
spew packets into the ether(net). champagne from room service for offload chips.
Let’s get this straight: if you attach everyone staying in the hotel. Hotel Processors got faster and larger,
something to a network—any net- guests will gladly accept it and then and what could not be handled by fre-
work—it had better be secured as well sue to have it removed from their quency scaling was put into special in-
as possible, because I am quite tired bills. Attacks do not have to occur at structions to handle the harder parts
of being awakened by the sound of the the NSA/FSB/GCHQ or similar levels of crypto. Embedded processors, like
gnashing of teeth caused by each and to run you and your customers out of all processors, continue to gain more
every new network security breach. The business. I think you can see why your circuits, including those for cryptog-
Internet reaches everywhere, and if engineer is talking about encryption. raphy. Any device, such as yours, that
even one-tenth of 1% of the people on it You did not mention authentication drives a display is bound to have some
are bad actors, then that is one million at all, so let’s talk a bit about that as sort of cryptographic engine, and the
potential attackers across the earth, well, because authentication protocols only reason to ignore it is sheer lazi-
and each one may be in control of far also require the use of cryptographic ness. It is very likely that your embed-
more than three devices! algorithms—the ones, you complain, ded processor already has accelera-
Encryption, in and of itself, is not that drain your batteries. The system tion for AES, and if it does not, spend
a solution to securing a system. There you describe needs the ability to dis- a quarter and buy a better one.
are many components that go into cern who made a request as a way of KV
building a secure system and service, preventing the attack I just described,
most of which I will not go into here where I ordered champagne for every-
due to limitation of space and my one in the hotel. How are you going to
Related articles
blood pressure medicine, which my ensure that room 243 really ordered on queue.acm.org
doctor tells me not to grind between those drinks?
Kode Vicious Battles On
my teeth. Encrypted communication Now that I have convinced you that
George Neville-Neil
is one tool to be used when securing connections to the back-end system A koder with attitude, KV answers your
a networked service. You say there are need to be both encrypted and authen- questions. Miss Manners he ain’t.
no “state secrets” in your system, and, ticated, we can split hairs on how big http://queue.acm.org/detail.cfm?id=1059801
I guess, so long as the president is not a key to use, and, you know what they Pervasive, Dynamic Authentication of
staying in one of your IoT-equipped say, the bigger the key, the deeper the Physical Items
hotels, perhaps that is true. Even if lock! Choosing algorithms and key Mandel Yu and Srinivas Devadas
The use of silicon PUF circuits
the president no longer eats children sizes is where we can start to talk about
http://queue.acm.org/detail.cfm?id=3047967
for breakfast, there are plenty of other device features and power usage.
risks involved in building a system as At the moment, the AES (Advanced Security Collapse in the HTTPS Market
Axel Arnbak, et al.
you have described. Encryption Standard) set of algorithms Assessing legal and technical solutions to
If attackers gain access to the pri- is what is used most often for encryp- secure HTTPS
vate Wi-Fi network the hotel uses to tion, and this protocol has at least http://queue.acm.org/detail.cfm?id=2673311
deploy your system—which is likely, three standard key lengths: 128, 192,
because the hotel will probably pick and 256 bits. There are various rec- George V. Neville-Neil ([email protected]) is the proprietor of
Neville-Neil Consulting and co-chair of the ACM Queue
a password such as “1234567890”— ommendations on key lengths, but editorial board. He works on networking and operating
then they can see all the traffic used I rather like the comparison of sev- systems code for fun and profit, teaches courses on
various programming-related subjects, and encourages
to perform operations. Want to wake eral sets of recommendations on this your comments, quips, and code snips pertaining to his
Communications column.
your neighbors at 01:00, 03:00, and website: https://www.keylength.com/
05:00? Simple, just create a request en/4/. NSA—whoops, I mean NIST— Copyright held by author.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 37
V
viewpoints
Viewpoints
What Can Agile Methods
Bring to High-Integrity
Software Development?
Considering the issues and opportunities raised by Agile
practices in the development of high-integrity software.
T
HERE IS MUCH interest in
Agile engineering, espe-
cially for software develop-
ment. Agile’s proponents
promote its flexibility, lean-
ness, and ability to manage changing
requirements, and deride the plan-
driven or waterfall approach. Detrac-
tors criticize Agile’s free-for-all.
At Altran U.K., we use disciplined
and planned engineering, particu-
larly when it comes to high-integrity
systems that involve safety, security,
or other critical properties. A shallow
analysis is that Agile is anathema to
high-integrity systems development,
but this is a naïve reaction. Pertinent
questions include:
˲˲ Is Agile compatible with high-
integrity systems development?
˲˲ Where is Agile inappropriate?
˲˲ Do Agile’s assumptions hold for
high-integrity or embedded systems?
PHOTO BY UK’ S GOVERNM ENT DIGITA L SERVICE/F LICKR (C C BY 2 .0)
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 39
viewpoints
ways able to accept delivery of the build (perhaps once every six months) In mitigation, we reduce on-target
product and use it immediately. This has a complete assurance package, in- testing with more static verification.
is not realistic for high-integrity proj- cluding a safety case, and is designed Secondly, if we know that code is com-
ects. Some customers have their own for eventual operation. The trick is pletely unambiguous, then we can jus-
acceptance process, and regulators to make the iteration rates harmonic, tify testing on host development ma-
may have to assess the system before both with each other and with the cus- chines and reduce the need to repeat
deployment. These processes can be tomer and regulator’s ability to accept the test runs on target. Hardware sim-
orders-of-magnitude slower than a and deploy releases. ulation can give each developer a desk-
typical Agile tempo. Embedded Systems Issues. Agile top virtual target or a fast cloud for the
iFACTS uses a deeper pipeline and presumes plentiful availability of fast deployment pipeline. While virtual-
multiple iteration rates, with at least testing resources to drive the devel- ization of popular microprocessors is
four builds in the pipeline: opment pipeline. For embedded sys- common, high-fidelity simulation of
˲˲ Build N: in operation with the cus- tems, if the hardware exists, there may a target’s operating environment re-
tomer. be just one or two target rigs that are mains a significant challenge.
˲˲ Build N+1: undergoing customer slow, hostile to automation, and dif- On one embedded project, all de-
acceptance. This process is subject to ficult to access. We have seen projects velopment of code, static analysis, and
regulatory requirements, and so can revert to 24-hour-a-day shift-working testing is done on developers’ host ma-
take months. to allow access to the target hardware. chines, which are plentiful, fast, and
˲˲ Build N+2: in development and offer a friendly environment. A final
test. re-run of the test cases is performed
˲˲ Build N+3: undergoing require-
Agile presumes on the target hardware with the expec-
ments and formal specification. tation of pass-first-time, and allowing
All four pipeline stages run concur- plentiful availability the collection of structural coverage
rently with multiple internal iteration of fast testing data at the object-code level.
rates and delivery standards. The de-
velopment team can deliver to our test resources to drive the Opportunities
team several times a day. A rapid build development pipeline. High-integrity practices can comple-
can be delivered to the customer (in, ment Agile. We previously mentioned
say, 24 hours), but comes with limita- the use of static verification tools.
tions on its assurance package and While we have a preference for devel-
allowed use: it is not intended for op- oper-led, sound analysis, we recog-
erational use, but for feedback from nize that some projects might find
the customer on a new feature. A full more benefit in unsound, but easier to
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 41
practice
DOI:10.1145/ 3106625
˲˲ Your claims are indefensible.
Article development led by ˲˲ He attacked every weak point in my
queue.acm.org
argument.
˲˲ I demolished his argument.
Code is a story that explains ˲˲ I never won an argument with him.
Metaphors
tacking our positions, so we structure
our arguments as if we were at war
with the other person. This means
We Compute By
the metaphor is not just language
flourish; we live it. Lakoff and John-
son propose the exercise of imagin-
ing a culture where arguments are
not viewed in terms of war—of win-
ners and losers—but in which lan-
guage is a dance, where you have to
cooperate with a partner in order to
achieve a desired goal, reaching con-
clusions as a team.
The book goes on to analyze the
different aspects of language and
metaphors and how they affect our
concepts and view of the world. The
authors give many examples to defend
the thesis that our understanding of
I N T H EIR NOW classic book Metaphors We Live By,6 the world is based on metaphors and
that those metaphors are the founda-
George Lakoff and Mark Johnson set out to show the tion of behavior.
linguistic and philosophical worlds that metaphor One of the book’s biggest takeaways
is that metaphors enable certain ways
isn’t just a matter of poetry and rhetorical flourish. of thinking, while restricting others,
They presented how metaphor permeates all areas of as the argument-as-war example illus-
our lives, and in particular that metaphor dictates how trates. This article applies this idea to
computer science. How do metaphors
we understand the world, how we act in it, how we live shape the way we understand comput-
in it. They showed that our conceptual system is based ing and its related problems? What
kinds of problems are enabled by the
on metaphors, too, but since we are not normally metaphors in use? And, no, monads are
aware of our own conceptual system, they had to study not like burritos!8
it via a proxy: language. First, the article looks at how meta-
phors help us understand the relative-
By studying language, Lakoff and Johnson tried ly young world of computers and how
to understand how metaphors work by imposing they affect the way we structure code
or design algorithms and data struc-
meaning in our lives. The basic example they present tures. We even solve problems based
is the conceptual metaphor “argument is war.” We on which metaphors are part of our
understand the act of arguing with another person arsenal, or toolbox. “Sometimes our
tools do what we tell them to. Other
in the same way we understand war. This leads to the times, we adapt ourselves to our tools’
following expressions in our daily language: requirements,” states author Nicholas
but people understood them as auto- new metaphors is the original mean- understanding that people don’t have
matic brains. Actually, the word com- ing of the word is used at face value. to “write at a distance” to have actual
puter existed at that time, but it was The word being used to explain a new long-distance communications.
used to refer to the person who did cal- concept may actually limit understand- Thanks to mathematician Claude
culations for engineers. Think of engi- ing of that very concept. In his book Shannon and others like him, we
neers needing to know the trajectory The Information,5 James Gleick gives a managed to escape from the problems
of a projectile or how the wind would fascinating account of the invention of the telegraph metaphor and build
affect an airplane’s wing shape; they of the telegraph. The word tele-graph the whole discipline of information
would throw a couple of formulas and means far-writing. Lo and behold, early theory beginning in the late 1940s.
numbers to their human computers to get telegraphs were strange machines that Shannon’s seminal book, The Math-
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 43
practice
the other does so in LIFO (last in, first tell two other computers about the will be able to explain a concept, but
out) order, respectively. update, and so on until the informa- you must have enough skill to choose
Even if this looks like an everyday tion was replicated across the system. the right one that’s able to convey your
thing for most programmers, the mo- This metaphor gave way to a new area ideas to future programmers who will
ment you choose to use a stack instead of study called gossip algorithms. The read the code.
of a queue, you are deciding how to ex- gossip metaphor makes the idea easy Thus, you cannot use every meta-
plain your program. The stack is a very to explain, but the Xerox team was phor you know. You must master the
good metaphor for the collection of still lacking the mathematical tools art of metaphor selection, of meaning
items that a program works with, be- that would help analyze the effective- amplification. You must know when
cause it tells a future reader of the pro- ness of the algorithms. to add and when to subtract. You will
gram in which order to expect the items During their research, they dis- learn to revise and rewrite code as a
to process. You don’t even need to read covered another metaphor related to writer does. Once there is nothing else
how the stack is implemented, since their problem: epidemics. They un- to add or remove, you have finished
you can assume you will get the items derstood their algorithms replicated your work. The problem you started
in LIFO order. This is why types are so data the same way in which an epi- with is now the solution. Is that the
important in computer science—not demic disseminates across a popu- meaning you intended to convey in the
types as in static type checking of pro- lation. By using this new metaphor, first place?
grams, but types as the concepts used they got immediate access to all the
to describe programs: persons, users, knowledge in The Mathematical The- Acknowledgments
stacks, trees, nodes, you name it. Types ory of Epidemics,1 which fit their work Thanks to Jordan West and Carlos
are the characters that tell the story of like a glove. Not only did they name Baquero for our discussions about how
your program; without types, you just their paper “Epidemic Algorithms for metaphors permeate computing, and
have operations on streams of bytes. Replicated Database Maintenance,”4 for their feedback for this article.
they also took the nomenclature of
Cognitive Leaps that discipline to explain their algo-
Related articles
The goal is to find the right metaphor rithms. It was a matter of finding the on queue.acm.org
that describes and explains a prob- right metaphor to get access to a new
First, Do No Harm: A Hippocratic Oath
lem. As explained earlier with the world of explanatory power.
for Software Developers
queueing theory example, a cognitive Phillip A. Laplante
leap was needed to go from tasks that Metaphors Everywhere http://queue.acm.org/detail.cfm?id=1016991
have to be processed in a certain order Do we really use that many metaphors Coding for the Code
to understanding that this is a queue- in programming? Let’s take a look at an Friedrich Steimann and Thomas Kühne
ing problem. Once you manage to example from the distributed systems http://queue.acm.org/detail.cfm?id=1113336
make the cognitive leap, all the math- literature (metaphors are in italics): A Nice Piece of Code
ematical tools from queueing theory Whenever nodes need to agree on a George V. Neville-Neil
are at your disposal. Graph theory common value, a consensus algorithm http://queue.acm.org/detail.cfm?id=2246038
is filled with examples of mundane is used to decide on a value. There’s
tasks that, once converted to a graph usually a leader process that takes care References
1. Bailey, N.T.J. The Mathematical Theory of Epidemics.
problem, have well-known solutions. of making the final decision based on C. Griffin and Co., 1957.
Whenever you ask Google Maps to get the votes it has received from its peers. 2. Baker, C. (Ed.). What a programmer does. Datamation
(Apr. 1967); http://archive.computerhistory.org/
you to your destination, Google trans- Nodes communicate by sending mes- resources/text/Knuth_Don_X4100/PDF_index/k-9-
lates your problem to a graph repre- sages over a channel, which might be- pdf/k-9-u2769-1-Baker-What-Programmer-Does.pdf.
3. Carr, N. The Shallows. W.W. Norton, 2011.
sentation and suggests one or more come congested because of too much 4. Demers, A., Greene, D., Hauser, C., Irish, W. and
Larson, J. Epidemic algorithms for replicated database
paths in the graph. Graphs are the traffic. This could create an informa- maintenance. In Proceedings of the 6th Annual ACM
right metaphor, understood by math- tion bottleneck, with queues at each Symposium on Principles of Distributed Computing,
(1987), 1–12.
ematicians and computers alike. Are end of the channels backing up. These 5. Gleick, J. The Information: A History, a Theory, a Flood.
there other instances of problems bottlenecks might render one or more Pantheon, 2011.
6. Lakoff, G. and Mark J. Metaphors We Live By.
that seem difficult but that can be nodes unresponsive, causing network University of Chicago Press, 1980.
solved by finding the right metaphor? partitions. Is the process that’s taking 7. Shannon, C.E. and Weaver, W. The Mathematical
Theory of Communication. University of Illinois
The distributed-systems literature too long to respond dead? Why didn’t it Press, 1949.
has a very interesting one. acknowledge the heartbeat and trigger 8. Yorgey, B. Abstraction, intuition, and the ‘monad
tutorial fallacy.’ https://byorgey.wordpress.
In the late 1980s Alan Demers and a timeout? This could go on, but you com/2009/01/12/abstraction-intuition-and-the-
his colleagues from Xerox tried to find get the point. monad-tutorial-fallacy/
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 45
practice
DOI:10.1145/ 3080188
Their selection covers new techniques
Article development led by
queue.acm.org
for fabricating (and emulating) com-
plex materials (for example, by ma-
nipulating the internal structure of
Expert-curated guides to an object), for more easily specifying
the best of CS research. object shape and behavior, and for
human-in-the-loop rapid prototyping.
Research
Combined, these two guides provide
a fascinating deep dive into some of
the latest human-centric computer sci-
ence research results.
for Practice:
As always, our goal in this article is
to allow our readers to become experts
in the latest topics in computer science
research in a weekend afternoon’s
Technology
worth of reading. To facilitate this
process, we have provided open access
to the ACM Digital Library for the rele-
vant citations from these selections so
for Underserved
you can enjoy these research results in
full. Please enjoy!
—Peter Bailis
Personal
the design and implementation of next-generation data-
intensive systems.
Fabrication
Technology For
Underserved
Communities
By Tawanna Dillahunt
According to the Global
Multidimensional Pov-
erty Index 2013, 1.6 billion people—or
THIS INSTALLMENT OF Research for Practice provides more than 30% of the combined popu-
curated reading guides to technology for underserved lations of the 104 countries analyzed—
were impoverished in terms of health,
communities and to new developments in personal education, and living conditions.1
fabrication. First, Tawanna Dillahunt describes design Here, these individuals and those fac-
ing similar conditions are referred to
considerations and technology for underserved and as underserved.
impoverished communities. Designing for the more Designing and building technology
than 1.6 billion impoverished individuals worldwide to support people in these underserved
communities has several complexi-
requires special consideration of community needs, ties. Overcoming these complexities
constraints, and context. Her selections span protocols requires the following:
˲˲ Understanding the needs of a spe-
for poor-quality communication networks, community- cific underserved population and em-
driven content generation, and resource and public powering or enabling individuals from
service discovery. Second, Stefanie Mueller and Patrick that population to produce informa-
tion and develop their own solutions.
Baudisch provide an overview of recent advances in ˲˲ Understanding the context and
personal fabrication (for example, 3D printers). constraints that underserved indi-
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 47
practice
going forward. cerns as employers increasingly re- University of Michigan’s School of Information and holds
a courtesy appointment with the electrical engineering
In summary, the authors contrib- quire job seekers to apply online; and computer science department. She leads the
Social Innovations Group (www.socialinnovations.us),
uted Sangeet Swara, a system that government services, such as unem- an interdisciplinary group specializing in the R&D of
empowered low-income community ployment benefits, must be applied ubiquitous and social computing technologies.
members to produce information and for online (in some cases, this can be
build social capital, which is vital for done via phone); and other services,
connecting and strengthening com- such as health care and even housing,
munities. The community moderated are researched and obtained online.
the forums by categorizing and rating Connecting underserved job seekers
posts, thus eliminating the need for a with individuals who are employed is
single moderator—a step toward creat- commendable. Without making these
ing a self-sustaining system. technologies more inclusive, however,
many individuals will continue to be Personal Fabrication
left behind. By Stephanie Mueller and
Technology Connecting Low-Resource
Communities to High-Resource
Patrick Baudisch
Communities for Information Sharing Conclusion Personal fabrication tools such as 3D
Dillahunt, T.R., Bose, N., Diwan, S., For a computer scientist who cares printers are paving the way to a future
and Chen-Phang, A. about designing, implementing, and in which nontechnical users will be
Designing for disadvantaged job seekers: In- deploying inclusive tools, it is vital able to create custom objects. With the
sights from early investigations. In Proceedings to build software and software tech- recent decrease in price for 3D-printing
of the ACM Conference on Designing Interac-
niques that aim to: hardware, these tools are about to enter
tive Systems, 2016, 905–910; http://dl.acm.org/
˲˲ Improve poor infrastructures the mass market. The cost of the aver-
citation.cfm?id=2901865
such as unstable network connectiv- age consumer 3D printer has dropped
This article focuses on unemployed ity and electricity or provide solutions from about $14,000 in 2007 to $500 to-
job seekers from low-income commu- that do not require these infrastruc- day. Given the decreasing price, it is not
nities in the U.S. The authors conduct- tures to be stable. surprising that the number of consum-
ed a user-centered design process to ˲˲ Empower individuals to produce er 3D printers sold has doubled every
create and deploy Review-Me, a Web- information, not only to consume it.2 year—from 6,000 in 2010 to more than
based application that sourced résu- ˲˲ Connect individuals to others with 270,000 in 2015.
mé feedback for individuals who were more resources and help support devel- While the hardware is now more
unemployed from local individuals opment within and across communities. affordable and the number of people
who were employed. While the appli- When creating software, computer who own a 3D printer is increasing,
cation deployment successfully con- scientists and practitioners should only a few people use the printers to
nected job seekers who were currently consider issues that persist in limited- create 3D models. Most users down-
students, it uncovered limitations and resource areas, such as unstable ac- load models from a platform such as
constraints among underserved job cess to electricity and water and overall Thingiverse, and after downloading,
seekers. For example, underserved job poor infrastructure. Designing for oth- fabricate them on their 3D printers. At
seekers did not always have access to er constraints such as limited reading most, users adjust a few parameters of
digital résumés or understand how to and digital literacy and limited Inter- the model, such as changing its color
recreate physical résumés in digital net access is often beneficial to popu- or browsing among predetermined
form. Job seekers rarely had access lations beyond underserved groups. shape options.
to an email address, and those with Empowering individuals to produce Personal fabrication has the poten-
email access often forgot their pass- information and strengthen bonds tial for more: instead of only consum-
words. Finally, low literacy levels made within and across communities is criti- ing existing content, nontechnical us-
it very difficult for these job seekers to cal to reducing systemic issues that ex- ers in the future could use 3D printers
sign up to use the application. ist beyond our control, such as social or to create objects that only trained ex-
The results of this work suggested income inequality. perts can create today. In the past few
three fundamental design principles years, human-computer interaction
to address these shortcomings: com- References (HCI) and graphics researchers have
1. Alkire, S., Roche, J.M. and Seth, S. Multidimensional
patibility (for example, systems should Poverty Index 2013. Oxford Poverty and Human worked on a number of challenges to
accept photographed images of résu- Development Initiative; http://www.ophi.org.uk/wp- move us toward such a future.
content/uploads/Global-Multidimensional-Poverty-
més); practicality (for example, sys- Index-2013-8-pager.pdf?0a8fd7. In our paper, “Personal Fabrica-
tems should provide ways to submit ré- 2. Dell, N. and Kumar, N. The ins and outs of HCI for tion” (Foundations and Trends in Hu-
development. In Proceedings of the CHI Conference
sumés offline, such as through kiosks on Human Factors in Computing Systems, 2016, man–Computer Interaction 10, 3–4,
or offline networking devices); and fa- 2220–2232. 2017), we provide a comprehensive
3. Heeks, R. and Bhatnagar, S.C. Understanding success
miliarity (for example, systems should and failure in Information Age reform. Reinventing overview of these challenges and in-
allow registration via SMS or familiar Government in the Information Age: International clude a survey of more than 200 recent
Practice in IT-enabled Public-sector Reform. R. Heeks,
mobile accounts such as Instagram or (Ed). Routledge, London, 1999, 49–74. papers. Here, we summarize three of
Facebook). those papers, which are representa-
The research findings raise con- Tawanna Dillahunt is an assistant professor at the tive of the larger field.
havior and either critiques the user’s in a workflow that looks quite differ-
Multimaterial Fabrication design or automatically adjusts the ent from what is seen today.
Vidimce, K., Kaspar, A., Wang, Y., Matusik, W.
Foundry: Hierarchical material design for
design to make it comply with respect In the new “interactive fabrication”
multi-material fabrication. In Proceed- to forces. workflow presented in this paper, users
ings of the 29th Annual Symposium on Consider even the simple example work hands-on on the physical work-
User Interface Software and Technology, of designing a paper glider as pre- piece using physical tools (much like
2016, 563–574; http://dl.acm.org/citation. sented in this paper. Depending on in crafting), and see the physical work-
cfm?id=2984516.
the orientation of the glider, it will be piece change immediately as they edit.
With 3D printing, users can design subject to drag forces that make the The fast feedback loop allows users to
every aspect of an object: they can glider resist the airflow and lift forces evaluate every intermediate step, allow-
create a specific appearance (for ex- that move the glider upward. All forc- ing them to adjust their decisions along
ample, a desired shape, color, and es depend not only on the shape of the the way. This potentially makes editing
reflectance), a specific feel (for ex- glider, but also on its velocity and ori- of physical data as easy as editing digi-
ample, by printing tactile textures entation at a certain moment in time; tal data on a multitouch device today.
or by using soft materials), and they thus, they are constantly changing
can make an object perform a desired as the glider moves through the air. Conclusion
function (for example, using conduc- This creates a large parameter space When will personal fabrication reach
tive materials for printed electronics that is infeasible for the user to tackle consumers? The journey has only just
or optical clear materials for printing manually. The design tool described begun. If the starting point is deemed
light pipes). in this paper lets users design the to be 2009 (that is, when the first pat-
Functional properties can also be shape of the glider, and as they de- ents ran out and the first low-cost 3D
achieved by designing the internal sign, provides real-time feedback on printer, the MakerBot Cupcake CNC,
structure of an object—for example, the flight performance. appeared on the market), then we are
making an object stand by redistribut- clearly still at the very beginning of
ing its infill to shift its center of mass. putting personal fabrication into the
From Batch-Processing, To Turn-Taking,
Finally, by using microstructures with To Direct Manipulation For Physical ‘Data’ hands of consumers.
structurally varying cells, researchers Willis, K.D.D., Xu, C., Wu, K.-J., If personal fabrication today feels
have shown how to emulate different Levin, G., Gross, M.D. like a niche technology for hobbyists,
material behaviors using a single ma- Interactive fabrication: New interfaces for it is most likely because there are still
digital fabrication. In Proceedings of the 5th
terial (so-called metamaterials). decades to go. We should look at the
International Conference on Tangible, Embed-
This paper provides an editing ded, and Embodied Interaction, 2010, 69–72; in-between progress with patience.
environment for designing such http://dl.acm.org/citation.cfm?id=1935716. The success of other technologies
multimaterial composite objects. such as personal computing could
Composing a set of operators into an When simulation is not possible, such certainly not have been predicted un-
operator graph creates the material as when testing the aesthetics and er- til decades after their inception (con-
definitions. The operators are imple- gonomics of a design, users have to sider the words of Thomas Watson,
mented using a domain-specific lan- fabricate the object to evaluate it. Since president of IBM, in 1943: “I think
guage for multimaterial fabrication, 3D printing is so slow, users have to there is a world market for maybe
and users can easily extend the library think carefully before printing, as every five computers”).
by writing their own operators. mistake may imply another hour-long If personal fabrication should turn
or even overnight print. out anything like personal comput-
Planning every step, however, is ing in its adoption, we still have an
Domain Knowledge
Umetani, N., Koyama, Y., Schmidt, R. not feasible for nontechnical users amazing journey ahead of us.
and Igarashi, T. as they lack the experience to reason
Pteromys: Interactive design and optimiza- about the consequences of their de- Stefanie Mueller is an assistant professor in MIT’s
tion of free-formed free-flight model air- sign decisions. To solve the problem, electrical engineering and computer science department,
planes. ACM Transactions on Graphics 33, 4 joint with mechanical engineering, and is a member of
researchers proposed repeating the MIT CSAIL. She develops novel hardware and software
(2014), Article 65; http://dl.acm.org/citation. systems that advance personal fabrication technologies.
cfm?id=2601129. evolution of the user interface from
personal computing for personal Patrick Baudisch is a professor of computer science at
Hasso Plattner Institute at Potsdam University and chair
Professional CAD tools require years fabrication. Computing also started of the Human Computer Interaction Lab. Previously, he
worked as a research scientist in the Adaptive Systems
of engineering training to gain the with machines that ran a program in and Interaction Research Group at Microsoft Research
necessary expertise as they provide one go overnight. Then turn-taking and at Xerox PARC.
fine control over every parameter systems such as the command line
in the design. HCI and graphics re- evolved, which provided users with
searchers have looked at how to cre- feedback after every input; finally, di-
ate design tools that abstract away the rect-manipulation interfaces (such as
necessary domain knowledge by let- today’s multitouch devices) provided
ting users specify the shape and mo- users with continuous feedback dur-
tion of the desired object; the system ing editing. Applying the same inter- Copyright held by owners/authors.
then simulates the mechanical be- action concept to fabrication results Publication rights licensed to ACM.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 49
practice
DOI:10.1145/ 3106633
The environment I was trying to re-
Article development led by
queue.acm.org
produce, however, was not normal, or
more accurately, it was not typical. A
typical IT organization is, in compari-
Why the Bell curve hasn’t transformed son, in utter disarray. The quality of
into a hockey stick. IT organizations follows a bell curve:
A few percent run like fine-tuned ma-
BY THOMAS A. LIMONCELLI chines, a few percent look like toxic
waste dumps on fire, and the vast ma-
Four Ways
jority are somewhere in the middle.
Fortunately for me, I won the IT ca-
reer lottery. Early in my career I saw what
the best in class looked like and consid-
to Make CS
ered it normal. Later, this high standard
made me look like a visionary. The truth
is I just didn’t know any other way.
Most IT practitioners are not so for-
and IT More
tunate. They are not blessed with the
same experience I was afforded, and
they literally do not know any better.
This, I believe, is why the bell curve
Immersive
has not transformed into a hockey
stick, or is even a lopsided blob. This is
why we cannot have nice things.
Here are a few small and big things spoke to a roomful of third-year CS ma- a result of an HTTP request and, by de-
that universities could do. jors and was shocked to learn that fault, generates logging information,
1. Use DevOps tools from the start. most didn’t know HTML. The cur- monitoring metrics, and so on.
Students should use source-code re- riculum was fairly standard—under- Yes, that is a bit much for an intro-
positories such as Git, and CI/CD tools graduate algorithms and such. HTML ductory student’s “Print your name 10
such as Jenkins, as they do their CS was something you learned in the art times” program. But after that, gener-
homework. These processes should be department; the computer science de- ate a Web page!
established as the normal way to work. partment was for serious students. 3. IT curricula should be immersive.
Professors should expect homework as- I think there is a middle ground be- How could formal education better
signments to be turned in by linking to tween serious computer science theory emulate the immersive experience that
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 51
practice
I was lucky enough to benefit from? continuous improvement. As a result, tion technologies (infrastructure as
Most IT curricula are bottom up. students are more motivated and better code and software-defined network-
Students are taught individual subsys- able to assess their own work. This leads ing), Lean/KanBan ideas, cloud funda-
tems, followed by higher levels of ab- to improved engagement and fosters mentals, and even DevSecOps.
stractions. At the end, they learn how more practical class discussions. It cre-
it all fits together. Toward the end of ates a direct feedback loop between a Conclusion
their college careers, they learn the best student’s actions and the value they cre- Education should seek to normalize
practices that make all of it sustainable. ate. Most importantly, it better prepares best practices from the start. Work-
Or, more typically, those sustainabil- students for the real world. ing outside these best practices
ity practices are not learned until later, 4. Be immersive from the start. IT should be considered a bug. Stu-
when the new graduate has a job and projects usually involve some kind of dents should not struggle to learn
is assigned to a coworker who explains legacy system. The most apt analogy best practices after graduation, and
“how things work in the real world.” is being asked to change the tires on they should be shocked if potential
Instead, an IT curriculum should a truck while it is being driven down new employers do not already have
start with a working system that fol- the highway. these practices in place.
lows all the best practices. Students Software engineers spend more Both IT and CS curricula could be
should see this as the norm. They can time reading other people’s code than structured to be more immersive, as im-
dissect the individual subsystems and writing their own. We evolve existing mersive education more reliably reflects
put them back together, rather than systems. Green-field or “fresh start” the real world. It prepares students
building them from scratch. opportunities are rare. Many people I for industry and better informs the re-
The Masters of System Administra- have met have never been in a situation search of those who choose that path.
tion curriculum at the University of Oslo where they designed a new network, Seeing the forest, and then understand-
includes a multiweek immersive expe- application, or infrastructure from ing the trees, helps students under-
rience called the Uptime Challenge.1 scratch. Why can’t education better stand why they are learning something
Students are divided into two teams, prepare students for this? before they learn it. It is more hands-on
and each team is given a Web-based ap- Could something like the Uptime and therefore more engaging, and lends
plication, including multiple Web serv- Challenge be introduced even earlier itself to gamification.
ers, a load balancer, a database, and so in the educational process? Our first experiences cement what
on. The application is a simple social Perhaps on the first day of class becomes normal for us. Students
network application called BookFace. students should be handed not only should start off seeing a well-run sys-
Once the system is running, the in- copies of the syllabus, but also the tem, dissect it, learn its parts, and pro-
structor enables a system that sends an username and passwords to the ad- gressively dig down into the details.
ever-increasing amount of simulated ministrative control panel of a work- Don’t let them see what a badly run sys-
traffic to the application. Each team’s ing system. Instruction and labs could tem looks like until they have experi-
system is checked for uptime every five be oriented around maintaining this enced one that is well run. A badly run
minutes. The team receives a certain system. Students would have their own system should then disgust them.
amount of fake money (points) if the wikis to maintain documentation and
site is up, and a small bonus if the page operational runbooks.
Related articles
loads within 0.5 seconds. If the site is Each student would have their own
on queue.acm.org
down, money is deducted from the working system, but I suggest that ev-
team. This simulates a typical website ery few weeks students be randomly re- Undergraduate Software Engineering
Michael J. Lutz, et al.
business model: you make money only assigned to administer a different sys- http://queue.acm.org/detail.cfm?id=2653382
if the site is up. Faster sites are more tem. Seeing how their fellow students
A Conversation with Alan Kay
appealing and profitable. Customers had done things differently would be
http://queue.acm.org/detail.cfm?id=1039523
react to down or slow sites by switch- educational. Also, the best way to learn
ing to competitors; thus, those lower- Evolution of the Product Manager
the value of a well-written runbook is
Ellen Chisa
performing sites lose money. to inherit someone else’s badly main- http://queue.acm.org/detail.cfm?id=2683579
The challenge lasts multiple weeks, tained runbook.
during which the students learn to per- Institutions are developing more References
form common web-operation tasks immersive educational strategies. In 1. Begnum, K. and Anderssen, S.S. The Uptime
challenge: A learning environment for value-driven
such as software upgrades, bug fixing, cooperation with industry, Bossier operations through gamification. Usenix J. Education
task automation, performance tuning, Parish Community Collegea has cre- in System Administration 2, 1 (2016); https://www.
usenix.org/jesa/0201/begnum.
and so on. Inspired by Netflix’s Chaos ated an Associate of Applied Science in 2. Tseitlin, A. The antifragile organization. Commun. ACM
Monkey,2 individual hosts are random- Systems Administrator degree, which 56, 8 (Aug. 2013), 40–44.
ly rebooted to test the resiliency of the is highly immersive and covers core
overall system. DevOps principals, including automa- Thomas A. Limoncelli is a site reliability
The Uptime Challenge enables stu- engineer at Stack Overflow Inc. in NYC. He blogs
at EverythingSysadmin.com and tweets at @YesThatTom
dents to understand IT’s value to the a https://www.bpcc.edu/catalog/current/tech-
organization and to identify the IT pro- nologyengineeringmathematics/aas-system- Copyright held by owner/author.
cesses that impact this value and permit administration.html Publication rights licensed to ACM. $15.00
Barriers to
rized as follows:
Resources. Concern over the re-
sources required was a frequently
cited reason for not refactoring. The
Refactoring
resource mentioned most often was
time, as in “Deadlines often don’t al-
low refactorings”; “Sometimes there
is just no time”; and “No time no
time.”
Risk. Also frequently cited was the
risk involved in making a change, in par-
ticular introducing new faults or other
problems, as in “That kind of refactor-
ing is time consuming and there is a
REFACTORING6 IS SOMETHING software developers like to large risk of introducing bugs” and “If
it’s working I leave it alone.”
do. They refactor a lot. But do they refactor as much as Difficulty. Another concern was the
they would like? Are there barriers that prevent them difficulty in making the change, as in
“Inheritance is tricky to refactor cor-
from doing so? Refactoring is an important tool for rectly” and “This kind of refactoring
improving quality. Many development methodologies is usually difficult.”
rely on refactoring, especially for agile methodologies ROI. Participants acknowledged
that while there may be benefits from
but also in more plan-driven organizations. If barriers refactoring, there are also costs, and
exist, they would undermine the effectiveness of many the return on investment, or ROI, has
to be considered, as in “Again, I have
product-development organizations. We conducted a to weigh the costs and benefits. The
large-scale survey in 2009 of 3,785 practitioners’ use benefits have to be clear before tak-
of object-oriented concepts,7 including questions as ing on the costs of refactoring, retest-
ing, etc.”
to whether they would refactor to deal with certain Technical. Participants reported a
design problems. We expected either that practitioners variety of constraints due to charac-
teristics of the project that restricted
would tell us our choice of design principles was
inappropriate for basing a refactoring decision or that key insights
refactoring is the right decision to take when designs ˽˽ Developers understand the value of
were believed to have quality problems. However, we refactoring but are often prevented
from doing it due to factors beyond
were told the decision of whether or not to refactor was their control.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 57
contributed articles
We classified participants re- to do anything when that limit is ex- operating system, role, type of de-
sponding “Always” or “Most” to ques- ceeded. In both cases a significant velopment, and highest qualifica-
tions Q16 and Q20 as an indication proportion would not refactor their tion. This information indicated our
that they would refactor classes they designs even when, by their own as- sample provided a broad representa-
perceived as being “bad.” Consider- sessment, a particular design had tion of developers. Half of the 3,785
ably more (952) agreed with having a problems. We wanted to know why participants reported more than four
limit on the depth of a class than on this was so. years of experience, half had a bach-
the number of methods (452). For One possibility is there is some- elor’s, and another 30% had mas-
those participants inclined to limit thing about the participants’ back- ter’s or higher academic credentials.
the number of methods, 265/452 grounds that influences their deci- Within the role category, almost all
(58.6%) would refactor classes that ex- sion to refactor or not. For example, reported having a programmer role
ceed their choice of limit (in bold). For project managers may be more con- (95.6%), but all other roles also had
those inclined to limit the depth of a cerned about resources and risk, and good representation; for example, ar-
class, 364/952 (38.2%) would refactor inexperienced developers may view chitects made up 64.1% of the total.
if the choice of depth was exceeded. refactoring as too difficult. We col- Participants could report more than
So developers are more inclined lected demographic information for one role. The highest reported lan-
to limit the depth of a class than the all participants, including amount of guage was C# (55.7%), though Java
number of methods but less inclined experience, programming language, (49.4%) and C++ (45%) were also well
represented; full details are available
Table 1. Likelihood of refactoring a class if limit of number of methods or class depth is in Gorschek et al.7
exceeded (all participants).
We examined just those partici-
pants who agreed with having a limit,
Q16 Refactor Methods Q20 Refactor Depth
as they were responding to whether
Always 125 3.3% 120 3.2%
they would refactor to deal with a de-
Most 845 22.3% 536 14.2%
sign issue. Our statistical analysis,
Obvious 1,725 45.6% 1,660 43.9%
which was based on logistic regres-
Nothing 415 11.0% 366 9.7%
sion, showed two groups—architects
Forces 128 3.4% 224 5.9% and experienced C# programmers—
Hardly 547 14.5% 879 23.2% were more inclined to refactor if the
Total 3,785 3,785 number of methods exceeded the
participants’ chosen limit. If the class
depth exceeded the limit, C# devel-
Table 2. Likelihood of refactoring by those who limit number of methods or class depth. opers and architects were again more
inclined to refactor. Those in the pro-
Response to Q14 Comments Q18 Comments grammer role were less inclined to
Q16/Q20 Limit Methods for Q16 Limit Depth for Q20
refactor. Given that almost all partici-
Always 38 8.4% 14 8.6% 76 8.0% 29 8.4%
pants indicated having a programmer
Most 227 50.2% 84 51.9% 288 30.2% 102 29.7%
role, this result may simply reflect the
Obvious 145 32.1% 50 30.9% 415 43.6% 138 40.1%
overall response.
Nothing 28 6.2% 7 4.3% 77 8.1% 21 6.1%
These results are interesting, al-
Forces 4 0.9% 2 1.2% 26 2.7% 12 3.5%
though none of the variables are good
Hardly 10 2.2% 5 3.1% 70 7.4% 42 12.2% predictors, as they only weakly indi-
Total 452 162 952 344 cate whether or not developers will
refactor. Neither participant role nor
background gave a clear answer as to
Table 3. Comment category for those who would not refactor (“No”) and those who would why they chose not to refactor even if
refactor (“Yes”); participants may be in more than one category.
their class design contradicted their
view of good design. We had to dig
Category Limit Methods Limit Depth deeper into the motivation offered by
No Yes No Yes the respondents.
Resources 23 47.9% 27 49 36.0% 8
Risk 19 39.6% 6 43 31.6% 1 Participant Comments
Technical 11 22.9% 11 25 18.4% 17 To better understand what influ-
Difficulty 3 6.2% 5 13 9.6% 6 ences a decision to refactor, we ex-
ROI 0 0% 0 13 9.6% 3 amined the comments provided by
Management 1 2.1% 0 11 8.1% 3 those participants who indicated
Tools 1 2.1% 1 0 0% 0 there should be a limit but would not
Participants 48 46 136 35 refactor if the limit was exceeded.
As outlined in Table 2, 162 partici-
pants who would limit the number
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 59
contributed articles
None of the material we used tor, whereas ours included those who 8. Harun, M.F. and Lichter, H. Towards a technical
debt-management framework based on cost-benefit
to recruit participants mentioned chose not to refactor. That these more analysis. In Proceedings of the 10th International
refactoring or gave specifics of the recent surveys report findings simi- Conference on Software Engineering Advances
(Barcelona, Spain). International Academy,
concepts we would be asking about. lar to ours suggests there has been Research, and Industry Association, 2015.
Consequently, unlike other studies, little change in barriers to refactoring 9. Kazman, R., Cai, Y., Mo, R., Xiao, L., Feng, Q., Haziyev,
S., Fedak, V., and Shapochka, A. A case study in
rather than include only develop- since our survey . locating the architectural roots of technical debt. In
ers known to refactor, there should We used two metrics to identify Proceedings of the 37th International Conference on
Software Engineering (Firenze, Italy). IEEE Press,
have been no obvious bias for or potential design-quality issues based 2015.
10. Kim, M., Zimmermann, T., and Nagappan, N. A field
against the use of refactoring. We re- on published theories. It is possible study of refactoring challenges and benefits. In
cruited through personal contacts, the theories are wrong. However, Proceedings of the ACM SIGSOFT 20th International
Symposium on the Foundations of Software
word of mouth, and social media, our conclusions are based on com- Engineering (Cary, NC). ACM Press, New York, 2012,
supported by our website http://se- ments by participants who (rightly or article 50.
11. Murphy-Hill, E. and Black, A.P. Refactoring tools:
folklore.com and a YouTube video. wrongly) believed the theories. Fitness for purpose. IEEE Software 25, 5 (Sept.-Oct.
The demographic data we collected 2008), 38–44.
12. Murphy-Hill, E., Parnin, C., and Black, A.P. How we
reflected a variety of experience, Conclusion refactor, and how we know it. IEEE Transactions on
roles, languages, level of qualifica- There are significant barriers pre- Software Engineering 38, 1 (Jan. 2012), 5–18.
13. Robson, C. and McCartan, K. Real World Research,
tion, and type of development. venting developers from refactoring Fourth Edition. John Wiley & Sons, Inc., 2015.
We set a high bar in assessing to remove software design-quality 14. Saldana, J. The Coding Manual for Qualitative
Researchers, Third Edition. SAGE Publications Ltd.,
whether a participant would agree issues, no matter how they are iden- 2016.
with placing a limit on number of tified. Reducing or even eliminating 15. Shatnawi, R., Li, W., Swain, J., and Newman, T.
Finding software metrics threshold values using
methods or class depth or would still the barriers has the potential to sig- ROC curves. Journal of Software Maintenance and
refactor when that limit is exceeded. nificantly improve software quality. Evolution: Research and Practice 22, 1 (Jan. 2010),
1–16.
Different choices would change the One means is to provide refactoring 16. Szöke, G., Nagy, C., Ferenc, R., and Gyimóthy, T. A
case study of refactoring large-scale industrial
distribution of the frequencies in the support that is goal-directed rather systems to efficiently improve source code. In
different categories somewhat, but than operations-directed. Another Proceedings of the 14th International Conference
on Computational Science and Its Applications
the same trends would be evident. is to provide better quantification (Guimarães, Portugal, June 30–July 3). Springer
Our wording of the possible re- of the benefits, thus better inform- International Publishing, Cham, Switzerland, 2014,
524–540.
sponses might have affected what re- ing the decision as to whether or 17. Vakilian, M., Chen, N., Negara, S., Rajkumar, B.A.,
sponses participants chose, but the not to refactor. Bailey, B.P., and Johnson, R.E. Use, disuse, and
misuse of automated refactorings. In Proceedings
barriers we identified came from of the 34th International Conference on Software
the free-text commentary, which Acknowledgments Engineering (Zurich, Switzerland). IEEE Press, 2012,
233–243.
was unlikely to be affected by such We thank our survey participants, 18. Xing, Z. and Stroulia, E. Refactoring practice: How it
concerns. Even those who would many of whom made the extra effort is and how it should be supported (an Eclipse case
study). In Proceedings of the 22nd International
refactor mentioned some of the to provide the comments we included Conference on Software Maintenance (Philadelphia,
same concerns (as reported in Table and discussed here. We also thank the PA). IEEE Press, 2006, 458–468.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 61
contributed articles
DOI:10.1145/ 3132745
weighs benefits against risks when it
Millennials entering the workforce ignore the comes to intention to use technology
in a business environment.
risks of using privately owned devices on the job. In 2013, we conducted an interna-
tional study involving 402 students in
BY HEIKO GEWALD, XUEQUN WANG, ANDY WEEGER, their final year of undergraduate study
MAHESH S. RAISINGHANI, GERALD GRANT, just before entering the workplace.
OTAVIO SANCHEZ, AND SIDDHI PITTAYACHAWAN We received feedback from students
at Neu-Ulm University of Applied Sci-
ences (Germany), Dongbei University
Millennials’
of Finance & Economics (China), Tex-
as Woman’s University (U.S.), Carleton
University (Canada), Fundação Getu-
Attitudes
lio Vargas (Brazil), and RMIT Univer-
sity (Australia). We found they share
a common set of values regardless of
Toward IT
nationality, including motivational
drivers that would alarm corporate IT
managers, if known. The individuals
Consumerization
in our sample value their own benefit
highly and dramatically neglect the
risks their actions might pose.
in the Workplace
The way we work, think, and behave
is heavily influenced by the Internet,
email, smartphones, and other tech-
nological innovations that have prolif-
erated over the past 20 to 30 years. The
generation of people born after 1980
is the first to grow up with information
everywhere, anytime24 and referred to
as digital natives, or, more commonly,
millennials.15
PEOPLE BORN AFTER 1980, often called “millennials” Many studies have sought to ana-
lyze them.20 For example, in their 2010
by demographic researchers, behave differently literature review, Ng et al.24 character-
from older generations in significant ways. They ized them as “want it all” and “want
are the first “digital natives,” the “always on it now.” It seems generally accepted
in research and practice alike that
generation” that expects to have information millennials are difficult to cope with
instantly and always available at its fingertips. Their
attitudes have been described by previous research key insights
in often unfavorable terms. And when they enter the ˽˽ Members of Generation Y (so-called
millennials) see the use of their privately
workplace, they pose a major challenge to managers owned devices for work as a necessity,
not an option.
from older generations, who, it has been shown,
˽˽ Millennials focus on their personal benefit,
typically follow a different set of values. generally ignoring the risks they may
introduce into corporate networks when
Our research investigates the attitudes of using their own devices and accounting for
millennials who have not yet entered the workforce risk only if it threatens them directly.
toward the use of information technology (IT) in ˽˽ When it comes to weighing risks
against benefits, such behavior is seen
terms of “IT consumerization.” Specifically, we want across developed economies, with no
significant cultural influence across an
to know how this significant part of the population international sample.
To partially close this gap, we con- tablets, and smartphones are used for ee (such as when responding to email
ducted our study on the motivational business tasks but also to software messages on weekends when techni-
factors that shape millennials’ inten- when online email services or cloud- cally not at work). Not only users, but
tion to use technology in the context storage solutions are used for busi- their employers as well, face notable
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 63
contributed articles
challenges from this trend. Anecdotal individualism/collectivism; masculin- privately owned devices on the job is
evidence indicates corporate IT de- ity/femininity; uncertainty avoidance; determined by the outcome of weigh-
partments are under pressure to give and long-term orientation. Other re- ing perceived benefits vs. perceived
in to user demands to be allowed to search found that national cultural risks/associated costs, as in Figure 1.
use privately owned IT for work pur- values strongly affect an individual’s Perceived benefits. Perceived ben-
poses. However, granting myriad dif- IT-adoption behavior.12,32 efits “include all benefits which the
ferent consumer devices access to the In order to understand how mil- customer perceives as having been
corporate network is a nightmare for lennials perceive benefits and risks received.”18 In the context of IT use,
anyone concerned about IT security. associated with IT consumerization they reflect the overall positive utility
Consumerization of IT seems to across different cultures, we identi- individuals expect when using a par-
be a key characteristic of millenni- fied uncertainty avoidance (UA) and ticular technology.12 Prior research
als, as their desire to be always on is individualism/collectivism (IC) as demonstrated that perceived benefits
not limited only to the workplace. In the most relevant dimensions. Power significantly affect behavioral inten-
the same vein, they are accustomed to distance, masculinity/femininity, and tion regarding the use of IT.16,19
always using state-of-the-art technol- long-term orientation are important We define perceived benefits as
ogy, something not every work envi- in the more general context of tech- individuals’ assessment of the func-
ronment is able to provide, especially nology adoption but less relevant in tional benefits they associate with us-
because the definition of “state of the understanding the effect of perceived ing a privately owned device for work
art” is subjective. risks and benefits in the context of our purposes. Building on the premises of
Our study focused on mobile de- study. technology acceptance and use mod-
vices as an exemplary technology to “Uncertainty avoidance” refers els,22,29,33 we propose that the benefits
explain millennials’ behavioral inten- to “the extent to which individuals of using a privately owned device for
tions when it comes to the use of tech- feel vulnerable to unpredictable and work purposes are related to the char-
nology. This area is of great concern unknown situations.”9 People with acteristics of the technology and the
to practitioners, including CIOs and strong UA values fear uncertainty. In functional advances it provides.29 We
senior IT managers.31 the context of work-related technol- thus assume perceived benefits as a
To understand the role of such con- ogy, they need the predictability often multidimensional construct compris-
tradictory factors in individual deci- provided by rules, policies, and struc- ing three facets of employment behav-
sion making, social psychology pro- ture in organizations that IT consum- ior: performance expectancy; effort
vides net-valence models (NVMs) that erization contradicts or dilutes. UA expectancy; and compatibility.
assume individuals intend to perform can thus help understand how mil- Employees may realize productiv-
an action only if the perceived benefits lennials perceive the risks associated ity gains when allowed to select de-
outweigh the associated costs.7,26 Prior with IT consumerization. vices on their own.14 Consequently,
research found NVMs help explain the “Individualism/collectivism” is performance expectancy reflects the
adoption of technology-related ser- one of the most widely studied cultur- extent individuals perceive that using
vices.17 Other prominent theories on al values in cross-cultural research,30 privately owned devices supports their
technology adoption (such as Unified referring to “an individual’s prefer- ability to perform better at work.33
Theory of Acceptance and Use of Tech- ence for a social framework where in- Moreover, devices selected by individ-
nology33) do not capture the risks asso- dividuals take care of themselves (in- ual employees are usually perceived as
ciated with technology use and is why dividualism), as opposed to how they easier to use and more intuitive than
we chose NVMs as our theoretical lens expect the group to take care of them those provided by an IT department.25
(see Figure 1). in exchange for their loyalty (collec- We thus define effort expectancy as
Cultural values and IT use behav- tivism).”9 Individuals with individual- the degree of ease an individual as-
ior. Millennials’ use of corporate IT istic values have a more complex and sociates with using a privately owned
involves multiple challenges for IT more frequently sampled private self. device as compared to using a device
executives worldwide. However, the Consequently, their own goals, be- provided by an IT department. Overall
literature suggests behavioral mod- liefs, and values are more salient. Con- benefit perceptions are also formed
els do not apply universally across all sidering technology use at work, they by an individual’s work style and as-
cultures.30 Research by Srite and Kara- are more concerned with the benefits sociated needs and values. To capture
hanna30 showed the significance of they might achieve than the disadvan- these influential factors, Moore and
factors determining technology use tages that could arise for others. IC is Benbasat22 proposed the construct
are notably dependent on espoused thus useful in understanding how mil- “compatibility” as the degree to which
cultural values, particularly those re- lennials perceive benefits associated using a privately owned device for
flecting national culture. National cul- with IT consumerization, especially work purposes fits the individual’s
ture refers to “the collective program- when there could be conflict between work style. Employees who agree to
ming of the mind that distinguishes themselves and their employers. be available for work responsibilities
the members of one group or category (such as to respond to email messag-
of people from another.”11 Hofstede Research Model es) after work hours are more likely to
and Bond10 proposed five dimensions Based on NVMs, it is assumed an in- see the use of their devices for busi-
of national culture: power distance; dividual’s behavioral intention to use ness purposes as beneficial.
Following these insights, we hy- Using a device for both private and
pothesize that perceived benefits in- business purposes entails the risk that
fluence individuals’ consumerization personal information is disclosed to
behavior: the employer without the employee’s
Hypothesis 1. The greater the per-
ceived benefits of using privately Granting myriad consent and knowledge.21 Privacy risk,
as defined by Featherman and Pavlou6
owned devices for work purposes, the
greater an individual’s intention to
different consumer as the “potential loss of control over
personal information,”6 encompasses
participate in a BYOD program. devices access this facet of risky behavior. Business
“Perceived risk” reflects negative
utility from a subjective perspective, a
to the corporate data, as well as personal data, is at
risk. The potential for corporate data
concept introduced by Bauer2 as part network is to be exposed to unauthorized third
of his “Perceived Risk Theory,” which
assumes subjective risk perceptions
a nightmare for parties also increases when individu-
als use their private devices for work
directly influence an individual’s in- anyone concerned purposes.25 Information security is
tention to perform a certain action.4
Perceived risk is defined by Cunning- about IT security. one of the most important topics re-
lated to IT consumerization, as 90%
ham4 as “the amount that would be of all corporate data breaches fall into
lost, or that which is at stake, if the four patterns:34 lost and stolen devic-
consequences of an act were not fa- es, user-initiated crimeware, insider
vorable, and the individual’s sub- misuse, and miscellaneous human
jective feeling of certainty that the errors. To capture this facet of risky
consequences will be unfavorable.” behavior, we assume security risk, or
Featherman and Pavlou6 and Hoehle potential loss due to fraud or a hacker
et al.9 found perceived risk plays a sig- compromising corporate information
nificant role in individuals’ IT-use be- security,16 contributes to overall per-
havior. ceived risk.
To reflect the perceived cost as- We thus hypothesize that perceived
sociated with using privately owned risk negatively affects individuals’
devices, we define perceived risk as decisions regarding use of privately
the belief of individuals about the owned devices at work:
potential negative outcomes caused Hypothesis 2. The greater the per-
by using privately owned devices on ceived risk of using privately owned
the job. The negative consequences devices for work purposes, the lower
of such behaviors can be classified an individual’s intention to partici-
into multiple types of loss, indicat- pate in a BYOD program.
ing that, as with perceived benefit, We also assume the perceived risk
perceived risk is a multidimensional associated with IT consumerization
construct.6,16 Based on the arguments influences behavioral intention indi-
discussed earlier regarding consum- rectly by negatively affecting perceived
erization and its effects on corporate benefits. For instance, as a measure of
IT, we hypothesize that using privately safeguarding IT security, firms usu-
owned devices for business purposes ally adopt policies that allow them to
encompasses three facets of risk: per- erase data when an employee’s de-
formance; privacy; and security. vice is lost or stolen. Such “loss of full
Using privately owned devices for ownership” significantly affects the
work purposes generally shifts re- perceived benefits of BYOD.28 We thus
sponsibility from the IT department propose:
to the individual. For instance, the Hypothesis 3. The perceived risk of
individual is, at least psychologically, using privately owned devices for work
accountable for “how well the [device] purposes negatively affects an individ-
will perform relative to expectations.”1 ual’s perception of benefit.
The risk associated with using one’s Cultural values. Research provides
own devices on the job includes the evidence that millennials’ cultural
potential that the device the individ- values influence their technology-use
ual is responsible for is not sufficient behavior.30 However, it remains to be
for its intended business purpose. demonstrated whether the proposed
Performance risk thus reflects the po- NVM holds across the general popu-
tential for not being able to perform lation of millennials who reflect a va-
business activities as expected. riety of cultural values.30 We propose
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 65
contributed articles
Uncertainty
Methodology
Avoidance Here, we discuss our data collection,
Performance Risk sample clustering, data analysis, and
results:
Perceived H4a Data collection. To test our re-
Privacy Risk
Risks
search model, we developed a ques-
H2 (–)
tionnaire with a set of measurement
Security Risk
items for each construct, and to
Behavioral
H3 (–)
Intention
safeguard measurement validity, we
adapted items from prior research, as
Performance Expectancy outlined in the online appendix (dl.
H1 (+)
acm.org/citation.cfm?doid=3132745
Effort Expectancy Perceived &picked=formats).
Benefits H4b
We distributed the questionnaire
among students in their final year of
Compatibility
Individualism/
undergraduate studies and with rel-
Collectivism evant work experience (most respon-
dents worked full time for at least six
months during their studies) using
an online survey tool. Our approach
is consistent with Vodanovich et al.35
that the explanatory power of the the- ing a privately owned device can indi- who suggested conducting surveys
oretical model depends on how dis- cate that individuals are more strongly with students to understand how
tinctively millennials espouse charac- concerned with their own needs than millennials (“digital natives” in their
teristic cultural values. with those of the collective, includ- terminology) use technology. We col-
Based on these arguments, we ex- ing their employers. We thus expect lected data from students with “tech-
pect to see distinctions in individu- individuals who espouse individual- nology-affine” majors—“information
alistic values and perceptions toward istic cultural values to more strongly systems,” “industrial engineering,”
uncertainty. Using their own devices value the benefits of using privately and “business administration”—in a
for work purposes enables millennials owned devices. Likewise, we assume number of universities worldwide. We
to express their sense of self and bet- individuals who experience less diffi- chose countries with different values
ter achieve their own goals and follow culty dealing with uncertainty to put of UA and IC, according to Hofstede.11
their own beliefs and values. Also, us- less emphasis on the potential risks After stripping out incomplete ques-
tionnaires, we received a total of 402 The formative measures of “per- and security risk contributed signifi-
valid responses. ceived benefits” were significant, at cantly only to the formative index of
Clustering for espoused cultural least at the .05 level, and path coeffi- the complete dataset. Although not
values. We conducted exploratory fac- cients were greater than .1, suggest- all facets of risky behavior contrib-
tor analysis, a statistical method used ing the chosen characteristics of each uted significantly, VIF was less than 2
to uncover the underlying structure category were relevant for the forma- within all three datasets, confirming
in a large set of variables, to test the tion of the construct (see Table 2). indicator validity. Consequently, low
“unidimensionality” of the measure- Moreover, the variance inflation factor redundancy of indicators’ informa-
ment items for IC and UA. It revealed (VIF) was less than 2, supporting our tion was confirmed.
three items measuring IC and two assumption for indicator validity. The results show performance
items measuring UA load with a high The formative measures of “per- risk contributed significantly to over-
coefficient on the factors they are in- ceived risks” revealed mixed results all risk perception in dataset B and
tended to measure (loadings > 0.79). regarding the risk facets’ contribu- C, while security risk contributed to
Using the factor scores of these items, tion to the formation of the formative overall risk perception in only the
we then conducted a K-means clus- index. We found privacy risk was not complete dataset C. Privacy risk did
tering. Cluster analysis revealed two relevant regardless of dataset used; not significantly contribute to per-
clusters (see Table 1) where the first performance risk was significant only ceived risk, regardless of dataset. And
cluster (referred to as A) encompassed in subset B and the complete dataset; performance expectancy, effort expec-
respondents with high IC scores (clus-
ter center 0.13) and low UA scores Table 2. Formative constructs measurements.
(cluster center −0.61), and the second
cluster (referred to as B) encompassed Construct Facet Cluster A Cluster B Complete Set C
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 67
contributed articles
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 69
review articles
DOI:10.1145/ 3048384
are rendered.27 As it turned out, large
Exploring the technical and ethical issues numbers of people wanted to do just
that.28 Ad blockers had been available
surrounding Internet advertising and ad blocking. for some time, but their potential use
in the world’s most popular mobile
BY STEPHEN B. WICKER AND KOLBEINN KARLSSON browser heightened their saliency and
brought the debate over their use—a
Internet
debate sometimes serious and nu-
anced, but often frivolous—into the
mainstream media.29
To put the issue into perspective,
Advertising:
consider the following provided by
PageFair, “a leading provider of coun-
ter ad block solutions to Web publish-
ers,” in its 201530 and 201631 reports on
Technology,
ad blocking:
˲˲ Ad blocking was estimated to have
cost publishers nearly $22 billion dur-
Ethics, and
ing 2015.
˲˲ As of November 2016, at least 309
million people are blocking advertis-
a Serious
ing on their smartphones.
˲˲ 298 million of these people use an
ad blocking browser, more than twice
the number using blocking browsers
Difference in 2015.
˲˲ Ad blocking is particularly popular
in emerging markets, with the largest
of Opinion
number of active monthly users in Chi-
na, India, and Indonesia. The U.S. is in
ninth place.
In its 2016 report, PageFair made
the following prediction: Mobile ad
blocking is a serious threat to the future
of media and journalism in emerging
markets, where people are coming online
for the first time via relatively expensive
key insights
“Every time you block an ad, what you’re really blocking
˽˽ Internet advertisers use networks of
is food from entering a child’s mouth.” 25 supply- and demand-side platforms and
automated auctions to deliver targeted
advertising to readers in a matter of
“In reality, ad blockers are one of the few tools that we as milliseconds.
Ad blocking is an existential threat to
users have if we want to push back against the perverse ˽˽
the Internet advertising industry, with
design logic that has cannibalized the soul of the Web.” 26 costs to advertisers ranging in the tens of
billions of dollars.
˽˽ The argument that ad blockers violate an
I N FALL 2015 , Apple introduced a “content blocking” implicit contract between the reader and
IMAGE BY YIORG OS GR
or slow mobile connections. Usage in on wireless cellular links, a burden Suppose the Web browser requests
Western economies is likely to grow as that is usually funded by unwilling us- a page from a content publisher that
more manufacturers and browsers start ers. The ad-blocking software provider supports his or her work through ad-
to include ad blocking as a feature.31 Shine, an Israeli startup that began life vertising (this is represented in the
Given the amount of money in- in 2011 as an anti-virus software devel- accompanying figure by link 1). Most
volved in advertising, one might expect oper, estimates that advertising con- publishers do not generate their own
a certain amount of invective on the sumes between 10% and 50% of user advertising content, so they will embed
subject of ad blocking. One would be data plans, depending on user location. requests for advertising into the HTML
correct. Ad blocking has been referred A typical mobile gaming app with adver- files they send to requesting users (link
to as “evil “and as a form of “theft.”32 tising was found to consume 5Mb over a 2). When the requesting host attempts
Ad Age, an advertising industry trade five-minute session, but only 50Kb with to render the HTML file, it will generate
magazine, accused ad blockers of be- ad blocking in place.17 requests for advertisements from an ad
ing exploitative, extortionate, and anti- Shine produces ad-blocking soft- exchange. The ad exchange, as shown in
democratic, all within the space of a ware that can be incorporated into the figure, sits at the center of a network
single sentence: cellular datacenters. In June 2016, the consisting of supply side and demand
As abetted by for-profit technology U.K. cellular service provider Three be- side entities. The supply side entities
companies, ad blocking is robbery, plain came the first to conduct trials using provide information about the user,
and simple—an extortionist scheme that this software to block ads on cellular while the demand side entities provide
exploits consumer disaffection and risks data connections.12 Given that mar- advertising in response to requests
distorting the economics of democratic keters are expected to have spent over from publishers.
capitalism.33 $100 billion on mobile ads in 2016,10 The HTML code provided by the
Randall Rothenberg, president and the response is expected to be extreme. publisher directs the host to a supply-
chief executive officer for the Interac- In this article, we explore how adver- side platform (SSP—link 3). The re-
tive Advertising Bureau accuses ad tising networks and ad blockers work. quest sent to the SSP includes a cook-
blocking “profiteers” of “stealing from We further consider how ad blockers ie—a small string of information that
publishers, subverting freedom of the are subverted, and whether they are eth- was previously stored by the SSP on the
press, operating a business model ical. The ethical analysis yields mixed user’s computer. The cookie enables
predicated on censorship of content, results, but it does, however, suggest a the SSP to craft a response that is specif-
and ultimately forcing consumers to solution that empowers users, allowing ically tailored to the requesting user. In
pay more money for less—and less di- them to select the types of ads that they this case, the cookie will include a user
verse—information.”34 see and how often they see them. ID that the ad exchange can use to co-
On the other side of the debate, ordinate bidding for an advertisement.
many have pointed to the ads them- The Technology of Ad The ad exchange forwards the user
selves as fostering needless consump- Networks and Ad Blockers ID and any other information that it
tion while being tasteless, intrusive, Web browsers request a Web page may have about the requesting user to
and evil (this word occurs a lot in these from a server by sending an HTTP GET one or more demand-side platforms
discussions), while suggesting that the command to the appropriate Internet (DSPs) that place bids on behalf of ad-
advertising industry brought ad block- host. The host responds with HTML vertisers for the opportunity to display
ing upon itself.1 code that the Web browser uses to ren- their ads. Through a process known as
There are purely technical issues as der the desired page and present it to cookie syncing, the DSPs are able to
well. The technology that allows Inter- the user. This much is both simple and match the SSP cookie ID to a user pro-
net advertisers to better target potential ubiquitous, but the details, particularly file, which is often stored and managed
consumers slows the loading of Web when advertising is involved, are much by a separate entity called a data man-
pages and places a significant burden more complicated. agement platform (DMP).
As multiple SSPs and DSPs can use
Internet ad delivery is a complex process involving multiple redirections, synchronization of the same DMP, the DMP may link a wide
user information, and an auction, all in a few tens of milliseconds.
range of user IDs to the same person.
Normal HTTP requests/redirects
This enables all interested parties (other
Cookie ID included in HTTP request SSP than the user) to exchange information
Actual ad served 4
on the user and form a more complete
3
picture of that user’s browsing history.
First-party websites may also partici-
1 pate in the process, providing yet more
Publisher Ad Data Economy
DMP
Website Exchange user information. For example, if a user
2
5 supplies an email address to a website
7 to sign up for its newsletter, the email
8 address can be linked at the DMP to the
DSP
cookie IDs associated with that user. If
Ad Server 6
the user provides a name and address
to a website, that information may also
be linked to the cookie IDs. The DMP Ad blockers can use several methods
may take this a step further by including to disrupt the process described ear-
information inferred from the user’s lier, thus prevent ads from being dis-
social media activity, purchase history played. Many prominent ad blockers,
on various sites, search history, and
email messages. Finally, the DMP may Given the amount such as Adblock Plus and its variants,
block ads by preventing the browser
have access to data gathered offline.
Data aggregators are known to collect
of money involved from sending HTTP requests to certain
URLs. The URL blacklist for a given
data from publicly available records, in- in advertising, blocker is often a crowd sourced effort,
cluding licensing records (for example,
licenses for doctors, lawyers, pilots, or
one might expect such as EasyList, the default blacklist
used in Adblock Plus. EasyList is prob-
hunters or fishermen licenses), voter a certain amount ably the most widely used blacklist;
registration databases, court records,
and DMV records, as well as buying data
of invective on the number of EasyList downloads was
used by PageFair and Adobe to esti-
from commercial sources including the subject mate the prevalence of ad blocking in
brick-and-mortar store purchase histo-
ries and transaction information from of ad blocking. their 2015 joint report.30
While URL blacklisting appears to be
financial services companies.9 Data ag- One would the most common method of ad block-
gregators also buy and sell information
from each other. This whole system of be correct. ing, the Electronic Frontier Founda-
tion’s Privacy Badger takes a different
transactions is often referred to as “the approach,36 attempting to learn which
data economy.” Through this data econ- domains and sites are tracking a user
omy, the DMP is able to build a strik- and blocking the ones that do. It detects
ingly detailed simulacrum of an indi- behavior such as the use of uniquely
vidual consumer, a simulacrum whose identifying cookies, canvas fingerprint-
accuracy drives the advertisers’ return ing, and the appearance of the same
on investment, and whose inaccuracy third-party site at multiple domains. As
may drive the consumer to distraction. such, it blocks very few domains at first,
If a DSP determines the user profile but the more it is used, the more it learns
fits its target audience, it places a bid for to block. It should be noted that Privacy
advertising space on the web page being Badger aims to prevent tracking, not
rendered by the host computer. The ad ads; but since the two are intimately con-
exchange selects an ad from among the nected, it often serves both purposes.
bidding DSPs; a Vickrey auction is gen- A third ad blocking method blocks
erally used, where the highest offer is website elements fitting certain pat-
selected and the amount paid is that of- terns; for example, it could look for the
fered by the second highest bidder. The “iframe” HTML tag and check to see
winning DSP provides a URL for retriev- if it contains text strings like “Spon-
ing the ad. In an “impression”-based sored” or links to a URL with the word
system, an agency ad server determines “ad” in it. This method can block ad-
whether the ad is actually downloaded, vertising served by the Web site itself
and pays the publisher accordingly, as opposed to just third parties, adver-
with the cost per thousand impressions tising that includes ads embedded in
(CPM) being the most widely used sta- search results and social media feeds.
tistic in Internet advertising.35 All of this This content filtering can happen
happens within tens of milliseconds, at the client or at an intermediate
though the actual loading of the win- proxy. Some ad blockers use a root cer-
ning ad into the user’s browser may take tificate to redirect browser requests to
far longer depending on the bandwidth a VPN or proxy that removes ad con-
of the user connection and the size of tent using the methods previously
the ad. mentioned before forwarding the
As a result of this process, the pub- HTML code to the browser. This ap-
lisher of the content often has limited proach can block ads for mobile apps
control over the safety, quality, or taste- as well as browsers, but it comes with
fulness of the ad seen by the content the risks associated with having third
consumer. A publisher may, for exam- parties interfere with browser traffic,
ple, be able to prevent advertisements risks that include the classic man-in-
from a particular advertiser or class of the-middle attacks. Apple recently re-
advertisers, but she may not be able to moved several ad-blocking apps from
exercise finer control. its app store on this basis.37
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 73
review articles
Publishers will sometimes try to cir- the ad, only serving malvertising every
cumvent attempts at ad blocking. Anti- 10th or 20th time, and not serving mal-
ad blocking usually works by serving a vertising to certain IP addresses.5 Even
fake ad in some way and verifying that large and reputable websites have been
it has been loaded or displayed. If it fails
to load, the site stops displaying the pri- Publishers will known to accidentally serve malvertis-
ing, making malvertising a potential
mary content or refuses to load it in the
first place. For example, a site can con-
sometimes try problem for every Internet user.
implied-in-fact contract. In 1917, the More generally, a third or more (39% ˲˲ Advertisers will seek other venues
U.S. leased a pier from the Baltimore in the U.K. and 30% in the U.S.) say they for their advertising dollars.
and Ohio railroad for the purpose of ignore ads. Around three in 10 (31%/29%) ˲˲ Some content generators will stop
handling supplies destined for the say they actively avoid sites where ads in- generating content.
war in Europe. An earlier fire was be- terfere with the content.20 ˲˲ Some content publishers will stop
lieved to have been an act of sabotage, publishing content.
so soldiers were deployed to guard the Are Ad Blockers Unethical? ˲˲ Some content publishers will pub-
pier and surrounding equipment. The In After Virtue, Alasdair MacIntyre de- lish content of lower quality.
weather was cold, and the troop com- scribes the breakdown in ethical argu- ˲˲ There will be less free content
mander often complained about the ment that occurs when the foundations available to all users on the Internet,
tents in which his men were forced to for ethical systems are cut away, leav- and the content that remains freely
live. A railroad official offered to build ing proponents of differing perspec- available will, in some cases, be of re-
temporary barracks. Though there was tives to argue past each other without duced quality.
never any discussion of compensation, any basis for decisive engagement.14 Ad It is important to provide some con-
the barracks were built. The railroad blocking provides a canonical exam- text for the suggestion that the quality
later sued to recover the cost of the ple, as we have one group arguing for of online content will be diminished by
construction, arguing there had been individual rights (the right to receive a general acceptance of ad blocking.
an implied-in-fact contract. In what be- payment for one’s effort in providing Newspaper journalism was in decline
came the 1923 case of Baltimore & Ohio content), while the other group argues well before the advent of ad blocking, or
R. Co. v. United States,2 the Supreme for the general welfare (an Internet de- even the advent of the Internet, primar-
Court disagreed. The Court stated that void of continual distraction caused ily because of the failure of its core busi-
an “implied agreement” required “a by tasteless advertising). It is not clear ness model.15 The business model was
meeting of minds inferred, as a fact, how the two arguments can be recon- that of a quasi-monopoly: competition
from conduct of the parties in the light ciled, or how one can clearly overcome was limited, so that a local paper could
of surrounding circumstances.” The the other. We suggest a solution lies in charge higher prices for advertising,
Court found there had been no such a technologically mediated meeting and then use the revenue to maintain
meeting of minds, as the railroad com- of minds, but before we consider the reporters across the world. In essence,
pany never intimated that it would ex- solution, we offer a more detailed ac- the local Wal-Mart paid for the Bagh-
pect payment from the government. count of the ethical arguments. dad bureau through its advertising dol-
It follows that there are several rea- The utilitarian approach, first pro- lars. The limit on competition was due
sons the alleged quid pro quo of viewing pounded by Jeremy Bentham and John to a fact of technology: printing presses
ads in return for free Internet content Stuart Mill in the late 18th and early were very expensive to operate and
fails to rise to the level of an implied-in- 19th centuries, is based on the famil- maintain, so all but the largest munici-
fact contract. First, as with the Baltimore iar precept that “it is the greatest hap- palities could only sustain one or two
& Ohio case, there was no unambiguous piness of the greatest number that is (print) newspapers at any given point in
offer. The Internet content consumer the measure of right and wrong.”3 In time.22 In a pre-Internet world, the pa-
is rarely told precisely what is going to what follows, we will consider act utili- pers acted as an intermediary between
be loaded into his or her Web browser, tarianism, which focuses on the conse- advertisers and consumers, charging
and what is expected in return. Con- quences of individual actions. We will both for the opportunity to communi-
tent consumers suffer the embedding also substitute “well-being” for “happi- cate. In a multi-newspaper market, the
of ads and, on occasion, trackers and ness” to counter some of the more ob- equilibrium was often unstable; a nota-
other forms of spyware into their Web vious criticisms of utilitarianism. ble scoop could send more advertising
browsers without receiving any notice Does the use of ad blockers create dollars to the scooping paper, allowing
from the content provider whatsoever. the greatest well-being for the greatest the scooper to grow (literally) fatter and
In fact, as we have seen, the content pro- number? Those affected by the deci- more attractive to the buyers.
vider may not know what is being inject- sion to block ads include the following: The unraveling of this relationship
ed into the consumer’s browser. ˲˲ Ad blocking users, began with the television era and the
Second, the alleged agreement fails ˲˲ Ad viewing users, movement of affluent readers from the
to satisfy the unambiguous acceptance ˲˲ Content generators, inner city to the suburbs. National and
element. Unlike the lawn-mowing ex- ˲˲ Content publishers, and retail advertisers moved their dollars
ample, there is no prior conduct that ˲˲ Advertisers to television, and newspapers came to
indicates a general understanding that Should users choose to employ ad depend more on classified ads.6 With
an agreement is in place. The popular- blockers, the following will arguably the advent of the Internet in general
ity of ad blockers20,30,31 indicates that result: and Craig’s List in particular (founded
most consumers do not want to see the ˲˲ The ad blocking users will see few- in 1995), classified advertising revenue
ads, and clearly have not agreed to do er advertisements. also began to leave the newspapers’
so. A Reuters survey provides further ˲˲ The content generators will receive balance sheets. By 2010 the newspa-
evidence, indicating that even those less revenue per reading user. per industry was in deep decline, with
who do not employ ad blockers are ig- ˲˲ The content publishers will receive many major players facing bankruptcy
noring or avoiding the ads: less revenue per reading user. (for example, the Tribune Company in
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 75
review articles
2008), and others left to cope with dra- you use humanity, as much in your own human values. For example, in Ethical IT
matically reduced staffs. person as in the person of every other, Innovation, Sarah Spiekermann points
The consequences of a general use always at the same time as end and to both Aristotle and Maslow while con-
of ad blockers may thus be character- never merely as means.”13 Ad blocking cluding that technical design must be
ized as a further reduction in the quality readers arguably do not satisfy this for- based on an understanding that knowl-
of free online content through the de- mulation—they treat the content gen- edge, freedom, and autonomy are pre-
parture of some Internet content gen- erators as means rather than an end in conditions for human growth, self-es-
erators and publishers to other ways of themselves, taking their work product teem, friendship and self-actualization.24
making a living. For large numbers of without respecting their efforts to make At best, the design of advertising
consumers these are apparently accept- a living. It appears that Kant is on the technology shows little concern for
able outcomes given what they avoid: side of the advertisers, while Bentham knowledge, freedom, and autonomy of
the problems associated with spyware favors the general reader. consumers. At worst, advertising tech-
and the relentless distraction of advertis- Contractualism, an ethical theory nology actively works to subvert these
ing. There is also evidence that Internet related to Kant’s deontological ap- values. This subversion can be seen
readers do not greatly value what they are proach,18 more clearly takes into ac- through the lens of the “attention econ-
reading; given the choice between pay- count all interested parties, while omy,” a term coined by Herbert Simon
ing for the content and losing it, most pointing to a potential solution. In to capture the finite nature of the indi-
prefer the latter. The aforementioned What We Owe Each Other, T.M. Scan- viduals’ attention in the face of a seem-
Reuters survey found that only 10% of on- lon21 offers the following ethical rule ingly infinite amount of information.23
line users appeared to be willing to pay for action (emphasis added): The attention economy is reflected in
for once-free news content. An act is wrong if its performance under advertisers’ insertion of themselves
After a sharp upturn in 2012–2013— the circumstances would be disallowed by into virtually all personal interacti ons
when a large number of paywalls were any set of principles for the general regu- in everyday life, ranging from highway
introduced—our data shows very little lation of behavior that no one could rea- billboards to doctors’ offices to the bot-
change in the absolute number of people sonably reject as a basis for informed, un- toms of the trays at airport security.
paying for digital news over the past year. forced, general agreement. Writing for the “Practical Ethics” blog
In most countries the number paying for In establishing rules for behavior, of Oxford University, James Williams
any news is hovering around 10% of online Scanlon suggests that we must consid- argues the resulting distractions are
users and in some cases less than that.20 er the perspectives of all stakeholders, more than an annoyance, they “keep us
If Internet readers and users of ad and define a basis for informed general from living the lives we want to live:”
blockers are rational actors who are agreement. This would require com- In the short term, distractions can keep
making decisions based on their indi- munication between all stakeholders, us from doing the things we want to do. In
vidual well-being, and as the readers something that is sorely lacking in the the longer term, however, they can accu-
outnumber the writers and advertis- context of online advertising. We will mulate and keep us from living the lives
ers, one may conclude that the use of return to this point when we consider we want to live, or, even worse, undermine
ad blockers provides the greatest well possible solutions. our capacities for reflection and self-reg-
being for the greatest number. From a The third and final approach to be ulation, making it harder, in the words of
utilitarian perspective, ad blocking is considered shifts the balance of the Harry Frankfurt, to “want what we want
ethical; the content providers should argument in favor of the general reader, to want.” Thus there are deep ethical im-
look for a better business model. but on a far firmer basis than the argu- plications lurking here for freedom, well-
The counterargument is ready at ments of Bentham et al. Aretaic, or vir- being, and even the integrity of the self.19
hand: this analysis clearly does not tue ethics, emphasizes virtues of mind From a virtue ethics standpoint, it
take into account all stakeholders; the and character.8,11 Virtue ethics origi- follows that the design of Internet ad-
content generators and publishers, for nated with Aristotle’s Nicomachean vertising technology is itself unethi-
example, would almost certainly not be Ethics and his notion that the ultimate cal in that it works against the human
pleased with the consequences of this aim (telos) of an individual is to live project of self-creation. Ad blockers are
utilitarian calculus. This is an example a virtuous life. A virtuous life is a life thus not only ethical, but are literally
of a key criticism of utilitarianism; lived according to reason, where deci- a matter of self-defense. Quoting the
namely, that in emphasizing aggregate sions are based on a set of values held Practical Ethics blog once again:
well being, some individuals may be left dear by the individual. Virtue ethics In reality, ad blockers are one of the few
in far worse condition than before. thus involve the questions of “what is tools that we as users have if we want to
A deontic analysis avoids this par- desirable, good or morally worthwhile push back against the perverse design logic
ticular problem. Immanuel Kant sug- in life?” “What values should we pur- that has cannibalized the soul of the Web.6
gested in his Groundwork of the Meta- sue for ourselves and others?”8
physic of Morals that there is a single Virtue ethics has enjoyed a recent Solutions and Conclusion
primary moral obligation, which he re- resurgence, both in philosophy depart- The advertising delivery systems de-
ferred to as the “categorical imperative” ments and in schools of technology. scribed in this article are the antith-
(CI). Kant offered several formulations With regard to the latter, value-based esis of value-based design. The values
of CI, including one that sounds very design practices have been developed that Spiekermann and others point
much like the golden rule: “act so that based on various lists of fundamental to as a foundation for virtue-based
design—knowledge, freedom, and Such a solution will require careful Reporter Please Turn out the Lights: The Collapse of
Journalism and What Can Be Done To Fix It. The New
autonomy—are precisely the values design and far more communication Press, NY, 2011.
that online advertising systems most between stakeholders than currently 16. Nithyanand, R. et al. Adblocking and Counter-Blocking:
A Slice of the Arms Race (2016); arXiv:1605.05077.
systematically undermine. takes place, but it offers the potential 17. O’Reilly, L. This ad blocking company has the potential
Internet advertisers exchange in- for clearly informing readers of their op- to tear a hole right through the mobile Web—and it
has the support of carriers. BusinessInsider.com, May
formation about users without their tions, options upon which they can ex- 13, 2015.
knowledge or control, using that in- ercise rational choice in pursuit of their 18. Parfit, D. On What Matter 1 (2011), 412–413. Oxford
University Press.
formation to manipulate users into own individual goals. We hope that ad- 19. Practical Ethics Blog. University of Oxford;
behavior they might not otherwise vertisers see this as an opportunity. http://blog.practicalethics.ox.ac.uk/2015/10/why-its-
ok-to-blockad
have exhibited. This summary may We have argued that ad blocking is 20. Reuters Institute Digital News Report 2015. Reuters
Institute, University of Oxford, https://reutersinstitute.
seem harsh, and some may argue that not a violation of an existing contract politics.ox.ac.uk/sites/default/files/Reuters%20
advertisers would happily engage in (at least in U.S. law). This does not Institute%20Digital%20News%20Report%202015_
Full%20Report.pdf
more ethical behavior if better chan- mean that ad blocking is beyond the 21. Scanlon, T.M. What We Owe Each Other. Belknap
nels of communication were provided reach of earnest lobbyists and subse- Press, Cambridge, MA, 2000.
22. Shirky, C. Newspapers and thinking the unthinkable.
to interested consumers. A solution quent legislation. One might expect, Will the Last Reporter Please Turn out the Lights: The
beneficial to all may lie in a virtue- however, that such legislation would Collapse of Journalism and What Can Be Done To
Fix It. R.W. McChesney and V. Pickard, Eds. The New
based redesign. Such a redesign would not be very popular with the general Press, NY, 2011.
embed T.M. Scanlon’s suggestion that public. We hope the agreement sug- 23. Simon, H.A. Designing organizations for an
information-rich world. Computers, Communications,
there be an “informed, unforced, gen- gested here takes form before the bat- and the Public Interest. Johns Hopkins University
eral agreement” among all parties. tle between advertisers and ad block- Press, Baltimore, MD, 1971.
24. Spiekermann, S. Ethical IT Innovation: A Value-Based
The agreement would be based on a ers escalates any further. System Design Approach. CRC Press, Boca Raton,
system that provides revenue for con- FL, 2015.
25. http://www.tomsguide.com/us/ad-blocking-is-
tent generators and connects adver- Acknowledgments stealing,news-20962.html
tisers to interested consumers while The authors greatly acknowledge the 26. http://blog.practicalethics.ox.ac.uk/2015/10/why-its-
ok-toblock-ads/
reducing the deleterious impact of assistance of Sarah Wicker and Adam 27. https://developer.apple.com/library/ios/releasenotes/
General/WhatsNewIniOS/Articles/iOS9.html
the current system of advertising on Engst with several technical, legal, 28. https://adblockplus.org/blog/adblock-plus-for-ios-9-
the reading public. The key step lies in and ethical issues. We also thank the finallyhere-and-pssst-it-s-free
29. http://time.com/4052033/apple-iphone-ios-9-ad-blockers/
empowering the reading/consuming editor and reviewers for their time, ef- 30. https://blog.pagefair.com/2015/ad-blocking-report/
public—letting them choose whether fort, and expertise. 31. https://pagefair.com/downloads/2016/05/Adblocking-
Goes-Mobile.pdf
they will download ads, and if so, what 32. http://fortune.com/2015/09/18/dear-apple-i-may-rob-
type of ads. Should a reader choose References
yourstore/ http://www.tomsguide.com/us/ad-blocking-
isstealing,news-20962.html
not to download ads, he or she should 1. Alexander, J., Crompton, T. and Shrubsole, G. Think 33. http://adage.com/article/digitalnext/ad-blocking-
Of Me As Evil? Opening The Ethical Debates In
be given the opportunity to pay for ad- Advertising. Public Interest Research Centre (PIRC)
unnecessaryinternet-apocalypse/300470/
34. http://www.iab.com/news/rothenberg-says-ad-
free content. and WWF-UK, Oct. 2011. blocking-is-awar-against-diversity-and-freedom-of-
2. Baltimore & Ohio R. Co. v. United States 261 U.S. 592 (1923)
The supporting technology for 3. Bentham, J. A fragment on government. (1776).
expression/
35. http://www.allbusiness.com/web-advertising-and-
such a system already exists in the cur- Reprinted in The Collected Works of Jeremy Bentham. cpm-aquick-guide-for-small-businesses-2646-1.html
J.H. Burns and H.L.A. Hart, eds. The Athlone Press, 36. https://www.eff.org/privacybadger
rent ad networks. Recall the current London, U.K. 1977, 393. 37. http://www.theguardian.com/technology/2015/oct/09/
scheme of directed Internet advertis- 4. Cisco. Cisco 2016 Midyear Cybersecurity Report; appleremoves-iphone-adblockers-facebook-third-
http://www.cisco.com/c/m/en_us/offers/sc04/2016- party-apps
ing relies on the use of cookies stored midyearcybersecurity-report/index.html. 38. http://www.wsj.com/articles/facebook-will-force-
on user machines. These cookies are 5. Cyphort Labs. The Rise of Malvertising, 2015; http:// advertisingon-ad-blocking-users-1470751204
go.cyphort.com/Malvertising-Report-15-Page.html. 39. http://www.cnet.com/news/facebook-adblock-
sent to service-side and demand-side 6. Downie, Jr., L. and Schudson, M. The reconstruction plusworkaround-august-11
platforms to obtain directed advertis- of American journalism. Will the Last Reporter Please 40. http://www.extremetech.com/computing/209888-
Turn out the Lights: The Collapse of Journalism and mozillafirefox-kills-flash-by-default-security-chief-
ing for insertion into content initially What Can Be Done To Fix It. R.W. McChesney and V. calls-for-adobe-toissue-an-end-of-life-date, http://
Pickard, eds. The New Press, NY, 2011.
requested by a user. Suppose the cook- 7. EFFector 15, 15 (May 24, 2002); https://www.eff.org/
www.bbc.com/news/technology-36301904
41. http://www.computerworld.com/article/2487367/
ies are replaced by information explic- effector/15/15 ecommerce/ad-blockers--a-solution-or-a-problem-.html
8. Frankena, W. Ethics (2nd Edition). Prentice Hall, Upper
itly provided by the user that indicates Saddle River, NJ, 1973.
42. https://www.law.cornell.edu/wex/contract_implied_in_fact
buying habits and interest in specific 9. FTC. Data Brokers: A Call for Transparency and
Accountability; https://www.ftc.gov/system/files/
consumer goods. The data manage- documents/reports/data-brokerscall-transparency- Stephen B. Wicker ([email protected]) is a
professor in the School of Electrical and Computer
ment platform (DMP) would request, accountability-report-federal-trade-commissionmay-
Engineering at Cornell University, Ithaca, NY.
2014/140527databrokerreport.pdf
coordinate, and update this user-sup- 10. Hof, R. Mobile ads will smash $100 billion mark Kolbeinn Karlsson ([email protected]) is a graduate
plied information as necessary. Rather worldwide in 2016. Forbes (Apr. 2, 2015); http://www. student in the School of Electrical and Computer
forbes.com/sites/roberthof/2015/04/02/mobile- Engineering at Cornell University, Ithaca, NY.
than inferring potential sales from adswill-smash-100-billion-mark-worldwide-in-2016
browsing habits, advertising networks 11. Hursthouse, R. On Virtue Ethics. Oxford University
© 2017 ACM 0001-0782/17/10 $15.00
Press, 2002.
could make advertising bidding deci- 12. Jackson, J. Three network to run 24-hour ad
sions based on the clearly expressed blocking trial. The Guardian (May 25, 2016); https://
www.theguardian.com/media/2016/may/26/three-
desires of potential consumers. Such networkto-run-24-hour-adblocking-trial
a system would increase the agency of 13. Kant, I. Groundwork for the Metaphysics of Morals.
A. Wood, ed. Yale University Press, 2002, 37. Watch the authors discuss
the browsing user, while potentially 14. MacIntyre, A. After Virtue: A Study in Moral Theory their work in this exclusive
(3rd Edition). University of Notre Dame Press, South Communications video.
increasing return on investment for Bend, IN, 2007. https://cacm.acm.org/videos/
advertisers. 15. McChesney, R.W. and Pickard, V. Eds. Will the Last internet-advertising
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 77
This book, a revised version of the 2014 ACM Dissertation Award winning dissertation, proposes an
architecture for cluster
This book, computing
a revised systems
version of the that can
2014 ACM tackle emerging
Dissertation data dissertation,
Award winning processing proposes
workloads an at scale.
Whereasarchitecture
early cluster computing
for cluster systems,
computing likethat
systems MapReduce, handleddata
can tackle emerging batch processing,
processing our atarchitecture
workloads scale. also
enables streaming and interactive queries, while keeping MapReduce’s scalability and fault
Whereas early cluster computing systems, like MapReduce, handled batch processing, our architecture also tolerance. And
whereasenables
most deployed
streaming systems only support
and interactive simple
queries, while one-pass
keeping computations
MapReduce’s scalability(e.g., SQLtolerance.
and fault queries),Andours also
whereas most deployed systems only support simple one-pass computations (e.g., SQL queries),
extends to the multi-pass algorithms required for complex analytics like machine learning. Finally, unlike the ours also
extends
specialized to theproposed
systems multi-pass for
algorithms
some ofrequired for complex analytics
these workloads, like machine
our architecture learning.
allows Finally,
these unlike the to be
computations
specialized systems proposed for some of these workloads, our architecture allows these computations to be
combined, enabling rich new applications that intermix, for example, streaming and batch processing.
combined, enabling rich new applications that intermix, for example, streaming and batch processing.
We achieve these results
We achieve through
these results a simple
through extension
a simple extensiontoto MapReduce
MapReduce that that adds
adds primitives
primitives forsharing,
for data data sharing,
called Resilient Distributed
called Resilient Datasets
Distributed (RDDs).
Datasets (RDDs). We
Weshow
show that thisisisenough
that this enough to capture
to capture a wide
a wide range range
of of
workloads. We implement RDDs in the open source Spark system, which we evaluate using synthetic and
workloads. We implement RDDs in the open source Spark system, which we evaluate using synthetic and
real workloads.
real workloads. SparkSpark matches
matches or exceeds
or exceeds thetheperformance
performance ofofspecialized
specialized systems in many
systems domains,
in many while while
domains,
offering stronger fault tolerance properties and allowing these workloads to be combined. Finally, we examine
offering stronger fault tolerance properties and allowing these workloads to be combined. Finally, we examine
the generality of RDDs from both a theoretical modeling perspective and a systems perspective.
the generality of RDDs from both a theoretical modeling perspective and a systems perspective.
This version of the dissertation makes corrections throughout the text and adds a new section on the
This version of the
evolution dissertation
of Apache makes
Spark in corrections
industry since 2014.throughout the textformatting,
In addition, editing, and addsand
a new
linkssection
for the on the
evolution of Apache
references haveSpark in industry since 2014. In addition, editing, formatting, and links for the
been added.
references have been added.
research highlights
P. 80 P. 81
Technical
Perspective Multi-Objective Parametric
Broadening and Query Optimization
Deepening Query By Immanuel Trummer and Christoph Koch
Optimization
Yet Still Making
Progress
By Jeffrey F. Naughton
P. 90 P. 91
Technical
Perspective A Large-Scale Study of
Shedding Programming Languages and
New Light on
an Old Language Code Quality in GitHub
By Baishakhi Ray, Daryl Posnett,
Debate Premkumar Devanbu, and Vladimir Filkov
By Jeffrey S. Foster
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 79
research highlights
DOI:10.1145/ 3 0 6 8 6 1 0
Technical
To view the accompanying paper,
visit doi.acm.org/10.1145/3068612 rh
Perspective
Broadening and Deepening
ACM
ACM Conference
Conference
Proceedings
Query Optimization
Proceedings
Now
Now Available via
Available via
Yet Still Making Progress
By Jeffrey F. Naughton
Print-on-Demand!
Print-on-Demand!
QUERY OPTIMIZATION IS a fundamental times execution time is not the only crite-
problem in data management. Simply rion by which plans should be selected. As
Did you know that you can put, most database query languages are a prominent and current example, if the
now order many popular declarative rather than imperative—that query is being run in the cloud, the system
is, they specify properties the answer may obviously want to find fast evaluation
ACM conference proceedings should satisfy, rather than give an algo- plans, but may also desire inexpensive
via print-on-demand? rithm to compute the answer. The best ones. That is, now we have two objectives:
known and most widely used database running time and cost. This gives rise to
query language—SQL—is a prime exam- multi-objective query optimization, where
Institutions, libraries and ple of a language for which optimization the problem is: Given a query and a set of
individuals can choose is essential. objectives, find a set of plans that are Pa-
By “essential,” I mean that database reto-optimal for these objectives (a plan is
from more than 100 titles optimization is not a matter of shaving “Pareto-optimal” if it is not dominated in
on a continually updated 10% or even a factor of 2x from a query’s all objectives by other plans.)
execution time. In database query evalua- Both parametric and multi-objective
list through Amazon, Barnes tion, the difference between a good plan query optimization have been studied in
& Noble, Baker & Taylor, and a bad or even average plan can be the past, but the following paper by Trum-
Ingram and NACSCORP: multiple orders of magnitude—so suc- mer and Koch is a remarkable tour de
cessful query optimization makes the dif- force exploration of the combination of
CHI, KDD, Multimedia, ference between a plan that runs quickly the two. Here, the problem is the follow-
SIGIR, SIGCOMM, SIGCSE, and one that never finishes at all. Accord- ing: Given a partially specified query, and
ingly, since the seminal papers in the multiple objectives for the resulting plan,
SIGMOD/PODS, 1970s, query optimization has received find a set of Pareto-optimal plans that can
and many more. and continues to receive a great deal of be chosen at runtime by filling in all pa-
attention from both the industrial and re- rameters.
search database communities. Since the original query optimization
For available titles and Early work on optimization focused problem and its variants are already very
ordering info, visit: on a scenario in which the query was fully difficult, one might despair that simul-
specified, and the optimization goal was taneously treating two substantial exten-
librarians.acm.org/pod query evaluation time. That is, the prob- sions would yield a hopelessly intractable
lem was: What is the fastest way to evaluate problem. This paper is surprising in its
this query? While this problem was (and elegance and effectiveness. It embeds the
is!) challenging, it is not broad enough to problem in an insightful and expressive
capture the optimization problem faced formal framework, and specifies a solu-
by modern systems. As an important ex- tion that combines aspects of piecewise
ample, many times the query is not fully linear functions, dynamic programming
specified in advance (for example, it may with pruning based upon Pareto polytope
contain variables, or “parameters” that analyses, and linear programming. A thor-
are only discovered at runtime). This gen- ough set of experiments with an imple-
eralization gives rise to parametric query mentation of their algorithm completes
optimization, where the problem is as the paper, indicating all of this actually
follows: Given a partially specified query, works.
find a set of good evaluation plans, one of
which will be chosen at runtime when the Jeffrey F. Naughton ([email protected]) is a principal
scientist at Google and previously a professor of computer
parameter is instantiated. science at the University of Wisconsin, Madison.
Yet another necessary generalization
involves the optimization goal. Some- Copyright held by author.
Multi-Objective Parametric
Query Optimization
By Immanuel Trummer and Christoph Koch
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 81
research highlights
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 83
research highlights
the desired algorithm output. In summary, our algorithm plans p1 and p2 and we find that plan p1 has better cost than
can be written as follows: or equivalent cost to p2 according to all cost metrics for the
parameter space region X then we reduce the Pareto region
• Iterate over all subqueries s of the input query in ascend- of p2 by subtracting X. Pareto regions can only shrink during
ing order of query size: a pruning operation. Once the region of one plan becomes
– If subquery s is an atomic subquery then consider all empty, it is irrelevant and can be safely discarded. We dis-
possible plans for s. card plans as soon as possible in order to avoid unnecessary
– Otherwise, if s is not an atomic subquery, then iterate comparisons.
over all possibilities to decompose s into two subque- More precisely, the pruning function iterates over all
ries s1 and s2: plan pairs and executes for each pair the following steps.
For each split into two subqueries s1 and s2, con- First, it identifies the region in which one plan dominates
sider all plans that are combinations of a relevant the other plan. Second, it updates the Pareto region of the
plan for s1 and a relevant plan for s2. dominated plan by subtracting the region in which it is
– Prune all considered plans to obtain the set of relevant dominated. Third, it checks whether the Pareto region of
plans for s. the dominated plan becomes empty after the update. In
that case, the plan is discarded and does not participate in
As many query optimization algorithms,8, 14, 16 our algo- further comparisons. Figure 2 illustrates how the Pareto
rithm is based on dynamic programming. We can use region of a plan is reduced after comparing it to another
dynamic programming since the principle of optimality holds plan. The example refers to a scenario where two param-
for query optimization.14 Formulated in general terms, the eters and two cost metrics are considered (execution time
principle of optimality designates the problem property that and fees).
optimal solutions can be obtained by combining optimal Note that two plans can mutually dominate each other in
solutions to subproblems. In the context of query optimiza- different parameter space regions. Having determined that
tion, the principle of optimality means more specifically that a first plan dominates a second plan for some parameter
optimal query plans can be obtained by combining optimal space region, we must therefore still verify if the second plan
plans for subqueries. The principle of optimality has been dominates the first plan as well.
shown to hold for all common execution cost metrics in multi-
objective query optimization.16 This means that a Pareto- 3.3. Data structures
optimal query plan can be combined from Pareto-optimal We describe the data structures by which we represent plan
plans for subqueries. A relevant plan is Pareto-optimal for cost functions and Pareto regions. Our plan cost model is
some points in the parameter space. It is therefore intuitive based on piecewise-linear functions. A piecewise-linear
that a relevant query plan can be combined from relevant function is linear in parameter space regions that form
plans for the subqueries (we omit the formal proof). In other convex polytopes. A linear function can be represented by
words, the principle of optimality holds for MPQ as well. It is a constant and by weights capturing the slope of the func-
the fundament of our MPQ algorithm. tion for each parameter. Hence a piecewise-linear function
can be represented by a set of convex polytopes where each
3.2. Pruning convex polytope is associated with a constant and weights.
Many query optimization algorithms for classical query We consider multiple plan cost metrics. Each query plan is
optimization,14 multi-objective query optimization,16 or therefore associated with one piecewise-linear cost function
parametric query optimization8 are based on dynamic pro- per plan cost metric.
gramming. The primary difference between all those algo- We consider the class of piecewise-linear cost functions
rithms is the realization of the pruning function. As we treat to represent plan cost. We decided to use that class of func-
a novel problem variant, we must design a novel pruning tions since it allows to approximate arbitrary functions
function. In the following, we describe how our algorithm
prunes query plans, that is, how it compares plans for the
same query and identifies irrelevant plans. Figure 2. We subtract the area in which a plan is dominated from its
Our pruning function is based on the key concept of the Pareto region.
Pareto region. Each query plan is associated with a Pareto
region. This is a parameter space region in which it real- Before Comparison Comparison After Comparison
izes Pareto-optimal cost tradeoffs. A plan is irrelevant if its
Parameter 2
Parameter 2
Parameter 2
Parameter 2
Parameter 2
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 85
research highlights
condition. The algorithm by Bemporad verifies whether the refers to a scenario in which two cost metrics, namely execu-
union of a given set of convex polytopes forms a convex poly- tion time and execution fees, are of interest. Cost functions
tope again. If this is the case then the algorithm constructs depend on a single parameter, called “Parameter 1” in the
that polytope. The condition can only be verified figure, that could refer to unspecified predicates in the input
if forms a convex polytope. In that case, the algorithm query template. We see the cost functions of three plans.
by Bemporad constructs the polytope and a linear For parameter value 0, plan 1 is Pareto-optimal since it has
solver can verify whether P− and P+ are equivalent. lowest execution fees. Plan 3 is Pareto-optimal since it has
lower execution time than all other plans. Plan 2 is, however,
4. ANALYSIS dominated by plan 1 since plan 1 has equivalent execution
We analyze the formal properties of the freshly introduced time and lower execution cost. This means that plan 2 is not
MPQ problem in this section. We also analyze the complex- Pareto-optimal for parameter value 0. For parameter value 2,
ity of the algorithm described in the last section. the situation is similar and plans 1 and 3 are Pareto-optimal
while plan 2 is not. For parameter values between 0.5 and
4.1. Problem analysis 1.5, plan 2 is however Pareto-optimal. Even though the same
MPQ generalizes parametric query optimization since it set of plans is Pareto-optimal at the borders of the parameter
allows to consider multiple plan cost metrics instead of value interval [0, 2], additional plans can be Pareto-optimal
only one. We compare the formal properties of MPQ to for values at the interior of that interval. All plan cost func-
the properties of parametric query optimization in the tions are linear in the example and an interval is a special
following. case of a convex polytope. The example is minimal for MPQ:
The parametric query optimization problem with linear having less than two cost metrics would lead to parametric
cost functions has the following property: if the same query query optimization. Having less than one parameter would
plan is optimal at all vertices of a convex polytope in the lead to multi-objective query optimization. Hence, we can
parameter space then that plan must be optimal inside the conclude from this example that the guiding principles do
polytope as well.6 This property is commonly known as one not apply for MPQ in general.
of the “guiding principle of parametric query optimization.”5
Many algorithms for parametric query optimization exploit 4.2. Algorithm analysis
this property as follows6, 9: they recursively decompose the The space and time complexity of dynamic programming-
parameter space into convex polytopes and calculate opti- based query optimization algorithms depends on the
mal query plans at the vertices. Due to the guiding principle, number of plans stored per subquery. In traditional query
the decomposition of the parameter space can be stopped optimization, plans are compared according to one cost
once the same plan is optimal at all vertices of a polytope. metric and cost functions do not depend on parameters.
Such algorithms transform the parametric query optimiza- If we assume that alternative query plans are compared
tion problem into a series of traditional query optimization based on their cost values alone then exactly one plan, a
problems (calculating the optimal plan at a polytope vertex plan with minimal cost, remains after pruning an arbitrary
is a traditional query optimization problem). This has the set of plans. In parametric query optimization, plans are
advantage that traditional query optimizers can be used for compared according to one cost metric but cost functions
parametric query optimization with minimal changes. It is depend on parameters. This means that different plans
therefore interesting to verify whether an analog property can be optimal for different parameter values. In multi-
holds for MPQ. objective query optimization, we compare plans accord-
Unfortunately this is not the case as we show next. The ing to different cost metrics. Hence multiple plans can be
following property for MPQ would be analog to the guiding Pareto-optimal for each subquery. As a result, we generally
principle of parametric query optimization: if the same set need to store multiple plans per subquery in parametric
of plans is Pareto-optimal at all vertices of a polytope in the and in multi-objective query optimization. The number
parameter space then that set of plans must be Pareto-optimal of plans to store depends on many factors. Research in
inside the polytope as well. Figure 4 illustrates a counter parametric query optimization has focused on analyzing
example showing that this property does not hold. The figure how the number of plans per subquery depends on the
number of parameters. Research in multi-objective query
optimization has focused on the dependency between the
Figure 4. The guiding principle of parametric query optimization number of plans and the number of cost metrics. Such
does not hold for multi-objective parametric query optimization. analysis is necessarily based on simplifying assumptions.
Traditionally, the weights that define the cost functions
2 2
of different query plans are assumed to follow indepen-
Time
Fees
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 87
research highlights
query and plans for subqueries), and the number of solved 6. RELATED WORK
linear programs. We generated query templates joining Figure 6 shows how MPQ optimization relates to prior query
between 2 and 12 tables and having between one and two optimization variants. The figure shows for each variant the
parameters. type of cost function c, that is, associated with each query
Optimization time increases in the number of tables. plan. Arrows point from a more restricted to a more general
As predicted by our formal analysis in the previous section, query optimization variant. Multi-objective query optimiza-
optimization time also increases in the number of parame- tion1, 7, 11, 16, 17 and parametric query optimization3, 4, 6, 8, 10, 13
ters. Optimization time grows faster in the number of query both generalize the traditional query optimization model.14
tables for star queries than for chain queries. The reason is Multi-objective parametric query optimization generalizes
that the number of admissible join orders grows faster in the both of the aforementioned variants.
number of query tables for star queries. Speaking of admis- The algorithm that we propose in this paper allows to
sible join orders, we mean join orders that comply with the solve query optimization problems that prior algorithms
restriction mentioned before. Optimization time, the num- cannot solve. Algorithms for parametric query optimization
ber of generated plans, and the number of solved linear pro- are not applicable to MPQ since they cannot handle multiple
grams are all correlated. This is intuitive as the number of cost metrics. Algorithms for multi-objective query optimi-
generated plans relates to the number of plan comparisons zation are not applicable to MPQ since they cannot handle
that are required during pruning. The number of linear pro- parameters. Note that parameters and cost metrics have
grams is related to the number of plan comparisons since a different semantic such that it is not possible to model
plan comparisons are realized by solving linear programs. parameters as cost metrics or vice versa. Intuitively, we want
The time required for generating plans and for solving lin- to “cover” the entire parameter space (by finding plans for
ear programs adds to optimization time. each possible parameter value combination) while we do not
The query sizes that we consider in our benchmark are want to cover the entire cost space (plans with higher cost
typical for query sizes as they appear in standard bench- values than necessary are not part of the result plan set).
marks: the queries in the popular TPC-H benchmarkb join at The algorithm that we describe in this article is based
most eight tables. MPQ takes longer than traditional query on dynamic programming. It calculates optimal plans for
optimization. In contrast to traditional query optimization, a query by combining optimal plans for its subqueries.
MPQ takes, however, place before runtime. This makes Many query optimization algorithms for traditional query
higher optimization times acceptable. optimization,14 multi-objective query optimization,16, 17 and
parametric query optimization8 use the same dynamic pro-
gramming scheme. The difference between our algorithm
b
http://www.tpc.org/tpch/. and all prior algorithms lies in the implementation of the
pruning function. We use linear programming in the prun-
Figure 5. Optimization time, number of generated plans, and number
ing function. Our algorithm shares this property with prior
of solved linear programs. algorithms for parametric query optimization.8 We support
however multiple cost metrics and hence the definition of
Chain queries Star queries the pruning function, the type of the used data structures,
104 and the implementation of elementary operations on those
105
data structures differ.
Time (ms)
103 104
Many algorithms for parametric query optimization are
103
102 based on the guiding principles of parametric query optimi-
102 zation.5 They partition the parameter space in a more and
101 101 more fine-grained manner until a single query plan is opti-
105 mal in each partition.6, 9 The condition that allows to verify
# Linear programs # Created plans
References
1. Agarwal, S., Iyer, A., Panda, A. Blink 7. Ganguly, S., Hasan, W., Krishnamurthy, R.
and it’s done: Interactive queries on Query optimization for parallel
very large data. In VLDB 5, 12 (2012), execution. In SIGMOD (1992),
1902–1905. 9–18.
2. Bemporad, A., Fukuda, K., Torrisi, F. 8. Hulgeri, A., Sudarshan, S. Parametric
Convexity recognition of the union query optimization for linear and
of polyhedra. Comput. Geom. 18, 3 piecewise linear cost functions.
(2001), 141–154. In VLDB (2002), 167–178.
3. Bizarro, P., Bruno, N., DeWitt, D. 9. Hulgeri, A., Sudarshan, S.
Progressive parametric query AniPQO: Almost non-intrusive
optimization. KDE 21, 4 (2009), parametric query optimization for
582–594. nonlinear cost functions. In VLDB
4. Darera, P., Haritsa, J. On the (2003), 766–777.
production of anorexic plan diagrams. 10. Ioannidis, Y.E., Ng, R.T., Shim, K.,
PVLDB (2007), 1081–1092. Sellis, T.K. Parametric query
5. Dey, A., Bhaumik, S., Haritsa, J. optimization. VLDBJ 6, 2 (May 1997),
Efficiently approximating query 132–151.
optimizer plan diagrams. In VLDB 1, 2 11. Papadimitriou, C., Yannakakis, M.
(2008), 1325–1336. Multiobjective query optimization.
6. Ganguly, S. Design and analysis In PODS (2001), 52–59.
of parametric query optimization 12. Park, H., Widom, J. Query optimization
algorithms. In VLDB (1998), over crowdsourced data. In VLDB
228–238. (2013), 781–792. © 2017 ACM 0001-0782/17/10 $15.00
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 89
research highlights
DOI:10.1145/ 3 1 2 6 9 0 7
Technical Perspective
To view the accompanying paper,
visit doi.acm.org/10.1145/3126905 rh
AS C OM P U T E R S CI EN T I STS, we use pro- choice and code quality are related. To domain, for example, user application
gramming languages to turn our ideas do so, the authors perform an observa- versus library versus middleware. Last,
into reality. It is no surprise, then, that tional study on a corpus of 728 popular they report that some defect types,
programming language design has GitHub projects, totaling 63 million such as memory errors, are strongly as-
been a major concern since at least the lines of code. The largest single proj- sociated with certain languages.
1950s, when John Backus introduced ect considered is Linux, comprising These are interesting results. While
FORTRAN, usually considered the first 15+ million lines of code. The authors they suggest that the language does in-
high-level programming language. apply a variety of techniques to extract deed matter, almost all of the observed
The revolutionary innovation of FOR- data from the subject programs to try effects are small … except that some
TRAN—the thing that made it high-lev- to answer the question at hand. They particular language features, such as
el—was that it included concepts, such rely on GitHub Linguist to identify a lack of memory safety, do have pro-
as loops and complex expressions, that the primary language of a project. As a found effects.
made the programmer’s job easier. To proxy for code quality, they count the Like any empirical study, the results
put it another way, FORTRAN showed total number of defect-fixing commits here have threats to validity: noise in
that a programming language could per project, determined by textually the data, such as the classification of
introduce new abstractions that were searching for one of a fixed set of key- a commit as defect-fixing, is difficult
encoded via a compiler, rather than words in the commit messages. And to account for; defects may have been
directly implemented in the hardware. they use supervised machine learning made and fixed without an intervening
Not long after the introduction to approximately classify bug reports commit, for example, defects prevent-
of FORTRAN, other programming into categories. ed by a static type checker are likely not
languages appeared with somewhat The authors then apply a range of included; projects vary significantly
different sets of abstractions: John statistical methods and reach four in software engineering practices, for
McCarthy’s LISP, which introduced main conclusions. First, they report example, Linux is an outlier, with an
functional programming, and Grace that projects in Clojure, Haskell, Ruby, extremely large user base with many
Murray Hopper’s COBOL, which aimed and Scala are slightly less likely to developers and testers; tool support
to support business, rather than scien- have defect-fixing commits, and those for different languages varies signifi-
tific or mathematical, applications. in C, C++, Objective-C, and Python cantly; there may be a strong relation-
Thus, for at least the last 60 years, are slightly more likely. Second, they ship between programmer skill and
programmers have been faced with report that languages that are func- language choice; language design can
the question: What programming lan- tional, disallow implicit type conver- obviate classes of errors, for example,
guage should I use? sion, have static typing, and/or use- buffer overflows can occur in C and
Today, answering this question has managed memory have slightly fewer C++ but not Java; and in practice the
only gotten more difficult. Myriad lan- defects than languages without these choice of programming language is
guages have been developed in the last characteristics. Third, they report that often constrained both by external
six decades, with at least a few dozen in defect-proneness does not depend on factors (for example, the language of
common usage today. Moreover, even existing codebases) and the problem
if in a theoretical sense any general- domain (for example, device drivers
purpose language can implement any For at least are likely to be written in C or C++).
algorithm, in practice the different Finally, while the use of regression
abstractions provided by different lan- the last 60 years, analysis in such observational studies
guages seem to have a strong influence programmers can control for confounds and strongly
on programming tasks. The Internet suggest relationships, it cannot defini-
is filled with heated debates about the have been faced tively establish causation. Even so, this
merits of functional versus imperative with the question: paper raises intriguing questions in an
programming, the costs versus the effort to shed light on one of the oldest
benefits of object orientation, and the What programming debates in computer science.
trade-offs between dynamic and static language should I use?
typing, among many others. Jeffrey S. Foster (jfoster at cs.umd.edu) is a professor
in the computer science department at the University of
The following paper aims to bring Maryland, College Park.
empiricism to this debate by study-
ing whether programming language Copyright held by author.
A Large-Scale Study of
Programming Languages and
Code Quality in GitHub
By Baishakhi Ray, Daryl Posnett, Premkumar Devanbu, and Vladimir Filkov
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 91
research highlights
2.1. Data collection projects). This leaves us with 728 projects. Table 1 shows the
We choose the top 19 programming languages from GitHub. top 3 projects in each language.
We disregard CSS, Shell script, and Vim script as they are Retrieving project evolution history. For each of 728 projects,
not considered to be general purpose languages. We further we downloaded the non-merged commits, commit logs, author
include TypeScript, a typed superset of JavaScript. date, and author name using git. We compute code churn
Then, for each of the studied languages we retrieve the top 50 and the number of files modified per commit from the num-
projects that are primarily written in that language. In total, ber of added and deleted lines per file. We retrieve the languages
we analyze 850 projects spanning 17 different languages. associated with each commit from the extensions of the modi-
Our language and project data was extracted from the GitHub fied files (a commit can have multiple language tags). For each
Archive, a database that records all public GitHub activities. commit, we calculate its commit age by subtracting its commit
The archive logs 18 different GitHub events including new date from the first commit of the corresponding project. We also
commits, fork events, pull request, developers’ information, calculate other project-related statistics, including maximum
and issue tracking of all the open source GitHub projects commit age of a project and the total number of develop-
on an hourly basis. The archive data is uploaded to Google ers, used as control variables in our regression model, and
BigQuery to provide an interface for interactive data analysis. discussed in Section 3. We identify bug fix commits made to
Identifying top languages. We aggregate projects based individual projects by searching for error related keywords:
on their primary language. Then we select the languages with “error,” “bug,” “fix,” “issue,” “mistake,” “incorrect,” “fault,”
the most projects for further analysis, as shown in Table 1. “defect,” and “flaw,” in the commit log, similar to a prior study.18
A given project can use many languages; assigning a single Table 2 summarizes our data set. Since a project may use
language to it is difficult. Github Archive stores information multiple languages, the second column of the table shows
gathered from GitHub Linguist which measures the language the total number of projects that use a certain language at
distribution of a project repository using the source file exten- some capacity. We further exclude some languages from a
sions. The language with the maximum number of source project that have fewer than 20 commits in that language,
files is assigned as the primary language of the project. where 20 is the first quartile value of the total number of
Retrieving popular projects. For each selected language, we commits per project per language. For example, we find 220
filter the project repositories written primarily in that language projects that use more than 20 commits in C. This ensures
by its popularity based on the associated number of stars. This sufficient activity for each language–project pair.
number indicates how many people have actively expressed In summary, we study 728 projects developed in 17 lan-
interest in the project, and is a reasonable proxy for its popu- guages with 18 years of history. This includes 29,000 dif-
larity. Thus, the top 3 projects in C are linux, git, and php-src; ferent developers, 1.57 million commits, and 564,625 bug
and for C++ they are node-webkit, phantomjs, and mongo; and fix commits.
for Java they are storm, elasticsearch, and ActionBarSherlock.
In total, we select the top 50 projects in each language. 2.2. Categorizing languages
To ensure that these projects have a sufficient develop- We define language classes based on several properties of the
ment history, we drop the projects with fewer than 28 com- language thought to influence language quality,7, 8, 12 as shown
mits (28 is the first quartile commit count of considered in Table 3. The Programming Paradigm indicates whether the
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 93
research highlights
domains and so we assign them to a catchall domain labeled (2) Supervised classification. We use the annotated bug
as Other. This classification of projects into domains was fix logs from the previous step as training data for supervised
subsequently checked and confirmed by another member learning techniques to classify the remainder of the bug fix
of our research group. Table 4 summarizes the identified messages by treating them as test data. We first convert each
domains resulting from this process. bug fix message to a bag-of- words. We then remove words
that appear only once among all of the bug fix messages. This
2.4. Categorizing bugs reduces project specific keywords. We also stem the bag-of-
While fixing software bugs, developers often leave impor- words using standard natural language processing tech-
tant information in the commit logs about the nature of niques. Finally, we use Support Vector Machine to classify the
the bugs; for example, why the bugs arise and how to fix the test data.
bugs. We exploit such information to categorize the bugs, To evaluate the accuracy of the bug classifier, we man-
similar to Tan et al.13, 24 ually annotated 180 randomly chosen bug fixes, equally
First, we categorize the bugs based on their Cause and distributed across all of the categories. We then compare
Impact. Causes are further classified into disjoint sub- the result of the automatic classifier with the manually
categories of errors: Algorithmic, Concurrency, Memory, annotated data set. The performance of this process was
generic Programming, and Unknown. The bug Impact is acceptable with precision ranging from a low of 70% for
also classified into four disjoint subcategories: Security, performance bugs to a high of 100% for concurrency bugs
Performance, Failure, and Other unknown categories. Thus, with an average of 84%. Recall ranged from 69% to 91% with
each bug-fix commit also has an induced Cause and an an average of 84%.
Impact type. Table 5 shows the description of each bug cat- The result of our bug classification is shown in Table 5.
egory. This classification is performed in two phases: Most of the defect causes are related to generic program-
(1) Keyword search. We randomly choose 10% of the bug- ming errors. This is not surprising as this category involves
fix messages and use a keyword based search technique to a wide variety of programming errors such as type errors,
automatically categorize them as potential bug types. We typos, compilation error, etc. Our technique could not clas-
use this annotation, separately, for both Cause and Impact sify 1.04% of the bug fix messages in any Cause or Impact
types. We chose a restrictive set of keywords and phrases, as category; we classify these as Unknown.
shown in Table 5. Such a restrictive set of keywords and
phrases helps reduce false positives. 2.5. Statistical methods
We model the number of defective commits against other
factors related to software projects using regression. All
Table 4. Characteristics of domains.
models use negative binomial regression (NBR) to model the
Domain Example Total counts of project attributes such as the number of commits.
Domain name characteristics projects projects NBR is a type of generalized linear model used to model non-
(APP) Application End user programs bitcoin, macvim 120 negative integer responses.4
(DB) Database SQL and NoSQL mysql, mongodb 43 In our models we control for several language per-project
(CA) CodeAnalyzer Compiler, parser, etc. ruby, php-src 88 dependent factors that are likely to influence the outcome.
(MW) Middleware OS, VMs, etc. linux, memcached 48 Consequently, each (language, project) pair is a row in our
(LIB) Library APIs, libraries, etc. androidApis, 175
opencv
regression and is viewed as a sample from the population
(FW) Framework SDKs, plugins ios sdk, 206 of open source projects. We log-transform dependent count
coffeekup variables as it stabilizes the variance and usually improves
(OTH) Other – Arduino, 49 the model fit.4 We verify this by comparing transformed with
autoenv
non transformed data using the AIC and Vuong’s test for
non-nested models.
Memory (Mem) Incorrect memory handling Memory leak, null pointer, buffer overflow, heap 30,437 5.44
overflow, null pointer, dangling pointer, double free,
segmentation fault
Programming (Prog) Generic programming errors Exception handling, error handling, type error, typo, 495,013 88.53
compilation error, copy-paste error, refactoring,
missing switch case, faulty initialization, default value
Security (Sec) Runs, but can be exploited Buffer overflow, security, password, oauth, ssl 11,235 2.01
Impact
Performance (Perf) Runs, but with delayed response Optimization problem, performance 8651 1.55
Failure (Fail) Crash or hang Reboot, crash, hang, restart 21,079 3.77
Unknown (Unkn) Not part of the above categories 5792 1.04
Defective commits model Coef. (Std. Err.) We can read the model coefficients as the expected change
(Intercept) −2.04 (0.11)***
in the log of the response for a one unit change in the predic-
Log age 0.06 (0.02)*** tor with all other predictors held constant; that is, for a coef-
Log size 0.04 (0.01)*** ficient βi, a one unit change in βi yields an expected change
Log devs 0.06 (0.01)*** in the response of eβi. For the factor variables, this expected
Log commits 0.96 (0.01)***
change is compared to the average across all languages. Thus,
C 0.11 (0.04)** if, for some number of commits, a particular project devel-
C++ 0.18 (0.04)***
oped in an average language had four defective commits, then
C# −0.02 (0.05)
Objective-C 0.15 (0.05)** the choice to use C++ would mean that we should expect one
Go −0.11 (0.06) additional defective commit since e0.18 × 4 = 4.79. For the same
Java −0.06 (0.04) project, choosing Haskell would mean that we should expect
CoffeeScript 0.06 (0.05) about one fewer defective commit as e−0.26 × 4 = 3.08. The accu-
JavaScript 0.03 (0.03)
racy of this prediction depends on all other factors remaining
TypeScript 0.15 (0.10)
Ruby −0.13 (0.05)** the same, a challenging proposition for all but the most trivial
Php 0.10 (0.05)* of projects. All observational studies face similar limitations;
Python 0.08 (0.04)* we address this concern in more detail in Section 5.
Perl −0.12 (0.08) Result 1: Some languages have a greater association with
Clojure −0.30 (0.05)***
Erlang −0.03 (0.05)
defects than other languages, although the effect is small.
Haskell −0.26 (0.06)*** In the remainder of this paper we expand on this basic
Scala −0.24 (0.05)*** result by considering how different categories of applica-
Response is the number of defective commits. Languages are coded with weighted tion, defect, and language, lead to further insight into the
effects coding. AIC=10432, Deviance=1156, Num. obs.=1076. relationship between languages and defect proneness.
***p < 0.001, **p < 0.01, *p < 0.05
Software bugs usually fall under two broad categories: (1)
Domain Specific bug: specific to project function and do not
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 95
research highlights
depend on the underlying programming language. (2) Generic Table 7. Functional languages have a smaller relationship to defects
bug: more generic in nature and has less to do with project than other language classes whereas procedural languages are
function, for example, typeerrors, concurrency errors, etc. greater than or similar to the average.
Consequently, it is reasonable to think that the interaction of
application domain and language might impact the number Defective commits
of defects within a project. Since some languages are believed (Intercept) −2.13 (0.10)***
to excel at some tasks more so than others, for example, C for Log commits 0.96 (0.01)***
Log age 0.07 (0.01)***
low level work, or Java for user applications, making an inap- Log size 0.05 (0.01)***
propriate choice might lead to a greater number of defects. To Log devs 0.07 (0.01)***
study this we should ideally ignore the domain specific bugs, Functional-Static-Explicit-Managed −0.25 (0.04)***
as generic bugs are more likely to depend on the programming Functional-Dynamic-Explicit-Managed −0.17 (0.04)***
Proc-Static-Explicit-Managed −0.06 (0.03)*
language featured. However, since a domain-specific bugs may
Script-Dynamic-Explicit-Managed 0.001 (0.03)
also arise due to a generic programming error, it is difficult to Script-Dynamic-Implicit-Managed 0.04 (0.02)*
separate the two. A possible workaround is to study languages Proc-Static-Implicit-Unmanaged 0.14 (0.02)***
while controlling the domain. Statistically, however, with 17
Language classes coded with weighted effects codes (AIC = 10,419, Deviance = 1132,
languages across 7 domains, the large number of terms would Num. obs. = 1067).
be challenging to interpret given the sample size. ***p < 0.001, **p < 0.01, *p < 0.05.
Given this, we first consider testing for the dependence
between domain and language usage within a project, using
a Chi-square test of independence. Of 119 cells, 46, that is, different with p = 0.00044. We note here that while choos-
39%, are below the value of 5 which is too high. No more than ing different coding methods affects the coefficients and
20% of the counts should be below 5.14 We include the value z-scores, the models are identical in all other respects. When
here for completenessd; however, the low strength of associa- we change the coding we are rescaling the coefficients to
tion of 0.191 as measured by Cramer’s V, suggests that any reflect the comparison that we wish to make.4 Comparing the
relationship between domain and language is small and that other language classes to the grand mean, Proc-Static-
inclusion of domain in regression models would not produce Implicit-Unmanaged languages are more likely to induce
meaningful results. defects. This implies that either implicit type conversion or
One option to address this concern would be to remove memory management issues contribute to greater defect
languages or combine domains, however, our data here presents proneness as compared with other procedural languages.
no clear choices. Alternatively, we could combine languages; Among scripting languages we observe a similar relation-
this choice leads to a related but slightly different question. ship between languages that allow versus those that do not
RQ2. Which language properties relate to defects? allow implicit type conversion, providing some evidence that
Rather than considering languages individually, we aggre- implicit type conversion (vs. explicit) is responsible for this dif-
gate them by language class, as described in Section 2.2, and ference as opposed to memory management. We cannot state
analyze the relationship to defects. Broadly, each of these this conclusively given the correlation between factors. However
properties divides languages along lines that are often dis- when compared to the average, as a group, languages that do
cussed in the context of errors, drives user debate, or has not allow implicit type conversion are less error-prone while
been the subject of prior work. Since the individual proper- those that do are more error-prone. The contrast between static
ties are highly correlated, we create six model factors that and dynamic typing is also visible in functional languages.
combine all of the individual factors across all of the languages The functional languages as a group show a strong dif
in our study. We then model the impact of the six different ference from the average. Statically typed languages have a
factors on the number of defects while controlling for the substantially smaller coefficient yet both functional language
same basic covariates that we used in the model in RQ1. classes have the same standard error. This is strong evidence
As with language (earlier in Table 6), we are comparing that functional static languages are less error-prone than
language classes with the average behavior across all lan- functional dynamic languages, however, the z-tests only test
guage classes. The model is presented in Table 7. It is clear whether the coefficients are different from zero. In order to
that Script-Dynamic-Explicit-Managed class has strengthen this assertion, we recode the model as above using
the smallest magnitude coefficient. The coefficient is insig- treatment coding and observe that the Functional-Static-
nificant, that is, the z-test for the coefficient cannot distin- Explicit-Managed language class is significantly less
guish the coefficient from zero. Given the magnitude of the defect-prone than the Functional-Dynamic-Explicit-
standard error, however, we can assume that the behavior Managed language class with p = 0.034.
of languages in this class is very close to the average across
all languages. We confirm this by recoding the coefficient Df Deviance Resid. Df Resid. Dev Pr (>Chi)
using Proc-Static-Implicit-Unmanaged as the base
NULL 1066 32,995.23
level and employing treatment, or dummy coding that com- Log commits 1 31,634.32 1065 1360.91 0
pares each language class with the base level. In this case, Log age 1 51.04 1064 1309.87 0
Script-Dynamic-Explicit-Managed is significantly Log size 1 50.82 1063 1259.05 0
Log devs 1 31.11 1062 1227.94 0
Lang. class 5 95.54 1057 1132.40 0
d
Chi-squared value of 243.6 with 96 df. and p = 8.394e−15
Ruby 20
proc-static- expl. -unmanaged
Php 10
Python proc-static- implic .managed
Perl
Clojure func-static- implic .managed
Erlang
Haskell
func-dynamic- implic .managed
Scala
n
er
se
ll
ar
or
ar
tio
ra
go
re
ce
it y
yz
ba
nc
or
in
br
ew
lu
w
an
Al
ur
ica
m
em
al
re
Ov
ta
i
e
Fa
Li
c
rm
m
An
ur
am
dl
Se
pl
Da
ra
rfo
nc
id
Ap
de
og
Fr
Co
Pe
Pr
Co
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 97
research highlights
bug categories and languages, we use an NBR regression managed languages, Java induces more memory errors,
model for each category. For each model we use the same although fewer than the unmanaged languages. Although
control factors as RQ1 as well as languages encoded with Java has its own garbage collector, memory leaks are not
weighted effects to predict defect fixing commits. surprising since unused object references often prevent the
The results along with the anova value for language are garbage collector from reclaiming memory.11 In our data,
shown in Table 8. The overall deviance for each model is 28.89% of all the memory errors in Java are the result of a
substantially smaller and the proportion explained by lan- memory leak. In terms of effect size, language has a larger
guage for a specific defect type is similar in magnitude for impact on memory defects than all other cause categories.
most of the categories. We interpret this relationship to Concurrency errors. 1.99% of the total bug fix commits
mean that language has a greater impact on specific cat- are related to concurrency errors. The heat map shows that
egories of bugs, than it does on bugs overall. In the next Proc-Static-Implicit-Unmanaged dominates this
section we expand on these results for the bug categories error type. C and C++ introduce 19.15% and 7.89% of the
with significant bug counts as reported in Table 5. However, errors, and they are distributed across the projects.
our conclusion generalizes for all categories.
Programming errors. Generic programming errors account C C++ C# Java Scala Go Erlang
for around 88.53% of all bug fix commits and occur in all Race 63.11 41.46 77.7 65.35 74.07 92.08 78.26
the language classes. Consequently, the regression analy- Deadlock 26.55 43.36 14.39 17.08 18.52 10.89 15.94
sis draws a similar conclusion as of RQ1 (see Table 6). SHM 28.78 18.24 9.36 9.16 8.02 0 0
All languages incur programming errors such as faulty error- MPI 0 2.21 2.16 3.71 4.94 1.98 10.14
handling, faulty definitions, typos, etc.
Memory errors. Memory errors account for 5.44% of all the Both of the Static-Strong-Managed language classes
bug fix commits. The heat map in Figure 2 shows a strong rela- are in the darker zone in the heat map confirming, in general
tionship between Proc-Static-Implicit-Unmanaged static languages produce more concurrency errors than others.
class and memory errors. This is expected as languages with Among the dynamic languages, only Erlang is more prone
unmanaged memory are known for memory bugs. Table 8 to concurrency errors, perhaps relating to the greater use
confirms that such languages, for example, C, C++, and of this language for concurrent applications. Likewise, the
Objective-C introduce more memory errors. Among the negative coefficients in Table 8 shows that projects written
Table 8. While the impact of language on defects varies across defect category, language has a greater impact on specific categories than it
does on defects in general.
O C TO B E R 2 0 1 7 | VO L. 6 0 | N O. 1 0 | C OM M U N IC AT ION S OF T HE ACM 99
research highlights
than hybrid. Similarly, we annotate Scala as functional time. In Proceedings of the ACM undocumented software. In ACM
International Conference on Object SIGPLAN Notices, Volume 47 (2012).
and C# as procedural, although both support either design Oriented Programming Systems ACM, 683–702.
choice.19, 21 We do not distinguish object-oriented languages Languages and Applications, 16. Meyerovich, L.A., Rabkin, A.S.
OOPSLA’10 (New York, NY, USA, Empirical analysis of programming
(OOP) from procedural languages in this work as there is no 2010). ACM, 22–35. language adoption. In Proceedings
clear distinction, the difference largely depends on program- 8. Harrison, R., Smaraweera, L., Dobie, M., of the 2013 ACM SIGPLAN
Lewis, P. Comparing programming International Conference on Object
ming style. We categorize C++ as allowing implicit type con- paradigms: An evaluation of functional Oriented Programming Systems
version because a memory region of a certain type can be and object-oriented programs. Softw. Languages & Applications (2013).
Eng. J. 11, 4 (1996), 247–254. ACM, 1–18.
treated differently using pointer manipulation.22 We note that 9. Harter, D.E., Krishnan, M.S., 17. Milner, R. A theory of type
Slaughter, S.A. Effects of process polymorphism in programming.
most C++ compilers can detect type errors at compile time. maturity on quality, cycle time, J. Comput. Syst. Sci. 17, 3 (1978),
Finally, we associate defect fixing commits to language and effort in software product 348–375.
development. Manage. Sci. 46 4 18. Mockus, A., Votta, L.G. Identifying
properties, although they could reflect reporting style or (2000), 451–466. reasons for software changes using
other developer properties. Availability of external tools or 10. Hindley, R. The principal type-scheme historic databases. In ICSM’00.
of an object in combinatory logic. Proceedings of the International
libraries may also impact the extent of bugs associated with Trans. Am. Math. Soc. (1969), 29–60. Conference on Software Maintenance
a language. 11. Jump, M., McKinley, K.S. Cork: (2000). IEEE Computer Society, 120.
Dynamic memory leak detection for 19. Odersky, M., Spoon, L., Venners, B.
garbage-collected languages. In ACM Programming in Scala. Artima Inc,
6. CONCLUSION SIGPLAN Notices, Volume 42 (2007). 2008.
ACM, 31–38. 20. Pankratius, V., Schmidt, F.,
We have presented a large-scale study of language type and 12. Kleinschmager, S., Hanenberg, S., Garretón, G. Combining functional
use as it relates to software quality. The Github data we used is Robbes, R., Tanter, É., Stefik, A. Do and imperative programming for
static type systems improve the multicore software: An empirical
characterized by its complexity and variance along multiple maintainability of software systems? study evaluating scala and java. In
dimensions. Our sample size allows a mixed-methods study An empirical study. In 2012 IEEE Proceedings of the 2012 International
20th International Conference on Conference on Software Engineering
of the effects of language, and of the interactions of language, Program Comprehension (ICPC) (2012). IEEE Press, 123–133.
domain, and defect type while controlling for a number of (2012). IEEE, 153–162. 21. Petricek, T., Skeet, J. Real World
13. Li, Z., Tan, L., Wang, X., Lu, S., Zhou, Y., Functional Programming: With
confounds. The data indicates that functional languages are Zhai, C. Have things changed now? An Examples in F# and C#. Manning
empirical study of bug characteristics Publications Co., 2009.
better than procedural languages; it suggests that disallow- in modern open source software. 22. Pierce, B.C. Types and Programming
ing implicit type conversion is better than allowing it; that In ASID’06: Proceedings of the 1st Languages. MIT Press, 2002.
Workshop on Architectural and System 23. Posnett, D., Bird, C., Dévanbu, P. An
static typing is better than dynamic; and that managed mem- Support for Improving Software empirical study on the influence of
ory usage is better than unmanaged. Further, that the defect Dependability (October 2006). pattern roles on change-proneness.
14. Marques De Sá, J.P. Applied Statistics Emp. Softw. Eng. 16, 3 (2011),
proneness of languages in general is not associated with soft- Using SPSS, Statistica and Matlab, 2003. 396–423.
ware domains. Additionally, languages are more related to 15. Mayer, C., Hanenberg, S., Robbes, R., 24. Tan, L., Liu, C., Li, Z., Wang, X., Zhou, Y.,
Tanter, É., Stefik, A. An empirical Zhai, C. Bug characteristics in open
individual bug categories than bugs overall. study of the influence of static source software. Emp. Softw. Eng.
On the other hand, even large datasets become small type systems on the usability of (2013).
and insufficient when they are sliced and diced many ways
simultaneously. Consequently, with an increasing number Baishakhi Ray ([email protected]), Daryl Posnett, Premkumar Devanbu,
Department of Computer Science, and Vladimir Filkov ({dpposnett@,
of dependent variables it is difficult to answer questions University of Virginia, Charlottesville, VA. devanbu@cs., filkov@cs.}ucdavis.edu),
about a specific variable’s effect, especially where variable Department of Computer Science,
University of California, Davis, CA.
interactions exist. Hence, we are unable to quantify the spe-
cific effects of language type on usage. Additional methods
such as surveys could be helpful here. Addressing these chal-
lenges remains for future work.
Acknowledgments
This material is based upon work supported by the National
Science Foundation under grant nos. 1445079, 1247280,
1414172, 1446683 and from AFOSR award FA955–11-1-0246.
References
1. Bhattacharya, P., Neamtiu, I. 4. Cohen, J. Applied Multiple
Assessing programming language Regression/Correlation Analysis for
impact on development and the Behavioral Sciences. Lawrence
maintenance: A study on C and Erlbaum, 2003.
C++. In Proceedings of the 33rd 5. Easterbrook, S., Singer, J., Storey,
International Conference on Software M.-A., Damian, D. Selecting empirical
Engineering, ICSE’11 (New York, NY, methods for software engineering
USA, 2011). ACM, 171–180. research. In Guide to Advanced
2. Bird, C., Nagappan, N., Murphy, B., Empirical Software Engineering
Gall, H., Devanbu, P. Don’t touch (2008). Springer, 285–311.
my code! Examining the effects 6. El Emam, K., Benlarbi, S., Goel, N.,
of ownership on software quality. Rai, S.N. The confounding effect of
In Proceedings of the 19th ACM class size on the validity of object-
SIGSOFT Symposium and the 13th oriented metrics. IEEE Trans. Softw.
European Conference on Foundations Eng. 27, 7 (2001), 630–650.
of Software Engineering (2011). ACM, 7. Hanenberg, S. An experiment about
4–14. static and dynamic type systems:
3. Blei, D.M. Probabilistic topic models. Doubts about the positive impact of Copyright held by owners/authors.
Commun. ACM 55, 4 (2012), 77–84. static type systems on development Publication rights licensed to ACM.
Visit us at
http://ubiquity.acm.org/blog/
CAREERS
California Institute of Technology of the largest Christian colleges in North America, an official copy of most recent transcripts, and
Lecturer in Computing and Mathematical and was named the #1 regional college in the Mid- a diversity statement that describes how your
Sciences west for 2017 by U.S. News & World Report. teaching, scholarship, mentoring and/or service
For more information and application infor- might contribute to a liberal arts college commu-
The Department of Computing and Mathemati- mation, see: https://cs.calvin.edu/documents/ nity that includes a commitment to diversity as
cal Sciences (CMS) at the California Institute of Tenure_Track_Faculty_Position. one of its core values. Three letters of recommen-
Technology invites applications for the position dation should be sent separately. Review of appli-
of Lecturer in Computing and Mathematical Sci- cations will continue until the position is filled.
ences. This is a (non-tenure-track) career teaching Furman University To submit an application and letters of recom-
position, with full-time teaching responsibilities. Open-Rank Tenure Track Professor in mendation, please visit http://jobs.furman.edu.
We seek to fill this position this coming academic Computer Science
year and the initial term of appointment can be
up to three years. The Department of Computer Science at Furman Johns Hopkins University
The lecturer will teach introductory computer University invites applications for an open-rank Full-Time Teaching Position
science courses including data structures, algo- tenure track position to begin in the fall of 2018.
rithms and software engineering, and will work Candidates must have a Ph.D. in Computer Sci- The Department of Computer Science at Johns
closely with the CMS faculty on instructional ence or a closely related field, and all areas of spe- Hopkins University seeks applicants for a full-
matters. The ability to teach intermediate-level cialty will be considered. The position requires time teaching position. This is a career-oriented,
undergraduate courses in areas such as software teaching excellence, scholarly and professional renewable appointment that is responsible for
engineering, computing systems and/or com- activity involving undergraduates, effective insti- the development and delivery of courses pri-
pilers is desired. The lecturer may also assist in tutional service, and a willingness to work with marily to undergraduate students both within
other aspects of the undergraduate program, colleagues across disciplines. and outside the major. Teaching faculty are also
including curriculum development, academic The Department of Computer Science confers encouraged to engage in departmental and uni-
advising, and monitoring research projects. The degrees with majors in Computer Science (B.S.) versity service and may have advising responsi-
lecturer must have a track record of excellence in and Information Technology (B.S. and B.A.), an bilities. Opportunities to teach graduate level
teaching computer science to undergraduates. innovative, interdisciplinary program of study. courses may also be available, depending on the
In addition, the lecturer will have opportunities The Department values teaching and research candidate’s background. Extensive grading sup-
to participate in research projects in the depart- projects that bridge Computer Science with other port is given to all instructors. The university has
ment. An advanced degree in Computer Science disciplines, providing students with learning instituted a non-tenure track career path for full-
or related field is desired but not required. opportunities outside the classroom and in the time teaching faculty culminating in the rank of
Applications will be accepted on an ongo- community, and contributing to Furman’s uni- Teaching Professor.
ing basis until the position is filled. Please view versity-wide First Year Writing Seminar program. Johns Hopkins is a private university known
the application instructions and apply on-line at Furman Computer Science professors mentor for its commitment to academic excellence and
https://applications.caltech.edu/job/cmslect undergraduates both formally and informally, research. The Computer Science department is
The California Institute of Technology is an and work to build a welcoming student-faculty one of nine academic departments in the Whit-
Equal Opportunity/Affirmative Action Employer. community. ing School of Engineering. We are located in Bal-
Women, minorities, veterans, and disabled per- Furman University is a selective private liber- timore, MD in close proximity to Washington,
sons are encouraged to apply. al arts and sciences college committed to help- DC and Philadelphia, PA. See the department
ing students develop intellectually, personally, webpages at http://www.cs.jhu.edu for additional
and interpersonally and providing the practical information about the department, including
Calvin College skills necessary to succeed in a rapidly-changing undergraduate programs and current course de-
Tenure Track CS Faculty Position world. Our recently-launched strategic vision, scriptions.
The Furman Advantage, promises students an Applicants for the position must have a Ph.D.
The Department of Computer Science at Calvin individualized four-year pathway facilitated by in Computer Science or a closely related field,
College invites applications for a tenure track fac- team of mentors and infused with a rich and var- demonstrated excellence in and commitment to
ulty position to begin August 2018, pending ad- ied set of high-impact experiences outside the teaching, and excellent communication skills.
ministrative approval. Our department features classroom that include undergraduate research, Applicants should apply online at https://aca-
supportive colleagues, excellent facilities, (most- study away, internships, community-focused demicjobsonline.org/ajo/jobs/9571. Applications
ly) hardworking students, a dynamic colloquium learning, and opportunities to engage across will be evaluated on a rolling basis. Questions
series, and strong undergraduate programs in disciplines. should be directed to [email protected].
computer science, data science, digital commu- Furman is an Equal Opportunity Employer The Johns Hopkins University is commit-
nication, and information systems, including a committed to increasing the diversity of its fac- ted to active recruitment of a diverse faculty and
BCS major accredited by ABET (abet.org). Appli- ulty and staff. The University aspires to create a student body. The University is an Affirmative Ac-
cants should have a PhD (or be near completion) community of people representing a multiplic- tion/Equal Opportunity Employer of women, mi-
in computer science or a related area, or have a ity of identities including gender, race, religion, norities, protected veterans and individuals with
master’s degree and 5 years of related experience. spiritual belief, sexual orientation, geographic disabilities and encourages applications from
We are especially interested in expanding our ex- origin, socioeconomic background, ideology, these and other protected group members. Con-
pertise in the areas of 3D modeling & animation, world view, and varied abilities. Domestic part- sistent with the University’s goals of achieving
cybersecurity, data analytics & visualization, or ners of employees are eligible for comprehensive excellence in all areas, we will assess the compre-
machine learning, but individuals from all com- benefits. hensive qualifications of each applicant.
puting-related areas are encouraged to apply. Applicants should submit a curriculum vitae, The Whiting School of Engineering and the
Calvin is a Christian comprehensive liberal arts cover letter, statement of teaching philosophy Department of Computer Science are committed
college located in Grand Rapids, Michigan; it is one and experience, statement of research interests, to building a diverse educational environment.
Johns Hopkins University mation for at least three references. Applications a Ph.D. in computer science or a closely related
Tenure-Track Faculty Positions must be made on-line at https://academicjobson- discipline, teaching experience, and strong re-
line.org/ajo/jobs/9559. Review of applications search record or research potential.
The Johns Hopkins University’s Department of will begin in December 2017. While candidates Strong candidates in all areas of Computer Sci-
Computer Science seeks applicants for tenure- who complete their applications by December ence are encouraged to apply but we are especially
track faculty positions at all levels and across all 15, 2017 will receive full consideration, the de- interested in cyber security and networking, mo-
areas of computer science. Particular emphasis is partment will consider exceptional applicants bile development, database, or big data analytics.
at the junior level and in the areas of systems, dis- at any time. Questions should be directed to Applications should include a cover letter,
tributed systems, networks, system security, data [email protected]. curriculum vitae, research statement, a teaching
center scale computing, and big data infrastruc- The Johns Hopkins University is commit- statement, and a list of 3 references. Please send
ture. However, qualified applicants in all areas of ted to active recruitment of a diverse faculty and electronic copies of these documents to Alfred
computer science will be considered. student body. The University is an Affirmative Ac- McKinney [email protected]. Review of
The Department of Computer Science has tion/Equal Opportunity Employer of women, mi- applications will begin on September 30th, 2017,
30 full-time tenured and tenure-track faculty norities, protected veterans and individuals with and continue until the position is filled.
members, 8 research and 3 teaching faculty disabilities and encourages applications from Please, visit http://www.lsus.edu for more de-
members, 150 PhD students, 225 MSE/MSSI these and other protected group members. Con- tails about LSUS and http://www.lsus.edu/hr for
students, and over 300 undergraduate students. sistent with the University’s goals of achieving more information on this position and other em-
There are several affiliated research centers and excellence in all areas, we will assess the compre- ployment opportunities.
institutes including the Laboratory for Compu- hensive qualifications of each applicant. LSUS is committed to achieving excellence
tational Sensing and Robotics (LCSR), the Cen- The Whiting School of Engineering and the through cultural diversity. The University actively
ter for Language and Speech Processing (CLSP), Department of Computer Science are committed encourages applications and/or nominations of
the JHU Information Security Institute (JHUISI), to building a diverse educational environment. women, persons of color, veterans and persons
the Institute for Data Intensive Engineering and with disabilities. LSUS is an Affirmative Action,
Science (IDIES), the Malone Center for Engineer- Equal Opportunity Employer.
ing in Healthcare (MCEH), and other labs and Louisiana State University in
research groups. More information about the Shreveport
Department of Computer Science can be found Assistant or Associate Professor of Computer Louisiana State University in
at www.cs.jhu.edu and about the Whiting School Science Shreveport
of Engineering at https://engineering.jhu.edu. Chair of Computer Science
Qualifications and required materials can be The Department of Computer Science at Loui-
found at http://www.cs.jhu.edu/about/employ- siana State University in Shreveport invites ap- The Department of Computer Science at Louisi-
ment-opportunities/. plications for a tenure-track Assistant/Associate ana State University in Shreveport invites applica-
Applicants should submit a curriculum vitae, Professor position. The successful candidate will tions for the position of Chair of the Department
a research statement, a teaching statement, three teach 3 courses per semester (9 credit hours) be- of Computer Science in the College of Arts and
recent publications, and complete contact infor- ginning in the fall of 2018. Candidates must have Sciences to begin in the fall of 2018.
THE UNIVERSITY: California State University, East Bay is known RANK AND SALARY: Assistant Professor. Salary is dependent
for award-winning programs, expert instruction, a diverse student upon educational preparation and experience. Subject to budgetary
body, and a choice of more than 100 career-focused fields of study. authorization.
The ten major buildings of the Hayward Hills campus, on 342 acres,
contain over 150 classrooms and teaching laboratories, over 177 DATE OF APPOINTMENT: Fall Semester, 2018.
specialized instructional rooms, numerous computer labs and a library,
which contains a collection of over one million items. The University QUALIFICATIONS: Applicants must have a Ph.D. in Computer
also has campuses in Concord and Oakland, as well as Online. With Science by Fall Semester 2018. Applicants must be able to teach
an enrollment of over 15,000 students and approximately 900 faculty, undergraduate and master’s level courses in most of the standard
CSUEB is organized into four colleges: Letters, Arts, and Social Sciences; computer science core subjects. Candidates should demonstrate
Business and Economics; Education and Allied Studies; and Science. experience in teaching, mentoring, research, or community service
The University offers bachelor’s degrees in 50 fields, minors in 61 fields, that has prepared them to contribute to our commitment to diversity
master’s degrees in 37 fields, 9 credentials programs, 12 certificate and excellence. Additionally, applicants must demonstrate a record
options, and 1 doctoral degree program. http://www20.csueastbay.edu/ of scholarly activity. Candidate’s accomplishments should be
All California State University campuses, including CSUEB, will become commensurate with their professional level.
smoke and tobacco-free effective September 1, 2017.
This University is fully committed to the rights of students, staff and
THE DEPARTMENT: The Department of Computer Science has faculty with disabilities in accordance with applicable state and federal
12 full-time faculty members, with a wide range of backgrounds and laws. For more information about the University’s program supporting
interests. The faculty is committed to teaching its undergraduate and the rights of our students with disabilities see: http://www20.
Master’s level students. In a typical quarter, the Department will offer csueastbay.edu/af/departments/as/
over 30 undergraduate and about 20 graduate classes. Classes are
offered both in day and evening. Classes are generally small, with APPLICATION DEADLINE: Review of applications will begin
many opportunities for faculty-student contact. The Department offers November 1, 2017. The position, however, will be considered open
a variety of degrees: B.S. in Computer Science (with possible options until filled. Please submit a letter of application, which addresses the
in Networking and Data Communications, Software Engineering, qualifications noted in the position announcement; a complete and
or Computer Engineering), and M.S. in both Computer Science and current vita; names and email addresses of three references, three
Computer Networks. Currently, there are more than 500 undergraduate letters of recommendation, a statement of teaching philosophy, and
majors and over 250 students in the M.S. programs. evidence of teaching and research abilities via http://apply.interfolio.
com/42009. See instructions on how to apply below. Applicants are
DUTIES OF THE POSITION (2 positions currently strongly encouraged to also submit a one-page diversity statement
available): Teaching courses at B.S. and M.S. levels, curriculum that addresses how you engage a diverse student population in your
development at both levels, and sustaining a research program. Please teaching, research, mentoring, and advising.
note that teaching assignments at California State University, East Bay
include courses at the Hayward, Concord and Online campuses. In Note: California State University, East Bay hires only individuals
addition to teaching, all faculty have advising responsibilities, assist lawfully authorized to work in the United States. All offers of employment
the department with administrative and/or committee work, and are are contingent upon presentation of documents demonstrating the
expected to assume campus-wide committee responsibilities. The appointee’s identity and eligibility to work in accordance with provisions
ideal candidate for this position is able to: of the Immigration Reform and Control Act. A background check
1. Teach a wide range of computer science courses including most or (including a criminal records check and prior employment verification)
all of the core subject matter at both the undergraduate and graduate must be completed and cleared prior to the start of employment.
level.
2. Support offerings for undergraduate C.S. students including teaching For instructions on how to apply, please visit:
courses, developing the undergraduate curriculum, and engaging https://help.interfolio.com/hc/en-us/articles/203701176-Job-
undergraduate students in research. Applicant-s-Guide-to-ByCommittee -Faculty-Search
3. Support offerings for graduate C.S. students including teaching
courses, guiding M.S. theses, developing the graduate comprehensive
examination, etc. As an Equal Opportunity Employer, CSUEB does not discriminate on
4. Advise Computer Science students. the basis of any protected categories: age, ancestry, citizenship, color,
5. Participate in departmental activities such as curriculum development, disability, gender, immigration status, marital status, national origin,
assessment, outreach, etc. race, religion, sexual orientation, or veteran’s status. The University is
6. Develop and continue ongoing research activities, service and committed to the principles of diversity in employment and to creating
leadership. a stimulating learning environment for its diverse student body.
CAREERS
The Chair will teach 2 classes per semester Applications will be reviewed on a continuing Mathematics, and Statistics, as well as a new
(6 credit hours) and oversee Departmental basis starting on October 30th, 2017, until the minor in Data Science. Typical class sizes
operations, personnel, and resources. position is filled and the posting removed from range from 15 to 32 students. We encourage
Additionally, the Chair will advance the the LSUS HR Web site. We welcome questions innovative pedagogy and curriculum and
Departmental vision of academic leadership and related to this posting. emphasize computer science’s interdisciplinary
excellence and will represent the Department Please, visit http://www.lsus.edu for more de- connections. We have close relationships with
to academia, industry, and government for its tails about LSUS and http://www.lsus.edu/hr for several disciplines both within and beyond the
continued vitality. The Chair will actively work more information on this position and other em- sciences, and we are interested in candidates
with faculty in the Department and across the ployment opportunities. whose work spans disciplinary boundaries.
University to identify and pursue innovations LSUS is committed to achieving excellence Areas of highest priority include computer and
in research, education, and service. Moreover, through cultural diversity. The University actively data security and privacy, mobile and ubiquitous
the Chair will be actively involved in strategic encourages applications and/or nominations of computing, computer networks and systems.
planning that would expand the influence of the women, persons of color, veterans and persons For more information about our programs, see:
Department in the broader academic community with disabilities. LSUS is an Affirmative Action, http://macalester.edu/mscs
both within and outside the university, and will Equal Opportunity Employer.
promote diversity within the Department and the About Macalester
institution. Macalester College is a highly selective, private
The successful candidate must have earned Macalester College liberal arts college in the vibrant Minneapolis-
a Ph.D. in Computer Science or a closely related Two Tenure-Track Assistant Professors of Saint Paul metropolitan area. The Twin Cities
field. The candidate should have a track record Computer Science have a population of approximately three million,
of research, classroom teaching, student men- a rich arts community, strong local industries,
torship, and funding from diverse sources. The Macalester invites applications for two tenure- an award-winning parks system, and are home
candidate must provide evidence of scientific and track positions at the assistant professor level to to many colleges and universities, including
organizational leadership, educational innova- begin Fall 2018. Candidates who have, or are com- the University of Minnesota. Macalester’s
tion, and have outstanding communication, in- pleting, a Ph.D. in Computer Science are preferred, diverse student body comprises over 2000
terpersonal, and administrative skills. but closely related fields may also be considered. undergraduates from 40 states and the District
Interested individuals should submit We are especially interested in candidates who of Columbia and over 90 nations. The College
an application, including a cover letter have a strong commitment to both teaching and maintains a longstanding commitment to
summarizing qualifications and leadership research in an undergraduate liberal arts environ- academic excellence with a special emphasis
approach, a vision statement for the Department, ment. This person will contribute to the teaching on internationalism, multiculturalism, and
a research and student mentoring statement, a of our introductory, core and advanced courses, service to society. We are especially interested in
teaching statement, detailed curriculum vitae, and mentor undergraduate research. applicants dedicated to excellence in teaching
and three letters of reference. Please send Macalester offers majors in Computer and research/creative activity within a liberal
electronic copies of these documents to Dr. Science, Mathematics, and Applied Mathematics arts college community. As an Equal Opportunity
Alfred McKinney at [email protected]. and Statistics, and minors in Computer Science, employer supportive of affirmative efforts to
achieve diversity among its faculty, Macalester
College strongly encourages applications from
women and members of underrepresented
minority groups.
Applying
To apply via Academic Jobs Online submit (1) cur-
riculum vitae, (2) graduate transcripts, (3) three
letters of recommendation (at least one of which
discusses your potential as a teacher), (4) a cover
Urbana-Champaign, IL
letter that addresses why you are interested in
Head and Professor Macalester, (5) a statement of teaching philoso-
phy, and (6) a research statement. Please contact
Department of Computer Science Shilad Sen at [email protected] with any
questions about the position. Evaluation of appli-
an area related to algorithms for the second posi- include the analytical and empirical study of tech- Tufts University
tion. Strong candidates with research interests in nological systems, in which technology, people, Dept. of Computer Science
artificial intelligence and software aspect of data and markets interact. It thus includes operations, Tenure-Track Assistant Professor in Computer
science will be considered as well. The successful information systems/technology, and manage- Science
candidates will demonstrate not only potential for ment of technology. Applicants are expected to
excellent undergraduate teaching, but also prom- have rigorous training in management science, The Department of Computer Science at Tufts
ise in sustained research with opportunities to engineering, computer science, economics, and/ University invites applications for multiple
involve undergraduates, mentoring or recruiting or statistical modeling methodologies. Candi- tenure-track faculty positions to begin in
underrepresented groups in computer science, dates with strong empirical training in econom- September 2018. We are looking for engaged
and service to the department, College or Univer- ics, behavioral science or computer science are and engaging researchers and teachers with
sity. Positions available starting in September 2018. encouraged to apply. The appointed will be ex- a strong vision who can build and maintain
Ph.D. or equivalent required by September 2018. pected to do innovative research in the OIT field, a high-quality research program at Tufts and
The closing date for applications is Decem- to participate in the school’s PhD program, and whose research will both connect with some
ber 1, 2017 at 3 pm Pacific time. Undergraduate to teach both required and elective courses in the of our current faculty and extend into new
teaching only. MBA program. Junior applicants should have or areas. We are seeking candidates at the rank of
Santa Clara University, located in California’s expect to complete a PhD by September 1, 2018. Assistant Professor but exceptional candidates
Silicon Valley, is a comprehensive, Jesuit, Catho- Applicants should submit their applications at the rank of Associate or full Professor will also
lic university, and an AA/EEO employer. electronically by visiting the web site http://www. be considered.
For more information or to apply, visit https:// gsb.stanford.edu/recruiting and uploading their We are especially interested in candidates
jobs.scu.edu/postings/6211. curriculum vitae, research papers and publica- with research in Artificial Intelligence/Machine
tions, and teaching evaluations, if applicable, Learning/Robotics, Security, and Systems for
on that site. For an application to be considered Data Science, where these areas are broadly con-
Stanford University, Graduate School of complete, all applicants must submit a CV, a job strued. Exceptional candidates in other areas will
Business market paper and arrange for three letters of rec- be considered as well.
Faculty Positions in Operations, Information ommendation to be submitted by November 15, Please submit your application online
and Technology 2017. For questions regarding the application through Interfolio at https://apply.interfolio.
process, please send an email to Faculty_Recruit- com/43666. You can contact help@interfolio.
The Operations, Information and Technology [email protected]. com with questions.
(OIT) area at the Graduate School of Business, Stanford is an equal employment opportu- Review of applications will begin December
Stanford University, is seeking qualified appli- nity and affirmative action employer. All quali- 15, 2017 and will continue until the position is
cants for full-time, tenure-track positions, start- fied applicants will receive consideration for em- filled. For more information about the depart-
ing September 1, 2018. All ranks and relevant ployment without regard to race, color, religion, ment or the position, please visit our web page at
disciplines will be considered. Applicants are con- sex, sexual orientation, gender identity, national http://engineering.tufts.edu/computer-science-
sidered in all areas of Operations, Information origin, disability, protected veteran status, or any positions. Inquiries should be emailed to
and Technology (OIT) that are broadly defined to other characteristic protected by law. [email protected].
modern computing era. The new Head will be record and supporting letters from recognized lead- losophy, state of research and plans, and contact
uniquely positioned to build on the considerable ers in the field. Senior applicants should have a par- information for three professional references.
strength and illustrious history of the department ticularly strong record of research and teaching ac- Review of applications will begin on Novem-
to lead the next chapter of the digital revolution. complishments, scientific leadership and creativity. ber 1, 2017 and continue until the search closes
More details about the department can be found The University of Minnesota is located in the on December 31, 2017. Inquiries should be di-
at http://cs.illinois.edu. heart of the vibrant Minneapolis-St. Paul metropoli- rected to Ms. Lisa Cody, [email protected].
The Head is responsible for visioning, strategic tan area, which is consistently rated as one of Amer- EEO/AA Women and under-represented
planning, operations, finance, academic affairs, ex- ica’s best places to live and is home to many leading groups, individuals with disabilities, and veter-
ternal relations, and advancement, and is a tenured companies. The Department of Industrial and Sys- ans are encouraged to apply.
Professor in the department. The successful can- tems Engineering is within the College of Science
didate will be committed to enhancing the Univer- and Engineering at the University of Minnesota.
sity’s education, research, and service missions and Applicants are encouraged to apply by Novem- University of Oregon
will possess the scholarly record, leadership skills, ber 1, 2017. Review of applications will begin imme- Dept. of Computer and Information Science
and strategic capacity to advance the department. diately and will continue until the position is filled. Faculty Positions
Additional essential qualifications include success- Additional information and application instruc-
ful administrative experience in a university, indus- tions can be found at http://www.isye.umn.edu/ The University of Oregon’s Computer and Infor-
try, or government environment, the ability to effec- news/Search_f2017.shtml. The University of Minne- mation Science (CIS) Department invites appli-
tively engage a broad range of internal and external sota is an equal opportunity educator and employer. cations for two tenure-track faculty positions at
constituencies, and a commitment to diversity. the rank of Assistant Professor, to begin in Sep-
The Head position is a full-time, twelve- tember 2018. We seek candidates specializing in
month administrative appointment accompa- University of Nevada, Reno high-performance computing or data science.
nied by a full- time, tenured Professor appoint- Assistant or Associate Professor We are especially interested in scholars who will
ment with full University benefits. The desired enhance/complement the department’s exist-
start date for this position is as soon as possible. The Department of Computer Science and Engi- ing strengths in these areas. Applicants whose
Applicants may be interviewed before the closing neering (CSE) at the University of Nevada, Reno research addresses security and/or privacy issues
date; however, no hiring decision will be made (UNR) invites applications for a Tenure-track in these sub-disciplines are of particular interest.
until after that date. To ensure full consideration, Faculty position starting July 1, 2018. The Depart- CIS is a diverse and growing department with
applications should be received by November 10, ment seeks highly qualified candidates in games, strengths in networking and distributed systems,
2017, but applications will be accepted until the hardware, or in areas that extend or complement data science, and high-performance computing.
position is filled. Salary is negotiable and com- the Department’s existing strengths or fulfill De- We offer a stimulating, friendly environment for
mensurate with skills and experience. partment needs. The position is at the Assistant collaborative research both within the department,
The University has retained Witt/Kieffer, a or Associate Professor level. The new hire will which expects to grow substantially in the next few
national executive search firm, to assist in this re- work with existing faculty to strengthen research, years, and with other units on campus; for exam-
cruitment. Nominations and applications, includ- attract research funding, teach courses, and en- ple, the department plays a key role in the Knight
ing cover letter, curriculum vita, and the names/ hance our graduate and undergraduate programs. Campus for Accelerating Scientific Impact. The
contact information for three references, should The rapidly expanding and dynamic Depart- department hosts two interdisciplinary research
be submitted electronically to Witt/Kieffer consul- ment of Computer Science and Engineering has centers, the Center for Cyber Security and Privacy
tants John K. Thornburgh and Brian Bloomfield added nine positions over the last five years and and the NeuroInformatics Center. Successful can-
at the email address [email protected]. expects to add more. Several faculty have NSF CA- didates have access to a new, state-of-the-art high-
The consultants can be reached by telephone, REER awards and play lead roles in multiple state- performance computing facility. CIS is part of the
care of Donna Janulis, at 630-575-6131. wide and national multi-million dollar NSF awards. College of Arts and Sciences and is housed within
The University of Illinois conducts criminal In addition to federal support from DoD, DHS, DoE, the Lorry Lokey Science Complex. The department
background checks on all job candidates upon and NASA, companies like Google, Microsoft, Ford, offers B.S., M.S. and Ph.D. degrees. More informa-
acceptance of a contingent offer. AT&T, Nokia, and Honda support our research. tion about the department, its programs and fac-
The University of Illinois is an Equal Opportu- In the last five years, the College of Engineer- ulty can be found at https://cs.uoregon.edu/.
nity, Affirmative Action employer. Minorities, wom- ing has witnessed unprecedented growth in stu- Applicants must have a Ph.D. in computer sci-
en, veterans and individuals with disabilities are dent enrollment and number of faculty positions. ence or closely related field, a demonstrated re-
encouraged to apply. For more information, visit The College is positioned to further enhance its cord of excellence in research, and a strong com-
http://go.illinois.edu/EEO. To learn more about the growth of its students, faculty, staff, facilities as mitment to teaching. A successful candidate will
University’s commitment to diversity, please visit well as its research productivity and its graduate be expected to conduct a vigorous research pro-
http://www.inclusiveillinois.illinois.edu. and undergraduate programs. gram and to teach at both the undergraduate and
Thanks to this substantial growth in both graduate levels. Additionally, successful candi-
student enrollment and tenure-track faculty po- dates will support and enhance a diverse learning
University of Minnesota sitions, the College of Engineering has received and working environment. Salary is competitive.
Two Tenure-Track Positions funding to build a new engineering building, Candidates are asked to apply online at https://
scheduled to be completed in 2020. The new academicjobsonline.org/ajo/jobs/9499 by submit-
The Department of Industrial and Systems Engi- engineering building provides both additional ting a cover letter, a curriculum vitae, a research
neering at the University of Minnesota invites ap- space critically needed by the College and the statement, a teaching statement, and the contact
plications for two tenure-track faculty positions modern facilities capable of supporting advanced details for a minimum of three referees, by 15 De-
starting in Fall 2018. research and laboratory space. This building will cember 2017, or until the post has been filled. If
Applicants at all ranks will be considered. We allow the College to pursue its strategic vision, you are unable to use this online resource, please
seek candidates with a strong methodological foun- serve Nevada and the nation, and educate future contact [email protected] to ar-
dation in Operations Research, Industrial Engineer- generations of engineering professionals. range alternate means of application submission.
ing and a demonstrated interest in applications The University of Nevada, Reno recognizes The University of Oregon is dedicated to the goal
including, but not limited to: business analytics, en- that diversity promotes excellence in education of building a culturally diverse and pluralistic faculty
ergy and the environment, healthcare and medical and research. We are an inclusive and engaged committed to teaching and working in a multicul-
applications, transportation and logistics, supply community and recognize the added value that tural environment and strongly encourages applica-
chain management, and service systems. Appli- students, faculty, and staff from different back- tions from minorities, women, and people with dis-
cants must hold a Ph.D. in Industrial Engineering, grounds bring to the educational experience. abilities. Applicants are requested to include in their
Operations Research, Operations Management or Interested candidates must apply online to cover letter information about how they will further
a closely related discipline and have demonstrated www.unrsearch.com/postings/25237. Applica- this goal. In particular, candidates should describe
the potential to conduct a vigorous and significant tion process includes: a detailed letter of applica- previous activities mentoring minorities, women, or
research program as evidenced by their publication tion, curriculum vitae, statement of teaching phi- members of other underrepresented groups.
ACM
solution is to minimize swaps.
LEARNING CENTER
RESOURCES
Solution to challenge. Numbering
the rows from top to bottom and left to
Upstart Puzzles
Partitioned Peace
I N A M Y T H I C A L land of rhetorically en-
couraged antagonism, different fac-
tions manage to co-exist, though poorly.
Imagine a set of red and blue hill towns 1
connected by a network of roads. People
in the red towns deal well with one an-
other. People in the blue towns deal well
7 2
with one another. But when a person
from a red town travels through a blue
town or vice versa, things can get un-
pleasant. The leaders of the red and the
blue towns get together and decide the
best way to resolve their differences is to
perform a series of swaps in which the 6 3
inhabitants of k red towns swap towns
with the inhabitants of k blue towns
with the end result that a person from a
blue town can visit any other blue town
without passing through a red town and
likewise for a person from a red town.
We call such a desirable state “parti-
tioned peace.” The goal is to make k as
small as possible.
5
Warm-Up 1. Given the configuration
in the figure here, what is the minimum
4
number of swaps needed to achieve par-
titioned peace? What is the minimum number of towns you can exchange so red and blue travelers never
Solution to Warm-Up 1. Two swaps need to cross the other color’s towns?
are sufficient: Red_1 with Blue_7 and
Red_3 with Blue_4. without crossing through blue towns. ing to cross through red towns.
Because exchanging town popula- Likewise, the blue town dwellers can Challenge: So far, we have considered
tions is painful for the people who must travel to other blue towns without hav- only a very simple configuration of towns;
move, the leaders seek other arrange- now consider that the red and blue towns
ments; they are willing to build, for ex- alternate like the squares of a four-by-four
ample, a certain number of roads to re- Can you create checkerboard. Every town is connected
duce the number of swaps. to all its vertical and horizontal neigh-
Warm-Up 2. Given the configura- an algorithm that bors. You may build eight new roads be-
tion in the figure, what is the minimum will perform a tween diagonally neighboring towns.
IMAGE BY AND RIJ BORYS ASSOCIAT ES
number of swaps needed to achieve par- Where should the roads go? And af-
titioned peace if you were able to build a minimum number ter the roads are built, which towns
single new road? of swaps to achieve should swap populations to minimize
Solution to Warm-Up 2. Build a road the number of swaps needed to achieve
between Blue_7 and Red_1 and then partitioned peace? partitioned peace, where the swaps are
swap Red_1 with Blue_4. The red town between towns directly connected by
dwellers can travel to other red towns roads? The [C O NTINUED O N P. 111]