Re-Examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult To Design
Re-Examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult To Design
Re-Examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult To Design
Figure 1: Mapping the human-AI interaction design challenges in the literature [58, 13, 26, 53]
onto a user-centered design process (Double Diamond [10])
.
two sources of AI’s design complexities, and a framework design, AI/machine learning as a design material, the design
that unravels their effects on design processes. of intelligent systems, designing for/with data [6, 14, 37], and
4. We demonstrate the usefulness of the framework. Specifi- many more [33, 42, 43].
cally, its usefulness to human-AI interaction designers, to
researchers of AI’s HCI issues, and to AI design method To better unpack what is known about the challenge HCI re-
innovators and tool makers. searchers and practitioners face when working with AI, we
cataloged these challenges and their emergent solutions. To
This paper makes three contributions. First, it provides a syn- gain a new perspective of this work, we mapped the chal-
thesis of many human-AI interaction design challenges and lenges and solutions to the familiar double diamond design
emergent solutions in literature. Second, the provocation ques- process used to describe user-centered design (Figure 1) and
tions offer an alternative lens for approaching the human-AI to a diagram displaying a lean startup process with its focus
interaction design challenge. We draw attention to AI’s design on producing a minimal viable product (MVP) (Figure 2), a
complexity rather than technical complexity; We draw atten- design approach becoming more popular with the growing use
tion to how AI hinders the interaction design process rather of agile software development.
than the end product. Finally, our framework gives structure
to the currently fuzzy problem space of human-AI interaction UX Design Challenges of AI
design. This provides a first step towards systematically un- Across HCI and UX communities, researchers and practition-
derstanding how HCI research might best choose to approach ers have reported challenges in working with AI at almost
these problems going forward. every step of a user-centered design process. From left to right
on Figure 1, they reported:
RELATED WORK
• Challenges in understanding AI capabilities (first divergent
Recent research has become increasingly interested in the thinking stage): Designers frequently report that it is dif-
opportunities and challenges AI brings to HCI and UX de- ficult to grasp what AI can or cannot do. This hampers
sign. As researchers produced a wealth of valuable, novel designers’ brainstorming and sketching processes from the
designs, they also reported encountering many challenges in start [13, 26, 51, 56].
the process [2, 3, 18, 26]. Some research has investigated
challenges faced by UX practitioners who do not specialize • Challenges in envisioning many novel, implementable AI
in AI but who desire to integrate it into their practice [13, things for a given UX problem (in both divergent thinking
53]. Research has chosen a number of different frames for stages): AI-powered interactions can adapt to different users
investigating these challenges including human-AI interaction and use contexts, and they can evolve over time. Even when
Figure 2: Mapping UX design challenges of AI in prior research on a technology-driven design innovation process [41, 5]
designers understand how AI works, they often found it technical experts can be a scarce resource [19, 53]. Some
difficult to ideate many possible new interactions and novel designers also found it challenging to effectively collaborate
experiences with much fluidity [13, 58]. with AI engineers, because they lacked a shared workflow,
boundary objects, or a common language for scaffolding
• Challenges in iterative prototyping and testing human-AI the collaboration [19, 28, 52].
interaction (in both convergent thinking stages): One core
practice of HCI design and innovation is rapid prototyp- Propelled by these challenges, a few researchers speculated
ing, assessing the human consequences of a design and that, when working with AI, designers should start with an
iteratively improving on it. HCI practitioners cannot mean- elaborate matching process that pairs existing datasets or AI
ingfully do this when working with AI. As a result, AI’s systems with the users and situations that are most likely to
UX and societal consequences can seem impossible to fully benefit from the pairing [5, 51]. This approach deviates from
anticipate. Its breakdowns can be especially harmful for more traditional user-centered design in that the target user
under-served user populations, including people with dis- or UX problem is less fixed. It is more similar to customer
abilities [45]. discovery in an agile development process that focuses on
HCI researchers have tried two approaches to addressing the creation and continual evaluation of a minimal viable
this challenge. One approach is to create Wizard of Oz sys- product (MVP) [41]. In this light, we also mapped the human-
tems or rule-based simulators as an early-stage interactive AI interaction design challenges onto an MVP innovation
AI prototype (e.g. as in [11, 30, 40, 44]). This approach process. However, it seems a similar set of design challenges
enables HCI professionals to rapidly explore many design that curbed user-centered design also thwarted technology-
possibilities and probe user behaviors. However, this ap- driven design innovations (Figure 2, from left to right), for
proach fails to address any of the UX issues that will come example:
from AI inference errors because there is no way to simu-
• Challenges in understanding AI capabilities;
late these errors [52]. The second approach is to create a
• Challenges in mapping out the right user stories and user
functioning AI system, and deploy it among real users for a
cases of a “minimum viable" AI system, or envisioning how
period of time [53]. This time-consuming, field-trial proto-
it can be applied in less obvious ways [13];
typing process enables designers to fully understand AI’s
• Challenges in collaborating with AI engineers.
intended and unintended consequences. However, it loses
the value that comes from rapid and iterative prototyping. We found no agreed-upon set of root causes or themes around
This approach does not protect teams from over-investing which we can easily summarize these challenges. Some re-
in ideas that will not work. It does not allow them to fail searchers suggested that AI systems’ technical complexity
early and often. causes the interaction design problems [9]. Some considered
the unpredictable system behaviors as the cause [26]. Some
• Challenges in crafting thoughtful interactions (in the last argued that AI appeared to be difficult to design because AI is
convergent thinking stage): Designers struggled to set user just “a new and difficult design material," suggesting that over
expectations appropriately for AI’s sometimes unpredictable time, known HCI methods will likely address these challenges
outputs [4]. They also worried about the envisioned designs’ [13]. Others argued that user-centered design needs to change
ethics, fairness, and other societal consequences [13, 26]. in order to work for AI [19, 51]. These proposals rarely share
• Challenges in collaborating with AI engineers (through- key citations that indicate emerging agreements.
out the design process: For many UX design teams, AI
Facilitating Human-AI Interaction Design better designs. Throwing AI into the mix only seems to in-
HCI researchers have started to investigate how to make it crease this challenge.
easier to design and innovate human-AI interactions. We
identify five broad themes in this body of work: THREE QUESTIONS FOR PROVOCATION
1. Improving designers’ technical literacy. An emerging con- We wanted to articulate whether, why, and how AI is distinctly
sensus holds that HCI and UX designers need some tech- difficult to design. The preceding review of related work re-
nical understanding of AI to productively work with it. vealed a remarkable set of insights and approaches to this
Designer-facing AI education materials have become avail- complex problem space. Now we step back and critically ex-
able to help (e.g. [8, 20, 22, 23]). However, substantial amine this rich body of research in order to more holistically
disagreement remains in what kinds of AI knowledge are understand AI’s resistance to design innovation. What has
relevant to UX design, and in how advanced a technical research missed? Can we see gaps or emerging opportuni-
understanding is good enough for designers [9, 48, 54]. ties across this work? Our reflection of the related work led
to three provocative questions. These questions served as a
2. Facilitating design-oriented data exploration. This body springboard for rethinking how we might further advance our
of work encourages designers to investigate the lived-life understanding of AI’s design challenges.
of data and discover AI design opportunities [6, 7, 14]. For
example, [37] investigated users’ music app metadata as What is AI?
a material for designing contemplative music experience; One critical question has been absent from the research dis-
[24] explored the design opportunities around intimate so- course around human-AI interaction: What is AI? Or rather,
matic data. Notably, this body of work often used terms what should count as AI as it relates to HCI and UX design?
like data-driven or smart systems; It was not always clear Prior literature has used a range of poorly-defined terms, such
when the authors specifically aimed at AI. as machine learning systems, intelligent/smart systems, AI-
infused systems, and more. The research discourse on under-
3. Enabling designers to more easily “play with" AI in sup- standing machine intelligence as a technological material is
port of design ideation, so as to gain a felt sense of what sometimes mixed with intelligence as an experiential factor.
AI can do. This work created interactive machine learning
(iML) tools and rule-based simulators as AI prototyping Locating the elusive concept of AI is difficult. What is com-
tools, for example, Wekinator for gesture-based interactions monly referred to as AI encompasses many disconnected tech-
[1] and the Delft AI Toolkit for tangible interactions [49]. nologies (e.g., decision tree, Bayesian classifier, computer
Currently, almost all iML tools are application-domain- vision, etc.). The technical boundary of AI, even in AI re-
specific. In order to make the systems accessible to de- search communities, is disputed and continuously evolving
signers and maximally automate data prepossessing and [46, 50]. More importantly, an actionable, designerly under-
modeling, these systems had to limit the range of possi- standing of AI is likely to be very different from a technical
ble in/outputs, therefore focused on particular application definition that guides algorithmic advances.
domains [38, 39]. Yet discussing AI’s design challenges without any bounding
4. Aiding designers in evaluating AI outputs. In recent years, seems problematic. What makes a challenge distinctly AI
technology companies have proposed more than a dozen and not a part of the many challenges designers regularly
human-AI interaction principles and guidelines (See a re- face in HCI and UX work? Current research does not always
view here [47]). These guidelines covered a comprehensive make this distinction. For example, Amershi et al. systemat-
list of design considerations such as “make clear how well ically evaluated 20 popular AI products and proposed a set
the system can do, what it can do" [4] and “design graceful of guidelines for designing human-AI interactions [4]. These
failure recovery" [21]. guidelines include “make clear what the system can do" and
“support efficient error correction". These seem important to
5. Creating AI-specific design processes. Some researchers AI systems, but they also seem to be issues that designers
have proposed that AI may require design processes less must address in systems with no AI. What is less clear is if AI
focused on one group of users, and instead on many user requires specific considerations in these areas.
groups and stakeholders [15]; processes focused less on
fast, iterative prototyping, and instead on existing datasets What Are AI’s Capabilities and Limits?
and functioning AI systems [51]; or processes focused less Designers need to understand the capabilities and limitations
on one design as the final deliverable to engineers, and of a technology in order to know the possibilities it offers for
instead on closer, more frequent collaborations [19]. design [17]. Engineers create new technological capabilities;
designers create new, valuable products and services with
These themes demonstrated the remarkable heterogeneity of
existing technological capabilities and limitations [34].
approaches researchers have taken to address the challenges
around human-AI interaction design. Similar to most design Interestingly, AI’s capabilities and limitations have not been
methods published within HCI research, we found no empir- the focus of current research. Instead, most work has fo-
ical evaluations of the proposed design tools, guidelines, or cused on getting designers to understand how AI functions
workflows. It is difficult to control for and measure improve- (2.1 theme 1). This is quite different from the traditional ways
ments in a design process to show that a method is producing of moving new technology from research to design practice,
which assume designers do not need to understand the tech- In this work, AI refers to computational systems that in-
nical specifics of the technology. In addition, research has terpret external data, learn from such data, and use those
produced many rule-based and Wizard of Oz simulators to learnings to achieve specific goals and tasks through
help designers better understand AI’s design opportunities flexible adaptation. [27]
(themes 2 and 3). Little is known about whether these systems
can sensitize designers to AI’s limitations realistically. This Importantly, we did not intend to draw a technical boundary of
motivates the question: Can an articulation of AI’s capabilities what counts as AI here. We also do not consider this definition
foster a more incisive examination of its design challenge? as valuable for HCI practitioners in working with AI. Instead,
we used this definition only to examine whether the systems
that are technically considered as AI indeed require new HCI
Why Is Prototyping AI Difficult? design methods. For example, this definition describes AI
AI brings challenges to almost all stages of a typical design as “learning" from data, yet does not specify what counts
process. However, the proposed AI design methods and tools as “learning." (It remains an issue of debate in technical AI
have mostly focused on the two ends of this creative process communities.) Therefore in our synthesis, we considered the
(Figure 1 and 2); either helping designers to understand what challenges designers reported in working with a full range of
AI is and can do generally, or enhancing the evaluation of the data-driven systems, including machine learning, classic ex-
final design. The central activities of an interaction design pro- pert systems, crowd sourcing, etc. We then examined whether
cess, i.e. sketching and prototyping, are under-explored. Re- the challenges are different across the spectrum from systems
search through Design (RtD) projects are rare when it comes that we all agree “learned" from data to those that we all agree
to designing and innovating human-AI interaction [51]. did not. This way, we started to separate the design challenges
that are unique to AI and those HCI routinely copes with.
Sketching and prototyping may constitute a fruitful lens for
understanding AI’s design challenges. They are cornerstones
UX Design Processes as Data
of any design process. It is through sketching and prototyping
Within this bounding, we curated a set of AI design process
that designers understand what the technology is and can do,
from our own research, design, and teaching experience. All
engage in creative thinking, and assess and improve on their
projects described below except teaching have been published
designs. Interrogating why is it difficult to abstract AI-powered
at DIS and CHI. We re-analyzed the data collected across
interactions into sketches and prototypes may shed light on
these projects for the purpose of this work. Below is a brief
how the other tangled design challenges relate to each other.
overview of these projects.
then provided students with a dataset and a demonstrational AI Two Sources of AI Design Complexity
system. Students were asked to design new products/services
with these materials for an enterprise client. 26 HCI Master Capability Uncertainty
students from two universities attended the workshop. All of When speaking of the capabilities of AI, we broadly refer
them had little to no technical AI background. Throughout the to the functionality AI systems can afford (e.g. detect spam
series, we experimented with different ways of introducing AI. emails, rank news feeds, find optimal driving routes), how
We observed how students used the AI technical knowledge in well the system performs, and the kinds of errors it produces.
their design, where and how they struggled, and which chal- The capabilities of AI is highly uncertain. We illustrate this by
lenges they were able to resolve with known design methods. walking through the lifetime of an AI system, moving from an
emergent algorithmic capability in AI research labs to situated
We also taught a one semester design studio course: Designing user experience in the wild (Figure 3, left to right).
AI Products and Services. Approximately 40 undergraduate
and master students took the course. About half of them had a AI’s capability uncertainty is at its peak in the early design
computer science or data science background. In comparison ideation stage, when designers work to understand what de-
to the workshops, the course allowed us to observe students sign possibilities AI can offer generally. This is not easy
working with a more diverse set of AI systems and design tasks, because there exists no catalog of available AI capabilities.
e.g. designing crowd as a proxy for AI, designing simple UI What might seem like a blue-sky AI design idea may suddenly
adaptions, designing natural language interactions. become possible because of a newly available dataset. The
performance of a deployed AI system can constantly fluctuate
Data Analysis and diverge when it gains new data to improve its learning.
With this diverse set of design processes and observations, This great uncertainty in AI’s capabilities makes it difficult for
we synthesized a framework meant to give structure to the designers to evaluate the feasibility of their emergent ideas,
many challenges around human-AI interaction design. We thereby hindering their creative processes.
started by proposing many themes that might summarize these
The range of AI’s available capabilities includes more than the
challenges. We then analyzed the emergent themes via affinity
capabilities of existing AI systems. It includes any AI things
diagramming, with a focus on the characteristics of AI that
that are technically feasible. When envisioning AI systems
may scaffold a full range of design challenges. Specifically,
that do not yet exist, designers face additional capability uncer-
we critiqued these frameworks based on three criteria:
tainties. For example, designers may choose to harvest their
• Analytical leverage: The framework should effectively scaf- own dataset from users’ traces of interaction. This approach
fold a wide range of AI’s design opportunities and chal- gives designers a relatively high degree of control over the
lenges. It should help separate design challenges unique to data they will eventually work with. However, it is often very
AI from others; difficult to estimate how long it might take to collect enough
high-quality data and to achieve the intended functionality.
• Explanatory power: The framework should help researchers Designers frequently worked with user-generated data in order
articulate how a proposed design method/tool/ workflow to understand available AI capabilities. To understand AI’s
contributes to the challenges of human-AI interaction de- capabilities, to a great extent, is to understand this gap between
sign, and the limits of its generalizability. what the data appear to promise and the AI system built from
• Constructive potential: The framework should not only that data can concretely achieve. As one expert designer we
serve as a holder of AI’s known challenges and solutions; It interviewed describes: To understand what AI can do is to
should also provide new insights for future research. conceptualize “a funnel of what (data and/or system) exists
and what is possible." [53]
We proposed and discussed more than 50 thematic constructs
and frameworks. The three authors, an external faculty, and an Alternatively, designers may choose to leverage existing AI
industry researcher participated in this process. All have spent libraries or pre-built models to address their design problem
at least 5 years researching AI and HCI. We also presented at hand. These systems free designers from the data troubles
this work to two research groups. One included about 40 and allow them to get a felt experience of the AI interactions.
HCI researchers. The other included 12 machine learning Unfortunately, these toolkits represent a very narrow subset of
researchers. They provided additional valuable critiques and the whole landscape of AI capabilities.
helped us refine the framework. What AI can do for a UX problem at hand becomes clearer
once a functioning AI system is built. For most systems trained
THE FRAMEWORK
on self-contained datasets, designers can measure their per-
Our synthesis identified two attributes of AI that are central
formance and error modes, and then make design choices
to the struggles of human-AI interaction design: capability
accordingly. However, this performance is limited by any
uncertainty (uncertainties surrounding what the system can
biases present in a dataset and should only be viewed as an
do and how well it performs) and output complexity (com- initial estimate (system performance in Figure 3).
plexity of the outputs that the system might generate). Both
dimensions function along a continuum. Together they form a Some AI systems continue to learn from new data after deploy-
valuable framework for articulating the challenges of human- ment (labeled as “deployed system performance over time” in
AI interaction. This section describes the framework. In the Figure 3). In the ideal case, the system will “grow," integrating
next section, we demonstrate its usefulness. new insights from new data and adapting flexibly to more va-
rieties of users and use contexts. Unfortunately, the new data When designing systems that produce many possible outputs,
might also drive system performance in the wrong direction. sketching and prototyping become more complex and cog-
Tay, the Twitter bot, provides an extreme example [36]. More nitively demanding. Imagine designing the interactions of a
typically, the system’s performance improves for users and use driving route recommender. How many types of errors could
contexts that have produced rich data. It performs worse for the recommender possibly produce? How might a user en-
less frequent users and less typical situations. That the system counter, experience, and interpret each kind of error, in various
capability can constantly evolve, fluctuate, and diversify is use contexts? How can interaction design helps the user to
another part of AI’s capability uncertainty. recover from each error elegantly? Some simulation-based
methods or iML tools can seem necessary for prototyping
Finally, user profiles and use contexts could also impact an AI and accounting for the route recommender’s virtually infinite
system’s capability. Many context-aware and personalization
variability of outputs. The route recommender exemplifies the
systems fall into this category. Consider the social media
many AI systems that produce open-ended, adaptive outputs.
news feed ranker, Amazon shopping recommendations, and
The traditional, manual sketching and prototyping methods
ride-hailing app’s driver-rider matching as examples. It is not
struggle to fully capture the UX ramifications of such systems.
difficult to conceptualize what these systems can do in general
(e.g. ranking news, recommending items); however, it is no The system outputs that entail most design complexities are
trivial task to envision, for a particular user in a particular those that cannot be simulated. Consider Siri as an exam-
use context, what error the AI system might make, and how ple. Similar to route recommenders, Siri can generate infinite
the user might perceive that error in-situ. Anticipating the possibilities of outputs. Yet unlike route recommenders, the
situated, user-encountered capability of AI is difficult, yet it relationship between Siri’s in- and outputs follow complex
is fundamental to user experience design. patterns that cannot be concisely described. As a result, rule-
based simulators cannot meaningfully simulate Siri’s utter-
ances; nor can a human wizard. We refer to such AI system
Output Complexity outputs as “complex.”
The second source of human-AI interaction challenges con-
cerns what an AI system produces as a possible output. While Notably, output complexity is not output unpredictability.
capability uncertainty is responsible for the HCI design chal- While prior research often viewed AI systems’ unpredictable
lenges around understanding what AI can do, AI’s output errors as causing UX troubles, we argue that AI’s output com-
complexity affects how designers conceptualize the system’s plexity is the root cause. Let us illustrate this by considering
behaviors in order to choreograph its interactions. how designers might account for AI errors when designing
two different conversational systems. One is Siri. The other
Many valuable AI systems generate a small set of possible is a system that always replies to user requests with a random
outputs. Designing interactions for these systems is similar word picked from a dictionary. While highly unpredictable,
to designing for non-AI systems that generate probabilistic the interactions of the latter system can be easily simulated
outputs. A face detection tool, for example, outputs either by a random word generator. Then following a traditional
“face" or “not face." To design its interactions, the designer prototyping process, designers can start to identify and miti-
considers four scenarios: when a face is correctly detected gate the AI’ costly errors. In contrast, Siri’s outputs are only
(true positive), when no face is detected (true negative), when quasi-random, therefore resist abstraction or simulation. To
there is no face and a face is mistakenly detected (false posi- date, it remains unclear how to systematically prototype the
tive), and when the image contains a face but the system fails UX of such systems, in order to account for its breakdowns.
to detect it (false negative). Designers consider each condition
and design accordingly.
Figure 3: The conceptual pathway translating between AI’s capabilities and thoughtful designs of human-AI interaction. AI’s
capability uncertainty and output complexity add additional steps (the colored segments) to a typical HCI pathway, make some
systems distinctly difficult to design. Designers encounter these challenges from left to right when taking a technology-driven
innovation approach; right to left when following a user-centered design process.
design as usual and provide wireframes as a deliverable to Challenges in envisioning novel and technically feasible
engineers at the end of their design process. designs of the technology: Re-imagining many new uses
of a face-recognition-and-tagging tool – beyond tagging
Language toxicity detection is a complex technical problem
people on photos – can be difficult. This is because its capa-
at the frontier of AI research. However, because the system’s
bilities are highly evolved and specialized for its intended
capabilities are bounded and the outputs are simple, existing
functionality and interactions.
HCI design methods are sufficient in supporting designers in
sketching, prototyping, and assessing its interactions. Lan-
guage toxicity exemplifies level one systems; They are valu- Challenges in iterative prototyping and testing: The sys-
able, low-hanging fruits for HCI practitioners to integrate into tem’s capabilities evolve over time as users contribute more
today’s products and services. images and manually tags, challenging the very idea of
rapid prototyping.
Level Four: Designing Evolving, Adaptive Systems
Level four systems learn from new data even after deployment.
They also produce adaptive, open-ended outputs that resist Challenges in collaborating with engineers. The system
abstraction. Search engines, newsfeed rankers, automated requires a closer and more complex HCI-AI collaboration
email replies, a recommender system that suggests “items you than as in a traditional double-diamond process. Engineers
might like," would all fit in this category. In designing such and designers need to collaborate on understanding how the
systems, designers can encounter a full range of human-AI face-recognition performance will evolve with users’ newly
interaction design challenges. Consider the face recognition uploaded photos and tags, how to mitigate the AI’s potential
system within a photos app. It learns from the photos the user biases and errors, as well as how to detect AI errors from
uploaded, clusters similar faces across photos, and automat- user interactions so as to improve system learning.
ically tags the face with the name inferred from the user’s
Face recognition and tagging are a relatively mature technol-
previous manual tags.
ogy that many people use every day. However, because its
Challenges in understanding AI capabilities: The system’s capabilities are constantly evolving and the outputs are di-
performance and error modes are likely to change as it verse, systematically sketching, prototyping, and assessing the
learns from new images and tags. Therefore it is difficult to UX of face tagging remains challenging. It exemplifies level
anticipate what the system can reliably do, when and how four systems; These are opportune areas for HCI and RtD
it is likely to fail. This, in turn, makes it difficult to design researchers to study human-AI interaction and design, without
appropriate interactions for these scenarios. getting deeply involved in technological complexities.
Figure 6: An example of the framework in use. Using the framework, researchers can easily outline the problem space of a
human-AI interaction issue of their interest, for example, the issue of AI fairness.
The Anatomy of AI’s HCI Issue 2. Articulating the contributions and limits of emergent design
For researchers who study specific human-AI interaction de- methods/tools/processes. To make prototyping human-AI
sign issues (e.g. fairness, intelligibility, users’ senses of con- interaction easier, researchers have created simple-rule-based
trol, etc.), the proposed framework gives a preliminary struc- simulators [49, 7]) as AI prototyping tools. Mapping the
ture to these vast issues. Take as an example the challenges characteristics of rule-based interactions onto the AI design
surrounding accounting for AI biases, a challenge that many complexity map (Figure 5), it becomes evident that rule-based
critical AI systems face across application domains such as simulators are most effective in prototyping level 1-2 systems.
healthcare and criminal justice. Building a “fair" AI applica- They can be particularly valuable for systems that generate a
tion is widely considered as difficult, due to the complexity broad set of outputs (level 2) where traditional, manual pro-
both in defining fairness goals and in algorithmically achiev- totyping methods struggle. However, rule-based simulators
ing the defined goals. Prior research has been addressing these cannot easily prototype systems that autonomously learn from
challenges by promoting interaction design guidelines [4, 35]. user-generated data (level 3-4). These are living, sociotechni-
cal systems; the rules that map their inputs to outputs evolve
Our framework provides a more holistic structure to the prob- in complex ways over time.
lem space of “AI fairness” (Figure 6). It illustrates that the
current work has mostly focused on building “a fair AI sys- 3. Providing new insights for future research. Framing level 3
tem pre-deployment”; that algorithmic fairness is only part and 4 AI systems as living, sociotechnical systems reveal new
of the whole “AI fairness” problem space. There is a real insights into how we might more effectively prototype their in-
need for HCI and AI research in collaboratively translating teractions. For example, CSCW research has investigated how
fairness as an optimization problem into a feature of AI the to prototype workplace knowledge sharing systems whose af-
socio-technical system (Figure 6, blue segment), and into a fordance co-evolves with its users’ behaviors, the interactions
situated, user experience of fairness (yellow segment). The among its users, and the organizational contexts at large [31].
framework suggests a tentative agenda for these important These are too living, sociotechnical systems with uncertain
future research topics. capabilities and complex outputs. This body of work, though
not typically considered as related to AI, could offer a valuable
Implications for Design Methods and Tools starting place for considering how we might design prototype
Finally, the proposed framework intends to allow for a more human-AI interactions in the wild, over time. In this light, the
principled discussion on how to support human-AI interaction proposed conceptual framework offers actionable insights for
design practice. It can help identify the core challenges AI addressing the challenges of prototyping AI methodologically.
brings to HCI practice across application domains. It can help
CONCLUSION AND ACKNOWLEDGEMENT
researchers to articulate the contribution of their emergent
AI design methods/tools/workflows as well as their scope of AI plays an increasingly important role in HCI and UX design.
generalizability. Finally, it can provide new insights into how Today, designers face many challenges in working with AI.
to address the remaining challenges. Prior research often attributed these challenges to AI’s algo-
rithmic complexity and unpredictable system behaviors. Our
We consider UX prototyping methods of AI as an example. synthesis of these challenges provided an alternative view to
this common assumption. We encourage fellow researchers to
1. Identifying root challenges. Current research typically critique, evaluate, and improve on this proposed framework
attributes the difficulty of prototyping AI to AI’s technical based on their respective design and research experiences.
complexity or reliance on big data. However, HCI routinely
grapples with complex, resource-intensive technologies using The contents of this paper were partially developed under a
simple prototypes. What makes AI unique? Our framework grant from the National Institute on Disability, Independent
suggests that the root challenges are that AI’s capabilities are Living, and Rehabilitation Research (NIDILRR grant number
adaptive and its outputs can autonomously diverge at a massive 90REGE0007). The first author was also supported by the
scale. Such systems problematize the conventional HCI proto- Center for Machine Learning and Health (CMLH) Fellowships
typing methods that treat technology’s affordance as bounded in Digital Health and the 2019 Microsoft Research Dissertation
and interactions prescriptive. These methods can work when Grant. We thank Karey Helms, Saleema Amershi, and other
prototyping AI as an optimization system in the lab (level one). contributing researchers for providing valuable inputs on the
They could fail in fully addressing AI’s ramifications over time framework. We thank Eunki Chung and Nikola Banovic for
as a real-world, sociotechnical system. their supports to the Designing AI workshops.
[22] Patrick Hebron. 2016a. Machine learning for designers. [34] Panagiotis Louridas. 1999. Design as bricolage:
O’Reilly Media. anthropology meets design thinking. Design Studies 20,
6 (1999), 517–535.
[23] Patrick Hebron. 2016b. New York University Tisch
School of the Arts Course: Learning Machines. (2016). [35] Margaret Mitchell, Simone Wu, Andrew Zaldivar,
http://www.patrickhebron.com/learning-machines/ Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena
Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019.
[24] Karey Helms. 2019. Do You Have to Pee?: A Design
Model Cards for Model Reporting. In Proceedings of the
Space for Intimate and Somatic Data. In Proceedings of
Conference on Fairness, Accountability, and
the 2019 on Designing Interactive Systems Conference
Transparency (FAT* ’19). Association for Computing
(DIS ’19). ACM, New York, NY, USA, 1209–1222. DOI:
Machinery, New York, NY, USA, 220–229. DOI:
http://dx.doi.org/10.1145/3322276.3322290
http://dx.doi.org/10.1145/3287560.3287596
[25] Douglas R Hofstadter and others. 1979. Gödel, Escher,
[36] GINA NEFF and PETER NAGY. 2016. Talking to Bots:
Bach: an eternal golden braid. Vol. 20. Basic books
Symbiotic Agency and the Case of Tay. International
New York.
Journal of Communication (19328036) 10 (2016).
[26] Lars Erik Holmquist. 2017. Intelligence on tap: artificial
intelligence as a new design material. interactions 24, 4 [37] William Odom and Tijs Duel. 2018. On the Design of
(2017), 28–33. OLO Radio: Investigating Metadata As a Design
Material. In Proceedings of the 2018 CHI Conference on
[27] Andreas Kaplan and Michael Haenlein. 2019. Siri, Siri, Human Factors in Computing Systems (CHI ’18). ACM,
in my hand: Who’s the fairest in the land? On the New York, NY, USA, Article 104, 9 pages. DOI:
interpretations, illustrations, and implications of artificial http://dx.doi.org/10.1145/3173574.3173678
intelligence. Business Horizons 62, 1 (2019), 15–25.
[38] Kayur Patel, James Fogarty, James A. Landay, and
[28] Claire Kayacik, Sherol Chen, Signe Noerly, Jess Beverly Harrison. 2008. Examining difficulties software
Holbrook, Adam Roberts, and Douglas Eck. 2019. developers encounter in the adoption of statistical
Identifying the Intersections: User Experience + machine learning. In 23rd AAAI Conference on Artificial
Research Scientist Collaboration in a Generative Intelligence and the 20th Innovative Applications of
Machine Learning Interface. In Extended Abstracts of Artificial Intelligence Conference. Chicago, IL, United
the 2019 CHI Conference on Human Factors in States, 1563–1566.
Computing Systems (CHI EA ’19). ACM, New York, NY,
USA, Article CS09, 8 pages. DOI: [39] Kayur Dushyant Patel. 2012. Lowering the Barrier to
http://dx.doi.org/10.1145/3290607.3299059 Applying Machine Learning. Ph.D. Dissertation.
University of Washington.
[29] Scott R. Klemmer, Anoop K. Sinha, Jack Chen,
James A. Landay, Nadeem Aboobaker, and Annie Wang. [40] Laurel D. Riek. 2012. Wizard of Oz Studies in HRI: A
2000a. Suede: A Wizard of Oz Prototyping Tool for Systematic Review and New Reporting Guidelines. J.
Speech User Interfaces. In Proceedings of the 13th Hum.-Robot Interact. 1, 1 (July 2012), 119–136. DOI:
Annual ACM Symposium on User Interface Software http://dx.doi.org/10.5898/JHRI.1.1.Riek
and Technology (UIST ’00). ACM, New York, NY, USA, [41] Eric Ries. 2011. The lean startup: How today’s
1–10. DOI:http://dx.doi.org/10.1145/354401.354406 entrepreneurs use continuous innovation to create
[30] Scott R. Klemmer, Anoop K. Sinha, Jack Chen, radically successful businesses. Crown Books.
James A. Landay, Nadeem Aboobaker, and Annie Wang.
[42] Antonio Rizzo, Francesco Montefoschi, Maurizio
2000b. Suede: A Wizard of Oz Prototyping Tool for
Caporali, Antonio Gisondi, Giovanni Burresi, and
Speech User Interfaces. In Proceedings of the 13th
Roberto Giorgi. 2017. Rapid Prototyping IoT Solutions
Annual ACM Symposium on User Interface Software
Based on Machine Learning. In Proceedings of the
and Technology (UIST ’00). ACM, New York, NY, USA,
European Conference on Cognitive Ergonomics 2017
1–10. DOI:http://dx.doi.org/10.1145/354401.354406
(ECCE 2017). ACM, New York, NY, USA, 184–187.
[31] Esko Kurvinen, Ilpo Koskinen, and Katja Battarbee. DOI:http://dx.doi.org/10.1145/3121283.3121291
2008. Prototyping social interaction. Design Issues 24, 3
[43] Albrecht Schmidt. 2000. Implicit human computer
(2008), 46–57.
interaction through context. Personal technologies 4, 2-3
[32] Shane Legg, Marcus Hutter, and others. 2007. A (2000), 191–199.
collection of definitions of intelligence. (2007).
[44] Lisa Stifelman, Adam Elman, and Anne Sullivan. 2013.
[33] Brian Y Lim, Anind K Dey, and Daniel Avrahami. 2009. Designing Natural Speech Interactions for the Living
Why and why not explanations improve the Room. In CHI ’13 Extended Abstracts on Human
intelligibility of context-aware intelligent systems. In Factors in Computing Systems (CHI EA ’13). ACM,
Proceedings of the SIGCHI Conference on Human New York, NY, USA, 1215–1220. DOI:
Factors in Computing Systems. ACM, 2119–2128. http://dx.doi.org/10.1145/2468356.2468574
[45] Maria Stone, Frank Bentley, Brooke White, and Mike [55] Qian Yang, Aaron Steinfeld, and John Zimmerman.
Shebanek. 2016a. Embedding User Understanding in the 2019. Unremarkable AI: Fitting Intelligent Decision
Corporate Culture: UX Research and Accessibility at Support into Critical, Clinical Decision-Making
Yahoo. In Proceedings of the 2016 CHI Conference Processes. In Proceedings of the 2019 CHI Conference
Extended Abstracts on Human Factors in Computing on Human Factors in Computing Systems (CHI ’19).
Systems. ACM, 823–832. ACM, New York, NY, USA, Article 238, 11 pages. DOI:
http://dx.doi.org/10.1145/3290605.3300468
[46] Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan
Calo, Oren Etzioni, Greg Hager, Julia Hirschberg, [56] Qian Yang, Jina Suh, Nan-Chen Chen, and Gonzalo
Shivaram Kalyanakrishnan, Ece Kamar, Sarit Kraus, and Ramos. 2018. Grounding Interactive Machine Learning
others. 2016b. Artificial intelligence and life in 2030. Tool Design in How Non-Experts Actually Build
One Hundred Year Study on Artificial Intelligence: Models. In Proceedings of the 2018 Designing
Report of the 2015-2016 Study Panel (2016). Interactive Systems Conference (DIS ’18). ACM, New
York, NY, USA, 573–584. DOI:
[47] Jennifer Sukis. 2019. AI Design & Practices Guidelines
http://dx.doi.org/10.1145/3196709.3196729
(A Review). https://medium.com/design-ibm/
ai-design-guidelines-e06f7e92d864. (2019). [57] Qian Yang, John Zimmerman, Aaron Steinfeld, Lisa
Carey, and James F Antaki. 2016b. Investigating the
[48] Mary Treseler. 2017. Designing with Data: Improving
Heart Pump Implant Decision Process: Opportunities for
the User Experience with A/B Testing. O’Reilly Media,
Decision Support Tools to Help. In Proceedings of the
Chapter Designers as data scientists.
2016 CHI Conference on Human Factors in Computing
http://radar.oreilly.com/2015/05/
Systems. ACM, 4477–4488.
designers-as-data-scientists.html
[58] Qian Yang, John Zimmerman, Aaron Steinfeld, and
[49] Philip van Allen. 2018. Prototyping Ways of Prototyping
Anthony Tomasic. 2016a. Planning Adaptive Mobile
AI. Interactions 25, 6 (Oct. 2018), 46–51. DOI:
Experiences When Wireframing. In Proceedings of the
http://dx.doi.org/10.1145/3274566
2016 ACM Conference on Designing Interactive Systems
[50] Wikipedia contributors. 2019. Artificial intelligence — - DIS ’16. ACM Press, Brisbane, QLD, Australia,
Wikipedia, The Free Encyclopedia. https: 565–576. DOI:
//en.wikipedia.org/wiki/Artificial_intelligence. http://dx.doi.org/10.1145/2901790.2901858
(2019).
[59] John Zimmerman, Anthony Tomasic, Charles Garrod,
[51] Qian Yang, Nikola Banovic, and John Zimmerman. Daisy Yoo, Chaya Hiruncharoenvate, Rafae Aziz,
2018. Mapping Machine Learning Advances from HCI Nikhil Ravi Thiruvengadam, Yun Huang, and Aaron
Research to Reveal Starting Places for Design Research. Steinfeld. 2011. Field Trial of Tiramisu:
In Proceedings of the SIGCHI Conference on Human Crowd-Sourcing Bus Arrival Times to Spur Co-Design.
Factors in Computing Systems - CHI ’18 (CHI ’18). In Proceedings of the SIGCHI Conference on Human
ACM. Factors in Computing Systems (CHI ’11). Association
for Computing Machinery, New York, NY, USA,
[52] Qian Yang, Justin Cranshaw, Saleema Amershi,
1677–1686. DOI:
Shamsi T. Iqbal, and Jaime Teevan. 2019. Sketching
http://dx.doi.org/10.1145/1978942.1979187
NLP: A Case Study of Exploring the Right Things To
Design with Language Intelligence. In Proceedings of [60] John Zimmerman, Anthony Tomasic, Isaac Simmons,
the 2019 CHI Conference on Human Factors in Ian Hargraves, Ken Mohnkern, Jason Cornwell, and
Computing Systems (CHI ’19). ACM, New York, NY, Robert Martin McGuire. 2007. Vio: A Mixed-Initiative
USA, Article 185, 12 pages. DOI: Approach to Learning and Automating Procedural
http://dx.doi.org/10.1145/3290605.3300415 Update Tasks. In Proceedings of the SIGCHI
Conference on Human Factors in Computing Systems
[53] Qian Yang, Alex Scuito, John Zimmerman, Jodi Forlizzi,
(CHI ’07). Association for Computing Machinery, New
and Aaron Steinfeld. 2018a. Investigating How
York, NY, USA, 1445–1454. DOI:
Experienced UX Designers Effectively Work with
http://dx.doi.org/10.1145/1240624.1240843
Machine Learning. In Proceedings of the 2018
Designing Interactive Systems Conference (DIS ’18). [61] Lamia Zouhaier, Yousra Bendaly Hlaoui, and Leila
ACM, New York, NY, USA, 585–596. DOI: Jemni Ben Ayed. 2013. Building adaptive accessible
http://dx.doi.org/10.1145/3196709.3196730 context-aware for user interface tailored to disable users.
In 2013 IEEE 37th Annual Computer Software and
[54] Qian Yang, Alex Scuito, John Zimmerman, Jodi Forlizzi,
Applications Conference Workshops. IEEE, 157–162.
and Aaron Steinfeld. 2018b. Investigating How
Experienced UX Designers Effectively Work with
Machine Learning. In Proceedings of the 2018
Designing Interactive Systems Conference (DIS ’18).
ACM, New York, NY, USA, 585–596. DOI:
http://dx.doi.org/10.1145/3196709.3196730