Harold Thimbleby - Press On - Principles of Interaction Programming-The MIT Press (2007)
Harold Thimbleby - Press On - Principles of Interaction Programming-The MIT Press (2007)
Harold Thimbleby
All rights reserved. No part of this book may be reproduced in any form by any
electronic or mechanical means (including photocopying, recording, or
information storage and retrieval) without permission in writing from the
publisher.
MIT Press books may be purchased at special quantity discounts for business or
sales promotional use. For information, please email [email protected]
or write to Special Sales Department, The MIT Press, 55 Hayward Street,
Cambridge, MA 02142.
This book was set in Palatino and Computer Modern Sans Serif by the author and
was printed and bound in the United States of America.
Thimbleby, Harold.
Press on : principles of interaction programming / Harold Thimbleby.
p. cm.
Includes index.
ISBN 978-0-262-20170-4 (hardcover : alk. paper)
1. Computer programming. 2. Human-computer interaction. I. Title.
QA76.6.T4493 2007
005.1—dc22
2007000519
Outline message
Part I Context
Interactive systems and devices do not fulfill their potential for economic,
social, psychological, and technical reasons.
Part II Principles
Computer science provides many practical creative ideas and theories that can
drive effective interaction programming—defined in box 0.1, “What is
interaction programming?” (p. 4).
vi
Outline contents
Part I Context
1 Dodgy magic 11
2 Our perception 39
3 Reality 61
4 Transition to interaction programming 91
Part II Principles
5 Communication 119
6 States and actions 163
7 Statecharts 201
8 Graphs 225
9 A framework for design 273
10 Using the framework 325
11 More complex devices 367
vii
Full contents
Outline message vi
Outline contents vii
Full contents viii
List of boxes xv
Acknowledgments xvii
0 Introduction 1
0.1 The Press On sandwich . . . . . . . . . . . . . . . . . . . . . . . . . 1
0.1.1 Is there really a problem? 1 – 0.1.2 How can we design better? 2 –
0.1.3 How do we scale up to large design problems? 2.
0.2 Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.3 How to teach interaction programming . . . . . . . . . . . . . . . 4
0.4 Devices, applications and product references . . . . . . . . . . . . 6
0.4.1 Some Press On principles 7.
Part I Context
1 Dodgy magic 11
1.1 The architecture analogy . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2 The conventional wisdom . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Any sufficiently advanced technology is magic . . . . . . . . . . . 15
1.4 We’ve been here before . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Computers as consumer products . . . . . . . . . . . . . . . . . . . 17
1.6 Externalizing use cost . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.7 The productivity paradox . . . . . . . . . . . . . . . . . . . . . . . 21
1.8 Moore’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.9 Obsolescence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.10 Reduce, reuse, recycle, rethink . . . . . . . . . . . . . . . . . . . . . 29
1.11 Metcalfe’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.12 The tragedy of the commons . . . . . . . . . . . . . . . . . . . . . . 33
viii
1.13 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.13.1 Some Press On principles 35 – 1.13.2 Further reading 35.
2 Our perception 39
2.1 Pyramids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 The lottery effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 The media equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4 Drama . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.5 Selective perception of user surveys . . . . . . . . . . . . . . . . . 46
2.6 Software warranties . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.7 Feature interaction and combinatorial explosion . . . . . . . . . . 49
2.8 Demanding ease of use . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.9 Cognitive dissonance . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.10 Persuasion and influence . . . . . . . . . . . . . . . . . . . . . . . . 54
2.11 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.11.1 Some Press On principles 58 – 2.11.2 Further reading 58.
3 Reality 61
3.1 Calculators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2 Televisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.3 Cameras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4 Personal video recorders . . . . . . . . . . . . . . . . . . . . . . . . 66
3.5 Microwave ovens . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.6 Mobile phones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.7 Home theater . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.7.1 The projector 68 – 3.7.2 The sound system 72.
3.8 Feature interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.8.1 Telephone conferencing 74 – 3.8.2 Calculators 74 – 3.8.3 Washing
machines 75 – 3.8.4 Mobile devices 76.
3.9 Ticket machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.10 Car radios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.11 And so on, and so on, and so on . . . . . . . . . . . . . . . . . . . . . 84
3.12 Not just difficult, but different . . . . . . . . . . . . . . . . . . . . . 84
3.13 User problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.14 Even “simple” design isn’t . . . . . . . . . . . . . . . . . . . . . . . 86
3.15 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.15.1 Some Press On principles 89 – 3.15.2 Further reading 89.
ix
Full contents
Part II Principles
5 Communication 119
5.1 The beginnings of modern communication . . . . . . . . . . . . . 119
5.1.1 Costs of improving designs 124 – 5.1.2 From Morse code to
Huffman codes 126 – 5.1.3 The structure of codes 126 –
5.1.4 Implementing Huffman codes 130 – 5.1.5 Technical details 132.
5.2 Redesigning a mobile phone handset . . . . . . . . . . . . . . . . . 134
5.2.1 Estimating probabilities from a design 138 – 5.2.2 Creating a
Huffman code for the phone 142.
5.3 Visualizing design improvements . . . . . . . . . . . . . . . . . . . 144
5.4 Make frequent things easy, unlikely things hard . . . . . . . . . . 145
5.5 Efficient but more usable techniques . . . . . . . . . . . . . . . . . 147
5.5.1 Modelessness 149 – 5.5.2 Adapting to the particular user 151 –
5.5.3 Context of use 151 – 5.5.4 Devices don’t always need to be easier
to use 152 – 5.5.5 Different techniques for different tasks 153.
5.6 Faxes and Huffman codes . . . . . . . . . . . . . . . . . . . . . . . 156
5.7 A computer science of interaction . . . . . . . . . . . . . . . . . . . 157
5.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
5.8.1 Some Press On principles 158 – 5.8.2 Further reading 159.
x
uniform help 189 – 6.3.9 Dynamic laws 189 – 6.3.10 Make generic
devices 189 – 6.3.11 Define your own laws 189.
6.4 A four-state worked example . . . . . . . . . . . . . . . . . . . . . 190
6.5 A larger worked example: an alarm clock . . . . . . . . . . . . . . 191
6.6 Doing your own drawings . . . . . . . . . . . . . . . . . . . . . . . 194
6.7 The public image problem . . . . . . . . . . . . . . . . . . . . . . . 195
6.8 The key design problem . . . . . . . . . . . . . . . . . . . . . . . . 195
6.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.9.1 Some Press On principles 198 – 6.9.2 Further reading 199.
7 Statecharts 201
7.1 Modes and state clusters . . . . . . . . . . . . . . . . . . . . . . . . 201
7.2 Statechart features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.2.1 Clusters 202 – 7.2.2 And-states 203 – 7.2.3 History
connectors 205 – 7.2.4 Deep history connectors 206 – 7.2.5 Leaving
clusters 207 – 7.2.6 Delays and timeouts 207 – 7.2.7 Conditions 209 –
7.2.8 Joining, merging, and splitting connectors 210.
7.3 A worked example: a Sony TV . . . . . . . . . . . . . . . . . . . . 210
7.4 Statecharts for discussion . . . . . . . . . . . . . . . . . . . . . . . . 214
7.5 There is no right statechart . . . . . . . . . . . . . . . . . . . . . . . 215
7.6 Where did statecharts come from? . . . . . . . . . . . . . . . . . . 217
7.7 XML and the future of statecharts . . . . . . . . . . . . . . . . . . . 219
7.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
7.8.1 Some Press On principles 221 – 7.8.2 Further reading 221.
8 Graphs 225
8.1 Graphs and interaction . . . . . . . . . . . . . . . . . . . . . . . . . 227
8.1.1 The Chinese postman tour 228 – 8.1.2 The traveling salesman
problem 231 – 8.1.3 Coloring graphs 232 – 8.1.4 Coloring graphs to
highlight interesting features 234.
8.2 Mazes and getting lost in graphs . . . . . . . . . . . . . . . . . . . 235
8.3 Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
8.3.1 Spanning subgraphs 240.
8.4 User models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
8.5 The web analogy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
8.6 Graph properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
8.6.1 Connectivity 244 – 8.6.2 Complete graphs and cycles 246 –
8.6.3 Cliques and independent sets 247 – 8.6.4 Bridges and hinges 248.
8.7 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
8.7.1 Trees are for nesting 251 – 8.7.2 A city is not a tree 253 –
8.7.3 Getting around trees 254 – 8.7.4 Balanced and unbalanced
trees 256.
8.8 User manuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
8.8.1 Spanning trees 260 – 8.8.2 Ordered trees 262.
8.9 Small worlds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
xi
Full contents
xii
learn a system? 349 – 10.5.5 How do you crack a safe? 350.
10.6 Comparing designs . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
10.7 Putting analysis into products . . . . . . . . . . . . . . . . . . . . . 351
10.7.1 Building a manual into a device 352 – 10.7.2 “How to?”
questions 352 – 10.7.3 “Why not?” questions 354 – 10.7.4 Don’t be
idle 356 – 10.7.5 Weighting answers 358 – 10.7.6 Programming
wizards 359 – 10.7.7 Dynamic checks 361 – 10.7.8 General help
principles 362.
10.8 Complex devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
10.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
10.9.1 Some Press On principles 364 – 10.9.2 Further reading 365.
xiii
12.6.1 Make the design easy to change 428 – 12.6.2 Use simple
techniques as long as possible 429 – 12.6.3 Use sophisticated design
tools 431 – 12.6.4 Use proper compiler tools 432.
12.7 Make the device simple . . . . . . . . . . . . . . . . . . . . . . . . . 434
12.8 Know the user, and design accordingly . . . . . . . . . . . . . . . . 435
12.9 Exploit computer magic . . . . . . . . . . . . . . . . . . . . . . . . 436
12.10 If all else fails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
12.11 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
12.11.1 Some Press On principles 440 – 12.11.2 Further reading 441.
Index 491
xiv
List of boxes
0.1 What is interaction programming? . . . . . . . . . . . . . . . . . 4
1.1 The Code of Hammurabi . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Cargo cult computers . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 An obituary for a fax . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4 Energy consumption . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1 Eliza meets Parry . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2 Personas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.3 Ralph Nader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1 Turning the Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2 Design blindness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.1 Weapon salve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.1 Small screen interaction . . . . . . . . . . . . . . . . . . . . . . . . 124
5.2 The QWERTY effect . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.3 Information theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.4 Video plus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.5 Permissiveness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.6 Card sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.7 The principle of least effort . . . . . . . . . . . . . . . . . . . . . . 147
5.8 Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.1 Syringe pumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.2 Timeouts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3 Mealy and Moore machines . . . . . . . . . . . . . . . . . . . . . 183
6.4 Bad user interfaces earn money . . . . . . . . . . . . . . . . . . . 191
7.1 Computer demonstrations at exhibitions . . . . . . . . . . . . . . 208
8.1 Chinese postman tools . . . . . . . . . . . . . . . . . . . . . . . . 229
8.2 Trees versus DAGs . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
8.3 Computers in 21 days . . . . . . . . . . . . . . . . . . . . . . . . . 261
8.4 Small world friendship graphs . . . . . . . . . . . . . . . . . . . . 266
8.5 Erdös numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
8.6 LATEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
9.1 Facilities layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
9.2 Button usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
10.1 Why personal video recorders? . . . . . . . . . . . . . . . . . . . 331
10.2 What “always” and “never” should mean . . . . . . . . . . . . . 337
xv
List of boxes
xvi
Acknowledgments
Writing a book is a mammoth task, and as I’ve been working on this book many, many
people have made contributions—both encouraging and argumentative.
My wife, Prue, has supported me enthusiastically and has thought deeply and read and
commented on everything, as have my children Jemima, Isaac, Sam, and Will. All my
family proofread and gave fantastic comments on this book; I delight in them, and my
thanks to them goes way beyond what they have done for this book.
Thanks to my colleagues from around the world who have gone over a great deal too,
in writing and in conversation: including, especially, David Bainbridge, Tim Bell, George
Buchanan, Richard Butterworth, Brock Craft, Ian Crew, Penny Duquenoy, David Harel,
Michael Harrison, Lidia Oshlyansky, Jen Pearson, Simon Robinson, Yin Leng Theng, Chris
Whyley, and Ian Witten.
Some people have worked directly with me on implementation and getting things to
work, especially Paul Cairns, Paul Gillary, Jeremy Gow, Gary Marsden, Matt Jones, and
Will Thimbleby. I’m very grateful to Matthew Abbate and Robert Prior of MIT Press who
made this book finally happen. Sam Thimbleby drew the cartoons. Photographers are
acknowledged on individual photographs, except Prue Thimbleby and myself who did all
others.
My students over the years have suffered my teaching and greatly helped improve the
content. I thank my students at UCLIC, University College London Interaction Centre,
and my students at the Future Interaction Technology Laboratory in Swansea University,
www.fitlab.eu, who have helped me work out and refine this material over the years.
The nature of the material of this book makes it ideal for giving interactive lectures. As
Gresham Professor of Geometry, I was lucky to be able to give public lectures on some of the
material at Gresham College (an independent London institution, founded in 1597). Almost
all of this book has at one time or another, in one form or another, been presented to public
and school audiences: it’s young people who are going to change the world and make it a
better place—it’s also young people who are most likely to become fashion victims if they
do not notice or think about bad design. Of course, audiences like the constructive ways
you can identify and tackle bad design.
I have been supported in doing the underlying research by a Royal Society-Wolfson Re-
search Merit Award and various grants from the UK Engineering and Physical Sciences
Research Council, EPSRC. Gresham College kindly bought many gadgets to enhance my
lectures. All other gadgets I’ve either personally owned or used extensively, like the railway
ticket machine discussed in chapter 3, “Reality,” and chapter 12, “Grand design.”
This book was written in LATEX, first using TEXtures then TEXShop on the Apple Macintosh
with Dot, JavaScript, Lineform, Mathematica, MetaPost, and Perl—though the only lan-
guage you need to know is JavaScript (or C/C++/C#/Java). All the chapters discussing
programs used those programs to generate the diagrams and output attributed to them.
xvii
Harold Thimbleby. Press on and coffee . . .
0
Introduction
Press On provides many ways of thinking clearly about the design of interactive
devices and provides many ways of thinking about interaction programming. It cov-
ers the computer science fundamentals of the design of interactive devices, which
are typically complex things with buttons, like mobile phones and photocopiers,
but they range from the very simple—like torches and guns—to the very complex
like word processors and the web browsers.
This book takes the view that we can build far better interactive devices than
we do and that much of the initiative for better design can come from clear engi-
neering creativity, knowledge, and perspectives based on sound computer science
principles. Some things are possible or not possible, and competent computer sci-
entists can work it out. Programmers can come up with solutions that would have
escaped users or designers with limited technical knowledge; programmers can
be more creative and more central in all areas of interaction design than anyone
suspected.
1
Chapter 0 Introduction
2
0.2. Principles
and just our ability to keep up to date. Things get out of hand quickly, and what
is surprising is not that things are awkward to use and can be improved, but that
they work at all.
0.2 Principles
Various principles in Press On are summarized at the end of each chapter. The final
chapter of the book summarizes the main themes of the book.
Good user interface design isn’t a matter of following a recipe you can get from a
cookbook: it’s an attitude, along with a bunch of principles, skills, and provocative
ideas.
Nothing can provide definitive solutions; ideas and theories have to be insight-
ful and provocative. Given real-world complexity, then, this book cannot be a
definitive solution to everything that can happen. No book could be. Instead, Press
On is provocative; through many examples, stories, and simple ideas and theories,
it tries to shift your values and create new vistas where your design problems be-
come new opportunities to do better.
Interaction programming is an endless subject. Like a magazine, Press On has
lots of asides and boxes to provide background and interesting ideas. There is
a lot of cross-linkage between topics; this is a visible sign of the intrinsic variety
and cross-disciplinary fertilization of design itself, but it also allows the book to
be read in any order, for sections cross-reference themselves to different contexts,
perspectives, and ideas in other parts of the book. For those who want background
material to go deeper, this book includes suggestions for further reading at the end
of each chapter. Like all good books, Press On has a web site,
mitpress.mit.edu/presson
Linkage looks like this—brief paragraphs offset with a triangle marker. Linkage
is a taken-for-granted part of the technology of books, along with tables of
contents, footnotes, indexes, section and page numbering, even bookmarks. All
these ideas make books easy to use and more useful. We’ll mention the history
of books and printing in section 4.2 (p. 96), because it is not only fascinating in
its own right, it’s a good example of using design to make things easier—it also
suggests a range of effective aids we can put into documents, such as user help
manuals, to make them more useful. Interaction programming is everywhere,
even in things that aren’t conventionally thought of as interactive.
If all the linkage in this book were represented as buttons a user could press to
jump around, the book itself would be interactive, much like a web site.
Figure 8.8 (p. 246) provides a visualization of all the linkage in this book.
I hope you read Press On and are inspired that there are many ways to do a better
job of design—indeed, any criticism of this book you have is your opportunity to
do better. I want to give you an appreciation of how powerful simple computer
science is and how it can influence, for the better, anybody doing anything.
3
Chapter 0 Introduction
4
0.3. How to teach interaction programming
5
Chapter 0 Introduction
Box 10.1, “Why personal video recorders?” (p. 331) discusses the choice of
personal video recorders in more detail.
Getting a modern piece of technology to work at all, particularly in a commer-
cial environment, is a miracle in itself; keeping an innovative stream of products
rolling off the production line, and hence keeping the manufacturer in business
to live another day, is more important than “perfection.” On the other hand, I
hope this book will help designers see how they can make products better, and by
changing practices, make their products increasingly better over time.
6
0.4. Devices, applications and product references
A danger of referring to real products is that this book’s accuracy relies on how
good my understanding of the products is—how good is my reverse engineering?
The manufacturers of these devices did not provide formal specifications—I had
to reconstruct them. While this is a weakness, any reader can check my results
against a real device, its user manual (if adequate), or by corresponding with the
manufacturers themselves.
An alternative might have been for me to build new systems and discuss them
and the design insights they inspired. Here, there would be no doubt that the
systems were accurately specified—though the reader would not be able to obtain
them as physical devices to play with. But the problem would be that I might not
have implemented or thought about certain awkward features at all: I would be
misleadingly talking about, say, “mobile phones” when in fact I was only talking
about the simple mobile phone that I implemented for this book. It would be hard
to tell the difference between what I had done and what I should have done.
Although they get occasional mention as needed, this book does not specifically
cover safety-critical systems, the design of devices like medical equipment, aircraft
systems, and nuclear plant control panels, nor does it cover mission-critical sys-
tems, ones that, for instance, directly keep a business working. To cover these,
we would have to skirt the legal minefields, an unproductive diversion from our
focus. Nevertheless even small contributions to user interface design significantly
affect health and safety for many people—and some readers will take hold of the
ideas in this book and use them for significant applications, like aircraft flight
decks or nuclear power-station control panels.
What of the future of interactive technologies? The interaction programming is-
sues and problems of future devices are going to be essentially the same as the in-
teraction programming issues and problems for today’s devices (to examine them
in this book would take space to explain the unfamiliar domains). In the future,
we will have implants, devices embedded inside our bodies: surely we will be
able to control them, but we certainly do not want their design to drive us mad!
We will want our exoskeletons—external robotic body enhancers—to be both nat-
ural to use and safe. When we release millions of nanotechnology gadgets, we will
want to be sure they work as intended, especially when they interact with us and
their environment. Why else have them?
7
0
Introduction
Press On provides many ways of thinking clearly about the design of interactive
devices and provides many ways of thinking about interaction programming. It cov-
ers the computer science fundamentals of the design of interactive devices, which
are typically complex things with buttons, like mobile phones and photocopiers,
but they range from the very simple—like torches and guns—to the very complex
like word processors and the web browsers.
This book takes the view that we can build far better interactive devices than
we do and that much of the initiative for better design can come from clear engi-
neering creativity, knowledge, and perspectives based on sound computer science
principles. Some things are possible or not possible, and competent computer sci-
entists can work it out. Programmers can come up with solutions that would have
escaped users or designers with limited technical knowledge; programmers can
be more creative and more central in all areas of interaction design than anyone
suspected.
1
Chapter 0 Introduction
2
0.2. Principles
and just our ability to keep up to date. Things get out of hand quickly, and what
is surprising is not that things are awkward to use and can be improved, but that
they work at all.
0.2 Principles
Various principles in Press On are summarized at the end of each chapter. The final
chapter of the book summarizes the main themes of the book.
Good user interface design isn’t a matter of following a recipe you can get from a
cookbook: it’s an attitude, along with a bunch of principles, skills, and provocative
ideas.
Nothing can provide definitive solutions; ideas and theories have to be insight-
ful and provocative. Given real-world complexity, then, this book cannot be a
definitive solution to everything that can happen. No book could be. Instead, Press
On is provocative; through many examples, stories, and simple ideas and theories,
it tries to shift your values and create new vistas where your design problems be-
come new opportunities to do better.
Interaction programming is an endless subject. Like a magazine, Press On has
lots of asides and boxes to provide background and interesting ideas. There is
a lot of cross-linkage between topics; this is a visible sign of the intrinsic variety
and cross-disciplinary fertilization of design itself, but it also allows the book to
be read in any order, for sections cross-reference themselves to different contexts,
perspectives, and ideas in other parts of the book. For those who want background
material to go deeper, this book includes suggestions for further reading at the end
of each chapter. Like all good books, Press On has a web site,
mitpress.mit.edu/presson
Linkage looks like this—brief paragraphs offset with a triangle marker. Linkage
is a taken-for-granted part of the technology of books, along with tables of
contents, footnotes, indexes, section and page numbering, even bookmarks. All
these ideas make books easy to use and more useful. We’ll mention the history
of books and printing in section 4.2 (p. 96), because it is not only fascinating in
its own right, it’s a good example of using design to make things easier—it also
suggests a range of effective aids we can put into documents, such as user help
manuals, to make them more useful. Interaction programming is everywhere,
even in things that aren’t conventionally thought of as interactive.
If all the linkage in this book were represented as buttons a user could press to
jump around, the book itself would be interactive, much like a web site.
Figure 8.8 (p. 246) provides a visualization of all the linkage in this book.
I hope you read Press On and are inspired that there are many ways to do a better
job of design—indeed, any criticism of this book you have is your opportunity to
do better. I want to give you an appreciation of how powerful simple computer
science is and how it can influence, for the better, anybody doing anything.
3
Chapter 0 Introduction
4
0.3. How to teach interaction programming
5
Chapter 0 Introduction
Box 10.1, “Why personal video recorders?” (p. 331) discusses the choice of
personal video recorders in more detail.
Getting a modern piece of technology to work at all, particularly in a commer-
cial environment, is a miracle in itself; keeping an innovative stream of products
rolling off the production line, and hence keeping the manufacturer in business
to live another day, is more important than “perfection.” On the other hand, I
hope this book will help designers see how they can make products better, and by
changing practices, make their products increasingly better over time.
6
0.4. Devices, applications and product references
A danger of referring to real products is that this book’s accuracy relies on how
good my understanding of the products is—how good is my reverse engineering?
The manufacturers of these devices did not provide formal specifications—I had
to reconstruct them. While this is a weakness, any reader can check my results
against a real device, its user manual (if adequate), or by corresponding with the
manufacturers themselves.
An alternative might have been for me to build new systems and discuss them
and the design insights they inspired. Here, there would be no doubt that the
systems were accurately specified—though the reader would not be able to obtain
them as physical devices to play with. But the problem would be that I might not
have implemented or thought about certain awkward features at all: I would be
misleadingly talking about, say, “mobile phones” when in fact I was only talking
about the simple mobile phone that I implemented for this book. It would be hard
to tell the difference between what I had done and what I should have done.
Although they get occasional mention as needed, this book does not specifically
cover safety-critical systems, the design of devices like medical equipment, aircraft
systems, and nuclear plant control panels, nor does it cover mission-critical sys-
tems, ones that, for instance, directly keep a business working. To cover these,
we would have to skirt the legal minefields, an unproductive diversion from our
focus. Nevertheless even small contributions to user interface design significantly
affect health and safety for many people—and some readers will take hold of the
ideas in this book and use them for significant applications, like aircraft flight
decks or nuclear power-station control panels.
What of the future of interactive technologies? The interaction programming is-
sues and problems of future devices are going to be essentially the same as the in-
teraction programming issues and problems for today’s devices (to examine them
in this book would take space to explain the unfamiliar domains). In the future,
we will have implants, devices embedded inside our bodies: surely we will be
able to control them, but we certainly do not want their design to drive us mad!
We will want our exoskeletons—external robotic body enhancers—to be both nat-
ural to use and safe. When we release millions of nanotechnology gadgets, we will
want to be sure they work as intended, especially when they interact with us and
their environment. Why else have them?
7
Any sufficiently advanced technology is
indistinguishable from magic.
— Arthur C. Clarke
Part I
Context
In part I, we review our culture and relationship with interactive devices: our love
of interactive devices, and what they really do for us. This context for interaction
programming forms the motivation for part II, when we will develop the tools to start
fixing things.
9
TYRON FRANCIS WWW.TFRANCIS.CO.UK
1
Dodgy magic
Once upon a time, if doors opened as you walked toward them people would
think you were a magician. Today we don’t think twice about this magic, which
happens all around us! We are surrounded by gadgets and devices that we interact
with, from doorbells to mobile phones. Our houses are full of digital clocks—and
it’s a rare house that doesn’t have some 12 o’clock flashing somewhere. My kitchen
has complicated oven controls, washing machine controls, microwave controls,
dishwasher controls, fridge controls, and a dead clock. Another room (whether
we call it a living room, front room, lounge, drawing room, or bedroom, or even
the West Wing) has a TV, a DVD player, and a sound system. The hall has a mobile
phone left in it and some battery chargers. Then there is my car in the driveway,
with all of its computers and gadgets.
That’s today, in an ordinary modern, European household. Tomorrow we will
be wearing clothes with embedded devices, which might show our moods or help
us find our way when we are out driving.
When we meet one another, our clothes might shimmer and tell us whether
we have common things to talk about. Or they might twitch when we walk past
a music shop desperate to sell us something. Maybe our mobile phones won’t
work if they are stolen and taken too far away from our jackets—thieves won’t be
able to use them. Certainly our mobile phones and similar gadgets should behave
differently inside cars, and perhaps have many or all features disabled so we drive
more safely.
Maybe the government will require us to carry identification tags. Maybe these
tags will be implanted into us, just as they are today into dogs and cats—and in
cows, who have their feed automatically chosen for them. Maybe the tags we have
will change the behavior of our gadgets; maybe our children can only play harm-
less games, whereas if the games console knows there is an adult in the room it
could let adult games be played. The oven might forbid young children to switch
on the grill unsupervised. If we are old and feeble, our medicines may be orga-
nized automatically.
Our homes and personal lives may become far more complicated than our work
lives—at work at least somebody will have organized all the relevant interactive
devices we encounter, whereas at home and in our clothes, there will be a hodge-
podge of systems we bought and acquired over years.
We will interact with everything. What we choose to wear and own will, in
turn, interact with us and with other people and their environment.
11
Chapter 1 Dodgy magic
If we have problems with our television remote control or problems setting the
timer on our oven today, just think what could go wrong in this future of more
complex devices, that interact with one another in unpredictable ways! Today we
worry that our pacemakers will be messed up by the airport security scanners,
but that’s just the beginning of the sorts of interaction problems we may face in
the future.
The future might be exciting—that is the dream. The future might be out of
control—that is the nightmare.
As we build this new world, we are going to create our interactive environment.
We will create it by our choices as consumers. Some of us will create it as designers.
Some of us will create it and transform it daily, as we modify the devices we use.
We have preferences and loyalties, and our devices will learn from us, or we will
teach them or replace them. As we configure them, and they configure themselves,
we are designing an interactive world in which we are sorcerers and they are the
sorcerers’ apprentices.
12
1.1. The architecture analogy
Box 1.1 The Code of Hammurabi King Hammurabi was the sixth king of the first dynasty
of Babylon. He lived around 1728–1686BC, and is remembered most for his famous code of
ethics. The Code of Hammurabi was engraved on a large black stone, eight feet tall, so that
all the people could see their rights. The usual “eye for an eye” codes are there as well as
some interesting rules about buildings.
There is an obvious analogy between building interactive devices and building houses,
an analogy worked out in section 1.1 (facing page). Hammurabi’s Law 229 warned that a
builder who builds a house that collapses and kills the owner of the house will themselves be
put to death.
Although today’s computer-based devices do almost as much for us as houses did in
ancient Babylon, they are not taken as seriously. Who would put to death a designer who
made a system that killed someone? Who would believe that deleting all the personal records
of someone whose program deleted theirs was ethically justified as an “eye for an eye,” as
opposed to being mere retribution? Rather than encourage responsible design, society has
done the opposite—most systems come with warranties that deny any responsibility for
anything untoward at all.
For further discussion of ethics, see section 13.4 (p. 467). Chapter 13, “Improving
things,” includes some historical discussion of codes of ethics, such as that of
Hippocrates, in box 13.4, “Hippocrates” (p. 470). Section 2.6 (p. 48) discusses
modern warranties.
generally, any substantial change would be. Computers couldn’t be more differ-
ent. Computers are flexible. They can be changed any time, and the bigger they
are, the more things there are that seem to need changing. Computer-based things
are always being upgraded.
Computer-based designers have a temptation. In the worst case, since many
of their ideas are not fully visible to users, they are tempted never to finish—the
design can always be fixed or updated later. More features can always be added
to compensate for misconceptions. And users don’t know how to recognize unfin-
ished details, so they never complain.
Imagine starting to live in unfinished buildings. If they were like computers,
when you saw the house you were buying, you would have no idea that the
kitchen wasn’t finished. When you start living in the house and some of its design
limitations become obvious, stop complaining—buy an upgrade! That attitude
wouldn’t survive in the building trade; why does it survive in computers?
The final twist is that in some mad world where houses were designed like
this, where architects and builders got away with complicated, unusable designs,
we—the people who live in the houses—would love it. We’d want the flexibility!
We’d want to move the living room upstairs and have the garage extended. If
our neighbors got a jacuzzi, we’d want one too. We might change the garage into
a storeroom and add another garage. If our buildings were like computers, we
could extend them at will and do a lot of upgrading without moving. There’d be
so much to upgrade that we would lose track of all our great ideas, and we’d end
up with lots of unfinished projects around the house.
If house owners loved such flexibility—the flexibility made it easier to build
houses and extensions—then more and more houses would be badly designed.
13
Chapter 1 Dodgy magic
Problems would be dismissed: they can be fixed later—or why not move house
if you don’t like the one you have? There would be a whole industry of selling
half-built houses pretending to be finished because we’d all come to expect that.
Anyway we’d all want to play with the new rooms and revel in the potential flex-
ibility. And the building trade would come to rely on the profits and continual
upgrading we were forced into.
This comparison may seem forced. But consider my home entertainment’s re-
mote control—see figure 3.3 (p. 73)—which has 55 gray rubber rectangular buttons
that all look and feel the same, especially in the dark when it is typically used.
Frequently-used keys, like Play are almost impossible to find. The remote has two
sets of volume controls, eight (!) separate sets of up/down arrows, two right ar-
rows but one left arrow, and two setup buttons . . . and this is just what it looks like;
using it is so hard I can’t explain it here. It’s like a house that has been built by the
worst sort of builders, who just added feature after feature without thinking about
how it would be used when it was finished.
14
1.3. Any sufficiently advanced technology is magic
15
Chapter 1 Dodgy magic
Box 1.2 Cargo cult computers During World War II, the United States used the Melanesian
islands in the Pacific Ocean for airbases. The airbases brought in huge volumes of supplies
as cargo and apparently awed the islanders with the valuable goods.
When the war ended, the United States departed, leaving the islanders wanting to continue
the benefits of cargo. So they built runways, lit fires, and even made airplanes out of bamboo
and straw. They were doing everything they had seen done during the war—except that it
didn’t work. This was the essence of the cargo cult, practically a belief in a god of cargo.
Richard Feynman made good use of the confusion behind cargo cult when he defined
what he called cargo cult science. Some scientists do what they think is science—they go
through the ritual—but they are not being really honest with themselves or other scientists.
They cut corners, not really caring how to be rigorous, and in a few cases being outright
frauds: writing up their fake science as if it had been done properly.
Human nature likes cargo cults. We’re all slightly superstitious. We want to be successful,
and we want to imitate the things that seem to make us or other people successful. When
we are lucky, we analyze what happened. Ah, we were wearing a bracelet. It must be a lucky
bracelet! We think that because we put on the bracelet before we were lucky it must have
caused the luck.
Computers are our modern cargo, and we have to be alert not to fall into the trap of
worshipping computers in the same naive way. Just because we hear a lot of the successes of
people and companies who use computers doesn’t mean that just using or owning a computer
will save us. We have to use it properly, and it has to be designed properly for what we want
to use it for.
kitchen has at least nine engines, not counting the vacuum cleaner in the cupboard.
Even my mobile phone has one in it. This is a world away from French’s vision, of
large engines that we needed to understand! We don’t have to understand engines
anymore—we just use them.
Instead of us having to understand engines, engines are now more reliable; they
are better designed. Wouldn’t it be nice if interactive devices changed like this?
We wouldn’t need to understand them because they were designed better.
A similar transition happened with cars. At first drivers had to know many
technical details to drive them, then cars became easier to use. The story of cars
is wrapped up with the birth of the consumer movement, triggered by Ralph
Nader’s agitation about car safety. Initially manufacturers argued that drivers
were responsible for safety; then they changed their mind and started to make
cars intrinsically safer.
See box 2.3, “Ralph Nader” (p. 51) for the story of Ralph Nader.
Now we are at the point of another transition, this time from badly designed
computer-based interactive devices that we, the users, are supposed to understand
to better devices that are designed so that we can use and enjoy them.
16
1.5. Computers as consumer products
17
Chapter 1 Dodgy magic
18
1.6. Externalizing use cost
19
Chapter 1 Dodgy magic
Organizations have lots of meetings. In the old days, meetings were held at
the organization’s premises. Now they can be done by tele-working, say, using
email. Before, the organization met the direct costs of providing space, hospital-
ity, and even the toilets; now the externalized members of meetings provide their
own space and nourishment. Likewise, universities used to teach students in lec-
ture halls: the lecture halls cost money to maintain, and they hold only so-many
students—the direct costs are obvious. By promoting what’s become called “dis-
tance learning,” educators now get students to learn from material on the web.
The universities have no space costs and have no physical limits on the numbers
of students who can be taught.
Technology has always done this. Books, for example, are another technology.
This book has saved me the effort of talking to you. If you are reading this on the
web, it’s even saved me (or my publishers) the effort of printing the book—you
have taken on that cost.
Once an activity costs nothing to run, or, better, when it can get a payback from
externalized costs (like getting a percentage of telephone costs, or charging stu-
dents for time online), then the business can set its ambitions globally. Increased
scale, especially on an international level, multiplies even small beneficial effects:
before shops were on the web, everybody had to travel to them: now a single truck
can deliver to many people. The supermarket has reduced its costs, expanded its
market, and it has reduced environmental damage.
Many activities are computerized to make the workers’ jobs easier. Typically a
computer can speed up the work enormously, and the company can lay off people,
do more work, or both. However, the customers now interface with a computer
rather than a human, and the computer imposes a stricter and often more myste-
rious way of working—this is true of a system on the world wide web, an optical
character recognition system, a telephone menu system, or whatever. Using it
takes customers longer. If users need to have passwords, this is an extra hurdle:
yet another password for them to remember and type in every time they want
something done.
Our university library has decided to automate filling in book loan forms. In
the good old days, I would go down to the library and fill in a little card when
I wanted to ask the library to get a book it did not possess. For me, this was
straightforward: I’m familiar with pen and paper. Now they have automated,
which presumably makes their side of processing loan cards easier (the university
has several libraries, and a computerized loan card makes centralizing the load
processing much easier, as no cards have to be put in the mail). From the library’s
point of view, the computerization helps enormously.
Yet I now have a quirky new program to learn how to use, and I have to re-
member a special password. It takes me a little longer (and a lot longer if I forget
the password). The library is deaf to my complaints, which really are quite trivial
in the scale of things—don’t you always have to learn programs and passwords?
In any case, I’m probably the only user complaining, so my problems are easily
dismissed as my problems, not representative of all users.
20
1.7. The productivity paradox
On the other hand, there are thousands of library users all on the university pay-
roll: all the “trivial” problems they have collectively ought to be considered as an
aggregate cost that the university is paying for. Since the library has externalized
this cost—and made their processes efficient—they aren’t interested.
Given the benefits to the librarians, the only people who can really influence
things for the better are the programmers who design the systems. We must urge
programmers to consider the whole process, not just the library’s side of the job.
21
Chapter 1 Dodgy magic
22
1.8. Moore’s Law
could make one chip that people could use in many different ways: instead of
manufacturers working on designing many special purpose chips, the consumers
of the chips had to learn how to program.
Chris Evans in his landmark book The Mighty Micro made the point that if cars
had developed according to Moore’s Law, then in 1979 when he was writing, al-
ready cars would have been small enough to put a dozen on a pinhead, have the
power of the QEII (the large ocean liner), and would cost £1.35 (for a Rolls Royce).
If cars were that small and that powerful, their impact would be inconceivable. Of
course, cars that size would be difficult to use as cars—Moore’s Law, even with the
freedom of imagination, doesn’t make things easy to use. Evans discussed the im-
plications presciently: what will happen when computers are so small that when
you drop them, they don’t hurt your feet as they would have done in the 1970s,
but instead stick to your shoe soles? We’re answering those questions now.
Compare the improvements and speed-ups computers have undergone since
they became consumer goods with the modest speed-ups that occurred in trans-
port over the period of the Industrial Revolution. Before the Industrial Revolution
the fastest way to travel was by horse. With staging posts properly organized up
and down the country, somebody rich enough or with an important enough job
would have been able to travel as fast as a horse for as long as necessary. The
main difficulty in the system was that if you wanted to go fast you couldn’t take
much with you. If you wanted to carry goods for your business, it would require
a wagon, which would go much slower.
After the Industrial Revolution you could get on a train. Not only could trains
go faster, they could carry almost without limit, and at much the same speeds
as the fastest passenger trains. Sea transport became much more reliable when
steamers replaced sailing ships. The world became a smaller place. The British
Empire expanded due to the efficiency advantages Britain had in being first into
the Industrial Revolution. The world was transformed, by a speed-up factor of
around a hundred. A hundred is a lot smaller than the speed-ups of computers,
which already exceed millions over the few years we’ve had them. The difference
is that people need to travel, whereas nobody needs to perform microprocessor
instructions faster—what they want to do is get their work done faster. That every-
one thinks they want faster computers—more megahertz—owes more to market-
ing than to effectiveness.
Of course a faster computer does things faster, but there are reasons why the
speed-up in itself does not help most humans. First, computers already spend
most of their time waiting for something to happen: however fast my computer is,
I can’t type much faster, and the computer might do everything it needs to faster,
but that means it just spends longer waiting for the next keystroke from me. It is
ironic that computers compete with one another on megahertz: and what is their
CPU speed to us? Second, the way I interact with computers has not changed:
since windows were invented, computers have not become much easier to use.
If you can get hold of one, a ten-year-old word processor or spreadsheet will
work blindingly fast on a modern computer. Today’s word processors, instead of
getting faster, have added features. In 2000 you could put video clips and mul-
23
Chapter 1 Dodgy magic
Performance
p
t Time
Figure 1.1: Performance of technology increases with time (curved line). As perfor-
mance increases, it will exceed any given technical performance requirement, such as
the line at height p. After the threshold time t, performance will exceed whatever is
required to do the work; thereafter, all extra performance can be diverted into mak-
ing the experience more pleasant, or the product more attractive in other ways. The
diagram is based on Christensen—see the further reading (p. 35).
timedia into word processed documents, whereas in 1980 doing so would have
been out of the question. All those features represent ideas that marketing people
can use to catch our imagination: more magic! Moreover, we are willing to get
the features even though this makes the programs we buy more complex and a bit
slower. Next month, or maybe next year, we will be able to buy a faster computer
and get back to speed again.
Certainly computers are getting better and have exponentially increasing per-
formance over time. We can represent this by a graph, as in figure 1.1 (this page),
using a curve of positive slope (it is not necessary to worry about the precise shape
of the line or exactly what “performance” is measuring). For any particular task
the user has, some minimal level of performance p will be required. There is a
crossover time where the two lines in the graph intersect, with the performance
equal to p and a point in time at t. Before the crossover time t, technology is
delivering inadequate performance; after time t, technology delivers more than
adequate performance.
Before the crossover time t, manufacturers need only promise technical capa-
bility (which is easy, since technology is getting better all the time). After time
t, products get distinguished not by their performance, which is more than ade-
quate, but by more subtle—and harder to supply—properties like usability. For
technologies like wristwatches, we are long past the threshold, and they are now
fashion items, chosen on criteria mainly other than technological. But for many ev-
eryday things, like word processors, we should also be well beyond the crossover.
So why aren’t word processors much better?
24
1.8. Moore’s Law
25
Chapter 1 Dodgy magic
Software version
me
you
me
you
Time
Figure 1.2: The arms race with just two people upgrading products. I start with a
document written in version 1, which I email to you. To read my email you need to
get the software, which is now at version 2. You email me a version 2 document, so I
upgrade to the current version of software, which is now up to version 3. And so on.
The interaction among customers is crucial. When I drive my car, I don’t get
overwhelmed with a desire to buy a Rolls Royce when one overtakes me. I might
imagine myself liking a Rolls Royce, but I generally think more rationally: they
cost a lot, and their cost-benefit for my lifestyle would be negative. Even if I had
the cash to buy one, I would prefer to have a cheap car that does all I need and to
use the rest of the money more wisely. What is different is that cars do not interact
with each other. What car I drive is hardly influenced by what car you drive.
Granted, if everyone drove one sort of car, then owning a different make would
inevitably be more expensive: it would be harder to find experienced garages, and
so on. The car market in fact has a lot of diversity (although a lot of the apparent
diversity is artificially inflated by a few manufacturers using a wide range of brand
names), and for the time being, at least, we can and do choose cars for all sorts of
personal reasons.
Why don’t we do this with computers? I like vintage cars: even though they are
slower and draftier, they have more character. But nobody I know runs a vintage
computer. There’s no point—you don’t take computers out for a ride to show off
their character. What matters is whether they are up to date: computers interact.
Almost all computer use requires communication with other people: obviously
email and the web are there for communication, but so are CDs, memory sticks
and floppy disks. When you buy a computer, you are worried about compatibility,
that is, how much you can work with other people.
If you go down to your local gadget shop to buy something like a home theater,
it is bound to have several sockets on the back that need cables that are incompat-
ible with what you already have. Already it’s hard to buy video cassette players,
and DVDs come in several formats. Should you get optical connections, to be “fu-
26
1.9. Obsolescence
Box 1.3 An obituary for a fax My fax machine, model number DF200, died! It had a short
life: bought new 1994, died 1999. Its cause of death was a power cut; it died when mains
power was restored.
The postmortem revealed that the cause of death was multiple component failure on the
main PCB (printed circuit board—the card that the electronic components are fixed to) due
to the power surge. The separate power supply was still operational and would have been
made available for organ donation had there been any way to make it available to power
supply failure victims.
Funeral arrangements. The DF200 was buried at a community landfill site in the normal
way. Will flowers grow on the landfill site? Donations can be made to Thames Water, the
agency that will deal with the toxic leachate and other pollution from the fax machine.
ture proof” or should you have analog connections to be compatible with your
old equipment? It is a nightmare to upgrade any part of your home electronics
because of the continually shifting standards.
Bill Schilit, of Intel’s Usability Laboratory, says that manufacturers are not to
blame—consumers are. Consumers, he says, have unrealistic expectations. Schilit
blames “ever-increasing improvements” for introducing incompatibilities as de-
vices are upgraded.∗
Hmm. Another reason for the incompatibilities is product differentiation. Man-
ufacturers want to place their produces at different price points, and obviously
they want the things that seem better to cost more, because consumers will be
more willing to pay for the extra features. Thus product differentiation tends to
encourage manufacturers to put different features on low-end and high-end de-
vices, even if inside they are practically the same. The cheaper products probably
have the same chip-set but don’t have the sockets to access the up-market features.
For more on home theater, see section 3.7 (p. 68).
1.9 Obsolescence
To take advantage of Moore’s Law, we have to dispose of the old stuff to make
way for the new. What happens to the old stuff?
The sad obituary in box 1.3, “An obituary for a fax” (this page) tells the final
story of a fax machine that was just five years old when it stopped working and
the light left its buttons. The obituary uses the demise of the DF200 fax as an
example to raise and question wider issues about device design and marketing
generally.
At the time of the fax machine’s untimely death, BT (a UK telecom company,
formerly called British Telecom) would repair the fax for a £100 engineer’s callout,
plus the cost of repair, which was estimated at £235. The fax was already running
on its third main board, due to failures during its warranty period; these repairs
had been free. However, the last repair wasn’t too successful; for the last four
∗ Quoted in New Scientist, issue 2555, p29, 14 June 2006.
27
Chapter 1 Dodgy magic
years of its life the fax was wrapped in tape since the engineer broke some of the
weak casing fixings.
Given that equivalent new faxes cost around £140 (or at least they did when the
DF200 needed replacing) and come with one year’s warranty, it made no financial
sense to repair the DF200. The DF200 therefore joins the UK’s electronics landfill.
The pricing of engineer service calls is probably designed to cover costs (or to
dissuade calls being made in the first place) rather than to build customer rela-
tions, preserve the environment, or even to collect life-cycle feedback on products
in use in the field.
Electric outages are not an unanticipated occurrence, so design should allow for
them. When the (still functioning) switched mode power supply is switched on,
it produces no significant surge over its operating voltage (I checked with a high-
speed oscilloscope). The fax used to overheat and get uncomfortably hot probably
indicating that the main board was under-rated. That the board was the third
the DF200 needed in its short life appears to confirm the poor quality of design.
Interestingly, the board was made by Sagem, who also make missiles (including
the Exocet), so presumably they should know how to make reliable electronics.
Thus one is led to conclude that the DF200 was designed for a short life.
Most obituaries recall the best memories of the dead, but in the case of the
DF200, a typical representative of the electronics consumer marketplace, to go so
quickly from desirable product to polluting debris is obscene. Its death, in such
routine circumstances, was avoidable. But at least it was a broken, useless gadget:
each year, UK consumers discard about 15 million working mobile phones as they
upgrade to fancier models, contributing to the million tons of electronic waste dis-
posed of annually in the UK alone. That is a huge hidden cost of chasing Moore’s
Law, to say nothing about the hidden human time costs of learning how to use all
those new mobile phones.
Clearly designers make trade-offs, for example, balancing design time, product
cost, reliability, servicing costs, and recall rates. But in these trade-offs, environ-
mental costs have been discounted or ignored. The DF200 contains six wired-in
NiCd batteries (NiCd, or nicad, batteries contain nickel and cadmium, which are
toxic metals) and has no instructions for their safe disposal. Cadmium is toxic to
humans, plants, crustaceans, and fish—pretty much everything. Long-term low-
dose exposure to cadmium leads to malfunctions of the liver, kidneys, and lungs.
Discarded NiCd batteries contribute about half of all the cadmium escaping into
the environment from landfill sites.
Try to avoid buying products that use NiCd batteries; or if you have to, make
sure the NiCds can be removed before you discard the device. NiCds are quite
sensitive about how they are charged, and if they are treated well they will last
longer. Charge them cold, and always fully charge them; never “top up” or par-
tially recharge them, as this reduces their capacity. Lithium ion (LiON) and nickel
metal hydride (NiMH) batteries are less polluting. Similar rules apply for using
and charging NiMH and LiON batteries; you should read the instructions.
By 2005 there were 150 million PCs buried in landfill sites in the United States,
and a similar number of the discarded PCs ended up as overseas waste—much of
28
1.10. Reduce, reuse, recycle, rethink
it in Africa, a region least able to cope with it. Those PCs will be buried somewhere
(or partly reclaimed, burned or buried), complete with their NiCd batteries, lead
screens, lead solder, flame retardants, and so on. Some people are keen on export-
ing old PCs to developing countries where they can be used—for a time. This is
a short-term solution, merely using the rest of the world like a bigger landfill site,
and probably a landfill site with fewer pollution checks and controls. Instead, we
have to rethink how devices are designed.
The DF200 fax will reappear in section 5.5.5 (p. 153), where we explore details
of its interaction programming.
29
Chapter 1 Dodgy magic
Box 1.4, “Energy consumption” (facing page) describes the surprising energy
waste from DAB radios, because of a design decision to have external
always-on power supplies.
By choosing to waste energy, manufacturers have saved costs in mains cabling,
electrical safety, and in internationalization (the same radio works in any country,
just with a different power supply—it doesn’t need a new power cord and plug).
In a different world, with different priorities, we could save megawatts and make
a noticeable difference to our environmental impact. The future for DAB radios is
supposed to be exciting, with new digital entertainment facilities being added all
the time—put another way, the industry wants to keep selling DAB radios, even
30
1.10. Reduce, reuse, recycle, rethink
Box 1.4 Energy consumption I bought the UK’s most popular digital audio broadcasting
(DAB) radio, the latest bit of interactive technology. As of 2006, three million DAB radios
had been sold in the UK (or about 500 million worldwide). I measured how many watts my
radio consumed when it was switched off : it consumes 1.7 watts. Therefore, the 3 million
sold up to 2006 in the UK collectively consume about 5 megawatts—though some use
batteries, a different sort of energy consumption. The DAB radio could have been designed
to consume a negligible amount of power when switched off; instead, it was convenient for
its designers to use an external power supply that is always switched on. An external power
supply makes it easier to make radios for different countries, it makes the electrical safety
issues easier to manage, and it makes the wiring cheaper. Cynically, it makes the radio in the
shop more appealing, lighter and smaller—unless you realize it needs another lump, which
won’t be on show.
I measured how much my house consumes with only “switched off” devices, and in the
summer it consumes 400 watts—from DVDs in standby, network hubs, burglar alarm, a
fridge, and so on. Across the UK, that amounts to about 4, 000 megawatts from domestic
households (my sums allow for the various sizes of households)—the equivalent to the output
of about four modern nuclear reactors. That’s only domestic houses, not counting the
standby waste of factories, shops, and offices. A 2006 UK Government Energy Review
claimed 8% of the domestic electricity consumption is stuff in standby.
In short, better design would reduce the need for energy considerably and have a dramatic
effect on the world. The UK government is now considering a law to control energy-wasting
standby features.
Presumably all the old non-DAB radios are heading for landfill. According to the industry,
DAB has many new exciting developments in store for consumers, thus ensuring that all those
500 million DAB radios bought up to 2006 will soon be obsolete—and also discarded.
to people who’ve bought them. This imperative will contribute to electrical waste
problems, adding to the pile of pre-DAB radios already disposed of.
Finally, design systems to break. Things will break anyway, and if they are not
designed to break, they will be impossible to mend—and they will ultimately
end up in landfill because users don’t know what else to do with them.
31
Chapter 1 Dodgy magic
32
1.12. The tragedy of the commons
Once you have enough computers (faxes, mobile phones, or whatever) the social
benefits out-weigh the costs.
Putting this magic another way: once you can afford the entry cost to buy a
computer or some other sort of device, and provided enough other people have
them—and they are networked to you—then the computer is “worth it.” People
buy computers because other people buy computers . . . positive feedback.
33
Chapter 1 Dodgy magic
common environment are more easily transferred to another job, and so on.
The popularity of the system makes it a better target for viruses—the more
common a product, the more it becomes a software “monoculture” that is
easier for virus writers to target. Again, what suits an individual has
undesirable consequences for the community.
Computers, and particularly the internet, allow people to work from home, to
tele-work. As a personal choice, many people would rather work from home.
If anybody decides to go into work, they may find the workplace practically
deserted and no colleagues to make the day social. Although socializing at
work is important for sharing news, the tragedy is that everybody would
rather work at home. When everybody agrees to come in on a particular day,
they are so busy that working at home seems even more preferable to the
on-site workload!
It suits each individual and each company to ignore the environment. It suits
me to have the fastest, best devices so that I am happier and more competitive.
But to have the latest and fastest devices, I have to discard my older
models—and so I contribute to the damage done to the commons, the
environment.
1.13 Conclusions
Almost everyone makes a lot of money with computers, and interactive devices
more generally, and nobody is clearly warning about the problems. We all seem
to have a love affair with computers and don’t want to hear about their problems!
The market-driven solution is to spend more money getting replacements. For
everything we replace, we end up with another computer to discard. It might end
up in the attic. It will end up as waste—indeed as toxic waste. That’s one danger
sign. Another is how computers encourage us to make our society more and more
complex—in fact, our laws (tax being a good example) are so complicated that it
would be hard to stay in business without a computer to help. If the government
assumes every business has a computer, then it can impose regulations that only
a computer can cope with. Like the fishermen, we are totally entangled in the
process. The point of this book is to help us step a bit outside of the cycle, so
that we can think more rationally and make informed choices. As consumers,
we should become more critical of computers and demand higher standards. This
will put pressure on manufacturers to achieve higher quality. Gradually the world
will become a better place—perhaps even a more magical place. (As interaction
programmers, we should read on.)
Part of magic being successful is our manipulated perception, so we don’t notice
how the tricks work. The next chapter explores how technology and our percep-
tions interact.
34
1.13. Conclusions
35
Chapter 1 Dodgy magic
Do an internet search for e-waste or for up-to-date WEEE (Waste Electrical and
Electronic Equipment) advice on disposing of electronic devices, including PCs
and monitors. The availability of recycling facilities will be different in different
countries and locations.
36
2
Our perception
Almost everything we know comes through our eyes and our ears: perception
is reality. It is easier for advertising agencies to modify our perception than to
change reality. If something looks nice, it usually is nice; thus, advertisers create
nice warm images to give us the perception that their products are nice. Since an
entire industry is based on this premise, there must be something to it.
One example of our perception being used to mislead us is the promotion of
lotteries. We all want to get lucky! Television and newspapers regularly report
people who win big, and usually we see pictures of happy winners, complete with
their grinning faces and an enormous check. Our perception of frequent winning
fools people into thinking that in reality winning is probable.
If it wasn’t for the media, our perception of frequency—how often something
happens—would be a reasonable indicator of how probable that thing was.
Or, if in everyday life I see frequent murders and robberies outside my house,
then it is very likely dangerous out there. But if I read in the newspapers about
lots of murders and robberies, then I am perceiving lots of dangers, but it would
be a mistake to conclude that everywhere was as dangerous as it seemed from
my armchair reading—the newspapers have selected their stories from across the
country. Unfortunately, most people do not work out what the statistics really are,
and the pressure to “be safe rather than sorry,” or to avoid litigation, leads schools,
for example, to tighten their security because they perceive dangers to be far more
likely than they really are.
Now let’s return from the daily newspapers stories about everyday worries,
back to considering interaction programming and computers.
2.1 Pyramids
We are bombarded by advertising and media reports of the success of gadgets
and computers. For a while, we were told that dot.com businesses could not fail.
Individuals like Bill Gates, who amassed a fortune, were held up as examples of
the rewards available.
Pyramids have small tops and big bases. Bill Gates is at the top of one sort of
pyramid, and most of us are consumers, pretty near the bottom of it. Because we
each contribute a little, and the pyramid is so broad-based, a huge profit can be
realized further up. The huge wealth and prominence at the top is a consequence
39
Chapter 2 Our perception
of the base of the pyramid being so large (and, of course, that the business is basi-
cally making a profit; if its products were losing money, the losses would be huge
too).
The bigger the pyramids, the greater the benefits for people at the top. Once the
structure is in place, computers can provide a huge customer base. The organiza-
tion benefits, and individuals at the bottom pay a small amount that perhaps they
will put up with. If users complain, nobody is interested because the benefits of
the pyramid eclipse the minor inconveniences.
When criminals use the same techniques, it is called “salami slicing”: typically
they find a way of taking a tiny amount of money from many transactions, and no-
body notices the tiny amount going astray. With the help of computers, criminals
manage to systematically collect the negligible amount of money but off millions
of transactions. They end up with a huge income. When pyramid selling is done
blatantly it is illegal, because our society recognizes that the “small” cost multi-
plied by large numbers of users creates an immoral social exploitation.
Of course, the more people at the bottom of the pyramid, the fewer at the top.
Lots of people can own corner shops because the shops are small. Because they
have a small customer base, the owners cannot make astronomical profits. When
shops get bigger, merging into chains, they have more customers and proportion-
ately fewer owner to rake in the profits. In other words, the more prominent and
profitable someone or some company is, the fewer of them there must be. This is
pretty obvious, but it is more than a tautology: as businesses become bigger, they
have more resources, become more efficient, and can price their competitors out
of the market.
What isn’t so obvious is that once people get to the top of a pyramid and start to
attract the attention of the media, they get lots more said and written about them.
This influences our perception: because we know so much about them, because
we know them so well, it’s plausible that we could be like them. What’s wrong
with ambition? So just as it becomes harder and harder for anyone to succeed; the
media, conversely, gives the impression that success is frequent.
Of course the media isn’t going to go out of its way to report failures to put
the successes into perspective; moreover, we are very bad at thinking through
negative ideas. It’s positive to win, but for every winner there are countless losers,
and who wants to think about that? Hardly anyone buys lottery tickets thinking
about losing.
40
2.2. The lottery effect
feedback. If using computers makes you successful, then a dot-com can raise cap-
ital easily. Raising capital makes it easier to be successful, and it certainly makes
it easier to get media coverage. Shareholders are bombarded with success stories,
which in turn makes it easier to raise capital. The dot-com enthusiasm has burst,
but it shows how easy it is to get carried away with lemming-like enthusiasm.
The lottery effect plays out in other ways too. If a web site asks for feedback
from its users, for example, it will only get feedback from users who understand
it well enough to get it to work. You won’t get any feedback from users who are
using incompatible browsers, because they won’t even get far enough to find out
how to send feedback. Worse, users who are in a hurry—which will include all
the frustrated users trying hard to work the site—will have less time to provide
feedback than the relaxed, successful users. The designers of the web site will
hear lots from winners and virtually nothing from the losers.
For handheld gadgets, the same effect obtains. I’ve written to some companies
about their designs because I feel strongly about either the designs or, when they
aren’t so good, about how to help the companies. But I’ve also had some disas-
trous gadgets, which were so nasty that I didn’t use them long, and I certainly
didn’t waste my time working out exactly what was wrong with them. I would
not have been able to write a useful complaint because I knew so little about the
devices!
In everyday life there is a rather unsettling example of the lottery effect. We
take risks when we cross the road. We base our behavior and risk taking on our
perception of how safe it is to cross roads. But, in particular, we have never killed
ourselves—so we think that we are going to be more successful than a full sur-
vey of the facts would support. Another example: we have all heard of dolphins
pushing hapless swimmers back to the safety of the shore—but of course there
are never any survivors to tell us about dolphins pushing swimmers out to sea to
drown. We like to listen to the survivors’ stories.
The lottery effect emphasizes success. We change our perceptions about our
chances because we think success is more likely than it objectively is. The media
enhance this effect, since journalists publish success more than failure, themselves
biased by the lottery effect. Ironically, the media can reverse our assessment of
reality. For example, you are more likely to die of a stroke than of a mugging, but
because strokes are common they do not make good news. We believe what we
read and are more worried about mugging than stroke.
As designers, we concentrate too much on success and on the positive outcomes
(these, after all are the key selling points of our work). Particularly because of
the lottery effect bias we should put more conscious design effort into avoiding
problems and failure. We know we are bad at risk perception; the lottery effect
shows that we are doubly bad at risk balancing—and design is very much about
having a balanced view.
41
Chapter 2 Our perception
42
2.3. The media equation
People have different personalities, but we tend to fall somewhere between sub-
missive and dominant and between friendly and unfriendly. People tend to prefer
matching personalities, and when people of non-matching personality traits are
mixed up, they do not get along as well.
Interactive devices, too, can appear to have different sorts of personality. “You
must insert a disk,” or “Please insert a disk.” The phrases mean the same thing and
could be said by the same part of the program, except one comes across as dom-
inating and the other as submissive. You can guess that experiments show that
people of the corresponding personality type prefer one sort over the other. What
is more interesting is that a computer that mixes personality types—or, rather,
mixes expressions of different personality types—will be disliked by almost ev-
eryone, because it feels inconsistent.
Personality, particularly how it impacts design, is further discussed in
chapter 13, “Improving things.”
One of the classic, early examples of people imputing personality to comput-
ers was Jo Weizenbaum’s program called Eliza, named after the very Cockney-
cultured Eliza Doolittle in George Bernard Shaw’s play Pygmalion.
In the play, Eliza is taught by Professor Henry Higgins to become refined. Eliza
the program was a simple program that recognized a few word patterns in what
you typed at it. So if you typed, “mumble mumble my dog mumble mumble,”
Eliza would respond by saying, “Tell me more about your dog,” even though it
(she?) hadn’t the faintest idea what your mumblings meant. Of course, when I
write “hadn’t the faintest idea” what I really mean is that there was no match-
ing pattern covering the rest of the string. Eliza can’t have ideas, let alone “faint
ideas,” but it is so much easier to talk about computers as if they were human and
have intentions, even when we want to say that they don’t have any such abilities.
Weizenbaum’s programming was quite clever. Suppose you typed “blah blah
blah”—something that Eliza could certainly not recognize. Eliza would reply,
“Earlier you said something about your dog. Can you tell me some more?”
Most of the time Eliza just recognized simple key words. If you said something
with a “not” anywhere, Eliza might reply that you were being negative. To dis-
guise Eliza’s simplicity, each pattern (like “my” and “not”) had several different
responses that Eliza would cycle through to avoid obvious repetition.
There are now much more powerful ways of understanding human language
than Eliza used, but in the 1960s Eliza was state-of-the-art research. What sur-
prised Weizenbaum was how some people got carried away, as if Eliza was a
genuine human confidant. Eliza’s success inspired other programs. Parry the
Paranoid tried to simulate a paranoid schizophrenic patient. Kenneth Colby, its
designer, tested psychotherapists with it to see whether they could tell the differ-
ence between Parry and real human paranoids. The initial success of his work led
to a heated debate about how soon computers would replace psychotherapists.
It’s perhaps ironic that these early experiments simulated—thoroughly enough to
fool many people—what were thought to be deeply personal, and deeply human,
43
Chapter 2 Our perception
Box 2.1 Eliza meets Parry Eliza and Parry once met each other. Their conversation went
like this:
..
.
PARRY In my opinion, you have a one-track mind.
ELIZA What makes you think I have a one-track mind?
PARRY I used to go to the track every week.
ELIZA What does that suggest to you?
PARRY I avoid racetracks now because the races are fixed.
ELIZA Is that the real reason?
PARRY I’d rather not discuss it.
ELIZA What else would you like to discuss?
PARRY
.. That’s something I don’t want to go into.
.
Certainly Parry comes across as the paranoid!
44
2.4. Drama
2.4 Drama
When we watch television, we know that what we are seeing is just flickering
lights. But we soon forget this, and start enjoying it. Part of the enjoyment is
not worrying about the TV’s boundaries: we willfully suspend disbelief. When a
character goes off the edge of the screen and comes back moments later, we would
be wasting our time if we worried what they really did. That’s not the point. It’s
theater; the point is to be stimulated by and to enjoy the program.
Now imagine we are not watching television but are watching something on a
computer screen. The flickering lights are much the same, and we are entertained
now not by some imaginary story, but by software.
Imagine you are in a shop deciding which computer and software to buy. You
are looking longingly into a screen; the salesperson probably has some encourag-
ing chatter about how sensible you are. It is hard not to suspend disbelief and
be carried away with the demonstration. When you ask the salesperson a ques-
tion, do you actually see the answer fully on screen, or do you have to imagine
it—like imagining what a character does off the TV screen. You don’t actually
see what happens, but the salesperson won’t let that spoil the effect. Obviously
playing on our imagination is part of the power of sales: the salesperson wants
to create an image of how successful your purchase will be, even if some of that
success is imagination. You end up buying something that isn’t what you want.
This happens to everyone.
A programmer designs a system and shows it to the marketing manager. The
marketing manager is now in the same position that you were in the shop; the
demonstration is imagination, existing in the manager’s head, not in reality. It
looks good. The programmer might plead that it isn’t finished, but the marketing
department thinks it can be shipped already. When this happens, the products
shipped to the shops you’ve just been in owe more to imagination than to reality.
You are not in the shops long enough to find out: it’s quite easy to make a program
really so complex that you cannot begin to see how most of it works. You buy
programs on faith.
When we watch a film, the film is exciting because of all the things that could
happen. If we thought that the characters had no choices and had to behave as
they do, the story would not be half so interesting—it’s be without any dilemmas
or surprising twists. Sometimes we know things that the characters do not yet
know: they will open a door and be surprised, but we know what’s there. Yet in
reality they aren’t surprised: they’re actors; they’ve rehearsed! Of course, the film
is just a reel of plastic, and there is only one story. Nobody has any choices. If
you see the film twice, exactly the same things will happen again. To enjoy the
film, we automatically imagine a whole lot more is going on, or could go on. Since
we confuse media for reality, we confuse the narrow story-line for something with
much more potential.
With films, this is the whole point, and a good director creates a bigger picture
out of the narrative. With computers, it is a potential problem. If I show you some-
thing trivial on my computer, there is a good chance you will imagine some much
45
Chapter 2 Our perception
Box 2.2 Personas We can always think of more features, and if we are working in a de-
sign group everybody will have their own ideas. Soon the ideas get a life of their own—
independent of reality.
A good design technique is to focus design questions on what particular users—personas—
would make of them. Your design group (or you, if you are working alone) should decide on
two or three—no more, otherwise it gets confusing and misses the point—“real” characters
or personas. You should draw up their characteristics and make them as real and as personal
as possible. Your personas should obviously be the sorts of people you expect to have using
your design.
Once you have gone through this exercise, design questions can be focused. Would
Jemima—use their name—like this? Would she need it?
Here’s how we might draw up a persona called Jemima.
JEMIMA
Who she is: Teenager
Interested in horses, fashion, art
Lives in London, big city
Her technology views: Regular train user
Good with a mobile phone
Uses SMS a lot, but wobbly spelling
Her goals: To get up early to see horses
To get to school and back again safely
The photograph helps, as it makes the persona real. Was that last goal Jemima’s or her
parents’ ? What does she really want? What does the person who buys and uses the device
want? We may need two personas to work this out.
After you’ve debated who you want your personas to be, they become the real people you
are designing for. What might have been abstract design discussions can now be phrased
in personal terms. Would Jemima like this? This person has a real need to get up early,
but does she want to get up the same time every day of the week? Already you are asking
interesting and focused design questions.
See also box 13.7, “Scenarios” (p. 480).
richer tapestry. You will be most gullible if I take you through a demonstration
sequence I have prepared beforehand: if I can show you something interesting
where it doesn’t crash, you likely go away thinking it can do all sorts of interest-
ing things. Whether it can or not depends on how good a programmer I am and
how much effort I put into it.
46
2.5. Selective perception of user surveys
People who themselves don’t get lost (like the frequent travelers, the experts
who design airports, and even the airport staff doing surveys) don’t understand
why anyone else would get lost. For them, it seems so easy. They tend to dis-
miss our problems as stupidity—which doesn’t help them take our complaints
seriously. Not many disoriented users complain, let alone complain coherently—
if you are lost, it’s hard to say clearly why you are lost; if you knew how you got
lost, you probably wouldn’t have got lost in the first place. So useful complaints
47
Chapter 2 Our perception
are infrequent compared to the positive feedback Heathrow gets from regular pas-
sengers who don’t have the problems.
For airport designers substitute interaction programmers. We tend to under-
estimate the difficulties of using our systems, and our evaluation of users tends to
reinforce the wrong ideas. For many products, dissatisfied customers rarely bother
to complain. (Worse, on web-based systems, dissatisfied customers typically can’t
complain, because the system is going so wrong for them.)
48
2.7. Feature interaction and combinatorial explosion
liability for their wanton destruction, and they take as their legal position the same
wording as “respectable” software companies. They might be right.
I like to compare this standard computer warranty for a word processor with
the warranty for my Cross pen. The Cross warranty is unconditional. Cross will
repair or replace my pen if it goes wrong.
It’s a different business model: Cross want to give the impression that they care
about their product’s quality. It’s also a matter of customer expectations: a pen that
came with a booklet of legalistic warranties, in lots of different languages just to
make sure you understood, would give the impression that the pen, just perhaps,
wasn’t going to be reliable, or that perhaps you’d cut yourself on it or have some
other accident that the manufactures have anticipated but daren’t tell you.
Yet somehow we all happily buy word processors under these awful warranty
conditions. They are supposed to be better than pens.
Actually, the legal story is far more complex than this neat irony makes out. The
warnings—not warranties!—that come with medicines are even more worrying.
With medicines, “side effects” are expected, and some medicines even warn of
coma and death as possible side effects of the “cure.”
So, compared with medicines, software warranties seem innocuous. Neverthe-
less, compared to warranties for complex devices, like cars or washing machines,
which include complex software, the warranties for software seem a little feeble.
Surely, we can design better stuff more responsibly. We should give ourselves
better incentives to design better stuff.
Compare these modern warranties with box 1.1, “The Code of Hammurabi”
(p. 13).
49
Chapter 2 Our perception
as it should. Then somebody will want to support Arabic, which is written right
to left: so that requires a whole new text processing engine . . . before we know it,
the program is enormous, and the various features combine in different ways in
different places, which will cause feature interaction.
Feature interaction is further discussed in section 3.8 (p. 74).
It may become obvious that a better approach is needed and that the program
should be re-factored. But we’ve probably spent a lot of effort adding in all these
features, and to start again means throwing away most of that effort. Besides, the
delay to get all the features working together properly in a new design would be
longer than the small delay to add just a teeny little new feature to the current
mess. And so the arguments carry on: the program gets hideously complex, and
it never seems worth starting over.
It ends up with lots of sensible features that occasionally interact with one an-
other in peculiar ways. The number of ways in which features combine is explo-
sive: this is combinatorial explosion.
Adding features, of course, helps sell products. The more features there are, the
harder anyone finds it to decide which product suits them: this manufacturer’s
or that manufacturer’s? If features were laid out and explained in an organized
way, then customers could make rational choices. Just possibly, few people would
want manufacturer X’s word processor if they knew exactly what it did. But if
they were confused a bit—but not so confused that they noticed—then X has as
good a chance of selling its wares as any other. One of the simplest ways for man-
ufacturers to use this confusion marketing to their advantage is to bombard the
customers with half a dozen attractive features—without saying how the product
deals with features that competitors are touting. Then the consumer has no easy
way of comparing products, except on their brand or their pricing.
Combinatorial explosion is not just a problem for designers; it affects users too.
Because of combinatorial problems, we often want to do things but it is not at all
obvious how to start—how do I combine the many features the system provides
to achieve what I want to do? I might want a picture in the top left of my page,
but I can’t get it to stay there. We know that the program underlying this feature
is quirky, but that doesn’t help! So we find an expert, who says we should use
anchors—or something else. That’s a really simple idea that has solved our prob-
lem, and in hindsight why didn’t we know something so simple? In fact what the
expert has done is to make us look like a fool. We didn’t know something that
only now is obvious. Conversely, the expert got some positive feedback about
how helpful they were: they really like this software!
So the expert suggests that other people buy this software. And we will blame
ourselves rather than the bad design that doesn’t let us find out how to do things,
let alone use our knowledge of other parts of it to bring to bear on new problems.
50
2.8. Demanding ease of use
Box 2.3 Ralph Nader Ralph Nader was an investigative journalist who wrote the famous
exposé Unsafe at Any Speed, which stirred up the 1960s car industry and was a keystone
in founding the consumer movement. Cars in the 1960s were sold on style and appearance,
and Nader found that many were unsafe and that, in some cases, the industry was covering
up dangerous design.
Nader intercepted correspondence between insurers and a car company. Insurers were
asking how to stem the increase in claims for accidents caused by parked cars rolling down
hills. The memo said that bad parking was a driver problem, and that drivers should be
trained to park more safely—for instance by turning the wheels so that if the car rolls, it rolls
into the curb and stops. Another recommendation was for drivers to apply the foot brake
before applying the parking brake: the hydraulics and servos of the foot brake add force to
the hand-brake, making it grip better. Another view, of course, is that if the parking brake
does not work very well, that’s due to an engineering problem, not a driver problem. The
problem could be solved either by training drivers better or by training engineers.
While the car industry hid behind the slogan “drivers have accidents,” it was clear that
blame would always be laid at the driver’s door. Moreover, if drivers happened to kill
themselves through “bad driving,” they were an obvious scapegoat–who couldn’t defend
themselves. Some accidents are certainly caused by driver error (just as some aircraft acci-
dents are caused by pilot error), but the driver and the car work together, and failure or bad
engineering in the car can be just as effective a cause of accidents.
Unfortunately, if you identify car design as the cause of an accident, then all cars of the
same make and model will have the same problems. Recall time! This is a very expensive
business. In the long run, it’s much easier to get everyone to believe that most accidents, if
not all of them, are the driver’s responsibility.
Ralph Nader’s campaigning changed attitudes in the industry, and now thirty years later,
car manufacturers are much more interested in safety. But compare historical car attitudes
with today’s computer attitudes. Today, it isn’t drivers having accidents, but users having
problems. The solution—we’re being bred to believe—is that users should learn how to use
computers better. “Life-long learning” is now everyone’s right. Why? So that we can all
learn how to use otherwise difficult to use gadgets! Wouldn’t it be better if the engineers
learned how to make things easier to use so the rest of us didn’t have to learn how to cope
with the complexity?
51
Chapter 2 Our perception
52
2.9. Cognitive dissonance
Figure 2.1: A finger points to the “chip” inside a typical interactive device. (The actual
chip is hidden underneath the protective circle of resin.) The technology is amazing,
it’s so small, so fast and so powerful!! The gadget does so many useful things!!
Rather than being uncritically enthusiastic, one should also say that the device isn’t
easy to use. However, the technology itself is not the cause of the usability problems;
for example, it would be trivial to make the device more powerful if doing so made it
easier to use—there’s plenty of space to make it more capable. In fact, devices are
getting more powerful all the time but they aren’t getting noticeably easier to use.
Ironically, the power of technology encourages devices to become over-complex, and
too complex even for the designers themselves to understand fully. As technological
power increases the design problems get worse, particularly as market competition on
feature lists means “continual product development” is just piling on more features. By
the time usability problems are identified, often the design has become so obscure that
it is too hard to fix properly—and cognitive dissonance (see section 2.9: the design
was a lot of work but usability studies suggest that it still isn’t easy to use) leads
programmers to deny the problems and blame users for their ignorance. In general,
then, a root cause of usability problems and their persistence is designers’ relatively
limited ability or concern to manage the interaction programming well.
53
Chapter 2 Our perception
54
2.10. Persuasion and influence
had spent a week deciding which antelope to kill, they wouldn’t have lived long
enough to have children and us their heirs.
So, like the media equation, satisficing is hard-wired into our approach to life,
for very good reasons.
People fall in love for the same reason. We’ll make one more use of
folk-evolutionary theories when we discuss beauty in section 12.3.1 (p. 416).
55
Chapter 2 Our perception
If schools teach people how to use difficult things, educated people should
understand them; or, tools and devices schools teach must be well-designed
→ section 3.1 (p. 61).
If I find something easy to use, it is easy to use → section 13.1 (p. 446).
But, if you follow up the discussions in the indicated sections of this book, these
very reasonable-sounding heuristics are deeply flawed.
While using any heuristic makes us seem efficient and decisive, it also leaves
us open to mistakes and manipulation by others. In other words, we may make
quick rather than considered decisions.
Salespeople know how we think, and they often try to exploit our willingness to
rely on heuristics. Here I am looking at a car I am thinking of buying. Somebody
else comes over and says how much they like it and that they want to buy it—and
all of a sudden I really want to buy it. The other person looks fondly at the car
and mutters some sophisticated things about its emissions, its horsepower, and its
dynamic road handling. Wow! Now I have three heuristics triggered: social proof
(other people like this car); threat of scarcity (somebody else might buy it); and
authority (they know a lot, so they’re right). So the car’s value appears to shoot
up, and I am strongly tempted to buy.
If that other person appreciating the car was a random person, then my reason-
ing would be sound. But if that other person was a stooge deliberately placed there
by the salesperson, I am being manipulated. Being manipulated is not necessarily
a bad thing, though, I may end up making the decision I should made anyway,
but faster. I may really like the car, and it doesn’t matter whether this feeling of
liking it is down to cognitive dissonance—I still like it. If I need a car and I like the
one I’ve got, then I’m happy and the salesperson has done their job.
In a different situation, a doctor might want to persuade a patient to change
their lifestyle to live longer; it seems right for the doctor to use any and all tech-
niques of persuasion, however “manipulative,” to intervene more successfully.
The doctor might give the patient a computer game or simulation, an interactive
device, to help see the consequences of their lifestyle. Or the doctor might give
the patient a heart monitor—a simple interactive device in many ways, but one
whose impact on the user can, and should, be designed to be effective. Among
56
2.11. Conclusions
other things, the heart monitor could use wireless communication to get the pa-
tient to make social contact with others who are using their devices successfully,
thus using the heart monitor as a lever into social pressures and the deliberate
triggering of heuristics.
Translate these ideas and issues fully into interaction programming, and the
gates open to captology—a word grown from “computers as persuasive technolo-
gies.” Just as visual illusions, triggering our perceptual heuristics, can make us
look more flattering, captology seeks to make interaction more successful.
Interactive devices expand the opportunities for persuasion. They can be more
persistent than human persuaders and they can be (or seem) anonymous, which is
important for persuading people regarding private matters—ironically, anonymity
can help overcome social forces. Interactive devices can also be everywhere and,
like a mobile phone, accompany the user everywhere. They can hang around you
much closer and more persistently than a salesperson!
The Baby Think It Over is an interactive device, a realistic baby doll, that inter-
acts like a real baby would—see www.btio.com. It cries at random times, and it
won’t stop crying unless it gets attention, for varying lengths of time. It has to be
carried everywhere. It is designed for teaching: a teenager might be lent the doll
to look after over a weekend, and at the end of the weekend the teacher can get
a summary from the doll’s computer of how well the teenager cared for it. After
using this device, over 95% of teenagers in one study decided they weren’t ready
for the responsibility of a real baby.
2.11 Conclusions
The Ponzo and Müller-Lyer illusions (p. 55) show us that things are not always
what they seem to be. Similar, unavoidable, cognitive illusions conspire to make
our perception of the world different from what it really is. To a great extent, this
difference does not matter, but technology developments have put another layer
between us and reality. Media, in particular, concentrates events and channels
them into our eyes, giving us an experience we could only have had, until recently,
if we had been there. Our perceptions are changed. Drama is the special use of
media specifically to create the perception of a world, a story, that perhaps was
never there in the first place.
How we respond to our perceptions determines how we feel and whether we
like things. Whether we like interactive devices depends on how and to what ex-
tent they stimulate our perceptions. Some devices will be specifically designed to
manipulate our feelings; sometimes, our feelings are an accidental consequence
of design. The telephone started out as an aid for the deaf, but it is now mar-
keted for the most part as a fashion accessory. A key part of successful interaction
programming is to understand and either overcome or exploit perceptual factors.
A second thread in this chapter is that we, both of us, I and you the reader, are
subject to the same forces. We live in a culture that has to balance the opposing
messages that technology is wonderful, and makes lots of money, and that we are
failures who at times don’t quite understand how to use it.
57
Chapter 2 Our perception
Software warranties reveal deep attitudes to usability → section 2.6 (p. 48).
Cognitive dissonance: the harder things are, the more users may justify
wanting them → section 2.9 (p. 52).
In an organization, ask ordinary users not just experts what to design—experts
typically know what should happen, not what really does happen → section 2.9
(p. 54).
58
2.11. Conclusions
McBride, L. H., The Complete Idiot’s Guide to Natural Disasters, Alpha Books,
2000, is one of the many idiot’s guides. The idea started life with computers,
and now there’s even a spoof, Dolt, T., and Dullard, I., The Complete Idoit’s
Guide for Dumies, Ten Speed Press, 2000. The success of the series says a lot
about us as downtrodden consumers, feeling stupid when we can’t use things.
Piatelli-Pamarini, M., Inevitable Illusions—How Mistakes of Reason Rule Our
Minds, John Wiley & Sons, 1994. The inevitable illusions of the title are the
mental illusions that play tricks on our minds and are as unavoidable as optical
illusions are that play tricks on our vision.
Reeves, B., and Nass, C., The Media Equation, Cambridge University Press, 1996.
The subtitle of this book, “how people treat computers, television and new
media like real people and places,” describes it well. Media equals reality:
people respond (in interesting ways) to media like computers and television as
if it were real. This fascinating observation provides ideas on how to make
gadgets more attractive.
Tavris, A., and Aronson, E., Mistakes Were Made (but not by me), Harcourt Inc.,
2007. This is an authoritative and very engaging book about cognitive
dissonance, our need to justify ourselves and the consequences of doing so. It
is an enormously useful book to read for its powerful insights about life
generally but, instead, please read the book and when the book gives its
examples—buying an expensive car, buying the wrong house, divorcing the
person you married, the police arresting and interrogating an innocent person,
terrorists attacking a country—imagine replacing those cases with the
dissonant thoughts of being smart yet finding you’ve spent good money and
wasted time on an interactive device that disappoints you. (Elliot Aronson was
a graduate student of Festinger’s.)
Weizenbaum, J., Computer Power and Human Reason, Penguin, 1993. Jo
Weizenbaum was surprised how people were fooled by his simple Eliza
program, and the experience made him write this classic polemic against the
excesses of artificial intelligence. This revised edition has a good foreword; it
was originally published in 1977.
Read any software warranty and compare it with the warranty that comes with
any other device, such as a washing machine or car. Your washing machine—and
certainly your car—will have far more complex software inside, yet they work
pretty well.
59
3
Reality
One of the most visible transformations computers have brought about is in cal-
culators. The early calculators were enormous. Thomas Watson, the founder of
IBM, supposedly said in 1943 that he couldn’t imagine the world needing more
than five computers, which in those days were hardly more powerful than to-
day’s cheap handheld calculators. Computers then needed air conditioning, spe-
cial floors, and specially trained operators, and they cost a lot.
Today, calculators are small enough to be put into wrist watches; they are cheap
enough to give away at gas stations. They are ubiquitous, and, it’s tempting to
think, if there are so many of them and even school children are taught how to use
them, they must be good. How they work is magic, and this is where perception
and reality part company.
3.1 Calculators
Casio is probably the market leader, in a market that includes Sharp, Hewlett
Packard, Canon, and numerous other companies that make cheap calculators. So
to be specific, let’s look at two of Casio’s models. Although I’m going to be critical
of Casio’s calculators, the same sorts of things could be said about the other makes.
Casio’s are easy to get hold of, and you can buy the calculators I am writing about
to check the accuracy of my descriptions.
The calculator has a percent key, so let’s use it to find 1 plus 100 percent. One
hundred percent means, at least to my mind, “all of it,” so 1 plus 100 percent
should be one plus one—two.
Let’s see what two similar-looking calculators do. Casio’s calculator model
number SL300LC gives 2 if we press the keys 1 + 1 0 0 % for calculating
1 + 100%. That is what we expected. Now imagine that we spill coffee over the
SL300LC and need to buy a replacement. We might buy the Casio MC100, which
looks pretty similar. Asking the MC100 to calculate 1 + 100% exactly the same way
gets a different result: E 0—there is an error, and presumably the displayed 0 is
the wrong answer.
If we try another sum, say, 1 + 10% the MC100 gives us 1.1111111, whereas the
SL300LC gives us 1.1 for the same sum. You might think that this was rounding
(1.1111111 rounds down to 1.1), but that isn’t what it is doing if you try other sums.
61
Chapter 3 Reality
Figure 3.1: The Casio SL-300LC. Note the highly visible self-advertising memory keys
in the middle of the top row of buttons—imagine it being sold in a transparent bubble
pack from a shop display.
Evidently these two calculators are quite different in their approach to percentage
calculations.
I have used this example with audiences around the world. Well over a thou-
sand people have been surprised, and nobody could explain why Casio should do
this. And if there were good reasons to have different sorts of percentage features,
why doesn’t the calculator let you choose one?
Surprisingly few people actually use percentage, outside of shops, where they
are always adding on tax. Almost everyone I’ve asked says that they don’t un-
derstand percentage and they don’t use it. In other words, most people think the
problem is theirs, not the calculator’s, and they rearrange their lives so that they
need worry no more—they avoid using the quirky feature.
What is the percent key for, then? You can see on a calculator’s packaging when
you browse in a shop that there is a button there that does something. The %
key makes the calculator look a bit more sophisticated and more useful. “With
percent” can be emblazoned across the box to make it look better than the simpler
calculators next to it in the shop’s display. And we allow our perception of the
calculator to overwhelm our assessment of its true utility.
Actually, anyone who doubted that a calculator did what it claimed could spend
ages in the shop trying to assess it—you could test the problem I mentioned above,
but if you are buying a different calculator, what should you look for? Moreover,
if you are buying a calculator, chances are you are not very good at arithmetic,
which is why you need one in the first place. That would make checking even
harder!
Both of these calculators have a memory, which uses three buttons. The key
MRC recalls the memory, and if it is pressed twice it puts zero into the memory. The
62
3.2. Televisions
key M+ adds the displayed number to memory, and M– subtracts the displayed
number from memory.
The question is, how do you store a number you’ve just worked out in the mem-
ory? Pressing M+ sounds obvious, but that adds to the memory, so it will only do
the intended thing if the memory already contains zero. To make the memory zero,
you’d have to press MRC twice, but the first time you pressed it you would lose
the number you wanted to remember because MRC recalls the memory number.
It is possible to store a number in memory. Here’s one way: press M– – MRC =
M+ , and finally press MRC to get the display back to what it started with. That’s
six steps, and it’s as fast as it can be done. (If you are cautious, you might worry
that the memory might overflow, which is a problem that is not easily avoided.)
The upshot is that it is often easier to use a piece of paper than to use the mem-
ory. Yet surely the whole point of the calculator’s memory feature is to make it
easier to remember numbers?
Somehow the perception that the calculator has a memory and that it must be
there to make the calculator more useful seems to overpower the reality. It is tricky
to do these calculations, and it would be practically impossible to figure out in a
shop, with pressure from the sales staff. You’d also give the impression that you
couldn’t even master a simple calculator!
Memory is intended to make the calculator easier and more reliable to use; if it
is confusing, as it seems to be, then the calculator might be better off without the
feature. Does the calculator need memory anyway? Consider working out a sum
like (4 + 5) × (3 + 4)—this is obviously 9 × 7 = 63 (we are using small numbers
only so the example is quite clear):
1. Press 4 + 5 .
2. If we pressed × now, we would start working out 9 × 3, but we want
9 × (3 + 4). We have to store the current number, 9 in this case, into
memory because the calculator doesn’t have brackets.
3. Now press 3 + 4 = , which works out the 7.
4. Now press × MRC = and we get the answer, 63.
So this calculation, or any like it, requires the memory features. If the calcula-
tor had brackets, we could have entered ( 4 + 5 ) × ( 3 + 4 ) = directly.
Brackets, if it had them, would be easier and more useful than the memory feature.
Section 12.6.4 (p. 432), shows better ways to design calculators to avoid
problems like these.
3.2 Televisions
I have a Sony KV-M1421U type television together with a remote control, the RM-
694. Sony is a leading manufacturer, and what I say about these devices is typical
of any TVs and remote controls.
63
Chapter 3 Reality
H*
Standby
Sound Sound
off on
Clock
off
90 minute
timer H*
Figure 3.2: Statecharts for a Sony TV and its remote control. Chapter 7, “Statecharts,”
explains them fully; for now it doesn’t matter what a statechart is—just see how these
two statecharts for controlling the same TV are very different, and that means they
have to be used in very different ways.
The two diagrams in figure 3.2 (this page) show two statecharts specifying how
the users interact with each device and the remote control. We don’t need to under-
stand statecharts to see that the devices are very different; even the corresponding
buttons do different things. Some features can be done on the TV alone, some
can be done on the remote control alone, some features available on both are done
differently on each device.
The television and the statechart diagrams will be fully covered later.
Section 7.3 (p. 210) explains them fully and has full-size statecharts, which are
more readable than the small ones shown in figure 3.2.
Although there is only one application, namely, the TV, you have two designs to
understand, each with its own independent rules. If the television’s user manual
explained how to use the TV properly, then having two different interfaces for
it would require a user manual of twice the thickness. Moreover if you become
skilled with one interface, then your skill is of little use with the other: there is little
scope for transfer of skills from one interface to the other. If you’re accustomed to
using the remote control, and then lose it, you will find the television hard to use.
One wonders if this is deliberate: people lose their remote controls, become ir-
ritated, and perhaps think it worthwhile buying replacements. Had the television
and remote control been similar in their design, there would be less incentive to
buy a new one. Remote controls are surprisingly costly: they are probably nice
little earners for manufacturers.
64
3.3. Cameras
3.3 Cameras
Cameras are different from calculators and televisions because using them be-
comes a hobby and, for some people, a profession. It’s hard to think of anyone
(apart from me) playing with calculators as a hobby! Canon is a leading manufac-
turer of cameras, and the Canon EOS500 was a popular automatic single lens reflex
(SLR) film camera, that culminates decades of development from earlier models.
Unlike the Casio calculators or the Sony TV, the EOS500 camera has a substantial
user manual: not only does the camera manual explain how to use the camera,
it also provided photographic hints and tips so you can get the best out of the
camera.
The EOS500 user manual warns users that leaving the camera switched on is a
problem. There is an explicit warning in the manual on page 10:
When the camera is not in use, please set the command dial to ‘L’. When the
camera is placed in a bag, this prevents the possibility of objects hitting the
shutter button, continually activating the shutter and draining the battery.
Since this problem is documented in Canon’s own literature, we can be sure that
Canon was aware of the usability problem but somehow fails to fix it before the
product was marketed.
Interestingly, three years later Canon released an updated version of the EOS500,
the EOS500N. The user manual for this camera phrases the same problem thus:
If the film is removed from the camera in mid-roll without being rewound
and then a new roll of film is loaded, the new roll (film leader) will only be
rewound into the cartridge. To prevent this, close the camera back and press
the shutter button completely before loading a new roll of film.
It seems that the manual writers discovered that as well as pressing the shut-
ter button the camera back must be shut too (it would probably be open if you
were changing a film). It doesn’t seem like the camera designers read the earlier
EOS500’s manual themselves; otherwise, they might have tried to fix the problems
some people—at least the manual writers—at Canon knew about.
65
Chapter 3 Reality
Canon now makes digital SLR cameras more popular than film SLRs. The
digital SLR corresponding to the old manual EOS is the EOS300D (EOS350D,
EOS400D . . . ), and Canon has changed its on/off switch design considerably. It
is now a separate switch that cannot be moved accidentally, and the camera also
has an adjustable timeout—if it is left on for a long time without the user taking
any photographs, it switches itself off automatically (though it doesn’t move the
on/off switch into the off position as well).
66
3.5. Microwave ovens
Box 3.1 Turning the Pages Housed in an impressive building in the heart of London, the
new British Library must be one of the most civilized institutions in the world. It has the
deepest basements in London, and over 300 kilometers of bookshelves on site.
Tucked away in a corner is a computer-based exhibit, Turning the Pages. Touch screens
combined with photographic animations give people the sensation of touching and turning
over the pages of books. The books are lovely works of art, and the touch screens give an
impression much more real than using a mouse could: touch a page and drag it across the
screen, and see it turn over under your fingers, just like a real page. The success of the
system should not prevent us from being critical of its design. It is good but it could be
much better.
Here are two simple things the designers of Turning the Pages got wrong:
The books almost work much like real books: you can turn over pages, and go forward
or backward through the book. However, every so often you turn over more than one
page—maybe some pages are blank or not very interesting. Yet when you turn over
the last page, and you don’t know which the last page is (because several pages may
turn over at once), the book turns over and shuts. At this point you cannot turn back;
instead, you have to start at the first page again! In an unpredictable situation, the
user cannot undo the last action.
For some reason (perhaps the developers could not program mouse drivers) pages turn
in a jittery way, and to be turned over successfully you have to move your fingers
precisely. Some people don’t and are mystified that the pages don’t turn over—and
once people have trouble, they start trying harder and are even less likely to move their
fingers from the “right” positions at the “right” speeds.
Both of these problems are breakdowns in affordance, a concept defined in
section 12.3 (p. 415).
67
Chapter 3 Reality
unable to discover how to do it. There is no reason why all functions should not
be in the main menu, where a user could easily find them. If Nokia insists that it
is a good idea to have a different way of accessing the keypad lock, it can still be
done that way and be in the menu: there’s no harm having a useful function in
more than one place to make it easier to use. That’s a design suggestion, but when
you ask mobile phone users, you get the usual response. For users who know how
to set the keypad lock, it’s not a problem; for users who don’t know how to set it,
they’ve decided to change their lives—to reduce cognitive dissonance (p. 52)—so
that they don’t need the function or the embarrassment of not knowing how to use
something they need. And then there are people who use a different make, and
aren’t interested in Nokia’s problems.
We can criticize the Nokia’s keypad lock design, but sometimes there is no sat-
isfactory solution. In an emergency, you certainly want to be able to dial 999 (or
whatever the national emergency code is—or if you are British you might want
your mobile phone to treat 999 like the local national emergency number, wher-
ever you are, so you don’t need to know it), and fiddling with a keypad lock may
be the last thing you are able to do under pressure. It may not even be your own
mobile phone. Some mobiles therefore allow a user to dial 999 even when the key-
pad is locked. This makes the phone much easier to use in an emergency, which
is a good idea—if you are in a panic, you don’t want to have to work out how to
unlock the keypad as well, especially if your phone is unfamiliar.
Unfortunately, if a mobile phone is in a handbag or pocket all day with the key-
pad locked, the buttons will be pressed randomly and eventually it will succeed in
dialing 999, as the emergency services know to their cost. In fact, as I know to my
cost, a phone with the keyboard locked will eventually unlock itself if it is banged
around in a pocket for long enough—unlocking a keyboard usually is only two
key presses, so it is easier to do by accident before calling 999 by accident. Once
unlocked, your mobile phone can do anything, and probably does.
For more on mobile phones see chapter 5, “Communication.”
68
3.7. Home theater
The user manual for my TDP-T30 projector runs to 32 pages, repeated four times
in four different languages, but it does not seem to help with any of what follows.
You have to figure things out for yourself.
You switch the TDP-T30 on, and it projects a startup image on your wall or
screen. Probably, the start up image will be a trapezoid or other funny shape. Thus
projectors have keystone correction, which can adjust the horizontal or vertical
shape of the image to get a rectangular picture like from a keystone shape like
(which happens to be the shape you’d get if the projector was higher than the
center of the screen, as it would be if it was mounted on the ceiling). The projector
has features to project its image any way up, or even backward, for rear projection.
The TDP-T30 has buttons on the projector itself and a separate remote control.
None of the projector’s buttons seem to access the keystone facility, so you try the
remote control. None of these buttons help you either, though you notice that the
buttons are quite different from those on the projector itself, and there is a hinged
flap on the remote control hiding some of them. Some buttons have two labels,
like RETURN and ESC , and there are also two unlabeled buttons, one on the front
and one on the back.
Trying to use the remote control, if you pause to think for a few moments, as
you will, it reverts to being a computer mouse pointer. The 30 second timeout of
the remote control is different from the projector’s, which doesn’t have one—so
you are left looking at an on-screen menu but unable to control or dismiss it!
It’s frustrating, because there is a substantial menu system. When search the
menu system, though, you won’t find keystoning. You quickly give up on the
Setup button because it displays a big white X on a red background on the screen:
presumably it does nothing.
It turns out that keystone correction can only be done when there is a video
signal, say from your DVD player. Now the Setup button shows a menu and al-
lows you to change horizontal keystoning. The Menu button still works and still
gives you several menus—but it still does not mention keystoning. The setup
menu shows both horizontal and vertical keystoning as options, but I still haven’t
worked out how to get vertical keystone correction to work (and I suspect it isn’t
possible on this model—providing it as a menu choice must be a mistake). There
are several other features in various menus that do not seem to work at all, so it’s
not surprising to have some more inconsistencies.
Interestingly, as you correct keystoning when you have the video signal, the
inverse keystoning starts to affect the menu shapes! In other words, although the
technology can clearly adjust the menu shapes with keystoning, Toshiba chose to
only allow correction when there was a video signal. That makes it harder to find
out how to do it, since you might try (as I did) to correct the projection before
switching your DVD player on—I mean, you don’t want to watch your movie
before the shape is corrected, do you?
Why, at least, wasn’t that big but meaningless X explained in English, say,
“Please switch on a video source to correct keystone easily?” (The word “easily”
hints that you might even prefer to do it this way, rather than be irritated that there
is no other way.) Almost all the other menus are shown in English, so it wouldn’t
69
Chapter 3 Reality
have been difficult to do so. Why is a separate button needed for keystone correc-
tion when there is a general menu system that could do everything? Indeed, the
range and variety of functions in the menu system fooled me into thinking that
everything was in it. Why not redesign the device so the warning (here, about
needing a video signal in order to correct the keystone shape) is not necessary?
So that’s just setting up the projector. Now you want to use it. In my case, I
screwed it to the ceiling, and that makes it just a bit too high to reach (which of
course is why you want it fixed on the ceiling—to get it out of the way). I can
switch the mains power on and off easily, but unfortunately this only switches the
projector into standby—it does not switch on the light. The projector has a feature
called “no signal power off” which I thought would mean that the projector could
switch itself off when there is no video signal, that is, when you switch off your
DVD player. But this feature is disabled: it’s there in the list of on-screen menu
options, but you can’t use it. The projector also has a feature called auto power
on, but even though the menu allows this to be switched on and off, I haven’t
found out what difference it makes—the menu seems to be misleading. The two
features together, if they worked, could have been perfect: I’d keep the projector
in standby, and it would switch on and off automatically with my use of the DVD
player.
If your projector is on the ceiling or otherwise out of reach, you have to use
the remote control—a design decision that, incidentally forces you to keep buying
batteries (and disposing of the old ones) forever to keep it working. And given
that the remote control is small, you are bound to lose it. And guess who benefits
from you needing to buy a new one? Worse, I often find the projector still on after
somebody else in my family has been unable to switch it off.
Guess what? Projector light-bulbs have limited lifespan (for this model, of only
3, 000 hours). They are very expensive, about £170, which is nearly a quarter the
cost of a new projector; and when they burn out they may explode, spraying envi-
ronmentally harmful (that’s the very words the owner’s manual uses) mercury
around your house—whatever happens, it’s not just cost to the consumer, but
pressure on the environment, whether it is sprayed round your house or buried in
landfill which is where it will end up from most houses. Despite those problems,
here we have a device clearly designed to burn out light bulbs as fast as it can, even
though it has evidence of a lifespan-enhancing feature that has been deliberately
disabled for some reason.
See section 1.9 (p. 27) for more on waste and recycling issues.
I have no reason to think that Toshiba works like that, for if they knew what the
wider consequences of their interaction programming was or if they had done any
study of how home projectors are really used, surely they would know enough
about the design to get the user manual right? The device itself isn’t that compli-
cated, so there’s no plausible technical reason for the interaction programming to
have become so unusable, as it seems to have done. So I read everything on the
CD that comes with the projector, not just the printed owner’s manual.
Reading the CD, I discovered that you can set auto power off, only you don’t do
it the same way as setting other menu options. Instead of selecting the choices by
70
3.7. Home theater
using left and right arrows, which I’d tried because that’s how you select all other
menu choices everywhere else, you are supposed to press R-CLICK on the remote
control or ← . For some reason, this is also called ENTER in the manual—probably
a symptom of their having lost track of what it was called too, or that they were
recycling another design.
On the CD, I read that auto power on feature means that switching the mains
on switches the lamp on, not that a video signal wakes it from standby, which is
what I’d wanted and assumed.
Why did I have to get to page 33 of the advanced manual that is only provided
on the CD before I discovered all this? Would normal users persevere as I did?
It was only because I was writing this book and wanted it to be accurate and
honest that I double-checked and rechecked the manual several times . . . so I fi-
nally noticed that it said “For more details, please refer to the owner’s manual
of the CD-ROM version.” Ah. Even now I’m still not sure that I’ve finished my
discovery process.
So rather than having a projector that can switch from standby to on, or on to
standby automatically with a video signal, we have one that can only switch off.
To switch it on, we still need the remote control, or we have to use the mains power
to restart it—but this risks shortening the bulb life, as it also switches off the fan.
Furthermore, now we have set the projector to be as automatic as it can be, we
need the remote control less than when it was manual, as it was configured out of
the box. If we use the remote control less, we are more likely to lose it.
In summary, I suspect that this device has technical specifications sufficient to
sell it—screen resolution, lightbulb brightness and so on—and that’s all it needs to
be a successful product. In the shop, I didn’t notice the tedious user interface, nor
did I realize how bad the user manuals were. I didn’t notice the rainbow stripe
effects of the projector technology either, but that’s a different story.
The moral of the story is that bad design—which includes bad manual design
and bad correspondence between device and manual—happens because there is
no incentive to do any better. But the manufacturers could tell users (and shops
or web sites) more about their products, which would allow users to make better-
informed choices, and even to try out simulations of the products on the web—
though of course to do this, the manufacturer would have to use a design method
that allows them to build accurate simulations and web sites for their products.
This book introduces a simple programming framework (not the only possible
one, but an example of one) that solves these problems. The framework is
discussed from chapter 9, “A framework for design,” onwards.
Getting users to make better choices would put market pressure on the manu-
facturers to do better and to be clearer to customers (and indeed to manual writ-
ers) about what things really do. Well, maybe the fear of consumers knowing
what they are buying keeps manufacturers from being clear about products be-
fore they’re sold.
71
Chapter 3 Reality
As with all critiques here, I don’t write like that without giving the manufacturer
a chance to respond. I found their contact email on their web site, and sent off
a pretty clear email to them. I got a reply a few days later, asking me to phone
their Projector Helpline on 0870 444 0021—so a human had read the email, I think.
The phone number is an expensive UK pay number, costing me to call them. I did
call them, and the phone was answered immediately by the words, “We’re sorry to
keep you waiting.” Twenty five minutes later I gave up waiting to talk to a human.
Thus Toshiba makes money from their customers’ problems; perhaps it isn’t such
a bad approach to design after all.
For issues relating to call waiting and externalizing costs, see section 1.6 (p. 19).
72
3.7. Home theater
Figure 3.3: The Sharp (left) and One For All (right) remote controls, shown to scale—
the originals are 17cm and 23cm (7 and 9 inches) long, respectively. The buttons on
the Sharp always look the same regardless of the mode of the device; the One For All
buttons have a back-lit screen that changes depending on what the remote control is
doing.
the infrared codes for almost any remote-controlled gadget. Figure 3.3 (this page)
shows the original Sharp and the One For All remote controls side-by-side. The
One For All is a larger, more substantial remote control, with back lighting so you
can see its buttons. Moreover, only relevant buttons are lit up in any mode—
whereas the Sharp remote control provides buttons for your Sharp TV (I have no
Sharp TV) that get in the way and make its interface more cluttered.
The One For All remote control is set up with a four-digit code so it can be told
what device it is supposed to be controlling. It has a thick manual that includes a
dictionary of companies and codes to try, as some companies (such as Sharp) have
made devices that use different infrared control codes. Unfortunately I couldn’t
find a code for the Sharp SD-AT50H in the user manual. One For All has a web site,
but you can’t search for model codes like SD-AT50H. They have an email service,
which provided me with more details, and—to cut the story short—I never found
out how to get it to work, so I returned it to the shop and got my money back. I’m
afraid that the process gives One For All no useful feedback, so the company will
find it hard to improve the product; there isn’t even a registration process with a
comments box.
73
Chapter 3 Reality
To conclude these two sections on home theater, it should be said that the Toshiba
projector’s picture is great and the surround sound from the Sharp amplifier is
great; however, their interactive features are terrible. The remote controls are ter-
rible, are physically awkward, and provide a different interaction logic than the
main devices.
The designers seemingly didn’t consider how their devices would be used, even
though how they are supposed to be used is pretty obvious. The projector is on
the ceiling, and the amplifier is across the room, both out of reach from where one
sits. Yet neither device can be used fully using its remote control alone.
3.8.2 Calculators
Sometimes features that ought to interact fail to do so. Many handheld calculators
provide various features, such as, on the one hand, being able to handle fractions
and, on the other hand, being able to do advanced maths such as square roots,
sines, and cosines.
In fraction mode, you can add 1/9 to 1/2 and get the exact answer as a fraction
(it’s 11/18), whereas in the normal mode 1/9 ends up as a decimal and gets rounded.
When it’s added to 0.5 the final answer is only approximate, as there is no exact
(finite) decimal number for eighteenths.
√ In normal
√ mode, you can find square
roots of numbers. For example 16 is 4, and .25 is .5. That feature works fine,
then.
74
3.8. Feature interaction
Very few calculators can cope with both features at once. Trying to find the
square root of 1/9 typically gets 0.3333333 rather than 1/3, which is the correct an-
swer. Worse, if you now square it you’d get 0.11111, which is not even 1/9 to the
original accuracy! (I’m assuming 8 figures; the calculator I tried this on rounded
fractions to a lower precision.) Square root works fine, and fractions work fine; but
somebody forgot to get them to work together. They forgot that the two features
interact with each other.
A common feature interaction in so-called scientific calculators is that they pro-
vide complex numbers and all sorts of scientific features, such as trigonometry
functions. But they don’t work
√ together. Indeed, few calculators that boast com-
plex numbers can even do −1, which must be the quintessential complex number.
75
Chapter 3 Reality
undoes the changes the user tried to make. To make matters worse, the machine
delays before it takes over resetting the cycle—so a user in a hurry may not notice.
If you do want to change the settings, you have to switch the machine off and on
again and wait until the “on” light flashes.
I’m never quite sure what to do because there is nothing on the machine to tell
you what to do. You don’t know whether the light flashing means reset or it’s
waiting for you to open the door. Here are Indesit’s actual instructions:
Easy Operation [. . . ] instructions to perform a simple re-set [. . . ] the
programme will continue when the power supply is resumed or you switch
the appliance back on.
If you do not wish to continue the programme or you want to interrupt the
cycle to set a new one, please re-set the appliance as follows:
1. Align the dial with one of the “•” positions, for at least 10 seconds with
the appliance power on.
2. Simply press the Reset button for at least 3 seconds.
3. Set the appliance to any new programme you wish.
4. If in doubt always re-set the appliance as described above.
There are a lot of emphatic “easy” and “simple” words and a note that after
you’ve finished, if you are still in doubt, you should start again! Notice that the
user may want to do a reset frequently (in my household we do one every time we
run the washing machine), yet there are timeout steps where you have to count up
to ten or to three—and the machine itself gives no feedback to help. Why doesn’t
it just have a reset button and be done with it? Why does it need a reset anyway—
surely it could tell the difference between a power cut and the user trying to do
something?
Box 6.2, “Timeouts” (p. 174) continues the discussion about timeouts.
In the good old electomechanical days, if there was a power cut, the timer would
stay where it was. With an electronic timer, keeping track of where the machine is
in the washing cycle is a problem, and Indesit’s design solves this problem. Unfor-
tunately, it solves the problem in a machine-oriented way and creates a completely
new user interface problem: a feature interaction between the “power cut” auto-
matic resetting and the “user resetting.”
77
Chapter 3 Reality
Figure 3.4: The West Anglia and Great Northern (WAGN) Rail ticket machine at
Welwyn North railway station (just outside of London). There is a label “Easy to Use,”
the ten unlabeled buttons that do nothing, and payment slots that have pictures of the
coins or credit cards underneath them. The paper money slot you are supposed to use
is not marked.
that there is, indeed, a network card option available, but it is a choice given to
you after you try to buy the ticket without a discount—whereas almost all of the
other options are given to you before you buy a ticket. Moreover, the screen that
lets you use the network card discount does not have a “go back” button, so you
cannot change the type of ticket (you might want to do this after finding out how
much it costs; say, you might decide to buy a one-way rather than a round trip
ticket).
This experience highlights three design issues. First, the automatic ticket ma-
chine is inconsistent: sometimes there are “go back” options, and other times there
are not—for no good reason. Second, some options are provided before you select
to buy a ticket, and others are given after you try to buy; since you don’t know
where your options are before becoming experienced with the ticket machine, the
interface can seem mysterious. Third, experts (such as the person in the ticket of-
fice) are oblivious to complaints about how difficult the machines is to use, since
they already know how to use it.
The ticket machine does have a help button, labeled i , presumably meaning
“information.” This is not a soft button, so it is always visible. In any case, it
provides misleading help. For example, the main top-level help, which is shown
whenever the i button is pressed, is:
Using This Machine:
78
3.9. Ticket machines
On the machine’s help screen there is no full stop after the 2 on the “Select the
ticket type required” line. I wonder why WAGN doesn’t use automatic methods
to help them—as I did using LATEX for showing the example text above, which
avoided this error. Every day lots of people use this machine, and a small effort to
make the design better would have made a lot of people happier. Once you find
a feature or detail that the tool can help with (like punctuation), it can automat-
ically help everywhere with it, and therefore save a lot of design work, as well as
improving quality throughout.
Chapter 11, “More complex devices,” gives many techniques for generating help
text like this automatically. See section 11.6 (p. 396) for further ideas about
building tools to help write user manuals.
Anyway, this help screen isn’t helpful since some discounts must be selected be-
fore step 2 and some after step 2.
It’s more fun to press the i key when you are wondering how to use a network
card: this causes an error, as there is no help provided! The “help” says, “Help
not implemented for ContextID 0011.” Yet if somebody wrote this help text
into the program, they had anticipated this sort of error. If so, why didn’t they
check that it could never occur in the released product? There are only about one
hundred destinations, so there are not even too many screens to check by hand,
though a design tool would have helped enormously in the checking.
Since every destination has about ten variations of pricing, and we know that
WAGN increases ticket prices regularly, surely there must be a computer database
somewhere that organizes everything? If there isn’t a database, there’s going to
be a big mess sooner or later. If there is a database, why wasn’t it used to build a
consistent user interface, and one where help was properly provided?
The ticket machine has another problem, which again shows how little effort its
designers put into it. Suppose you are part of a family group: you want to buy a
ticket for yourself and children. It takes about 13 key presses to buy your ticket,
and then you want to go back a screen and buy tickets for your children. Your kids
take 13 key presses each! You can’t buy a ticket and then return to the previous
screen to change the ticket type (from adult to child), let alone simply repeat the
last purchase. Little wonder that the ticket machine is not used much, except when
the ticket office is shut.
I chatted to the woman who’d bought a ticket just before me. She blamed herself
for leaving her change behind. Her task was to buy a ticket. The ticket machine
gave her a ticket (and a frustrating time), so she picked up the ticket and turned
away. If the train she’d wanted had been already at the station, she’d immediately
have been running toward it and not even had a chance to hear her change clunk
into the slot. She had completed her task but the badly designed machine had not
finished its part of the task. The ticket machine could have been designed to drop
change before, or perhaps at the same time as, the ticket itself.
79
Chapter 3 Reality
I then had a chat with a railway employee. He’s not been trained to use the
machine, let alone explain it to customers. There is an operator’s manual, but he
hasn’t seen it. He has to empty the cash from the ticket machine from time to time,
and the first time he did this he set off alarms, since he didn’t know how to use the
buttons inside the machine. The insight of his was that it was hard enough for him,
but passengers have to learn how to use the ticket machine while they are trying
desperately to get a ticket before the next train comes.
80
3.10. Car radios
Figure 3.5: The Pioneer KEH-P7600R radio in a laboratory rig with its remote control
and a loudspeaker. Usability experiments can be done without involving any car driving.
The driver looks at the radio display to check which category is chosen
(SOCIAL, RELIGION, whatever), and keeps on changing category until a
music category is selected. The driver chooses music categories. This takes 15
seconds of full attention, plus the radio’s 15 second search.
81
Chapter 3 Reality
If the driver’s search is unsuccessful (that is, because there is no JAZZ) then the
driver must repeat the process. This may require up to 11 attempts, as there are
that many different music categories. Total driver time is 165
seconds—assuming no errors and full attention to the radio. Total radio time is
165 seconds. Overall time is 330 seconds, or about 5 minutes.
To avoid looking at the radio, the driver can select each category in turn, then
scan for that category until it finds some music. The radio will spend 15
seconds on each search—15 seconds for SOCIAL, 15 seconds for RELIGION,
and so on. This requires less continual attention from the driver, but there are
more RDS choices to work through. There are 30 categories, and in the worst
case the one the driver wants will be the last one looked at. Although there are
several music categories, there is no general music category; even OTH MUS
means music not otherwise categorized (so, for example, it will not find JAZZ).
Total driver time is 30 × 10 = 300 seconds, and total radio time is 30 × 15 = 450
seconds. Total time is 750 seconds, over 12 minutes.
These times, five to twelve minutes, are surprisingly long, especially considering
they are from ideal, stationary conditions without any of the demands of driving.
If the driver makes mistakes using the radio, the times (to say nothing of the frus-
tration) will increase, as will the amount of attention the driver needs to devote to
the radio. If unable to concentrate on the radio, the driver will have to spend more
time on the radio to recover, thus increasing the risk that the radio’s timeouts will
take over.
Having no timeouts would make it easier for the driver to pay attention to driv-
ing tasks without risk of losing their place in the interaction. Instead of timeouts,
the radio could have a reset button: the driver should decide when to start again—
not the radio.
To complete most tasks successfully with the radio requires looking at its dis-
play. Some indicators are merely informational (say, whether the subwoofer is
on or off); others are relevant to how the user wants the device to perform (say,
whether the radio broadcast may or may not be interrupted with traffic announce-
ments). Some of the textual indicators are only 1.5mm high.
In my car, the radio-to-eye angle is 70 degrees (which makes text 1.5mm high as
measured on the radio appear like 1.4mm), and my eyes are 80cm from the panel in
a normal driving position. The UK traffic regulations, The Highway Code, require
that a driver can, at a minimum, read a vehicle number plate at a distance of 20.5
meters in good daylight. Doing the sums, this translates to requiring 3.3mm for
the height of text on the radio. In other words, even for drivers who pass the legal
vision test, the radio displays are far too small to read.
Inevitably, to read the radio’s status lights, the driver has to move their head
closer and (in my car) lower: the driver would have to move their head at least
0.64 meters closer to the radio—very close. Doing so significantly reduces the
driver’s view. Special safe radios should have air bags in them: it’s likely that the
driver’s head will be close to them when an accident occurs!
82
3.10. Car radios
Box 3.2 Design blindness There are all sorts of reason why interactive products fail to be
as good as they should be. The simplest reason, and one this book tackles, is that nobody
really knows what the products actually are.
Sure it’s a big program. And it’s had features added to it so it does everything anybody
can think of. But how do these features work? Can they be used together in all combinations?
Is help provided for the user, and if it is, is it provided uniformly well? Who knows? Does
the user manual describe the real system, or a figment of somebody’s imagination that never
got implemented? Is it even worth writing a user manual, because it probably won’t be very
accurate—and it will be extremely long to cover all the features it should.
Sometimes it is too easy to worry more about the niche than the product is intended for
rather than the design itself. We can spend endless time surveying the users, their tasks,
their contexts, and how they run their lives with the devices we want to build. It is very
important to do this, and there are various helpful techniques, like scenarios.
Scenarios are discussed in box 13.7, “Scenarios” (p. 480).
The people who like developing the human story of the design are rarely the people with
the technical skills to build something well—somebody also has to specify the device so it
can be built. That does not happen automatically just because a lot is known about the
planned use of the system.
If you build a system using finite state machines, as this book discusses (from chapter 6,
“States and actions,” onward), you will know exactly what you have built. It will then be
easy to analyze, maintain, extend, and write manuals for it.
The examples of trying to use the radio, gone through in some detail above, do
not exhaust the design problems. Here are some more:
The radio has some buttons that do nothing for several seconds when pressed.
They have to be held continuously before they work, which is tricky when
driving. It also means that a driver may hold the wrong button down for a few
seconds before noticing they’ve made a mistake.
The radio has numerous features such as “FIE,” an abbreviation whose
meaning is lost on me.
The spacing of the radio’s front panel buttons is about 8mm (and many of them
are indistinguishable to the touch), spaced less than any other car controls.
Most cars are used by more than one driver, so the radio’s programmable key
with different meanings will cause problems.
Some cars are rentals, and the driver will not have time to sit down and read
the manual before setting off. My radio has a 76 page user manual. It is not
easy or even legal to read when driving.
In summary, this car radio may be technically brilliant, but it wasn’t designed to
be used while driving, and it wasn’t designed to be easy to use. But it isn’t much
83
Chapter 3 Reality
different from other car radios. The issue is why are car radios so bad? Like other
gadgets, by the time you have a nice new radio installed in your car, and you
discover how difficult it is to use, it’s too late to do much about it, other than to
resign yourself to coping with it.
84
3.13. User problems
other company’s design. No doubt users of Motorola, Siemens, and the others
think the same way, and thus there is little incentive for standardization. Yet the
lack of a decent user interface is bizarre when you compare it with the design of
the radio part of the mobile phone. These phones can work all over the world in
all sorts of conditions. If a phone did not work on a South African network, you’d
want your money back, or the manufacturer would lose a country from its market.
Devices are not just difficult to use, they are different from one another, and the
differences in themselves make them harder to use. At a hospital, there will be a
range of devices for infusion pumps made by different manufacturers—and they
all work in different ways. Even devices from the same manufacturer work in
bewilderingly different ways. Nurses and doctors cannot be expected to under-
stand the variety of user interfaces. A nurse might know how device A works,
and next they are dealing with a patient using device B. That patient will be lucky
if the nurse does the right things with the actual device delivering drugs to them—
especially if there is an emergency and the nurse is under pressure.
The design differences have an interesting effect. The people who master a de-
vice form themselves into a community of like-minded people. Thus, I talk to
people who use Macintoshes; other people stick to PCs. I talk to people who use
Nokia phones, and I don’t understand people who use Samsung—they do what
I do, but they do it differently, so we talk different languages and form different
communities. People rarely bridge communities and see things differently. De-
signers have to try harder.
Interactive devices clearly fall along a spectrum. A mobile phone may be a fash-
ion accessory, and it doesn’t matter that they are all different or “special.” At the
other end are safety-critical devices like medical equipment and interactive gad-
gets in aircraft cockpits. These have to be highly standardized for safety reasons.
The problem is that the magical success of the fashion end of devices leaks into
the other end; apparently successful design ideas we’re all familiar with end up
becoming the implicit standards at the safety-critical end. Indeed, any modern
aircraft is full of user interfaces that would make our PVR or camcorder look easy.
85
Chapter 3 Reality
that seems to confirm it’s all our fault. Children are supposed to be very good at
using modern interactive gadgets, so—the myth goes—we must be too old if we
have problems.
This myth is debunked in section 11.1.2 (p. 374).
For a toggle switch, you need to know what your users will expect up and
down to do.
86
3.14. Even “simple” design isn’t
Figure 3.6: Three push switches on a wall (which look like rocker switches). Each
switch controls a different row of lights. How do they control the rows?
The order of the switches tells us nothing—the switch is vertical on a wall run-
ning east-west, and the rows of lights are horizontal on the ceiling running north-
south. You can probably guess the middle switch controls the middle row of lights,
but if you are not familiar with this room, you can’t easily distinguish the middle
row anyway.
In contrast, if the three switches had been toggle switches, we would know
whether rows of lights were on or off. If our task was again to switch the front
row of lights on, and two switches were in the on state and one off, it would be
obvious what to do, namely, change the single off switch to its on position.
What have we learned? A single push switch and light combination is arguably
easier to use than a toggle switch, and the advantages are more apparent when two
(or more) switches control a single light. However, when several push switches
control several lights, it inevitably becomes harder to use. On the other hand,
toggle switches make most multiple switch banks easier to use.
Design does not stop there. We could have push switches with indicator lights
on each switch. Now we have the advantage of a push switch (“just push”) with
the state-indicating clarity of a toggle switch, but now it is a light. An indicator
light makes the switch safer, since then a user could reliably turn lights off as a
precaution when replacing a faulty light bulb. Unfortunately, lights on switches
can have other, conflicting, purposes: they are useful to help locate a switch in the
dark, so they are usually on when the controlled light is off!
3.15 Conclusions
As this chapter shows, user interface design problems are everywhere. The stories
may be so familiar that the reaction is to sigh. We’ve heard it all before—that’s
just how it is. We’re in denial—several of the examples had “easy to use” labels
on them or similar slogans in their user manuals. It will take effort, driven by
consumer pressure, legislation, motivated designers, informed managers in the
industry, and skilled designers, to make a difference.
The implicit theme of this chapter is that we must understand problems with
user interfaces to be in a position to improve them.
Did you find this chapter interesting, or were the examples tedious and difficult
to follow? Ironically, the more tedious the examples, the more important it is to fix
the problems they describe. If it is hard to explain what is going wrong, even to a
designer, it must be even harder to use. Quite often, one expects that conventional
design processes never uncover problems that are too difficult to put into words—
especially when a lot of conventional design processes rely on user involvement,
and users have not been trained to explain complex design issues.
I’m sorry this chapter has been long and tedious. The point the tedium made
is that we would be better off if we could avoid these sorts of problems, rather
than live in a world where it is too late to avoid them. Instead of being negative
and critical, we need to turn to positive ways to make systems better, which we
do in part II, after the transitional chapter 4, next, which takes us from critique to
framing a program of action.
88
3.15. Conclusions
89
TYRON FRANCIS WWW.TFRANCIS.CO.UK
4
Transition to
interaction
programming
91
Chapter 4 Transition to interaction programming
“red shirt” was a narrative fiction so that I could write the previous paragraphs
and make a nice story.
Whichever one of these reasons you prefer—physical, chemical, philosophical,
emotional, economic, or fiction—they certainly don’t begin to exhaust the expla-
nations for my shirt being red.
Even the simplest of questions have many answers. Instead of shirts, it could
have been anything—this is the doctrine of multiple representations. Similarly,
there are very many ways to look at interaction programming. We interact with
devices. There are psychological stories about our experiences; there are choices
made by designers; the devices themselves work because of electronics; and so on.
The field of interaction programming allows and covers all of these explanations.
All are valid, and emphasize different features of interaction. Are we interested
in the user’s experience? Are we interested in the technicalities of the micropro-
cessor? Are we interested in the shape and physical features of the gadget? We
relish this wide range of points of view in interaction programming because of the
field’s history and because each is important.
During World War II, people became acutely aware of the crucial role of inter-
action between people and complex machines. Decoding the Enigma, the German
code machine, can claim the first explicit human factors insight: the Herival Tip,
that operators would tend to make predictable slips that gave away clues on their
secret codes—John Herival’s insight revolutionized decryption in 1940. Aircraft
pushed pilots to their limits, and it was obvious that some problems were caused
by bad design, which had ignored the user, the human factors. Specialists soon
tried to understand how people operate machines, and how to design machines to
make them easier and more reliable. Ergonomics, or Human Factors in the United
States, developed quickly.
During the 1950s it became increasingly obvious that the interaction of people
and computers, which were then a new sort of machine, posed an important area
for the very practical reasons that many problems and surprises again seemed to
originate in engineering decisions. The field of man-machine interaction, or man-
machine interface (MMI, conveniently, is the same abbreviation for both), grew
up in the 1970s. Even today, particularly in industries such as car manufactur-
ing, what we now call user interfaces are instead called the HMI, or the human-
machine interface—at least HMI leaves behind the sexism of the term MMI. In
any case, three scientific disciplines were involved: ergonomics (or human fac-
tors), computer science, and psychology.
As an example of the interplay between different disciplines, chapter 5,
“Communication,” discusses programming techniques to reduce key pressing.
This also helps avoid repetitive strain injuries, an important ergonomics issue.
There followed a period when the different disciplines staked out the area. Er-
gonomics moved into cognitive ergonomics, which concerned itself with cogni-
tive issues of interaction, not just the physical issues that pertained to the con-
ventional ergonomics of simpler interfaces. By the 1980s, however, the field had
settled down, rebranded as human-computer interaction, abbreviated HCI in the
92
4.1. What’s in a field?
93
Chapter 4 Transition to interaction programming
94
4.1. What’s in a field?
Now interaction design is seen as rooted in the user, with the usability profes-
sional or human scientist best placed to interpret human-computer interaction.
This current state of the field encourages a merely supporting role for program-
ming and engineering, that of fulfilling requirements and correcting bugs that
have been uncovered by people who understand users. “User-centered design” is
the slogan. Center design on the user and then just tell the programmer what to do.
It’s truth, but not the whole truth. Interaction design often, though understand-
ably, overemphasizes the user-centered issues. The overemphasis leads to think-
ing such as: you should not design a staircase without knowing what users feel
or how it will help them go to bed. Why, indeed, make a staircase that people
don’t like? Yet an engineer can build a staircase that is rigid or flexes; one that can
hold elephants, or one that looks fine but is fragile. A good engineer knows what
the staircase can do without any testing or human involvement. Sadly, too many
interactive devices have been built like nice-looking but flimsy staircases, and we
have all learned how to dance up the wonky steps.
This is an important point. Let’s use another example to emphasize it differently.
A physicist can say, purely based on theory and no experiments that a passenger
in a car needs a safety belt to prevent being flung through the window if the car
crashes. This follows from Newton’s Laws of motion. We might engage psycholo-
gists in this scenario is find out why some people don’t like wearing safety belts or
how to make them more enjoyable to use. We might engage ergonomists to make
car seats more comfortable and safer or the steering wheel less of an obstruction in
crashes. No doubt, many experiments make sense. The mechanics of safety belts,
however, has to be grounded in a quite different sort of science and theory.
Many things are hard to use because we do not understand how bad, or badly
used, the engineering infrastructure is. We think computers and interactive de-
vices are like that. But they are bad because they have been used in ways that
their infrastructure does not support. It’s as if we are worried about a staircase
that wobbles and makes us sick, when the cause of our sickness is that it has been
stretched to bridge a wide gap that really needed a balcony or an arch; the engi-
neering is just wrong, and the result is something that is unusable and certainly not
pleasurable to cross. The solution is not to wonder how it could be made a nicer
experience for users crossing it; rather, the solution is to correct the engineering—
or, better, to get the engineering right to start with.
As there is a lot of pressure to solve the problems of poor interaction design or
interaction programming, most people look for the solutions outside of computer
science, which appears to have let us down badly.
Two particularly eloquent books express the general feeling well: Tom Lan-
dauer’s The Trouble with Computers and Alan Cooper’s The Inmates are Running the
Asylum (full references are given at the end of this chapter). Programmers don’t
know how to design, so human factors experts, usability experts, or psychologists
must lead design. These conventional views miss a few points:
95
Chapter 4 Transition to interaction programming
Box 4.1 Weapon salve In the sixteenth century, people injured by weapons got nasty,
infected cuts. Weapon salve was an ointment of choice, you might say the cutting edge
of medicine, and there was real evidence that it worked better than alternative ointments,
which were often ineffective at best.
Curiously, weapon salve was spread on the weapon that made the wound, not on the
injured person. Yet it worked! People recovered faster and wounds didn’t fester as much.
In fact all ointments in those days were pretty nasty stuff and would probably make any
wound worse through infection. Weapon salve was a technique to keep the germs away from
the wound, except nobody knew about germs, so the explanation they had was something to
do with the medicine working “through” the weapon. There was probably a lot of superstition
too—it had to be the right weapon or else it wouldn’t work. Whether an untreated wound
would have healed was chancy anyway. In short, the reason for the strange power of weapon
salve was that doctors had no theory to explain what was going on, so they relied too much
on empirical evidence.
The warning of this story is that we may have techniques that make designs better, and
they may seem to work. Almost all techniques for improving usability do not refer to the
device itself; they have no theory of how interaction with the device works. They may really
work just as weapon salve did, but in terms of really improving the “health” of the design
they are very weak.
96
4.2. From printed information to interaction
Existing manuals are generally organized around the technical features pro-
vided by the technology rather than by the tasks the user is expected to perform.
If a programmer was to design, say, a database for a computer, indexes would be
provided to support the expected tasks of the computer. In fact, if a programmer
wrote a program that did not explicitly address the tasks the computer was sup-
posed to do, the computer simply would not do them. Why aren’t manuals for
humans—which indeed serve the same purpose for humans as programs do for
computers—designed with the same forethought?
Tables of contents and indices for manuals provide users with access to the man-
ual’s contents “free of history.” That is, without a table of contents or an index,
users would have to read the manual as a narrative: they’d have to start at the be-
ginning and read through it (they might skip and skim, but that’s unreliable) until
they found what they were after. But with an index or table of contents, the man-
ual can be read in the order most appropriate to the users’ specific needs. With an
index you can go to what you want to read. You look up what you want to do and
then turn to the appropriate page.
The index and table of contents are remarkable inventions. The printing press
catalyzed the invention of many ideas that made reading books easier. Before the
printing press, it was so tedious just to write a book (and much more tedious to
copy one) that hardly anyone considered the needs of the book’s readers. There is
no point worrying about user interface design when getting the thing to work at
all consumes all the author’s energy!
Developing a table of contents is tedious, but developing a good index is a major
piece of work. In the days before the printing press, hardly anything even had
page numbers, so an index would have been quite useless. With numbered pages,
an index makes sense, but without a reliable copying process—that is, a printing
press—every copy of a book would have had different pagination because it was
handwritten.
Clearly, aids that help the reader, the book user, like the index, become feasible
once the technology for managing the written word becomes good enough. Once
the same information in any copy of a book could be found in the same place on
the same page, a teacher could say to their students, “look at page 42” and know
everybody was looking at the same thing. In turn, the improved usability of the
book changed the way teaching could be done. Tables of contents, indexes and so
on became practical propositions because their cost/benefit had been transformed:
the effort in working them out now only had to be done once, and the cost would
be recovered from the profit of many copies sold.
Johann Gutenberg’s printing press marked a turning point in civilization, dur-
ing the period 1440–1450. Though the Chinese invented movable type much ear-
lier, in 600, they were less successful mainly because of their enormous character
set. In addition to page numbers and indexes, more major innovations occurred
in just the next half century:
The author wrote not for a generous patron but for the wider public. Printing
gave a single author many readers, and the author’s income came from many
readers unknown to himself. Latin books had an enormous international
market.
97
Chapter 4 Transition to interaction programming
98
4.3. Computers as sleight-of-hand
Printing itself was spread with missionary zeal—but as can be imagined from
the enormous political impact (merely hinted at in the list above) many attempts
were made to suppress and control the new technology. Such observations are
reminiscent of the current assessment of computers and communications tech-
nologies, even down to the problems of copyright and piracy. Is the internet an
information revolution rather than just another way of packaging information?
Are we set to become mindless automatons?
99
Chapter 4 Transition to interaction programming
Yet here this neat analogy is misleading, in two key ways. First, most interactive
devices are innovative designs. You would not trust a novel building design to
be built by a construction worker; you would use structural engineers to check
and develop the design first. So because interactive devices innovate, we need
more structural design skills. Second, just as you would not leave designing or
building a large bridge to architects because the design and engineering problems
are immense, most programming projects should not be left to users or human
factors people alone. A bricklayer can build a bridge a few meters long without
difficulty. Increase the span to a few tens of meters and serious engineering issues
arise (for instance, the weight of the bridge is now a significant factor); increase
the span to hundreds of meters and really difficult engineering problems need to
be overcome.
Computer “buildings”—whether we software or hardware—are the largest in-
tellectual creations of humankind, ever. These are not small bridges, but vast
spans. Like vast bridges, they push science and engineering to the limits. Like
bridges, once the engineering framework is right, the finishing off can be com-
pleted by tradespeople—the painting, the nut-tightening, and so on.
However, computer programs are very easy to “patch together,” and it is sur-
prisingly easy to build enormous programs that miraculously still work—a little
like it is possible for bricklayers to build cities, however large, house by house, or
how it is (sometimes) possible to build a long bridge from lots of small arches. The
point is, it is possible to build enormously complex interactive devices using basic
principles. Just keep fixing the bugs. This commonplace fact has led to the demise
of computer science in the broader field of HCI, because all the interesting stuff
left seems to be at the level of polishing.
On the other hand, many large computer programs fail, because “patching to-
gether” does not work reliably. Many large government IT projects fail because
they are assembled as like bridges built by bolting girders together in the hope of
making a span long enough to cross any river. Obviously bolting girders end-to-
end will make something get longer and longer—just as obviously you can make
a program have lots of features, eventually all the features you need. But if one
girder can bridge ten meters, ten girders won’t bridge a hundred meters. Like-
wise, the engineering needed to design a big program is not a matter of scaling up
the skills needed for small programs.
Government projects and procurement are further discussed in section 13.2.4
(p. 456).
It is a rare skill to be able to build well. Equally, it is a real skill to be able to
program on top of infrastructure; some, say, like CORBA, are very complex beasts
in their own right. To use a framework, you need a lot of facts; to program without
one, or to build a new infrastructure, you need analytic skills of a different type. I
suspect the intrinsic skills are not rarer but simply needed seldom because of the
way the industry works—just as there are more bricklayers than architects.
The engineering infrastructure of cities—sewage, water supplies, telephone net-
works, and so on—has to be worked out carefully, but once worked out it can be
built on with ease; likewise, once the software frameworks are worked out, almost
100
4.3. Computers as sleight-of-hand
anyone can build on them. It took clever people to work out how to sort, but now
a programmer just writes sort and the computer sorts anything, with pretty much
nothing else said.
No practical programmers worry about sorting because the libraries and APIs
do it better, just as a house builder does not worry about getting the gas boiler or
telephone sockets to work; you just buy ones somebody else designed and built.
No practical programmer worries about the hardware, because it just works—but
somebody had to worry. In fact, hardware engineers have done a fantastic job of
hiding their discipline from interaction more widely! Apart from on/off buttons
and charging batteries, we hardly ever think about electricity at all, thanks to their
success at hiding all the complex details of their work. The same sort of thing has
happened with programming and computer science.
Sorting is just one example of the computer science infrastructure that is easy to
take for granted. Window managers and user interface features, like mouse man-
agement and buttons, have generally been programmed already when people start
programming interactive systems. Or web site design (information architecture)
takes for granted that the infrastructure HTML, cascading style sheets, and so on
works. Given that it works, there is quite enough for a designer to worry about
at other levels. On the other hand, the infrastructure could have been different to
take a tiny example, why don’t domain names allow spaces? This was a technical
decision but is now part of the unquestioned infrastructure.
If you use a web site design tool, it will be able to automatically check that any
web site you design is connected, that there are no HTML links to pages that don’t
exist, and that there are no pages that have no links to them. Here, a sophisticated
bit of programming and computer science design thinking has become concealed
into a something as trivial as a menu item in the tool’s user interface.
Once you can take infrastructure for granted, it is then easy to assume that all
that matters is how people behave and feel about the buildings; the psychology is,
after all, all that matters to people. There is a common assumption that the social,
emotional, and other needs of users, not the technology, determine how people
interact. Yet people can’t interact without the technology—and the technology
could be different, as you must know if you’ve ever been angry with a computer or
other gadget because it was “designed the wrong way.” How people interact with
mobile phones is different from how they interact with land phones; technology
and technological choices clearly make a huge difference.
Unfortunately, we don’t know what the infrastructure corresponding to sort-
ing is for interactive devices. We need good programmers to build good systems,
but they are in short supply—and often concerned about more crucial issues (like
battery management, communications protocols, graphics) than gilding features
like usability. Most interactive systems, however interesting they may be to users,
are largely made out of putting together computer science structures, like bonding
courses of bricks in a building. The infrastructure and the science of its develop-
ment have disappeared from sight.
Fortunately, for a very large class of interactive systems there is virtually no in-
frastructure. Push button gadgets are so “simple” that most developers, whether
101
Chapter 4 Transition to interaction programming
102
4.5. Introducing JavaScript
103
Chapter 4 Transition to interaction programming
Figure 4.1: Checking out HTML and JavaScript in a browser—however, browsers will
not render the page as shown until the alert box is dismissed.
104
4.5. Introducing JavaScript
105
Chapter 4 Transition to interaction programming
<script>
function plural(n, word)
{ if( n == 1 ) return n+" "+word;
if( word == "press" ) return n+" presses";
return n+" "+word+"s";
}
</script>
Without that extra line in the function, it would have returned strings like, “2
presss” instead of “2 presses.” It would come in handy when you want to tell the
user things like “it takes 2 presses to get from Off to Reset,” when you don’t know
in advance how many presses it will take.
From now on in this book, to save space, we won’t keep showing the <script>
... </script> tags, which must be put around all JavaScript to get it to work.
A few more useful points:
If you prefer, you can keep all your JavaScript together in a file separate from
the main HTML file. To do this, write <script src="mycode.js"></script>
in the HTML file, and then you can keep all your JavaScript in the file
mycode.js. You can of course have several files organized like this, for
instance, the device specifications, your debugged framework code, and the
particular experiments you are doing.
Browsers allow you to use other scripting languages than JavaScript. It’s good
practice to write <script language="Javascript"> ... </script> just in
case your browser defaults to running a different scripting language.
There are many commercial and shareware programs to help you develop web
sites. Using these will make writing JavaScript and running it much easier.
Apart from document.write, and a brief mention of forms, I’ve deliberately
avoided the document object model, so the code in this book is widely
compatible across browsers.
106
4.5. Introducing JavaScript
which will reset the JavaScript parser and drop back into HTML to say where it’s
got to. If you never see “Got to here OK.” that means something has gone wrong
somewhere above where you put this HTML. This will help you narrow down
where the problem is.
Another useful method to use within JavaScript is to place alert("got here
OK"); to see how the code is running. I find it useful to write calls to alert in
functions to say what their parameters are, and when I’m happy the function is
working well, I comment out the alert so it doesn’t affect the program any more.
Debugging Debugged
Since you may well see this alert message come up lots of time, it pays to make
it as clear and neat as possible. You may well need to debug the function next week
when you have forgotten what it does, so it helps to say something more useful
than just the parameter values. That’s one reason why it helps to leave the alert
in your code, but commented out.
Often it’s better to introduce a global variable, say, debug, and to use this to
control debugging. Instead of commenting out alerts that you think work, you
make them conditional:
var debug = false;
...
function f(a)
{ if( debug )
alert("call f("+a+")");
...
}
Now you can switch debugging on and off, so that only the parts of your pro-
gram that need debugging generate any output. If you want debugging, say debug
= true; and when you want it off say debug = false; This is of course very
useful if your function (here, one called f) is used in parts of your program that
work and parts that don’t. You can also use document.write(...) instead of
alert(...)—and it’s usually more convenient unless you are trying to debug gen-
erating HTML, in which case it will further mess up the problems that are already
worrying you.
Don’t forget that browsers are different, and some provide very nice debugging
facilities that you should explore.
107
Chapter 4 Transition to interaction programming
Comments Any text from // to the end of a line is ignored. (JavaScript also has
other sorts of comment we don’t need for this book.)
Assignments To update the value of a variable x, say, write x = x+1; What looks
like an equals sign is the assignment sign.
Subscripts Array elements are accessed using subscripts written between square
brackets; so a[i] is element i of the array a, and a[i][j] is element j of array
a[i]—so effectively a[i][j] is the element at the jth column of the ith row
of a.
Subscripts start at 0, and the length of an array a is a.length.
Types JavaScript has a type operator, typeof, which gives a string indicating
what type an expression has. For example, typeof "woof" is "string".
Conditional statements are written in the form if( test ) truestuff; or with
an else part: if( test ) truestuff; else falsestuff;
108
4.5. Introducing JavaScript
While loops are written in the form while( test ) repeatthis; The code is
executed if the test evaluates to true, and then it is repeatedly executed while
the test remains true.
There is a second form of while loop: the code is executed once, regardless of
the test, and is then repeated while the test is true. This form looks like do
repeatthis while( test ); Warning: studies have shown this sort of loop
often causes more conceptual problems that it helps solve elegantly.
For loops have the form for( var i = value; test; increment ) stuff;
meaning that the code stuff is repeated with i starting with value, while test
is true, and after each stuff is done, increment is done—almost always it will
be something like i++, to increase i by 1 (which means the same as i = i+1).
It is perhaps helpful to remember that for loops are just a special case of while
loops; the following two ways of saying the same thing are exactly equivalent:
Often we write for( var i = 0; i < 10; i++ ), declaring the controlled
variable i in the for loop. In JavaScript, this is pedantry, but it is good practice
in languages (like Java) that take declarations more seriously. It helps avoid
problems where the same name, here i, is used elsewhere in the program and
might otherwise be messed up by the for loop.
Functions are defined by writing function f(arguments) stuff and called by
writing f(values), which then runs stuff using the values.
Output The main form of JavaScript output we use in this book is writing HTML
straight into the web page that is being loaded by your browser. The function
document.write(string) writes string as HTML as soon as it is run. HTML
tags like <br> can also be written by document.write; it can even build them
up, as in document.write("<a href="+t+">")—recall that + means string
concatenation, not addition.
Objects If a variable is an object, it can contain different values in named fields,
so x.fred is the fred field of x; it can be used just like a variable in its own
right. Objects are initialized by writing, for instance, var x = {fred: 2};
which would make x.fred equal to 2.
Experimenting If you don’t understand something, don’t hesitate to use alert.
For example, try alert("Test: "+typeof 1); to see what happens.
109
Chapter 4 Transition to interaction programming
4.6 Conclusions
This chapter moved us away from exploring the problems, to starting to see the
importance and central role of computer science: the foundations of building in-
teractive systems lie in computer science and in programming.
As well as introducing JavaScript as the programming language we shall use,
we explored the history of interaction programming and HCI, one of its aliases.
HCI brings together many disciplines to understand how to build better systems
for users.
Indeed, conventional building makes a good analogy for us. Most building
work is great fun; we can redecorate and replace kitchens with ease, and we all
have opinions about what suits us. Building work that involves the infrastructure,
the wiring, or the trusses—all of the preparation of a new building—requires great
expertise. The point of this book is to remind us that quite a lot of building work
requires good engineering; but once you have got the framework right, as we’ll
see, you can change other things pretty much at will—and you end up, because of
the sound engineering, with a much greater freedom to be creative than you could
ever have had in a rickety, even dangerous, building.
Badly engineered buildings fall down; well designed buildings are there to en-
joy, to customize and to live in.
4.7.1 Programming
It’s impossible to give a review of the vast range of computer science resources—
though you can safely disregard almost everything, particularly anything you see
in an ordinary bookshop. There is an overwhelming amount of elementary hand-
holding material that at best only gives you recipes to follow and at worst locks
110
4.7. Further reading
you into a cut-and-paste mentality. You can recognize such books by the frequent
screen shots and long stretches of program code.
Assuming you are new to computer science, I recommend the following books:
Aho, A. V., and Ullman, J. D., Foundations of Computer Science, W. H. Freeman
and Company, 1995. Available in various language editions, this is an excellent
all-round introduction to computer science, going to first-year university level.
Flanagan, D., JavaScript: The Definitive Guide, O’Reilly, 1996. While there are
plenty of introductory books on JavaScript to choose from (browse in any
bookstore or on-line), this book is a programmer’s guide to JavaScript and is
ideal for programmers who want to know the details of the language. There
are, of course, very many examples of JavaScript on the web that you can work
from or exploit as well.
JavaScript is only one of many web technologies; others are cascading style sheets
(CSS), XML, DOM, . . . and many other abbreviations! If you learn JavaScript, as
well as being able to implement everything in this book, you are well on the way
to being able to design and write far more interactive web sites (while keeping
them well-written and dependable)—this will take you far beyond just program-
ming interactive devices. If you want to follow a web development route, the best
way to proceed is to learn Ajax, a very powerful JavaScript framework for web
development.
Crane, D., Pascarello, E., with James, D., Ajax in Action, Manning, 2006. This is
probably the best place to start on Ajax, its programming, patterns, and
philosophy.
111
Chapter 4 Transition to interaction programming
Most products get more and more complicated as designers think of new features.
Simplicity is the antidote; Jensen explains how to get a design group thinking
along the lines of simplicity—and he has some really good ideas. This book which
provides some technical tools to build simple systems. If you can’t build your
systems with the ideas in this book, you aren’t going for simplicity.
Jones, M., and Marsden, G., Mobile Interaction Design, John Wiley and Sons Ltd.,
2005. If you think that small mobile devices are the exciting, key future
technology, this book shows how to channel your enthusiasm in a
human-centred way: it concentrates on use, usability and the complete
user-experience. In particular it raises the issues and needs of the billions of
future users in developing regions such as Africa. If you read Press On and the
Jones and Marsden book, you’d have covered everything you need to know for
the design, programming and evaluation of interactive devices worldwide.
Norman, D. A., The Psychology of Everyday Things, Basic Books 1988, also
published as The Design of Everyday Things in paperback, Penguin, 1990. This
book is the book of the field, and every interaction programmer should read it.
Don Norman has the same problems with everyday things that I do, only he’s
a psychologist rather than a computer scientist; the advantage is that he has
insights that you won’t automatically dismiss by thinking, “I know a device
that doesn’t have that problem, so it’s not an issue.”
Norman has written many stimulating books. The Invisible Computer, MIT Press,
1998, in contrast to The Psychology of Everyday Things, The Invisible Computer ex-
presses a vision about good design: that computers should be invisible (see p. 441).
There’s a lot to agree with here—and a lot to disagree with! For more, see his es-
says at www.jnd.org/dn.pubs.html.
There are many books that take a more computer science view of interaction:
Dix, A. J., Formal Methods for Interactive Systems, Academic Press, 1991. Alan
Dix’s book is over a decade old, but it is still a provocative and excellent place
to start a PhD.
Raskin, J., The Humane Interface, Addison-Wesley, 2000. Jef Raskin worked on
the design of the Apple Macintosh user interface and this book has a great deal
to say about good design.
Thimbleby, H. W., User Interface Design, Addison-Welsey, 1990. I tried to get
down to the key issues and design principles in this book rather than get
bogged down in the technology (which goes out of date very fast). The book
came out just as everyone was moving away from desktop computers to
personal devices, like video recorders. All the work we had put into
understanding how to design for graphical user interfaces and so on were lost
on the new generation of interactive devices.
112
4.7. Further reading
The study of human error is fascinating, and all designers should have some in-
sight into how things go wrong. If we don’t design for errors, errors will nev-
ertheless surely happen, and then we—and our users—are certainly heading for
disaster. At least read this classic:
Reason, J., Human Error, Cambridge University Press, 1990.
Human error pervades this book; box 11.3, “Generating manuals automatically”
(p. 397) to take just one example, is a short summary of a railway accident that
could have been avoided by better design.
There is a danger in thinking that users are the same as we are. All the books listed
above treat the users as pretty ordinary—people, just like most of us designers.
There are many books that cover different aspects of users, whether international
and cultural, age related, or disabled. A good place to start is to read about chil-
dren; we were all children once, and we still have some sort of resonance to that
different world of users.
Druin, A., ed., The Design of Children’s Technology, Morgan Kaufmann
Publishers, 1999. Allison Druin’s book collects a wide variety of interesting
and stimulating topics into one book.
113
Chapter 4 Transition to interaction programming
These introductory textbooks cover so much material that very little is covered
in great depth—but an interaction programmer should know everything in them!
In particular, these books discuss more computer-oriented interaction techniques,
such as number pads, menus and pointing devices, which Press On does not cover
in any detail, even though many interactive devices need them. If you get hold
of these books, make sure you get the most recent editions, since they are being
revised continuously.
Design also requires us to be able to get things to work. We’ll have more to say
about programming later in this book, but you would do well to get some good
programming and algorithm books.
Most books on software engineering give far too little emphasis to user interface
design and programming interactive systems. The following book, however, is
an excellent overview of the technical issues in designing interactive systems—in
covering UML and other core topics:
Wieringa, R. J., Design Methods for Reactive Systems, Morgan Kauffman, 2003.
114
4.7. Further reading
The best resource by far is you. Don’t take my short reading list as definitive; at
least go to Amazon.com or some other good web bookshop and see what else the
authors write, or explore what other books the people who buy them buy: follow
the trails into new areas.
Don’t just read stuff or join groups; start research by doing design, start doing
experiments, write programs, get involved, build things, and make things happen
(you’ll find lots of ideas in this book). Find facts and principles to criticize other
people’s work—and find out how to do better. Then, of course, you talk about
and write up what you have done and throw it into the pit with all the other
contributions at conferences, and the community will help you improve, and you
will help the community improve.
115
I wouldn’t give a fig for the simplicity on the near
side of complexity; but I would give my right arm for
the simplicity on the far side of complexity.
Principles
It’s hard to understand even simple interactive devices. In part II, we introduce and
develop the principles behind interaction programming: how devices work and how they
really should work.
117
TYRON FRANCIS WWW.TFRANCIS.CO.UK
5
Communication
As long ago as 1084BC people wanted to communicate over long distances. A se-
ries of beacon fires lit on hills enabled Queen Clytemnestra to learn of the fall of
Troy when she was five hundred miles away. According to the Greek historian
Polybius, by 300BC, a more sophisticated method of signaling using a code was
in use. In this case, each letter of the Greek alphabet was written in a 5-by-5 grid,
and the code for each letter was its row number then its column number. This
code could be tapped out. Similar versions of the code have been reinvented and
used by prisoners to tap to each other through cell walls, certainly since medieval
times. In 1551 the mathematician Girolamo Cardano suggested using five torches
in five towers, an innovation that would mean the code could be used over greater
distances. From the early eighteenth century, people were starting to propose us-
ing electricity for long-distance communication, but the only methods known that
would work at all were based on static electricity. Static electricity is prone to leak-
age and isn’t reliable over any distance; it is useless outdoors, where wires can get
wet. An anonymous letter written in 1753 proposed using separate wires for com-
munication, one for each letter of the alphabet, and proposed watching swinging
pith balls attracted by static electricity to read the message.
To cut a long story short, by the mid-1800s all the technological and scientific
developments were in place to communicate over long distances—waiting only
for a practical user interface—and a means to stay in business. In fact, they had
much the same problems as any modern user interface designer.
The story of communication and how it developed over this lively period raises
almost all of the issues of modern user interface design, but without the complica-
tions. And it’s interesting.
119
Chapter 5 Communication
B D
E F G
H I K L
M N O P
R S T
V W
Figure 5.1: Schematic of the display of an original 1837 Cooke and Wheatstone tele-
graph design. The five needles had brass stops, so that the operator could hear them
clicking as they swung into position. Here, the display is indicating the letter K.
didn’t invent the Wheatstone bridge, which was named after him, but he did in-
vent the Playfair cipher, which was named after somebody else. Wheatstone and
Morse’s approaches to communication were very different, and their story high-
lights ideas that are true for any sort of communication.
Wheatstone, and his colleague William Cooke, were first to develop a viable
communication system that was made available to the public. The system was
first demonstrated in July 1837 using wires strung along the railway track from
Euston to Camden Town (railway stations in north London), covering a distance
of about 1.5 miles. They successfully transmitted and received a message. The
impact of this must have been enormous; bear in mind that the Penny Post—snail
mail—was introduced three years later. A decent electric light bulb had to wait
until the 1870s with Sir Joseph Wilson Swan’s patent (granted before Edison’s, by
the way, and which explains the brand name Ediswan).
Not only were Wheatstone and Cooke working decades before the light bulb
had been invented, they had to invent the entire user interface themselves. They
had to invent keys and displays as well as a coding system that would work for
sending messages.
How did Wheatstone’s telegraph system work? The users of Wheatstone’s tele-
graph used (what we would now call) desktop-sized displays, which had five nee-
dles. Each needle could be swung to the left or the right by electromagnets, so it
would then point along a diagonal line of letters arranged in two triangular grids.
When not energized, the needles rested in a vertical position.
To transmit a letter, the user pressed two keyswitches, which caused the cor-
responding needles at the other end of the telegraph wire to move and together
point along two lines in the grid. The letter at their intersection was the chosen
letter. The Wheatstone grid allowed 20 letters: the letters C, J, Q, U, X, and Z had
120
5.1. The beginnings of modern communication
to be omitted. If only one needle moved, it was taken to point at a digit, so num-
bers could also be sent easily. By 1838, Cooke had patents for mobile devices, a
premonition of today’s mobile communications technology?
Despite its shortcomings, the Wheatstone telegraph could be used by relatively
unskilled operators: using the grid was pretty self-explanatory and easily learned.
Of course Wheatstone had to overcome ignorance and attachment to traditional
methods—nobody knew what a telegraph was, or what advantages it would offer.
So it was a big advantage minimal training was required.
One of Wheatstone’s early telegraphs helped catch a murderer, John Tawell, on
New Year’s Day 1845. The message read:
A MURDER HAS JUST BEEN COMMITTED AT SALT HILL AND THE
SUSPECTED MURDERER WAS SEEN TO TAKE A FIRST CLASS TICKET
FOR LONDON BY THE TRAIN WHICH LEFT SLOUGH AT 7H 42M PM HE
IS IN THE GARB OF A KWAKER WITH A GREAT COAT ON WHICH
REACHES NEARLY DOWN TO HIS FEET HE IS IN THE LAST
COMPARTMENT OF THE SECOND FIRST CLASS CARRIAGE
Crucially, the telegraph message travelled faster than the train Tawell was on.
Tawell was apprehended near Paddington railway station; he was later hanged
for his crime.
The five-needle telegraph required five wires (Wheatstone had an ingenious
way of avoiding a sixth wire for the common return that would normally have
been needed). Over miles, the cost of wire (and keeping it working) was not in-
significant. The original design of telegraph was soon replaced by a single-needle
instrument, to cut down on the number of wires required. Each letter of the al-
phabet was then given a code of several right and left needle movements, but this
now required skilled operators—they had to know what the code was, as it could
not be read off the front of the telegraph “display” as before. One operator was
needed to send the message, and two skilled operators were needed to receive it:
one to read the needle movements and one to write the message down. As events
proved, Wheatstone’s modified scheme, though cheaper, was still harder to use
than Morse code, which became dominant worldwide.
At almost the same time that Samuel Morse, working in the United States, was
developing his code, he was developing his own telegraph to use with it. His first
design was a machine that recorded the message on a moving paper tape, which
he felt important for record-keeping purposes. Recording made Morse’s telegraph
uniquely different from all previous distance-communication approaches, such as
smoke signals and semaphore, which leave no record of the message. Wheat-
stone’s system too made no records—any message had to be written down by
hand.
Morse was aware of the advantages of automatic recording, and thought it had
potential. He soon found, however, that it was easier and faster for operators
to listen to a buzzing sound rather than read the dots and dashes and transcribe
them—you can listen and write easily, but you can’t watch and write easily, as
Wheatstone had found out. After several false starts, Morse devised his epony-
mous code.
121
Chapter 5 Communication
Morse code was not at all like what we recognize today. In fact, it isn’t totally
obvious that Morse himself invented it. There is a lively dispute about the contri-
bution of his assistant, Vail, and whether others helped. Regardless, Morse was a
great promoter of the code.
Originally, Morse had each letter or digit coded on a metal ruler—so the user
sending a message needed to find the ruler (but did not need to know the code
on it), and then put it through the transmitting device. This idea made sending a
message easy to learn, but slow.
One of Vail’s innovations was to use a tapping key, so operators could tap out
any code, an innovation with both advantages and drawbacks. The operators no
longer needed to hunt and find the right ruler, so they were faster, but they now
needed to learn the codes. Unlike the Wheatstone system, operators of the modi-
fied Morse system needed training to use it effectively. Even so, speeds of ten or
more words per minute were easily achieved.
The first US telegraph message sent over a decent distance, from the Supreme
Court room in the Capitol to Baltimore, was transmitted by Morse and recorded on
his paper tape. Morse gave credit to Annie Ellsworth, the daughter of a friend, for
suggesting the now famous message, “What hath God wrought?” from Numbers,
chapter 23, verse 23. It seems Morse was acutely aware of the potential for abuse
or misuse of the telegraph: “Be especially careful not to give a partisan character
to any information you may transmit,” he wrote to Vail. We could heed his advice
today.
Morse had some problems getting his ideas accepted in Europe, not only be-
cause of patent problems (Morse code was patented in 1840) but also because of
the continental European habit of using accents in their writing.
In the modern international Morse code, each letter and digit (and a few other
signs) are represented by dots and dashes. For example Q is • which is
spoken as “dash dash dot dash.” Note that the dots and dashes must be separated
by short silences; otherwise, we’d just hear a continuous buzz. The usual conven-
tion is that a dash is worth 3 dots, the gap between dots and dashes is 1 (silent) dot
unit, and individual letters are separated by 3 (silent) dot units. To send Q there-
fore takes a total time of 16 dots. Words are separated by 6 silent dot-length units.
If all letters were coded like Q, every letter would take 16 dot units to transmit.
But it was Vail, apparently, who realized that if the more common letters are
given shorter codes, even at the expense of making other letters longer, the av-
erage length of a message could be shortened. The story is that Vail went to a
newspaper printers and saw that they had more letter Es in lead type than other
letters, because any page printed needed more Es than other letters to be in stock.
Thus E, the most common English letter, is represented in Morse code by a single
dot.
Roughly, the Morse code length of each letter is inversely proportional to the
amount of lead needed for it at the printers. If this story is correct, then as the bar
chart in figure 5.2 (facing page) suggests, the printers must have had most Es, then
T, I, S, N, A, U, R, and so on, down to Y, Q and J being least used.
122
5.1. The beginnings of modern communication
E • L • • •
T K •
I • • G •
S • • • F • • •
N • B • • •
A • Z • •
U • • X • •
R • • P • •
M O
H • • • • C • •
D • • Y •
W • Q •
V • • • J •
Figure 5.2: Morse code, drawn like a bar chart sorted by increasing length of the codes
for each letter: E is the shortest code, just a dot, and Y, Q, G are the equal longest.
Read the codes from left to right (for example, N is a dash followed by a dot).
E • M
T F • • •
A • P • •
O G •
I • • W •
N • Y •
S • • • B • • •
R • • V • • •
H • • • • K •
L • • • X • •
D • • J •
C • • Q •
U • • Z • •
Figure 5.3: Morse code, now drawn with each letter in frequency order, E being the
most commonly used letter in English (at the beginning of the table), and Z the least
commonly used (at the end of the table). The length of the codes does not correspond
well to frequency of use.
123
Chapter 5 Communication
Box 5.1 Small screen interaction User interfaces come in four main screen sizes: TV
sized—as on a typical computer; small—as on a mobile phone; tiny—as in 10 character LCD
displays; and none at all—just lights and fixed text printed on the box.
Obviously the bigger the display, the more that can be done with it, but at the cost of
greater size and expense. In the case of mobile devices, there is an issue of resolution: what
might be good enough for watching television may not be good enough for reading text.
But there is a surprising difference between TV and mobile-sized displays. The human visual
system is very good at working out certain sorts of visual relationships: for example, we
can see this object is in front of another one, because it partly occludes the other. These
automatic features of our vision are one of the important perceptual supports on which
windowing systems rely. We see windows as if they are real objects, and we can see quite
easily (although not always) which window is in front, which is behind, and so on.
On a small screen, these visual cues may not be available. Even if little display markers
are used to indicate in a theoretically clear way, our automatic visual processing skills won’t
be used.
This explains why WAP phones, which tried to duplicate the success of web browsers on
PCs, failed miserably.
Morse code can easily be improved. The techniques we will discuss in this his-
torical setting work for modern interaction programming.
124
5.1. The beginnings of modern communication
Box 5.2 The QWERTY effect The letters QWERTY are the first few from the first row
of letters on a typical keyboard. There are millions of QWERTY-style keyboards; and “the
QWERTY effect” is the belief that if there are so many of them, then they must be efficient.
History of course is a lot more complex. There were many styles of keyboards as the
earliest typewriters were being developed, with all sorts of keyboard arrangements. One of
the successful typewriters, made by Christopher Sholes and patented in 1868, happened to
set the pattern for the future. It’s important to note that the typewriter as a whole was a
successful commercial product, not necessarily the keyboard design considered alone.
The prototype designs used by Sholes often locked up as the keys jammed; he avoided
this problem by—depending on whom you believe—either arranging the keys to slow people
down, or arranging them so that common digrams were typed by alternating hands. One
complication to this story is that in those days, it was envisaged that people would type
with their index fingers, and not touchtype using all their fingers and thumbs. Another
complication to the story is that to start with there were very many typewriters and alternative
keyboard layouts literally fighting it out—by typing competitions and by all the usual fair
and foul means—and QWERTY was the winner.
It was late in the day, the 1930s, by the time August Dvorak did his experiments on
keyboard layout and patented his Dvorak Simplified Keyboard. The improvements in typing
speed are marginal, but for this new design to take over the world it would have to show a
better whole-life efficiency in a world where most typists are trained on QWERTY layouts—
and when small improvements in reducing carpal tunnel syndrome are discounted. Dvorak
isn’t that much better (evidently).
code, as they did, “improving” it is not just a technical game we can play in iso-
lation to how people use it. We would have to retrain everyone who already uses
Morse code. That human cost in something as widely used as Morse code over-
whelms almost any other considerations.
There’s also a nice twist: because a scheme is dominant means it does not get
changed much, because the social costs of changing dominant systems are over-
whelming. The denial of the need to change, together with the daunting thought
of the cost of change, is an obvious design paradox.
Some people think that if a design doesn’t get changed much, it must be the
best there is: see box 5.2, “The QWERTY effect” (this page).
Today Morse code is consigned to the pages of history, but the design lesson is
still pertinent: get the user interface as well designed as you possibly can before
you release your product—once it is released its very success in the market will
squeeze out competitors, including its own future usability improvements.
The history of the telegraph is far more complex than we can cover in this book,
but there is one final lesson. Weber and Gauss (the famous mathematician) in-
vented and demonstrated an earlier telegraph in Germany in 1833, still not the
first, but certainly predating both Morse and Wheatstone. Their telegraph failed
125
Chapter 5 Communication
126
5.1. The beginnings of modern communication
Box 5.3 Information theory When we tell somebody or something a message, we transmit
information. Information can be measured, and since Shannon’s work, it is customary to
measure information in units of bits. A binary digit is a bit too, and the information a binary
digit can hold is exactly one bit.
Once you know how much information a message has, you know how many bits it would
take to send it. It is not possible to do any better. More generally, once we know how much
information a device takes to use (or a problem takes to solve), then the number of bits
gives us a lower bound that no design can better. This is a very general result with wide
implications. For example, although there are many programs that claim to do it, it is not
possible to compress all files—because that would mean losing information. Instead, you
can try hard to compress likely files, and many programs (such as zip, and the GIF and PNG
graphics standards) are very good at compressing common sorts of files. Translated back
into user interface design, we can make some things the user does easier, but we cannot in
principle make everything easier. The trick is to find what is worth doing.
Usually nothing is perfect. In user interface design, we have to worry about users making
errors. In information theory, errors are called noise. You want messages to be resistant
to noise, just as you want user interfaces to devices to be usable despite the user making
errors. One of Shannon’s remarkable results (Shannon’s noisy-channel coding theorem) was
to show that you can work with noise and get arbitrarily low error rates; it’s not that more
noise (or more errors) decreases what you can do; rather, with judicious choice of codes
(or user interface design) you can still make the final effect as reliable as you like, with the
proviso that it may take longer to achieve.
The idea that you don’t want to destroy a day’s good work with a single slip is a special
case of the principle of commensurate effort. When users work, they create information;
the principle of commensurate effort says that the effort of deleting information should be
comparable—little things should be easy, hard things harder—with the effort of creating it.
property of a code is called being prefix-free: it means that no letter codes share
the same prefix.
The alternative to being prefix-free is to block the symbols, for instance, so that
every 8 symbols code for a letter. ASCII code, for example, does just this. The
disadvantage is that we make the code much longer; ASCII may be fine for com-
puters with lots of memory, but if you were tapping out ASCII using a tapping key,
avoiding repetitive strain injury and increasing the speed you can send a message
would be more important.
How is a Huffman code constructed? Consider the two least probable letters in
English, Q and Z. Because they are least probable, then they must have the longest
Huffman codes, but whatever their codes are, they must certainly be different.
Suppose the longest Huffman code we need is XXXXX; if we could tell whether
this was a Q or Z before we read to the end of it, the code could be shorter. Thus
Q and Z must differ in their last digit. They will have codes such as XXXX0 and
XXXX1, which differ in their last digit so that they can be distinguished.
In figure 5.4 (next page), we have chosen 000011001 and 000011000 for Q and
Z. All other codes must differ from Q and Z earlier than the last digit—they can’t
differ later, because all longer codes starting like Q or Z obviously share the same
127
Chapter 5 Communication
Figure 5.4: A binary Huffman code for English, based on the letter frequencies in the
Brown Corpus. Thus E, being the most common letter, has the shortest Huffman code.
The letters are shown in decreasing frequency order, so the increasing length of their
Huffman codes can be seen easily.
prefix as Q or Z. So, for example, 00001101 is the next shortest different prefix; we
can either allocate this to a more frequent letter (say, S) or we can distinguish at
least two further codes after it, namely 000011010 and 000011011. In fact, this is
what we do for X and J.
We then allocate the next shortest code, 0000111, to K, and this differs from X,
J, Q, and Z in K’s last digit, which is a 1, whereas this digit is a 0 for all of X, J, Q,
and Z.
There are clearly lots of decisions to be made; Huffman worked out how to make
these choices optimally. There are many equivalent Huffman codes; for example,
if we simply swapped 0 and 1 in the table shown in figure 5.4 (this page) we would
get a different code, or we could just swap 0 and 1 in the third position . . . but these
changes just make new codes that are essentially the same as the one here.
As with the Morse code tables, the Huffman code is shown a bit like a barchart;
the letters (E, T, A . . . ) are arranged in decreasing frequency order, and each code
is a “bar” the length of which shows how long it takes to transmit. It is easy to
see that frequent letters, like E and T, have short codes, and the lengths increase
continuously up to the least frequent letters (X, J, Q, Z), which have the longest
codes.
This binary code looks longer than Morse code, but Morse code has more sym-
bols to play with than our Huffman code’s 0 and 1—Morse has dot and dash and
the gaps between letters and words. In fact Morse has six symbols: dot, dash, and
four sorts of gap—between dots and dashes, between letters, between words and
between sentences. The main reason Morse looks much shorter is because it uses
more than two symbols; the Huffman table above isn’t hiding any secret symbols.
The codes for the letters E and T are dot and dash, so every Morse code letter
looks like it must start with either the code for an E or the code for a T. Never-
theless, Morse is prefix-free, even though it doesn’t look prefix-free, because we
have been ignoring the end-of-letter pauses—no letter other than E starts with
dot immediately followed by the end-of-letter pause. Because of the end-of-letter
pauses, Morse code has the advantage over a Huffman code that losing a letter
128
5.1. The beginnings of modern communication
T E
H R S N I O A
P F M U C D L
V B Y W G
Z Q J X
Figure 5.5: A Huffman tree for the alphabet, based on the Brown Corpus frequencies.
Frequent letters are near the top of the tree (like T and E) and infrequent letters, with
longer paths, near the bottom. Using this tree, the Huffman code for a letter is the
sequence of left/right turns in the path from the root down to the letter—take left as
0, right as 1, and you should get the table in figure 5.4.
or two (as could happen with a dodgy communication line) does not matter too
much; whereas, in the Huffman code above, if even just one digit is lost, the rest
of the message will be garbled because we have no idea where to get back in step
with the code.
As before, making a code more reliable makes it longer to send—unless we take
into account the time it takes to recover from errors. In Shannon’s theory, the error
rate is crucial. The speed/reliability tradeoff cannot be avoided and applies to
user interface design too.
I’ve drawn the table of Huffman codes above in what amounts to a user manual
for them:
If your task was to code N, say, then look N up in the table and find the
instructions for “doing” N, namely to press 1011, or to press the keys 1 0 1 1 .
(If it was a user manual, it’d be easier to use if it was in alphabetical order.)
The converse task is to decode something, such as 1011, and find out what it
means. It’s easier to visualize the necessary decisions when the code is set out
as a tree: start at the top of the tree and go down branches, going left or right,
as you press 1 or 0 you will eventually end up at goals, like N.
It is much easier to follow the same rules in a diagram, such as those illustrated
in figure 5.5 (this page), which represents the same information as the table. For
example, if you were decoding 10011, start at the top of the tree, working your
way down and turning left or right as you read digits 0 or 1 in the code.
129
Chapter 5 Communication
If you are good at programming, you can take it from here with no more help
from me; if not, the next section gives you some code you can try out.
Section 5.2 (p. 134) discusses applying the key principles in user interface
design.
130
5.1. The beginnings of modern communication
original huffmanData table. These two entries are merged to make a new single
entry, which (since we no longer need it) will overwrite the old entry for Z by
{ prob: 0.002020191, data: [data[Z], data[Q]] }
The line of code above uses the specific letters Z and Q, but in general, any two
entries may be being combined. The code in the JavaScript program therefore uses
apos and bpos for the two locations in the huffmanData array. For the first iteration
with this data they will happen to be Z and Q (or, rather, they will be 25 and 24,
respectively, which are the indexes of Z and Q in the huffmanData array). Thus
on the first iteration in the program, huffmanData[Z] will be huffmanData[apos]
and huffmanData[Q] will be huffmanData[bpos].
Merging two elements reduces the number of items in the array by one. The eas-
iest way to handle this is to move the last element of the array down to overwrite
the other element; that’s done by the last line in the for-loop, huffmanData[bpos]
= huffmanData[n-1].
for( var n = huffmanData.length; n > 1; n-- )
{ var a = 1, b = 1, apos, bpos;
// find smallest two items
for( var i = 0; i < n; i++ )
{ if( huffmanData[i].prob < a )
{ b = a;
bpos = apos;
a = huffmanData[apos = i].prob;
}
else if( huffmanData[i].prob < b )
b = huffmanData[bpos = i].prob;
}
// merge apos and bpos into apos
huffmanData[apos] = {prob: a+b,
data: [huffmanData[apos], huffmanData[bpos]]};
// move last element of array down to overwrite bpos
huffmanData[bpos] = huffmanData[n-1];
}
The process stops when there is just one entry left, which will be a Huffman tree
(with its root at the final merged entry, which is entry 0). Note that the algorithm
is destructive: we will have lost the original data in huffmanData.
Finally, we define a recursive algorithm to walk the Huffman tree and allocate
binary codes to each letter:
function walk(tree, string)
{ if( typeof tree.data == "string" )
document.write(tree.data+" "+string+"<br>");
else
{ walk(tree.data[0], string+"0");
walk(tree.data[1], string+"1");
}
}
131
Chapter 5 Communication
The key point is that you can replace the letters (A, B, . . . ) in the data structure
with anything, such as function names for interactive devices, and the code will
work just as well. Instead of making Huffman codes for efficiently transmitting
letters, you’d be making Huffman codes for efficiently using devices.
Unfortunately, Huffman codes using more than two symbols (here, 0 and 1) are
a little trickier to work out. Every step of a binary Huffman code construction
combines two items into one; this reduces the number of remaining items by 1,
and the final step necessarily has exactly two items to combine. If we have an N
symbol code and S symbols to Huffman code, naively starting by combining the
N lowest probability symbols has a couple of problems: there may not be that
many symbols to start with (if S < N), and at the final step there usually won’t
be exactly N subtrees to combine, so the top level of the tree won’t be full. That
certainly would not make an optimal code.
The easiest way to solve this problem is to add enough dummy nodes with zero
probability to ensure the tree will be optimal. Once the tree is constructed, the zero
probability nodes can be ignored (because they occur with zero probability). You
will need to add enough zero nodes so that S mod ( N − 1) = 1.
It is even harder to generate a Huffman code when symbols can have different
lengths, as they do in Morse code (where dashes take longer to send than dots).
Fortunately for us, Morse code is officially obsolete, and all of the user interface
design we will do in the rest of this chapter with Huffman codes will use constant
length symbols—they’ll just be button presses.
132
5.1. The beginnings of modern communication
A Huffman scheme that adapts to the particular message’s specific letter fre-
quencies would be better for that message: this is adaptive coding. Furthermore, as
a message proceeds, the relative frequency of letters changes: in some parts of a
message E may be rare and in other parts common. An adaptive scheme changes
the code as it proceeds.
Morse’s original message contains only 10 different letters, and we could design
a special code for them rather than for the whole alphabet—and do it more effi-
ciently, too, since we wouldn’t waste code space for letters that are never going to
be sent.
Huffman is a so-called symbol code; each symbol it codes, it codes as another
whole symbol. Each letter becomes a distinct symbol in the Huffman code of so-
many bits. If a letter has a theoretical minimum code length that is a fractional
number of bits, like 2.3, Huffman has to round this up to a whole number of bits
in the code: it cannot do better than 3 in this case. This inefficiency is pretty much
inevitable, as there is no reason why the letter frequencies should come out at
whole numbers of bits for the optimal code.
If we code more than one letter at a time, the fractional bits can be “shared”
between the Huffman codes. As it were, if we needed 2.3 bits followed by 2.4 bits,
we could combine them to 4.7 and hence use only 5 bits; whereas, sent separately,
they’d need 3 + 3 = 6 bits.
If we coded pairs of letters (digrams) rather than individual letters, in most
messages many pairs don’t occur at all (such as QX, QY, QZ, . . . ) and we don’t
need to code them. In Morse’s message, we effectively reduce the alphabet to
one about 1.5% the original size of 26 × 26 = 676 pairs, an even better saving
than getting to 10 from 26 for the message-specific single-letter code. We can code
letters with fractional numbers of bits too; and the more letters we combine into
symbols, the more efficient it will be.
There is a trade-off, of course. If you want to send a message, the message
is certainly going to be shorter, but you need to know more of the message before
you start; and if you make mistakes, you are going to make bigger errors, affecting
more of the message. Message sending will become less interactive.
We’ve covered all this ground to review efficient communication methods, and
to look at some of the design trade-offs, with the added interest of a historical set-
ting. Obviously efficient communication is important in many contexts, not just
for the special case of sending messages between humans.
So far this chapterhas discussed communication as a technical process that con-
nects humans, from beacon fires to radio communications. But communication
need not be human-to-human, it can also be between humans and interactive de-
vices. In one case, your brain wants to get a message to somebody else’s brain:
you might use Morse code or a Huffman code to send the message to the other
person efficiently. In the other case, your brain wants to get a message to a device
you are trying to use.
133
Chapter 5 Communication
Box 5.4 Video plus To record a program on a DVD or PVR, the user must specify the date,
the start and end time of the program and the channel it will be broadcast on. This is quite
a lot of information, about 12 digits, and it is easy to make a mistake—and if you do make
a mistake, you probably won’t discover it until after you find out you failed to record the
program, when it’s far too late to do anything about it.
Timing information is a good candidate for compression, to make it easier to enter cor-
rectly. The simplest approach is to have the user interface to remember the last details:
typically, you won’t change all of them, so to enter a new program time you only enter the
changes needed. For example, for a whole year, you could get away with not reentering the
four digits needed to specify the year. If you always record the same program every week,
then usually the only detail that needs changing is the day of the month—only two digits.
Gemstar’s Videoplus+ is another scheme some DVD and video recorders support for mak-
ing recording programs easier. With Videoplus+, TV listings in newspapers and magazines
additionally show a number, ranging from 1 to 8 digits, as well as the conventional channel
and timing information.
If you have a basic recorder, you’d have to enter the full details. If you have a Videoplus+
enabled recorder, you would enter the Videoplus+ number. This number is a compressed
code for program times, so it is much faster to enter—and it uses Huffman coding to generate
the code.
The disadvantage of Videoplus+ codes is that they are meaningless, and the same code
will record different things if used at different times—for example, on any given date, all
Videoplus+ codes for earlier dates are be recycled to be used again for the future.
Huffman codes are discussed in section 5.1.3 (p. 126).
134
5.2. Redesigning a mobile phone handset
Now consider the following problem. I have two hands and ten fingers and I
want them to send “messages” to my mobile phone. The messages are going to
be coded instructions to get the phone to do useful things for me. For practical
reasons, namely, that I need one hand to hold the mobile phone, I am only going
to be able to use one hand for typing my message. Furthermore, the phone, being
a phone, already has 10 digit buttons on it. To ask for many more buttons would
make the phone bigger and heavier, which I don’t want.
As before, we will need a prefix-free code, for the obvious reason that if 123 is
the code for redialing, we don’t want to have to have any confusion with another
code such as 1234 that does something else: we want the code to specify dialing
unambiguously when we’ve finished saying do it. Of course, we might want to
allocate redialing a different code than 123, for instance because it is frequently
used, a shorter code; but that is a different problem from being prefix-free—in
fact, this choice of code is exactly like wishing to allocate E a shorter code in Morse
because it is used more often than other letters.
The best way (under broad assumptions) to allocate efficient codes is to use a
Huffman code. If I want to select functions on my phone quickly, I should be able
to use a Huffman code that uses ten digits (rather than two binary digits we used
in our comparison with Morse code) to select from the phone’s functions.
For concreteness, let’s take the Nokia 5110 mobile phone. We will be concerned
with the Nokia 5110 mobile handset’s menu functions, though there are a number
of essential functions that are not in this menu (such as its quick alert settings and
keypad lock), which we won’t deal with here.
It probably would have been better had Nokia included these few functions, like
quick alert, in the main menu, then users would be able to find them by searching
the menu system. As it is, a user has to know these functions by heart, or else they
cannot be used. There would be no harm in having them in both the menu (so they
are easy to find) and as quick functions that can be used faster than going through
the menu to find them. We will see later that the Nokia provides shortcuts for all
menu functions anyway; again, the handling of keypad lock (and the few other
functions) is inconsistent.
The general design principle that a function can be accessed in more than one
way is called permissiveness. The argument for permissiveness is that it allows
the same function to be accessed in several ways, so the user does not have to
know the “right way” to use it, as there are alternatives.
In total, there are 84 features or functions accessible through the main menu.
Menu systems on phones are all fairly similar. For the Nokia, a soft key, – (called
the navi key by Nokia) selects menu items (whether or not they are functions;
some menu items select submenus); the keys ∧ and ∨ move up and down within
menus. The correction key C takes the user up one level in the menu hierarchy.
The structure of the Nokia menu is illustrated in the summary table in figure 5.6
(p. 137). So, with reference to the figure, the function “Service Nos” can be ac-
cessed from standby by pressing – (the phone now shows “Phone book”), then
pressing – (phone now shows “Search”), then pressing ∨ (now shows “Service
Nos”) followed by a final press of – to get the desired function to work.
135
Chapter 5 Communication
Box 5.5 Permissiveness As designers, we decide how a device should support some behavior
or user task. For example, a phone should be able to send text messages. A user might want
to keep an address book, so it should be possible to write a message and send it to a named
person—so the mobile phone will use the address book to supply the actual number needed
to send the text message. This is routine. Also routine is merely implementing this idea in
the device.
The device can now do what users say they want. Unfortunately, this design is impermis-
sive. What about the user who thinks first that they want to send a message to George. If
they select George from the address book, can they send him a message? On my Nokia 8310
the answer is yes, and it can be done the other way around: the user can write a message
first, then decide whom to send it to. Fine. Except that Nokia decided to allow messages to
be saved, so that they can be reused. Now, if I want to send a saved message to somebody,
I must first select the saved message then decide who to send it to.
Impermissvely, I cannot decide whom I want to send a message to, then decide whether
to write a new one or use a previously saved message. In short, this design supports a task,
but only if it is done in the “right” order. Yet I am free—that is, permitted—to do it either
way if I want to write a new message.
Had the designers made a list of tasks the device supports (which can be done auto-
matically from its specification or program), then they could have listed the various ways
each task could be accomplished. There would need to be good reasons why some tasks
could be done only in one way. Each rule about how a device has to be used (you must do
it in this order) is a burden for the user to remember and a potential error for the future.
Permissiveness makes devices easier to use.
Permissiveness is not always a good thing, however. Sometimes permissiveness encourages
errors you would prefer to design out. Permissiveness is obviously bad when security is an
issue: you generally should not permit a user to do something before or after they have
entered their password—although you could work through the interesting design questions
for the unconventional choice of entering a password afterward.
See also sections 5.2 (p. 134), 8.7.2 (p. 253) and 12.3.1 (p. 417). For a “converse”
example, see section 11.4 (p. 387)—you usually don’t want a gun to be easy to use:
you want to be certain it will only work (that is, impermissively) when it is used
correctly, and by the right person.
All menu items have a numeric code (displayed on the Nokia’s LCD panel as
the user navigates the menu). For example, “Service Nos” can also be accessed by
pressing – 1 2 (no final press of – is required). The shortcut is an example of
permissiveness (shortcuts provide alternatives, so the value of shortcuts is another
indication that permissiveness is a good design principle).
There are some little complications in the shortcuts I shall ignore in our thinking
about redesigning the Nokia: except to say that C does not work with shortcuts
(e.g., – 2 C 1 is equivalent to – 2 1 , as if C was not pressed). It is important
to note that there is no fixed relation between shortcuts and a function’s posi-
tion in the menu, since some functions may not be supported (e.g., by particular
phone operators): if “Service Nos” is not available, pressing ∨ would move from
“Search” directly to “Add entry,” but the shortcut for “Add entry” would still be
– 1 3 and trying – 1 2 would get an error. Taking these provisos into consider-
136
5.2. Redesigning a mobile phone handset
Phone book – 1
−→
Search – 1 1
Service nos – 1 2
Add entry – 1 3
Erase – 1 4
Edit – 1 5
Send entry – 1 6
Options – 1 7
−→
Type of view – 1 7 1
Memory status
– 1 7 2
Speed dials – 1 8
Messages – 2
−→ Inbox – 2 1
Outbox – 2 2
Write messages – 2 3
Message settings – 2 4
−→ Set 1 – 2 4 1
etc
...
Figure 5.6: Extract from the Nokia mobile phone menu, showing the menu items and
their shortcut codes.
ation, our model of the Nokia 5110 is still substantial: it has 188 menu items and
84 user-selectable functions.
The table in figure 5.6 (this page) is a bit like the Huffman codes table. Apart
from the indentation, down the left we have what the user wants to do, but now it
is phone functions like “Inbox” rather than sending letters like N or Q. Down the
right, we have what the user needs to do, pressing buttons, where in the Huffman
table we had binary codes for letters. For the phone we can use more keys the
Huffman binary code had, which used only two “keys,” 0 and 1 .
As with the Huffman tree, shown in figure 5.5 (p. 129), here we can visualize
how a user chooses functions by tracing routes through a diagram, which is shown
in figure 5.7 (next page). The Nokia tree is much bigger, and I have therefore not
cluttered it up with the names of menus and functions; the tree therefore gives an
accurate impression of the complexity of the user interface without, in fact, helping
anybody use it because there are no details shown to do so.
The black dots in the figure 5.7 represent menu functions. “Standby” is at the
top of the diagram and each press of a button (∧ , ∨ , or – ) moves downward (or
diagonally downward)—if you make a mistake, not taking an optimal route to a
black dot, you go upward. The diagram has been carefully drawn so that this is
so. When you move through the diagram and land on a black dot, this is a menu
function.
137
Chapter 5 Communication
Standby
1 press
2 presses
3 presses
4 presses
5 presses
6 presses
7 presses
8 presses
9 presses
10 presses
11 presses
12 presses
13 presses
14 presses
15 presses
16 presses
17 presses
18 presses
Figure 5.7: The menu tree for the Nokia 5110 mobile phone. Phone functions are
represented by black dots, and all the white dots are intermediate menus. For clarity,
arrows going out of phone functions (e.g., many go back to “Standby”) have not
been shown. The diagram is a ranked embedding: drawn so that each row is the
same minimum number of presses from “Standby.” To use the device, start at the
top, and each correct key press will move you down a row toward the function you
want—assuming you know the optimal route and you make no errors.
As the diagram makes clear, some functions are easy to activate (they are near
the top), and some require quite a lot of pressing—these are the ones lower down
in the diagram. For clarity, I’ve deleted all arrows going out of black dots: they all
go back to “Standby” via the phone functions the black dots represent.
138
5.2. Redesigning a mobile phone handset
Figure 5.8: This is exactly the same menu structure as shown in figure 5.7 (facing page),
but shown here with a different layout. As before, standby is at the top, and each black
circle is a phone function; everything else is a menu item. Each wiggly column is an
individual layer of the menu structure—in this diagram (in principle) it’s easier to see
that a user can go backward and forward within a layer of the menu. As in figure 5.7,
the back arrows to standby are not shown, as they merely add a lot of clutter. The
arrowheads on the lines are too small to see at this scale, but graphs as complex as this
would normally be viewed with suitable visualization tools that can pan and zoom.
This diagram was drawn by Dot, the graph-drawing tool discussed in sec-
tion 9.5.1 (p. 291).
We could do some experiments, but let’s suppose Nokia has done them already,
and they chose the original menu structure accordingly to make it as easy to use
as possible. If so, then Nokia considers the function that takes 18 presses to get to
be the least likely to be used (how often do you change your phone to use Suomi,
which is what this function is?), and Suomi is less likely to be used than either of
the two functions taking 17 presses.
For the sake of argument I will take the probabilities of functions to be propor-
tional to the reciprocal of the number of button presses the Nokia takes to achieve
them. So if a function takes 17 presses, then its probability is 1 in 17 compared to
the others. In particular, this means that the second most frequently used func-
tions are assumed to have a probability of half the most frequent. Obviously all
the probabilities should add up to 1.0, since the user is certain to do one of the
possibilities. Looking at the numbers in the table in figure 5.9 (next page), the rel-
ative probabilities for the Nokia are 1/1 (“Search” is the most popular function), 1/2,
1/ , 1/ (several functions come equal second) . . . 1/ , 1/ (the hardest thing to do is
2 2 15 16
139
Chapter 5 Communication
Figure 5.9: The functions of the Nokia phone and the least number button presses
required to get to them from “Standby.” The table is ordered starting with the easiest
function first. The probabilities are proportional to the reciprocal of the function rank:
so, for example, we assume the user wants to do “Inbox” half as often as they want to
do a search.
140
5.2. Redesigning a mobile phone handset
Box 5.6 Card sorting For the mobile phone example, Huffman creates an efficient menu
tree grouped one way, and Nokia has a menu tree grouped another way. The text shows that
there is a way to combine the advantages of both, but if there are two different ways to do
something, why not consider more? There may be ways that are more appropriate to what
either the designers think or what the users think about the structure of the menu tree.
Card sorting is an easy technique that can be used to find tree structures that people
prefer for any set of concepts, such as menu items. Card sorting can also be used to help
design web sites—the items are then web pages.
First, write the names of the items on cards. Use a large font size so that the descriptions
can be read easily by participants in the card sorting when the cards are spread out on a
table.
Next get some participants, around six to fifteen people. It is advisable to use card sorting
with designers and users: you then know both what the structure ought to be, and what the
intended users think. Card sorting can be used for helping design a company’s web site—a
classic mistake is to think that the users are people in the company—the people who know
it best—when in fact, the users are people outside the company.
Shuffle the cards, then simply ask the participants to group the cards in any way that
makes sense to them. If you get the participants to name groups of cards, you can then
replace those groups with new cards and thus create a tree hierarchy of the concepts.
If a group of people are card sorting, it’s likely that there will be disagreements on how
to sort the cards into groups. Usually renaming some of the items will help—card sorting
thus not only provides a structure but also better names for the items.
Behind the simplicity of card sorting lies the statistical method of cluster analysis, which
can be used when you have a big sorting problem or wish to combine data from many people.
There are many programs available, such as IBM’s EZSort, that can help in card sorting,
and do the cluster analysis for you.
With a well-designed device, there is no need for the structure to be fixed; for instance,
it could be redefined, either by the individual user’s preferences or by uploading different
firmware.
For a warning to designers not to be oversimplistic when card sorting, see the
discussion on cities and trees in section 8.7.2 (p. 253).
speakers to do just this but mobile phone users have to use what they are given.
Thus it would be sensible for Nokia to have arranged the phone so that common
things are easy to do, so the design should follow a Zipf distribution. We are now
taking Zipf’s principle of least effort the other way around: assuming the phone is
efficient, then Zipf’s principle implies using the probabilities we have chosen.
When we design what we hope will be a better menu organization in the next
section, we will be comparing designs using Zipf distributions. However, we
could redesign the menu structure using any distribution we pleased; Zipf is just
one plausible choice. If we had better data, or a better theory of phone use, we
would be able to design even better systems—maybe even ending up with Nokia’s
own design, if they happened to use the same theory. Our approach to design does
not depend on the right theory; in order to be used it just needs some probabilities.
141
Chapter 5 Communication
Code Function
– 1 1 search
– 1 2 service nos
– 13 add entry
– 14 erase
– 15 edit
– 16 send entry
– 17 not allocated
– 1 7 1 type of view
– 1 7 2 memory status
– 1 8 speed dials
– 1 9 not allocated
– 2 1 inbox
... ...
Figure 5.10: Examples of numeric shortcuts for accessing menu items. From “Standby,”
if you press – 1 6, for instance, the phone would get to the menu item “send entry,”
which is faster than pressing – and then using the ∧ /∨ buttons to go six entries then
pressing – again to get to the desired state.
142
5.2. Redesigning a mobile phone handset
same thing on the new phone as on the old phone. If so, they can’t just use a
Huffman code to allocate the codes; there are constraints. My guess is that an
overriding constraint is that it is easier to program the shortcut codes the way
Nokia did; I can’t think of any usability reason for them to be as they are.
Perhaps the Nokia shortcut codes are important—say, so that new phones are as
easy to use for users who have already learned the codes from older phones. If we
don’t want to lose the existing codes, we can exploit the fact that the Nokia short-
cut codes don’t use all the codes that are available: we can squeeze a Huffman code
into the remaining shortcuts that haven’t been used by Nokia. Not surprisingly,
a squeezed Huffman code does worse than an unrestricted Huffman code using
ten digits freely, but surprisingly at an average of 3.09 keystrokes to do anything
it does better than Nokia’s original shortcut codes.
We can now have a phone with Nokia’s original shortcuts, to preserve whatever
benefits they have, and we can have a faster Huffman code. The two codes can
permissively coexist. Users can use the Huffman code whenever they want to be
faster—which is presumably the point of shortcuts.
For some functions the original Nokia codes will be better than the squeezed-
in Huffman codes, and if a sensible user uses the shorter of the two (if the new
Huffman codes are worse, we don’t need to say what they are: just provide one
shortcut code for the function if the Nokia shortcut is the better), the average drops
to a mere 2.69. In other words, if we want to design a well-structured menu system
(following Nokia’s original hierarchy), also retaining Nokia’s original shortcuts
codes, we can still make improvements to the user interface for experienced users.
Furthermore, the design improvements can be determined automatically, merely
by providing the usage data.
Design Best case Worst case Average
Original Nokia 3 18 7.15
Huffman, 3 key 3 5 4.04
Nokia shortcuts 2 5 3.39
Unallocated Huffman 2 4 3.09
10 digit Huffman 2 4 2.98
Shortest 2 3 2.69
We’ve got an improved user interface design for the Nokia mobile phone with-
out losing any of the (real or purported) advantages or features of the existing
design.
There remain two intrigues. First, why do the shortest codes have a worst case
(longest, hardest-to-do function) of 3 key presses, compared to 5 (for Nokia) or 4
(for Huffman)? Fortunately the hardest functions to do are different for the two-
code systems; as it happens, the hardest functions to activate using Nokia’s own
shortcuts can be done more efficiently by the Huffman codes.
The second intrigue is more subtle. The combined shortest code, which is based
on Nokia and Huffman, does better than the unrestricted 10-key Huffman code.
The combined code has an average of 2.69, but Huffman got 3.09. How can a com-
bined code be better than the theoretically best code? The answer lies in the fact
that Nokia’s shortcuts aren’t prefix-free—they rely on the user pausing between
143
Chapter 5 Communication
Knowledge %
25
20
15
10
Figure 5.11: A cost-of-knowledge graph for the Nokia 5110 function menu. The dashed
line is based on Zipf weights; the solid line is based on uniform weights—thus the graph
shows that a user takes longer to learn this device if they press buttons equally often,
rather than following Zipf probabilities.
codes. The pause amounts to using an extra PAUSE key, as it were—so the Nokia
shortcuts are really using 11 keys. It is not surprising they can be a bit more effi-
cient. A Huffman code with an extra key could do even better too.
Further tree comparisons are made in section 8.7.4 (p. 256), which compares
balanced and unbalanced trees.
144
5.4. Make frequent things easy, unlikely things hard
about behavior), or purely analytically, as we now do. The graph can be drawn
from any data; a designer would use the best available. The more realistic the
probabilities used, the more realistic the evaluations and new design insights that
can be drawn from them.
A cost-of-knowledge graph for the original Nokia menu system is shown in fig-
ure 5.11 (facing page). The solid line shows an unweighted cost of knowledge
graph, taking every user action as equally likely. Weighting (by the Zipf probabil-
ities) gives a more realistic measure of knowledge, which is shown by the higher
dashed line.
We can see that in “average” use to achieve a coverage of 25% of the device’s
features takes 455 button presses. (This figure does not translate nicely into a
time, since the cost-of-knowledge assumes the user acquires knowledge and thus
pauses in each new state.)
The dashed Zipf line is higher because it discounts features Nokia has made
hard to use. Since the user is presumably less interested in some functions than
others, the Zipf probabilities reflect this well. The differences between the two
lines make a case for getting good data from experiments; nevertheless, it is worth
noting that the shape of the two lines is similar, and a designer may want to gain
insights from the shape rather than the absolute values.
Just as the functions can be weighted by how likely it is that the user wants
them, the keys too can be weighted by how often they are used. For example, we
took the probability of pressing the keys to be equal; the probability of pressing
the C key is 0.25, which is possibly too high, since a user would only use it for
corrections. If we make using the C less likely, then the graphs would show the
devices to be easier to use, because the less often users correct errors, the sooner
they can find out how it works. On the other hand, if we took the probability of
pressing C to be zero, then the device would become impossible, for once users
have got to a function, they wouldn’t be able to back out of it and find another.
Figure 5.12 (next page) shows that the cost-of-knowledge graph for the Huffman
tree interface is considerably better (in terms of speed of access to the device’s
functions) than the original design; for example, it achieves 25% coverage after
only 168 presses (compared to 455 presses for the original Nokia). Even so, the
model still overestimates the costs because of the assumption of pressing the C
key with probability 0.25.
145
Chapter 5 Communication
Knowledge %
50
40
30
20
10
Figure 5.12: Cost-of-knowledge graph for the redesigned Huffman tree interface—as in
figure 5.11 (p. 144), the dashed line is Zipf weights; the solid line is uniform weights.
For comparison, the lower gray region represents the range of data from the original
Nokia as shown in figure 5.11. Evidently, for any number of button presses, the user
learns about twice as much as they could about the original design; also, the difference
between random pressing and efficient Zipf pressing has been reduced—the new design
is easier to learn even when you don’t know how to use it efficiently.
looking it up in a user manual. Instead, then, we might make some codes longer
to make them more memorable or to make it harder to activate some things by, for
instance, a single accidental key press.
The designers of the JVC HRD580EK PVR made an infrequent and hard opera-
tion too easy to do. The remote control has a single button that advances the clock
by one hour—in many countries the time changes by one hour for summer saving.
This makes a normally hard job very easy to do. Conversely, press the button for
two seconds and the device’s clock goes back an hour, for winter saving. These
operations need only be used once a year each, which is why they are worth sim-
plifying because most users won’t remember how to adjust the clock if they only
do it once every six months. But the operations are too easy. The device is often
running several hours fast because the button can be tapped by accident. (Also,
for no reason I can understand, the clock advance function is the same button as
the transmit button, which is used frequently for other purposes, so if you press
Transmit when the remote control is not in the right mode, you adjust the time in-
stead.) On the other hand, some facilities that are used far more often than once
a year are designed to be much harder, and involve complex button sequences.
These functions are hard to do by accident, but they are unnecessarily hard to do
deliberately!
146
5.5. Efficient but more usable techniques
Box 5.7 The principle of least effort The least effort principle automatically solves a
problem of closure. Closure is the sense of finishing a task; typically someone persists in
doing something until they get closure. Problems arise when they get closure before they
have finished—these problems are often called post-completion errors, because the error is
caused after the person thinks they have completed their task.
If a user sets up a PVR to record a program at a later time, it is necessary to press the
Timer button to tell the PVR to enter timer mode; otherwise, the PVR remains under direct
control and will not record the program at the set time. This is a closure problem, since the
user thinks they have completed the task of programming on transmitting the time details to
the PVR (they have reached psychological closure). Thus, they may not think to press the
Timer button, and there is no cue to prompt them to do so. In least effort terms, the user
is more likely to want the PVR in timer mode after entering program recording times than
not. Thus the code for entering program times and timer mode should be shorter than for
entering program times and not entering timer mode. This directly leads to a design where
there is a button to be pressed to exit timer mode after transmitting timer instructions.
Post-completion errors are also discussed in section 11.4 (p. 387). Another example
is given in box 6.1, “Syringe pumps” (p. 168).
Summer saving isn’t the only problem. Every remote control has an Off button,
which the designers know won’t be needed as much as any other button (after all,
you only need to press it at most twice for anything you want to do), yet it is
as easy to press as anything else. Once you switch the device off, on the badly
designed devices, you’ve lost part of the setup and have to start again. To make
things worse, the On button is often made easy to find, which makes sense, but
most remote controls make on and off the same button—so the Off button ends up
being too easy to use.
On a mobile phone, at least in the UK, every number dialed is short (such as
999 or 111) or begins with a zero or plus sign (an abbreviation for the international
prefix). Why doesn’t the phone automatically insert a 0 before any long-enough
number that the user hasn’t already started with a 0 or + and make it easier for
them? This idea is another example of making frequent things easy. It’s ironic how
easy it is to overlook ways of making things easier.
This PVR is also discussed in sections 3.4 (p. 66) and 10.3 (p. 330).
147
Chapter 5 Communication
The digits on a mobile phone can be used for writing text messages. So, for
instance, the digit 2 has letters A, B, and C on it; the digit 3 has D, E, and F on
it, and so on. Text messages are sent by pressing digits the appropriate number
of times to choose letters. You could send the message ACE by pressing 2 once,
pausing, pressing 2 three times, then pressing 3 twice—a total of 6 presses and a
pause. This is an unambiguous but slow way of typing. (Note that this technique
is not prefix-free—you would not get ACE if you pressed 2 four times then 3
twice.)
There is a way to do better. The sequence of key presses 2 2 3 can only spell
out one of the following 27 three-letter sequences: AAD, AAE, AAF, ABD, ABE,
ABF, ACD, ACE, ACF, BAD, BAE, BAF, BBD, BBE, BBF, BCD, BCE, BCF, CAD,
CAE, CAF, CBD, CBE, CBF, CCD, CCE or CCF. Only three of these 27 sequences
are real English words, which are those that I’ve underlined.
If we are sending English messages, then 2 2 3 can only mean one of three
words: ACE, BAD, or CAD.
It should be easier to choose from three things than from 27. We saw above that
the obvious way to use the keys 2 2 3 to send ACE requires 6 presses. Instead,
we can press 2 2 3 then take at most two more presses of another button to choose
ACE from the choice of the three properly-spelled alternatives. If the choices were
in alphabetical order, this example would fortuitously work with no extra presses;
ACE is the first word to match 2 2 3 . In general we may not be so lucky, and
it would make more sense to order the words in terms of likelihood—again using
the principle of making the most frequent things easier to do.
Three versus six presses (and some pauses) is a small saving, but if we were
trying to type a longer word, the saving gets better. The longer the words, the
fewer properly spelled plausible words there are to choose from—and if there is
only one properly spelled match, then the user does not have to press anything.
That’s a nice saving.
There are various schemes for typing on mobile phones like this. Tegic’s T9 sys-
tem is probably most familiar (www.tegic.com). Seasoned text users devise their
own codes that are even more efficient than using conventional English. Words
like CUL8R (“see you later”) and abbreviations like IMHO (“in my humble opin-
ion”) and :-), a smiley face, convey large concepts in few keystrokes. A sensible
text system would allow users to add their own words (including names) to the
dictionary of possible matches.
Tegic’s T9 is used to speed up writing text messages, and we can use the same
idea to type not English words, but words from the list of functions the phone
supports. Instead of English, then, our dictionary will have words like “search,”
“incoming call,” “inbox,” and so on. Instead of typing words, we’ll be controlling
the phone.
Imagine we are using the Nokia 5510 again. We press – first—just as we would
to start a standard Nokia shortcut. Now, however, we spell out the function we
want using the letters on the digit keys. So if we want the inbox function, we type
4 6 2 6 9 ; in fact, inbox is found as the unique match well before we’ve needed
to type all these digits.
148
5.5. Efficient but more usable techniques
So the new design has a systematic way of searching through all function names,
and it provides this feature “for free” as part of the improved approach to selecting
specific functions.
5.5.1 Modelessness
When new features are added to a design, they often either introduce the need for
more buttons (which will cost money and take up space), or they make the device
more complex, by introducing more modes. In one mode, the device works as
before, and it the other mode, the new feature is available. Since there are fewer
buttons, the device is cheaper to make and looks simpler. On the other hand, a
user may try to use the device when it is in the wrong mode for what they want
to do: modes make errors more likely—worse, what might have been a harmless
slip in the right mode, may do something unwanted and unexpected in the wrong
mode.
149
Chapter 5 Communication
Box 5.8 Dictionaries Samuel Johnson’s invention of the dictionary was a major design feat:
he had to invent alphabetical ordering, and use it consistently for every word in the dictionary.
The first edition was printed in 1755, following about ten years of work. Dictionaries are a
very good example of the need for appropriate structure in user interfaces.
How often have users complained that there are too many features in their mobile phone,
office photocopier, or other device? “If only it did what I wanted and no more, I wouldn’t
get confused! I’d be able to find what I want!”
A dictionary would be almost useless if it only had the words in it you wanted today.
Instead, dictionaries have millions of words in them—and, generally speaking, the more the
better—within limits of size, weight and cost. Since words in a dictionary are in alphabetical
order, you can find (mostly) what you want and not accidentally look up the wrong word. In
contrast the more “words” or features your photocopier has the harder anything is to find.
What gadgets need, then, is structure, like alphabetical order.
Johnson’s dictionary would still have been useless if its users didn’t understand alphabet-
ical order, and of course, it would have been useless if he, its designer, didn’t understand
order either. We need design structures users must be able to understand too.
And there’s another twist: there are some design structures that are easy to design with,
and easy to understand—trees being a good example—but they compartmentalize features
in ways that may not be what the user expects, so they are surprisingly hard to use. The
designer might think the interface is well-organized, which it is, except the user doesn’t know
what that organization is.
See the discussion on trees in section 8.7.2 (p. 253).
150
5.5. Efficient but more usable techniques
151
Chapter 5 Communication
152
5.5. Efficient but more usable techniques
time. If we made mobile phones easier to use, it would make it easier to smoke.
There are good reasons not to make using mobile phones easier!
Mobile phones are often stolen. Why aren’t they sold in pairs with radio tags
so that they won’t work when they are more than a few meters away from their
owner—or so that they squeal loudly when they are stolen or left behind by acci-
dent? Apart from issues of theft, many devices need to be differentially easy: au-
thorized people should find them trivial, unauthorized people should find them
impossible.
Games would be boring if they were too easy: their difficulty has to closely
match the user’s competence. Finally, some devices are intrinsically dangerous, or
have features that are safety related and should not be used by accident—which
means they must not be too easy to use.
See section 6.2.1 (p. 176) for a case study of a gun.
153
Chapter 5 Communication
makes no assumptions about the kind or frequency of user’s tasks), it is clear that
certain sorts of task are better served by different sorts of user interface design.
The actual DF200 design is sub-optimal for almost any task.
The average cost is the cost to access any function averaged over all functions,
assuming the user makes no mistakes and knows the most efficient way of
doing everything.
The maximum cost is the worst key press cost (assuming the user knows the
most efficient way to do anything and makes no mistakes).
The complete search cost is the cost to look for a function that the user knows is
present but does not know how to access except by searching for it. Again, this
is assuming the user knows how to use the device and makes no
mistakes—they never get lost or go round in circles.
To know the complete search cost requires solving the Traveling Salesman Prob-
lem for the device, which is not easy to work out.
The traveling salesman problem is discussed in section 8.1.2 (p. 231); it is also
compared to the Chinese postman tour, its dual. Chapter 9, “A framework for
design,” explains in detail how to work out most of these numbers from a
specification of the device design.
If a measure like the cost of a traveling salesman problem is difficult to work out
in theory, it must be very difficult for a user to work out—users are unlikely to
know or be able to keep track of optimal algorithms. In fact, there is no reasonable
way users can solve the complete search problem without making mistakes on the
DF200, and hence real users would do much worse than the table suggests. Put
another way, if “complete search” is a likely task, then the user interface should
be structured in such a way that a solution—or good enough solution—is easy to
find. Unfortunately, this is certainly not the case with the DF200.
The different designs compared in the table are:
Linear search Linear search requires only a very simple three-button user
interface, ∧ , ∨ , and Doit . Pressing one button takes the user through all the
functions, one by one; the other direction command is needed in case the user
overshoots what they want; and of course Doit does the selected command for
the user. On average, finding a command the user knows about, such a simple
design like this is harder to use than the original DF200, because there is no
particularly fast way to find any command. On the other hand, if the user
needs to search through all commands, it is much easier (49 presses against the
original DF200’s of 117), because all the user has to do is press ∧ (or ∨ ) until
they find what they are after. On average, a user would find a command half
way, so on average they’d take 24 presses.
154
5.5. Efficient but more usable techniques
Texting the function Using the scheme introduced for the alternative Nokia
design, in section 5.5 (p. 147), the more keys the user has the faster they can
home in on their chosen function. The average cost is higher because the user
has to spell the function name, and the function names were not chosen to be
brief.
Binary search One way to search efficiently is binary search. With a binary
search, the DF200 would show the user a command in the middle of the list in
its display panel, the user would indicate whether the command they are after
is alphabetically before or after this command. It is possible to do away with a
Doit key, but having one allows the user to correct mistakes before selecting a
function to do. The user would carry on until getting to the desired command.
This is faster than linear search: in general, if there are n commands, a binary
search will reduce the search time from n steps of a linear search down to
log2 n steps.
There are good and bad binary searches; see section 8.7.4 (p. 256), which
introduces balanced trees—as opposed to the unbalanced, bad trees.
Direct access The DF200 is a desktop-sized device, and it could have a list of
functions printed on it, each one numbered. The user could read this list then
enter the corresponding number. As there are almost 50 functions, functions
can be accessed directly by pressing at most two digits. For this design, the
table’s figure for the complete search cost is a bit of a cheat, because the user
has to search by eye through the printed list—the number of key presses
(rather than the total time taken by the user) is still the same, 1.9, however. As
before, a Doit key would allow users to confirm choices or if necessary correct
mistakes, which for this design would be a good idea as the fax responds to
commands so fast (under 2 keystrokes on average) that it may already be
doing things before the user notices.
What we’ve called direct access here is the same as using shortcuts, a feature
the Nokia mobile phone uses, as discussed in section 5.2 (p. 134) earlier in this
chapter; shortcuts give direct access to the corresponding functions.
From these five different user interface designs, we can say that direct access seems
best, though it requires a help sheet or a very skilled user who has memorized the
codes. Binary search seems next fastest, but it’s user interface is not very intuitive.
Linear search is very simple to use, but slow. Some of the approaches, such as lin-
ear search, are so simple and use so few keys they can be combined with other ap-
proaches, and thus achieve the advantages of both. Once you start thinking about
ways to design a user interface, there are lots of possibilities—Huffman codes, for
instance—and it isn’t difficult to compare them and see what tasks they are better
at for the user. Our quick survey of different approaches forces two questions:
where do new design ideas come from, and why didn’t the DF200 designers use a
better approach?
155
Chapter 5 Communication
156
5.7. A computer science of interaction
If we added up all the time people waste on messing around with faxes, a few
moments thought spent on the user interfaces would reap rich rewards for hu-
manity.
157
Chapter 5 Communication
their problems. Designing user interfaces can also be seen as designing computer
hardware: what are the basic operations a user (or a computer) needs to run good
algorithms? And information theory (particularly Shannon’s information theory)
underlies everything.
Shannon’s information theory was briefly reviewed in box 5.3, “Information
theory” (p. 127). From chapter 6, “States and actions,” onwards we’ll start to
explore exactly what the user interface “structure” is, and how we can think
about it clearly to help make better designs.
5.8 Conclusions
Mobile phones bridge to the past when they ring with their distinctive • • •
• • • sound, announcing the arrival of a Short Messaging System text message,
using Morse code to say SMS. This chapter, too, linked to the past with a brief
history of communication from the earliest times. We covered the history of long
distance communication, covering a period of three millennia, from hilltop fires to
digital mobile phones. The earliest example covered 500 miles; the last example
covered the inches between your hand and your mobile phone. The insight is that
the general theories of communication, which we illustrated with Morse and Huff-
man codes, apply equally to conventional communication problems as well as to
modern user interface design problems. We saw that the theoretically-motivated
design ideas did better than established market-leading commercial designs.
To conclude it’s worth underscoring two points:
User interface design can always be much improved by a little programming,
abstract, thinking
Success in design depends on usability—from issues of the balance of training
versus usability we saw in the earliest telegraphs, to present-day mobile
phones, where the right design balance depends on the task the user is doing.
When users want to use a device, they need to find the device’s functions and
access them. From an interaction programming point of view, this is a search
problem: the user interface has to be searched until the user has found what they
want. We know from programming that there are many ways of searching—linear
search, hash codes, binary search, to name but a few. Many of these techniques can
be taken over into the user interface and assessed for their effectiveness for a user
and real tasks. We found, for instance, that a technique based on hashing made a
mobile phone’s user interface much easier to use.
158
5.8. Conclusions
159
Chapter 5 Communication
There are very many books on algorithms, almost all of which include chapters on
searching.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. Introduction to
Algorithms, MIT Press, 2nd ed, 2001, is an excellent undergraduate reference
covering almost every relevant topic.
Knuth, D. E., Sorting and Searching, Addison-Wesley, 2nd ed, 1998—which is
volume 3 of Knuth’s massive The Art of Computer Programming. Volume 1 of
The Art of Computer Programming gives excellent details of Huffman coding.
Knuth’s The Art of Computer Programming is the most detailed and encyclopedic
reference book in computer science.
MacKay, D. J. C., Information Theory, Inference, and Learning Algorithms,
Cambridge University Press, 2003, is one of the best and most wide-ranging
books on information theory I know.
Shannon, C. E., and Weaver, W., The Mathematical Theory of Communication,
University of Illinois Press, 1998. Claude Shannon’s epochal work is still in
print.
Skiena, S. S., The Algorithm Design Manual, Springer, 1998, is an excellent
resource giving details of a huge variety of algorithms, as well as their
implementation on a CD and on the web.
160
WILL HARWOOD
6
States and
actions
Interactive devices and systems range from implants to wearable computers and
from clocks in our central heating systems to flight management systems in air-
craft. Behind all interaction programming is a simple, core idea struggling to get
out. Behind all unnecessarily complex interaction programming is a simpler idea
struggling to be heard.
This chapter introduces and explores the key concepts behind interaction pro-
gramming, which hinge on finite state machines. It turns out that finite state ma-
chines are easy to draw, and drawing them gives us more insight into design—
finite state machines are pretty good at showing what users do too, that is, they
can be used for more than design—they can be used for understanding.
164
6.1. Key concepts
165
Chapter 6 States and actions
Figure 6.1: A familiar example of being moded out by a dialog box, here on an Apple
Macintosh computer. To continue, the user has to choose OK. The program displaying
this dialog box will allow the user to do nothing else; the user has been moded out.
Here the user tried to change the name of a file, but they chose a name that was already
taken. The dialog box tells them so, but the user is unable to look at file names to find
out what the problem is; nor can the user try any other solutions, such as deleting or
renaming the other file.
typically offering them a simple choice (the clearest example is the intrusive
dialog box that says something . . . OK? and that’s the only choice, to click on
OK). When this happens, all other actions are inaccessible! Figure 6.1
(this page) gives an example. Fortunately this is becoming less of a problem on
computers, but it is still a real issue for interactive devices, like DVD players
and ticket machines.
Moding out is not always a problem. On a gun, to give a stark example, you
might want a safety catch, which modes you out of the possibilities of unsafe
or accidental use.
State In a given state, every action always works in exactly the same way (though
perhaps doing nothing). In contrast, modes need not be very specific. For
example, the “on mode” of a device only tells you what the Off key does—it
doesn’t tell you exactly what anything else will do, which knowing the state
does.
Thus a mode tells you what one or perhaps some user actions will do, whereas a
state tells you everything and exactly what the device will do. A state tells you what
any button will do.
For example, all televisions have two very important modes: on and off. A TV
can be in the on mode or in the off mode, and the TV modes broadly define what
user actions will do. When a television is off, the only button that will work is
generally the On/off button, which should switch it back on. If the TV is off, most
buttons will do nothing, but when it is off, it can be switched on. When a TV is on,
the On/off button will switch it off, but that’s just about all we know about it. To
know what all the other buttons do when the TV is on, you need to know more—
you need to know more about the states or other modes within the on mode. The
on mode only tells us what On/off will do; it doesn’t tell us what any other buttons
will do.
166
6.1. Key concepts
Figure 6.2: As well as being modey, the dialog box in figure 6.1 is rude—rude in the
sense of section 6.3.4 (p. 185). This dialog box shown here precedes it, giving you a
choice that the computer will immediately forbid: if you choose “Use .html” it will tell
you that you must choose another name, using the dialog box of figure 6.1. This dialog
box is giving you a choice that it won’t let you take.
167
Chapter 6 States and actions
Box 6.1 Syringe pumps A syringe pump is a device designed to operate a medical syringe
automatically, to give a patient a controlled dose of a drug over a period of time. They
are often used in pain control, where regular doses of painkiller is needed. Some pumps are
portable and can be used and carried around by a patient.
In one case, a pump was set to deliver 0.1ml/hr (a tenth of a milliliter of drug per hour) to
a baby. However, the syringe pump had not been activated to start delivery of the drug. The
nurse had told the pump to deliver 0.1ml/hr and then completed her task. If you (pretending
to be a pump) had been told to deliver 0.1ml/hr, you’d be expected to do it. The pump
didn’t, and wanted further confirmation from the user—causing a post-completion error.
After two minutes, the pump starts beeping, to warn the user that it has been told to do
something that hasn’t been confirmed. To stop the noise, the nurse reached over and pressed
a button to cancel the beeping. Unfortunately,
the baby died shortly afterward.
The nurse had pressed not Cancel but 10 , which canceled the beeping, but unfortunately
also started the pump delivering at 10.1ml/hr—a hundred times faster. This was clearly a
design error, particularly as post-completion errors are well understood.
Post-completion errors are discussed in section 11.4 (p. 387). Another example is
discussed in box 5.7, “The principle of least effort” (p. 147).
Other errors with syringe pumps have included the following, all avoidable by better
interaction programming:
7ml/hr entered as 77ml/hr, by pressing 7 twice by mistake.
0.5ml/hr entered as 5ml/hr. The nurse here didn’t press · hard enough, and didn’t
realize the mistake.
10ml/hr entered as 200ml/hr. Here the cause was a partly-obscured display. Displays
should be legible, and sound feedback should be provided—using a synthesized voice
would be easy.
A pump defaults to 0.1mg/ml, but the syringe was filled with a concentration of
1mg/ml, so the patient got a dose ten times too high.
since no interesting effect happens until several actions have happened, the entire
sequence of actions seems a single event so far as the user is concerned. Whether
this matters or not is a question for the user, designer, and application. In the
long run, a user might be confused and then make innocent mistakes. Chunking
is also an issue for us, as interaction designers. Do we want to treat every state
as different, or do we want to gloss some differences? We might treat playing all
the tracks on a CD as different states, but do we also want to treat all the positions
within the tracks as different? Are we even worried about different tracks? On
the one hand, we don’t want to be overwhelmed by detail, on the other hand, we
want to be precise and design good systems.
168
6.1. Key concepts
ularly useful form of finite state machine, or FSM. Finite state machines turn out
to be very useful.
Many things can be described using states and actions.
A web site has HTML pages (states) and links to other pages. Links are the
actions to go to other pages.
A flashlight has states, such as ON and OFF. It has actions, such as SWITCH
ON and SWITCH OFF.
An aircraft landing system has numerous states, including ON GROUND and
IN AIR. Its actions will include LAND and TAKE OFF.
A television set has states such as OFF, WATCHING CHANNEL 1,
WATCHING CHANNEL 5, and it has actions such as ON/OFF (a combined
switch on and switch off) and NEXT CHANNEL.
A dog has states such as sitting, walking to heal, lying down. It has actions,
usually voice commands, “sit!” “heal!” and so on. Many dogs have modes,
such as “obedient,” and “deaf.”
A computer has lots of states, and the actions on it include mouse clicking as
well as switching it on and off.
Many systems, from mobile phones and televisions to car radios, and desktop
computers have menus. Each menu item is a state: the state where that menu
item is displayed (or said) to the user; each action is either selecting another
menu item, or activating the current selection.
169
Chapter 6 States and actions
Figure 6.3: Interactive devices and web sites are both finite state machines and are
thus essentially the same thing. Here, a trivial two-page web site behaves exactly like
a simple flashlight. The web site has a page for the flashlight being on, and a page for
the flashlight being off. The two pages are linked together by the form buttons, which
are behaving like the flashlight’s switch.
activities the computer is up to that have nothing directly to do with the present
task.
A gadget can be understood using states and actions, and so can a web site.
Can we get any mileage out of this similarity? Yes, we can. You could build
a web site that behaved exactly like a gadget: when you click on the web site
(using a browser) you would be taken to new pages that show you what would
have happened if you’d pressed the corresponding button on the gadget. The web
site might be just descriptive text, but it could be made up from lots of pictures
(perhaps image maps), and if you did it really well, the web site would look and
behave exactly like the device it described.
We will build web sites from devices in chapter 9, “A framework for design.”
You could imagine having a web site that looked like the gadget and a corre-
sponding web site that told you what the gadget was supposed to do in each
state. In fact there could be the obvious links between the two sites, and from any
picture of the user interface you could immediately get the descriptive help for
that state. For any bit of the help text, you could immediately transfer over to the
simulation of the gadget, and it would be in the right state.
Of course it’s quite irrelevant really that these things are web sites. We could
have a real gadget that does exactly what it is supposed to do, and it could have
a help button on it that displayed the “web help” on the gadget’s own screen. If
it was something like a DVD player or video recorder, it could use the TV screen
that’s already there.
This easy way to slide between different representations is the power of finite
state machines. It doesn’t matter whether they are DVD players, TVs, web sites,
computers, or bits of string.
170
6.1. Key concepts
171
Chapter 6 States and actions
Turing Machines are not the only possibility. There are very many alternatives to
finite state machines and although they are important topics in their own right, this
book doesn’t cover pushdown automata (PDAs), Petri nets, and a variety of other
viable alternatives, such as using programming languages (including modeling
languages like CSP, Spin, SMV, and even regular expressions) that compile to finite
state machines.
172
6.1. Key concepts
continuous states without going into detail. For example, we can consider the
volume on a TV as either off or on, even though in the on mode there might
really be a continuum of volume levels. For the purposes of design (and use),
all those levels should work the same way—they are just one mode. In fact, it
would be a pretty peculiar design that had a continuous feature that somehow
behaved in a wide variety of ways and therefore needed lots of modes to
describe it.
Forbid If a device requires more than a dozen or so states, then it is going to be a
complicated device; if it needs thousands of states, then it is a very complicated
device. As designers, we can decide that we are not going to countenance
building overly complex devices. It is always best to begin with a device that is
a small simple discrete system, then expand it with features if necessary.
Strategize . . .
The fourth choice is to think at a strategic level—see part III and particularly
chapter 12, “Grand design.”
One of the main continuous issues we cannot avoid handling is time. On many
devices, how long a user presses a button or how long a user waits before doing
something is a key part of the design. How then can we handle time?
One approach is to say “time is bad” and refuse to design with any timing con-
straints. In fact, this is such a popular approach that systems like this have a special
name: a system that essentially takes no time at all to respond to actions applied
to it is called a reactive system, because it can be thought of providing instant
reactions to the user.
Generally, pure reactive systems are easy to use because the user doesn’t have
to understand or worry about any timing issues. If a button is pressed, it does
one thing. It doesn’t matter how long or briefly the user presses. If you leave a
reactive device alone, it won’t forget what it was doing; the user can have a break
and return to exactly where they were.
Reactive systems are much easier to use because things mean the same however
slow (or distracted) or however fast the user is.
The problem is that reactive systems are an idealization; many common systems—
like the mobile phones—do have delays and timing issues, and we must know how
to proceed. People don’t answer their phones immediately, so there is a delay be-
tween calling somebody and their answering (if they do). Or, when you switch
a mobile phone on, it takes a moment to get working fully. If a user asks a DVD
player to rewind its tape it can’t happen instantly. In particular, with a DVD player
if a user does Rewind Stop they can expect something quite different to happen than
if they do Rewind wait Stop . In short, most things are not perfect reactive systems.
Timeouts are discussed in box 6.2, “Timeouts” (next page).
We can make time itself into an action. A clock can tick, and ticks can be
actions—the user doesn’t have to do all actions on a device. The device can be
acted on by its environment, and time is one of the things that can act indepen-
dently. If we wanted to be imaginative, we could have a button on a device called
173
Chapter 6 States and actions
Box 6.2 Timeouts Many systems use timeouts: when the user does nothing for a second or
two, or takes several seconds pressing a single button, different things happen.
The main case where timeouts are useful—but often misused—occurs in walk-up-and-use
devices that are intended to be used by a queue of people. The second person does not want
to start with the mess left by their predecessor, so a timeout is used to reset the system
for the next person. Actually this would be far better done by a heat or other sort of body
sensor than a time out: what happens if the first person is too slow or the second too fast?
When trivial things like airport toilets use proximity detectors to flush themselves when you
move away from them, why can’t more sophisticated gadgets? Why can’t ticket machines
detect people who are present and using them, or when they walk away and stop using them?
Why guess with timeouts when you can do better?
My favorite example is my old JVC video recorder, which had a timeout that would reset
the device when you were setting up UHF channels. In the couple of minutes you took to
follow the manual, which you need to do something complex like setting up UHF channels,
the video recorder would have timed out—guaranteeing you couldn’t follow the manual’s
instructions!
Section 3.8.3 (p. 75) described timeouts for a washing machine, and box 6.4, “Bad
user interfaces earn money” (p. 191) shows the problem with burglar alarms.
Ironically, the burglar alarm problem arises with a device that does have a sensor
that “knows” when the user is around!
Tick and we’d employ a little gnome to press this button every second (or every
millisecond, or whatever was necessary).
This hidden action idea is sufficient to treat timeouts in principle, but it is rather
messy because it creates an enormous number of states to keep track of how many
times the Tick action has occurred.
Another sensible approach, the one we’ll use, is to imagine clever gnomes inside
the device. These gnomes have a stopwatch, and decide, using their detailed and
intimate knowledge of the device design, what actions the user is really doing. If
the user presses the Off button, and the gnomes start counting. If the user holds
the Off for three seconds, the gnomes say, “OK, that wasn’t and off action, it was
a really-off action!” If the user presses Off Off in quick succession, the gnomes
say, “OK, that was a double-off action, not two separate off actions.” Now the
finite state machine (there aren’t really any gnomes) has actions like one-off, really-
off and double-off. Instead, the user’s complicated, time-dependent actions are
translated into the device’s internal actions.
Henceforth, we will only talk about internal actions. We want to gloss over
gnomes; we will talk about internal actions as if they are user actions. It is much
easier, at least for most of our exposition in this book, to say things like the user
presses a button Off , and the action off occurs, or presses a button Double-Off , and
the action double-off occurs.
174
6.2. Drawing state machines
175
Chapter 6 States and actions
are huge and practically impossible to draw (unless you use a computer), and cer-
tainly impossible to understand and follow. Fortunately this is a problem we can
leave for later. For the time being, we want to draw and learn how to understand
simple devices. Later we can worry about how to handle more complex devices,
which we may not want to draw at all.
Chapter 7, “Statecharts,” describes a better way of drawing devices, that
handles more complex designs; chapter 8, “Graphs,” describes the generic
theory that can handle anything; finally, several chapters—particularly
chapter 9, “A framework for design”—show how to write programs that can
explore and answer design questions.
Off
We can put the states and actions together to make a complete state machine
diagram. We’ve bent the arrows to make a more pleasing diagram—the shape of
the arrows doesn’t matter, though.
On
Off On
Off
The flashlight can be in two states, but which state is it in when you get it? In
general, we need to indicate the starting, or default state a device is in when we
get it. By convention, we indicate the default state with a special sort of blobbed
arrow, or default arrow thus:
On
Off On
Off
176
6.2. Drawing state machines
In other words, this flashlight will start its life in the off state. The meaning of this
complete diagram can be spelled out:
When we get the machine (the flashlight), it will be in the off state. We show
the default state by the blobbed arrow.
If the machine is in the off state (left hand circle), then doing On will take it to
the on state.
If the machine is in the on state (right hand state circle), then doing Off will
take it to the off state.
The diagram doesn’t show what happens when we do On in the On state, or
Off in the Off state. By convention, if there isn’t an appropriate arrow for an
action, the device does nothing—but this is a convention to stop diagrams
from getting visually cluttered, not a conceptual convention.
Often in this book, to avoid visual clutter, we won’t write down the labels on all
arrows.
Safe
From the diagram, it’s clear that every action anybody can do in the safe state
takes us along the arrow and back to the safe state—nothing ever happens. This
isn’t quite what we had in mind, since it isn’t possible, as it’s drawn, to get out
of the safe state and do anything else. Even releasing the safety catch, as drawn,
takes us back to the safe state.
Clearly the safe state needs an arrow coming out of it for when the user releases
the safety catch, putting the gun in the fire mode. When the safety catch is off, the
gun can shoot. The arrow coming out of it for the safety catch release action will
go to an unlocked state, where the gun can do more things.
In the next, more accurate, diagram the safe state has two arrows leaving it:
one for the safety release action, the other for every other action (which still do
nothing).
177
Chapter 6 States and actions
In the case of this shooting, somebody was trying to remove the gun’s ammuni-
tion clip, or magazine, so that the gun would have no bullets in it. Unfortunately,
the design of the Bryco requires that the safety catch must be off in order to re-
move the magazine. So to empty the gun, you first have to release the safety catch;
by design you must get the gun into the unlocked state to do this—and in the un-
locked state you can, perhaps by accident, pull the trigger and go to the shoot
state. From the shoot state, the gun fires and returns itself to the unlocked state
when you release the trigger. So, the arrow from shoot to unlocked is done by the
gun, rather than by the user, as all the other arrows are.
Magazine
removed
Safe Unlocked
Shoot
For most guns, there may still be a bullet left in the gun chamber when the
magazine is removed. This is an intentional feature, so the gun can still shoot
while the magazine is out and replaced or reloaded.
Here is a more accurate diagram, then, representing this possibility:
Magazine
removed
Shoot
Safe Unlocked
once
Shoot
There’s still an error in this diagram, as it incorrectly gives the impression that you
can continue shooting with the magazine removed. The shoot once state returns to
the magazine removed state, and the state diagram strictly does not say you can’t
go back to it and shoot again—it’s only the name that says it can only shoot once
in this state. In fact, even with the magazine in, you cannot shoot indefinitely—the
gun runs out of bullets. The diagrams are not drawn with sufficient detail to count
bullets; instead, they just show the “ability to shoot.”
A basic rule of gun safety is that you should always assume a gun has a bullet
in it, and that’s the behavior the diagrams show too. In practice, you should also
assume the safety is off too, but here the diagrams do not assume that, since they
show the safe state.
178
6.2. Drawing state machines
The gun would have been safer if the magazine could have been removed with
the safety catch engaged; arguably, it was a design flaw that the safety had to be
released. The gun would have been even safer if the magazine could only have
been removed with the safety catch engaged. However, for a gun that might be
used in fighting, it is useful to be able to remove the magazine with the safety off,
so you can shoot the remaining bullet in the chamber while you are reloading.
The designers had to decide whether the value of being able to shoot once in
such circumstances outweighs the value of the gun never shooting accidentally
when the magazine is removed; or they could have considered the cost and ben-
efits of another safety mechanism (which might have given the user the choice
between “cannot shoot with magazine removed” or perhaps “cannot shoot at all”
and “can shoot with magazine removed”).
Making a device safer may make it more complicated, which in turn may make
it harder to use and hence, ironically, less safe. Chapter 12, “Grand design,”
discusses how we can make more sophisticated devices without making the
design more complex.
For guns, which are mechanical, every feature has a direct cost, both in manu-
facturing costs and in the risk that the mechanism may wear and become faulty. In
contrast, adding features for electronic devices has essentially no cost except that
the user interface becomes more complex—especially if a feature is added without
adding further controls, because then existing buttons have to have more mean-
ings to cope with the new feature. So, the design tradeoffs are very different, but
the underlying theory is the same.
Some people argue that the best safety device for a gun is the user—that is, a
sensible user would never aim a gun at somebody (unless they wanted to shoot
them), so the child would not have been shot accidentally. This is the “guns don’t
kill, people do” argument. It is an argument that encourages gun manufacturers
to blame the users and avoid improving their products; obviously to admit other-
wise would be to expose themselves to all sorts of liabilities. If the same argument
were applied to user interfaces more generally, then the user would be to blame
for all problems, and this would similarly create a culture where nobody wanted
to improve design for the sake of users. Of course, this is a politically charged
argument; but so is usability.
Car manufacturers took a similar position (“cars don’t kill, drivers do”), until
the exposé by Ralph Nader in the 1960s; see box 2.3, “Ralph Nader” (p. 51).
179
Chapter 6 States and actions
Get Call
fax friend
Picked
Waiting Ringing Talking
up
Figure 6.4: State diagrams do not have to describe devices and machines; they can
describe what people do too. Here is a state diagram that says what you and a phone
can do.
there are other ways of drawing state diagrams that will allow us to relax some of
the rules when drawing. Behind our casual drawings there is always in principle
a rigorous diagram drawn strictly according to these rules the system—if not the
actual diagram—must obey the rules.
There are nearly twenty rules. The rules are all pretty obvious once you think
about them. It is surprising how easy it is to build a device and accidentally forget
some detail that these rules would have caught. The rules make sense for drawing
and for using a device, but the rules are not visible in ordinary programs. It is
possible to write a program that makes a device work, but for it to have elementary
mistakes that go undetected.
1. Circles represent states. Sometimes rectangles are also used in this book.
5. States and actions are labeled, but if we are worried about the visual clutter
in the diagrams, we may omit some or all of the labels.
180
6.2. Drawing state machines
Action
Another
State
state
7. Every state has exactly as many arrows pointing from it as there are possible
actions, one arrow for each action. For example, if a device has b buttons,
every state will have b arrows pointing away from it. Arrows coming out of
a state have different labels.
8. Arrows can start and finish at the same state; these are called self-arrows.
For clarity, self-arrows are often omitted from a diagram, because they are
implied by the previous rule.
state
9. A state with all arrows from it returning to the same state are called
terminal states; they are states where the system jams—which is undesirable
for interactive systems. They are (almost always) an error, for they represent
a state where no action can do anything and nothing further can be
achieved. Explosive devices, fire extinguishers, and a few other single-use
devices, typically which destroy themselves, have terminal states by design.
10. If we are trying to avoid visual clutter by not showing action labels, we may
merge arrows going between the same states.
12. The initial or default starting state (if we want to indicate it as such) is
pointed to with an arrow coming from a small black circle. Only one state is
indicated as the initial state.
Initial
state
13. States with no arrows or default arrow going toward them represent states
that can never be reached. They always represent an error (or possibly that
the designer has not yet finished the device).
181
Chapter 6 States and actions
14. Finally, with the exception of single-use and similar devices (see rule 9
above), it must be possible to follow a sequence of arrows from any state to
every other state. This important requirement is called strong connectivity.
If a state machine is not properly connected in this way, either there are
some states that cannot be reached by any sequence of actions, or there are
some states (more generally, sets of states) that a user cannot escape from
back to the rest of the device.
Off On
Remove Remove
Removed
We can now remove the lightbulb, but as drawn above, we would get stuck
with the lightbulb removed and be unable to do anything else: there is no way out
of the removed state as the diagram is drawn. Let’s add replace arrows, as in the
diagram below. (I’ve removed the other arrow labels, as they’d make the diagram
too cluttered.)
182
6.3. From drawing rules to interaction laws
Box 6.3 Mealy and Moore machines There are many variations of finite state machine.
The simplest kind merely recognize a sequence of actions, and are sometimes called finite
state acceptors (FSAs). Finite state acceptors are familiar to programmers in the form
of regular expressions, which are a concise and common way of specifying string patterns:
regular expressions are compiled into finite state acceptors that then “accept” when the
regular expression has been matched.
Regular expressions are used in section 12.6.4 (p. 432).
Unless we are modeling a single-use device that does only one thing, like matching strings,
these acceptors are not very useful for us. In this book, we are concerned with finite state
machines (FSMs) that can provide output as a user interacts with them—an interactive
device is only useful if it does something. For example a microwave oven must be able to
cook, and a mobile phone must be able to allow the user to manage a call.
The two main sorts of FSMs are Mealy machines and Moore machines, named after their
inventors. They differ in when they handle actions. In a Moore machine, the device does
things in states, so a Moore machine may have a cook state or a call state. In contrast,
a Mealy machine does things between states, on its transitions. When the user presses a
button a Mealy machine does its action as it goes to the next state. The action of a Moore
machine depends only on the current state, whereas the action of a Mealy machine depends
on the current state and the user’s action, because the effects happen on the transition.
Although the two approaches are equivalent, a Moore machine requires more states (that’s
how to remember which is which) because a Mealy machine can use different transitions—
even to and from the same states—to do different things, whereas a Moore machine must
have different states to do different things. This book uses Moore machines.
If we allow the number of states to be unlimited, then we have a labeled transition
system (LTS). Of course, FSMs are finite cases of LTS, and some people use the terms
almost interchangeably. Another variation is to make FSMs nondeterministic, so that they
can effectively be in more than one state at a time; this makes them more concise, and you
would think, more powerful. However, nondeterministic FSMs (NDFSMs) are mathematically
equivalent to ordinary deterministic FSMs, because the states of an deterministic FSM can
represent the possible sets of states of a nondeterministic FSM.
Off On
Replace Replace
Removed
We’re breaking our rules: we need to say which state the flashlight will be in
when the bulb was removed when it was off, and one to say which state it will
be in when the bulb was removed when it was on. In short, when we replace the
bulb, the flashlight might be on or off. We have the problem! When the user does
183
Chapter 6 States and actions
the action “replace,” the bulb might end up being be on or off, and we do not know
which, that is, this device design is nondeterministic.
Once the problem is detected, we then have to decide what to do. Perhaps we
could make the state machine more realistic: actually, the bulb can be removed
when it is on or off, so the single state is actually two states—this might be fine
for a low-voltage device. Or perhaps we should modify the design, so that when
a bulb is removed the state of the device changes to off, even if it was on. This is a
nice solution: it removes the nondeterminism and makes the device safer as well,
particularly if the device uses high voltages (like an ordinary domestic light, which
of course has the same transition diagram). Or we could make the device have
“no user serviceable parts” and simply make it impossible to remove or replace
the bulb—this might be a good solution if the light bulb was a LED, which have
very long lifetimes and are usually fixed in place, since having no socket makes
the device even cheaper.
This is an example of a drawing law is a law about interaction programming. In-
deed, once we start thinking of laws for drawing or designing interactive devices,
it becomes obvious that there are many more that can be checked.
184
6.3. From drawing rules to interaction laws
user can do is say when. The only way they can stop the next state happening is
by walking away—and then someone else might do it.
Almost certainly a state with a single arrow coming out of it is a design mistake:
it gives the user no choice (other than to delay). If we wanted the user to confirm
a choice, then there would need to be at least two arrows: one for yes and one for
no (perhaps more). Certainly any state with a single arrow coming out of it can be
found automatically in the design process, and questions can be asked about what
it is doing: if there is no time delay issue, then the state should be removed.
Note that during use, arrows can disappear if states or actions become impos-
sible of forbidden. In other words, a check like whether the number of arrows
leaving a state is one (or less!) is a check that may need to be done dynamically,
while the device is in use.
All laws have exceptions. Although a state may allow the user only one choice—
apparently no choice at all—and therefore be unnecessary from a choice point of
view, there might be an opportunity to tell the user something, to help them un-
derstand what is about to happen, or there might be legal warnings.
An email program, to take one example, might want to pop up a dialog box
that tells the user that new mail has arrived; this might be a case where the only
response is OK—but why not OK and read it? It’s a design choice that we could
easily uncover automatically by checking the state machine design.
If there is only one exit from the help or legal state, the user cannot then change
their mind. So in what sense is the help helping? Surely, if a user knows or under-
stands something better, they might realize their choice was a mistake. So the state
should not be removed so much as provided with an alternative exit: say, cancel.
185
Chapter 6 States and actions
not do anything. Many systems provide menus of features, with some features
available on the device, and some on the internet. Mobile phones (like the Sony Er-
icsson) have a menu hierarchy that is partly on the phone, partly on WAP (“wire-
less application protocol”—a sort-of simplified web service). Users cannot tell
which is which until they select an offered option and the phone then tries to con-
nect. The phone may be useless while it is connecting. The phone might not have
WAP enabled (because the user has decided not to pay for WAP services), so it is
never possible to get any WAP feature. Yet the phone will still try to connect.
There are two solutions. The simplest solution to all of these problems is for the
device to check that a user can complete all parts of a transaction before allowing
it to start. Perhaps the unavailable options should be grayed out; trying to select
them would bring up an explanation. Perhaps they would not be shown at all
(depending on whether this is a permanent or temporary state of affairs). Alter-
natively, users could be warned that they won’t be able to complete the activity
but the device can “take notes” so that when the service becomes available it can
be done automatically from the notes. This is like a more helpful waiter saying,
“Thanks for your order. We’re out of coffee right now, but I will go and get some
from a shop myself for you and be back as soon as possible.” Polite restaurants
like that become famous.
The “don’t be rude” rule is really a generalization of the drawing rule that every
arrow must start and finish at a state circle. A device should never trap the user in
a sequence of arrows they cannot get to the end of. Indeed, many useful interaction
programming laws come from checking routes through the system design, rather
than just checking states and arrows alone.
The “don’t be rude” (or “do be polite”) rule is bigger than just button-operated
devices running simple state machines. Figure 6.5 (facing page) shows how well-
designed menus in computer desktop applications are polite by using visual and
typographic conventions: they show what is possible before the user even starts
to try. Whether a menu item is active or grayed-out is worked out dynamically,
depending on the current state and what is reachable from it, just as it should be
in pushbutton-controlled devices like mobile phones.
Section 11.3.1 (p. 383) introduces some polite pusbutton-type device
techniques.
186
6.3. From drawing rules to interaction laws
Figure 6.5: A desktop application menu (taken from running on Apple’s OSX; in fact,
Grab is the program that makes screen shots, like this one). Choosing a menu item is
like pressing a button: it will usually make the application do something. Notice how
good practice is to use visual conventions so that the user can see easily whether: (i) a
menu item will work immediately, such as “About Grab” and “Hide Grab”—the latter
has a keyboard shortcut as well; (ii) a menu item will bring up further menus to select
actions from, as in “Services,” because of the triangle symbol; (iii) the menu will bring
up a dialog before it does something, as in “Preferences. . . ”; and (iv ) the menu item
currently does nothing, as in “Show All,” which is grayed out.
wash them. Although there is a knob, it doesn’t turn on the water; no water comes
out whatever I do. In fact, the whole thing is loose—indicating that people before
me have forced it one way or the other. I try the next knob, as there are five of
them in a row. Same problem on all of them. Somebody else now tries to get one
to work too, and they are have the same problem as I do. So, I’m not stupid, at
least. Perhaps there is no water at all?
And then we notice a paper notice; hand-washing is automatic. If you put your
hand under the outlet in just the right place the water flows automatically. The
knob is not one of the places that makes the taps come on. Now there’s water, we
soon find out that the knob controls the temperature of the water, not whether it is
flowing or not.
Enough people have been misled by this design that there needs to be a notice,
and indeed there is a sign above each knob. But who thinks of reading a sign, that
isn’t even where you are looking, to tell them how to use an everyday object?
The misleading here is that the knob allows the user to do an action, and a
conventional action that would normally get water. It has the affordance of getting
water, in fact. But we are wrong, misled. Once misled, we form a different model
of what is going on: the device is broken and/or there can’t be any water.
There’s more on affordance in section 12.3 (p. 415).
It’s clearly a bad design, because the owner of the place found it necessary to
put up a sign to help users who were misled. We do want to design interactive
devices that mislead; at best, it wastes a lot of time. But whether a user is misled
187
Chapter 6 States and actions
depends a lot on the user’s expectations, prior knowledge and so on. Generally,
you have to do experiments with real users to find out what they expect before
you can be sure a design is alright.
188
6.3. From drawing rules to interaction laws
189
Chapter 6 States and actions
to do. Don’t make things too hard—which begs the question of how hard things
are to do with the device being designed. Provide undo, so users can recover
from mistakes—it’s surprising how many devices have no undo at all. Make it
compatible with a previous device—so if a user doesn’t know about new features,
it doesn’t matter.
We’ll give lots of examples of measuring how hard things are to do in later
chapters, particularly chapter 9, “A framework for design.” Chapter 5,
“Communication,” gave examples of automatic ways of making design tradeoffs
over how hard things are to do.
on & dud not possible off & dud not possible on & OK
off & OK on & OK not possible off & dud off & OK
off & dud on & dud not possible not possible off & OK
Here it is, drawn as a diagram. In all we need four states, and lots of arrows:
Off On
OK OK
Off On
dud dud
190
6.5. A larger worked example: an alarm clock
Box 6.4 Bad user interfaces earn money When you leave a house with a burglar alarm,
you enter your secret code and the alarm gives you a minute or so to get out of the building
before the alarm is activated. The delay is a time out and can cause problems.
Georgia Institute of Technology has an “aware house” (www.cc.gatech.edu/fce/ahri) full
of exciting gadgets—and it has a conventional burglar alarm, made by Brinks Home Security,
the sort that could be used in any domestic house. What’s unusual about this system is that
lots of people use the house.
Here’s the problem. When you arm the burglar alarm just before leaving the house, the
alarm warbles for ten seconds to tell you what you have done, then it gives you a minute’s
grace—the timeout—to get out before the alarm is armed. So you leave. But this is a busy
house, and someone else enters within that minute’s grace, without the alarm making any
noise—just as if the alarm wasn’t set. Maybe they settle down to read a magazine in the
living room . . . Meanwhile, the alarm gets itself fully activated. It makes no noise that it
has changed state.
Time for a cup of coffee? Oops, when they get up to walk to the kitchen the alarm
goes off. They have to rush to the control box to key in the secret code. Hopefully they
can remember the code in their panic! If not, the security people who rush to the scene to
“catch the burglar” charge for their services.
Thus the company that supplies or supports the burglar alarm makes a profit out the bad
user interface design. There is no incentive to improve a design when you can blame the
users for the problems and make money out of it!
See box 6.2, “Timeouts” (p. 174) and the discussion at the beginning of this
chapter for more on how to avoid timeouts.
191
Chapter 6 States and actions
for (if it is set), and how long the alarm is supposed to ring for. Ringing is an effect,
not a state.
Next, the alarm clock could have a button so the user can switch the alarm off
earlier. Hence there is another factor to add to the list above: whether the key
Snooze has been pressed.
We could carry on adding features to our clock; instead, we’ll start to analyze
what we have designed so far. Rather than analyze the full device, let’s first imag-
ine a “broken” alarm clock with a fixed alarm time, say 08:30. (Eventually we
will need lots of these clocks, all “broken” at different times, so we can eventually
change the alarm time by choosing the right clock!) This alarm clock must be able
to display all 1, 440 times, from 00:00 to 23:59, so it needs 1, 440 states:
Here we’ve shown a state diagram but only drawn four of the 1, 440 states explic-
itly; the dashed line is supposed to indicate all the missing detail.
Each tick of the clock takes the clock to the next state, going around the circle,
and the display changes the time shown, say from 12:39 to 12:40, then to 12:41,
12:42, and so on.
When this alarm clock gets to the state corresponding to 08:30, the alarm is
supposed to ring for that and the next three ticks: so there are four states (08:30 to
08:33) where the alarm rings. However, in each of these four states, the user might
press the Snooze button and silence the alarm. So we need to add some more states,
getting 1, 440 + 4 = 1, 444 states so far.
In the diagram, we’ve drawn the four snooze states outside the main circle. The
idea is that the gnome’s ticks on their own take the alarm clock around the inner
circle. At 08:30 we’re at the first state explicitly shown on the inner circle, and the
alarm starts ringing. If the user does nothing, the next action will be a tick, and
we go one state further around the inner circle—and the alarm continues to ring.
If the user presses Snooze , we would follow an arrow to the outer arc of states. In
these states, the alarm does not ring, but as each tick occurs we stay moving along
the outer arc, and thus avoid going to any state where the alarm rings. Eventually,
after 4 ticks in the snoozing states, the clock goes back into a state on the main
circle. Until we get back to 08:30 again the alarm will be silent.
192
6.5. A larger worked example: an alarm clock
Let’s add another feature and so we can lock the alarm off, so it never makes a
noise. Thus the 08:30 alarm clock needs another 1, 440 states to show all the times
when the alarm will never ring because it is off in addition to the original 1, 440
states when it can ring.
Why can’t it have on/off with just one or two more states? If the alarm clock
had a single state for when the alarm is off, then it could not keep track of the
time when the alarm was off. Remember, by our rules, we could only have one
arrow coming out of this single state labeled ON , and it could only go to one time
with the alarm on. So we need 1, 440 states, all for alarm off in each of the 1, 440
different times.
These 1, 440 states are shown on the right of the diagram below. (As before, I
have only shown a few of the 1, 440 states explicitly.)
When the alarm is switched on, the clock goes from showing the time (perhaps
11:40) to showing the same time and the alarm being set. The double-headed
arrow at the bottom shows what the alarm on/off button does.
If we have been pedantic, there would have to be 1, 440 such arrows, to allow
the alarm to be switched on and off in each of the 1, 440 pairs of states. Really,
there are 2, 884 arrows, since there is one facing each way plus 4 one-way arrows
from the snooze states—these 4 arrows are one way, because when the alarm is off
it can’t get back into snoozing, so the righthand circle has 4 fewer states than the
lefthand circle.
Below, we’ve added in just the arrows to and from the alarm ringing and snooz-
ing states to the alarm off states:
193
Chapter 6 States and actions
So an 08:30 alarm clock, with the possibility of switching the alarm off, and of
having a snooze button that immediately silences the alarm requires 2, 884 states.
We’ve only sketched what it would look like.
In reality we need an alarm clock that lets us change the alarm time. In fact, we
need an 00:00 alarm clock, an 00:01 alarm clock, and so on. That comes to 1, 440
alarm clocks, all much like the 08:30 clock but different.
There will be lots more arrows. Imagine we are running the 08:30 alarm clock
and we want to increase the alarm time to 08:31. That means we must have arrows
for the “increase alarm time by one minute” action from each of the 2, 884 states in
the 08:30 alarm clock to corresponding states in the 08:31 alarm clock. In fact we
need 2, 884 arrows to do this.
It is likely that a real alarm clock would have more buttons to adjust the alarm
time, for example, in steps of 10 minutes and 1 hour, as well as just 1 minute.
The extra buttons do not require any more states, but each button requires many
more arrows of its own—each new button we add to the clock’s design in principle
requires one arrow from every state since the user can press the button whatever
state the alarm clock is in. That’s one of our original rules for drawing diagrams.
Finally, putting all these alarm clocks together into one box makes a giant ma-
chine with 1, 440 × 2, 884 = 4, 152, 960 states. This is a huge number, and clearly
if we designed an alarm clock by drawing the states and actions, we’d get into a
mess.
194
6.7. The public image problem
you would also need actions to replace dud bulbs with new, working bulbs. Draw
in the extra arrows needed. How about doing a drawing for your car headlights,
including full beam and dipped states? How do your fog lights work? (In Europe,
you can have front and rear fog lights, separately controlled, but they can only be
switched on when the main lights are on.)
195
Chapter 6 States and actions
and actions inside, nobody bothers to make them straightforward. Adding a ran-
dom action seems as good as any other action. With a large device, a missing action
or an action doing the wrong thing is not noticed—until somebody needs it.
Although a designer might satisfactorily conceptualize a simple system reliably,
devices can get big and there are no general answers to the problem. People are still
out researching how to represent what’s “important” in design and use.
A confounding problem is that most devices are programmed in ways that ob-
scure things even further. In a typical programming language—like Java, C, C++,
or even Prolog—what the program says as a piece of text and what that program
does, how it interacts with the user, are very different things.
You cannot read a program and see how it will interact; instead, you have to
work out what it will do when it is interacting with a user. You have to simulate
how it will work, going through it step by step.
In fact, interaction is a side effect of all conventional programming. Just as
global variables and goto statements are generally frowned on because what they
do isn’t clear, interaction is never clear from a program.
Functional programming, which tries very hard to be clear, avoids doing any-
thing with side-effects, such as interaction, on principle because it makes programs
too messy to understand! Being event-driven or object-oriented does not change
the basic problem.
When a programmer changes a program, the effect on the user interface can be
arbitrary. In particular, improving a program (making it shorter or neater, say) has
an unrelated effect on its user interface. Conversely, improving a user interface can
have an arbitrary effect on the program code; worse, what might seem like a little
improvement to the user interface can turn into an insurmountable modification
of the program—so no programmer will want to do it.
Designers and users cannot see what they are interacting with: they just see
the changes. There is no “big picture.” Nor can programmers see what users are
interacting with, even looking at the program code—and they are in a slightly
worse position because they may think they can see everything as plain as the
program code before them. But the code is not the interaction programming.
If we insist on a simple representation for interactive systems, namely, one based
explicitly on states and actions, we can pretty much eliminate the problems.
Because the theory of finite state machines is simple, we can build all sorts of
things reliably on top of them—and we can confirm, in many ways, that we are
designing what we intend to design.
Interactive devices themselves are of course one of the main things we can build,
but we can also make user manuals and many other things, such as transition dia-
grams, that will either help users or designers better understand and gain insights
into the design of the device.
The more views of a device we have, the more able we are to get a good grip
on its workings—and the more people who can be involved productively in the
design process. Phrased more carefully, the more reliable views of a device we
have, the more able we are to understand it—since devices are complex we must
generate views and alternative representations automatically. As we will see, we
196
6.9. Conclusions
can automatically generate working prototypes cheaply, so we can get users in-
volved immediately, and we can generate draft user manuals quickly so we can
get technical authors involved immediately.
In contrast, in a conventional approach to programming interactive devices, all
of these things are very hard. To work out any sort of user manual requires a lot of
careful—and error prone—thinking. Often, writing a user manual is therefore left
until the program is “finished,” since trying to write it any earlier only means that
it will need expensive revisions as well. Ideas a technical author gets while read-
ing or writing the manual—especially ways to simplify the design—are almost
impossible to implement at this late stage.
Well-written programs often simulate other styles of programming by building
virtual machines. Chapter 9, “A framework for design,” shows how
conventional programming can simulate clearer models of interactive programs.
Clearer models can then be used to create multiple views of interactive systems,
for interaction, analysis, manual writing, and so on.
With a little care, then, we can arrange for insights from users or technical au-
thors or other people to feed back into design. Since the state approach is simple
to program, changes that are needed, say, in the user manual are easy to feed back
into the core design. Thus we can have a single design, namely, a finite state ma-
chine, and all aspects of design work can follow from it.
There is only one thing to keep up to date, only one core thing to modify, rather
than many independent versions—prototypes, scenarios, manuals, actual devices,
and so on. We therefore end up with a more consistent, higher-quality design. This
is concurrent design: when properly organized, many parts of the design process
can proceed concurrently.
In contrast, in a normal sequential design process, you daren’t make contribu-
tions to the design at the wrong time because your effort may be wasted; it may be
calling for a revision to some earlier part of the process. In a sequential approach,
there is no point working on the user manual too soon, because the device design
might change. So, write the user manual after the device design is finalized. But
then, if you have (or anyone else has) any design insights from reading the draft
manual, you cannot afford to go back and revise the device design to help fix the
manual.
Typically, a sequential design process has to keep users out of all technical parts
of the design until practically the last moment, and then so much work has been
done on the design that nobody is prepared to make many changes. Concurrent
design is better, but it requires approaching design from a whole different set of
assumptions.
6.9 Conclusions
What have we learned? We’ve defined states, modes, indicators, and actions,
shown how to draw systems clearly, and how to get things done. Implicitly we
have learned something very important: the user of a device, even as simple as an
197
Chapter 6 States and actions
alarm clock, hasn’t a hope of being able to check it out—these things are surpris-
ingly big when all interaction details are considered. When people buy a clock,
they will have to take it on faith that it works sensibly—there is no way a user
can explore and understand the thousands of state transitions. Even a paid user
that we might employ as a tester of alarm clock designs cannot work hard or long
enough to check a device reasonably well. Most gadgets are more complex than
alarm clocks, so the user’s problems are usually worse. Most device designs are
not thoroughly checked.
If users can’t understand devices in principle, then designers are obliged to be
all the more careful and honest in the design process. The only way to do that is
to use automatic design tools.
The cost of all these advantages is that state machines are simple, and some
interactive devices would be very hard to represent as state machines. The coun-
terargument to that is that anything that is too hard to represent in this way is
almost certainly too hard to use (it would certainly be too hard to write about in
this book)—that’s because we’ve made a very good stab at solving the key design
problem.
Finally, if we draw state machines, we won’t want to draw all of the states and
actions explicitly, and we could be tempted to take shortcuts (as we did throughout
this chapter) to make the diagrams possible to draw without being cluttered with
massive detail. We therefore need tools to support the design process; we aren’t
going to do it well enough by hand.
198
6.9. Conclusions
One of the nice things about state machines is that the ideas were introduced so
long ago that the research papers about them are now easy to read. These are some
classics I recommend:
Parnas, D. L., “On the Use of Transition Diagrams in the Design of a User
Interface for an Interactive Computer System,” in Proceedings of the 24th. ACM
National Conference, pp379–385, 1964. Dave Parnas, now famous for his
software engineering, introduced state machines for designing interactive
computer systems.
Newman, W. M., “A System for Interactive Graphical Programming,” in
Proceedings of the 1968 Spring Joint Computer Conference, American Federation of
Information Processing Societies, pp47–54, 1969. William Newman went on to
coauthor one of the classic graphics textbooks, commonly called “Newman
and Sproull”: Newman, W. M., and Sproull, R. F., Principles of Interactive
Computer Graphics, 2nd ed., McGraw-Hill, 1979.
Wasserman, A. I., “Extending State Transition Diagrams for the Specification of
Human Computer Interaction,” IEEE Transactions on Software Engineering,
SE-11(8), pp699–713, 1985. Tony Wasserman extended state transition
diagrams to make them more useful for specifying interactive systems; he built
many UIMS—user interface management systems, which took state machines
as their working specifications. (There are lots of other extensions to finite state
machines.)
199
7
Statecharts
The last chapter drew diagrams for state machines, but as they got bigger they
quickly got messy. The more states and arrows there are, the worse it gets. A better
method, using statecharts, helps draw complicated state machines more easily and
more clearly.
A typical device will have an Off button and every state will need a transition
for that button to go to the off state. That requires as many arrows as there are
states, including the arrow to go from the off state to itself, as the off button usually
keeps a device off when it is already off. Yet Off is a simple enough idea. Can’t we
avoid all the visual clutter?
In contrast to a transition diagram, which shows every transition as an arrow, a
statechart is much easier to draw and visually much less cluttered. An Off button
would only need one or two arrows, depending on how the statechart is drawn,
regardless of how many states the device has. That’s a huge saving.
Statecharts have many other useful features too. Typical interactive devices be-
come much easier to draw and to understand—as the diagrams are much simpler,
and they show many important properties of a design much more clearly.
Statecharts are quite complicated things in their full glory, so we shall only intro-
duce the basic ideas here. Statecharts are a part of the Unified Modeling Language
(UML) which is widely used for program development.
On
Off On
Off
201
Chapter 7 Statecharts
In this book, I’ve drawn most rectangles with rounded corners, because I think
they look nicer—though some people use square and rounded corners to mean
different things.
We can define modes in terms of statecharts. If we can (correctly!) draw an ar-
row for an action from a cluster of states, then those states represent a single mode
for that action. In the diagram above, whatever happens inside the state cluster
called “on” represents a single mode for the Off action, because the statechart is
saying that whatever state inside On the device is in, Off always does the same
thing—in this case, it takes the device to the Off state from any state inside the
cluster called On.
This is why modes are a useful concept: they allow us to talk about a cluster
of states—possibly a huge cluster of states—ignoring every detail of what goes on
inside that cluster. What’s important for a mode is that an action (here, “off”) has
a consistent effect for all the states.
7.2.1 Clusters
Let’s consider details in the flashlight’s On state cluster. Inside this rectangle we
probably want to show how the bulb can go dud and how we are able to replace
the bulb:
On
On
Off
On On
OK dud
Off
But this diagram breaks one of our original diagramming rules: the inside of the
cluster is a state machine diagram, but we have not marked the default state. So,
from this diagram we do not know what happens when we try the action On when
the flashlight is off. It could go to On/OK or to On/dud, but we have not shown
which.
There are two ways to solve the problem. We either make the On arrow point
directly to the On/OK state, or we use a default arrow inside the cluster.
Next, we can use clustering to provide further information, making a multilevel
statechart. Here we provide the default arrow and we indicate that the flashlight
202
7.2. Statechart features
can be on or with a nonworking bulb in several different ways. For example, the
bulb may be a dud or it may be missing.
On
On
Off On On
OK dud
Off
7.2.2 And-states
Actually, for our flashlight, the inside of the Off cluster is going to be exactly the
same as the inside of the On cluster. The bulb can be OK or dud, whether the
flashlight is switched on or off, and we can replace the bulb regardless of whether
it’s on or off. In fact, we can try to switch it on or off regardless of the state of the
bulb. Statecharts can represent this “repetition” much better by using and-states.
In all the state diagrams we have drawn so far, the machine (like the flashlight)
is in exactly one state on the diagram, and this is represented by it being in ex-
actly one circle (or cluster). This is rather like having a single coin that follows
arrows around the diagram—as the user performs actions. Where the coin is at
any moment indicates what state (or state cluster) the device is in.
A state cluster divided by a dotted line means that the actions on both sides of
the line can happen independently. It is in the states on one side of the line and in
the states on the other side; hence the name.
We can now use more than one coin, and each coin will be in different states on
different sides of the line. In theory, the machine will only be in one state since
we are just using the two or more coins as a convenient way to keep track of the
“real” single coin that we would have used if we had drawn all the states explicitly
without and-states. In reality, it will be easier to program the device with as many
“coins” as are necessary, and the program will be closer to a statechart than to a
conventional state diagram.
The flashlight can be switched on and be OK, on with a dud bulb, off and OK,
or off with a dud. That’s why we need and-states. If we kept track of each com-
bination in the usual way, we would need four states and one coin to be in any
one of them. Using and-states, we represent the components separately, each as
independent state machines. One state diagram therefore represents on or off; the
other, dud or OK. If we keep track of the overall state using coins in states, now
we would need two coins, one for each statechart on each side of the dashed line.
The statechart in figure 7.1 (next page) doesn’t really need the default arrow at
the top: there is, after all, only one cluster of states the device could possibly start
in, namely, the one the default arrow points to. I put the default arrow in just to
make the point that if there is only one default state, it does not need pointing out:
the point of statechart conventions is to represent state machines as simply and
clearly as possible without loss of meaning. Even so, the diagram is still simplified:
203
Chapter 7 Statecharts
On OK
Off Dud
Figure 7.1: And-states for a flashlight. The flashlight can be in four states: on and
OK, off and OK, on and dud, and off and dud. Its initial state is on and OK.
self-arrows are missing, the arrows that go from a state to itself, representing user
actions that don’t change the state.
Notice that we eliminated four arrows (although we’ve drawn two default ar-
rows). Unfortunately, for persuading how effective and-states are in simplifying
statechart diagrams, we don’t seem to have achieved much: there are still four
states, and now there is a dashed line. It hasn’t achieved much is because, coinci-
dentally, 2 × 2 = 4 and 2 + 2 = 4.
Represented normally, as an ordinary state transition diagram, we need another
circle in the state diagram for every state. But when it is represented as and-states
in a statechart, the drawing multiplies up to the required number of states, and
we don’t need to draw so much. As the number of states in the device goes up,
the difference between sums and products gets more significant and statecharts
become more and more convenient. For example, whereas 2 + 2 = 2 × 2, if we
needed three pairs of and-states rather than two, then 2 + 2 + 2 = 6, which is
smaller than 2 × 2 × 2 = 8. For three pairs, the statechart would be a clearer
drawing.
Furthermore, and-states make sense and are easy to use. If we wanted to draw a
diagram of a flashlight that could also buzz, we would need twice as many states
(every original state becomes two states, what it used to do without buzzing, and
what it used to do and buzzing); but with and-states, rather than doubling the size
of the diagram to cope with this extra feature, we need only add two more states
to the statechart drawing (to buzz or not to buzz).
Let’s add another feature to see how things get better and better as the device
gets more complex. What if the flashlight could be blinking or not? With and-
states, we need only add two more circles to the statechart diagram: to represent
blinking and not blinking. With a conventional approach, without and-states, we
would need to double the size of the diagram again. The comparison now is be-
tween 2 + 2 + 2 + 2 = 8 for statecharts versus 2 × 2 × 2 × 2 = 16 for conventional
204
7.2. Statechart features
diagrams. The advantage is even better when there are more states in each of the
and-states. In our example, we’ve only had two states in each component. If we’d
had three states (say, for off/dim/on rather than off/on, and so on), then the sums
would work out as 3 + 3 + 3 + 3 = 12 versus 3 × 3 × 3 × 3 = 81.
In summary, and-states allow us to draw simpler diagrams, and the benefits of
using and-states get better and better as the devices get more complex. The only
problem is—for this book—that when we draw simple devices that we can easily
understand, the advantages don’t look very impressive.
H H
On OK
Off Dud
The history arrows, rather like the default arrows but with little H annotations,
mean that next time the machine comes to these clusters of states (there are two
clusters in this diagram, one on each side of the dotted line), go to the last state in
each cluster the machine was in. In general, the history entrance is a very powerful
and common feature of statecharts.
Playing with coins allows us to explain history entrances another way. When
there is an action taking the machine out of a cluster of states, we might have
to pick some coins up. If a cluster of states has a history entrance, we need to
remember what state it was in when we came out of it. Simple: just leave the
coins where they were, so that next time we know what state it was in. Somehow,
though, we need to note that these coins are inactive: that is, they represent not the
state now but the state the system will be in when it returns to this cluster. Perhaps
we could indicate this by showing the coins as either heads or tails. The current
state of the machine can be shown by a coin showing heads (more than one coin
in and-states), and the historical states can be shown by coins showing tails.
205
Chapter 7 Statecharts
Here is a statechart for simple TV with four channels and an operate button:
Channel Channel
1 2
Off
Channel Channel
4 3
The statechart seems to show a TV with two buttons: one button switches it on
and off, and the other button changes the channel. The history connector tells us
that the TV will remember which channel it was on when it is switched off; when
it is switched on again, it returns to the last channel the that was watched.
The problem is that although the history arrow correctly enters the cluster of states
“change channels” it doesn’t say which channel inside the cluster should be se-
lected. The problem could be fixed by the change channels cluster having its own
history arrow, but that would mean we end up saying history twice. What we
want to do, rather, is say clearly that the history arrow reaches deeply into the
nested cluster.
206
7.2. Statechart features
A history arrow is just an H, but a deep history arrow is an H with a star on it:
H*
Channel Channel
1 2
Off
Channel Channel
4 3
207
Chapter 7 Statecharts
Box 7.1 Computer demonstrations at exhibitions A common wooden printing press, several
hundred years old, and a bookbinder’s workshop are part of a permanent exhibition in the
British Library. They are used for working demonstrations of the old arts of printing and
book making. The working book press is of a type more-or-less unchanged since Gutenberg’s
time. In the corner of the room, for the last year or so, sit some sad large computer screens.
A notice apologizes for the “temporary inconvenience” of the computers being out of order.
I don’t know what the computers would contribute to the exhibition, because I’ve never seen
them working. Isn’t this a good example of inappropriate technology, which is (particularly
now that it has broken down in some way) completely beyond the abilities of its users
to operate? Somebody installed the computers at great expense and forgot about their
unreliability and the need for ongoing support, which the British Library is evidently unable
or perhaps unwilling to provide.
When I checked up for writing this box, there were two notices: “This interactive [sic]
is temporarily out of order,” and “The printer is temporarily out of order.” So things have
become worse!
For example, if an action occurs after doing nothing inside a cluster for 10 sec-
onds, we might write this:
Timeouts should be thought through carefully as they can be confusing for the
user. You, as designer, may like a device to be “always” in a standby state ready
for the user, even if the user was recently partway through some sequence be-
fore getting distracted from finishing. Timeouts are one way to do this: if a user
does nothing, the system can reset itself somehow. Unfortunately, the user may be
doing something—the device can’t know.
Perhaps the user is reading the user manual, trying hard to understand what is
going on. Now, after a timeout, the user is even worse off, as the device isn’t even
doing what they thought it was doing! Or if it is a walk-up-and-use ticket machine,
maybe the user has selected their ticket, and is fumbling for suitable coins to pay.
Perhaps they have dropped a coin on the ground. Returning to the ticket machine
to buy the ticket, the user finds that it has reset—its designer has made it ready for
the next customer—the machine has lost all knowledge of the ticket the user was
about to buy. In most cases, a device ought to use a motion or proximity sensor,
rather than a timeout.
208
7.2. Statechart features
7.2.7 Conditions
Sometimes actions can only occur when certain conditions hold true, for example,
if our flashlight had two batteries, perhaps we can only remove the second after
removing the first. As drawn below, the and-states give the impression that we
can do actions on either side of the dotted line, but we’ve noticed that we cannot
remove battery 2 unless battery 1 is already out.
Remove 1
Remove 2
(Battery 1 out)
Of course, this particular state machine, with the battery removal order condi-
tions, could also have been drawn more directly as follows:
Remove 1 Remove 2
In this last diagram, we don’t need the condition to be explicit: the Remove 2 ac-
tion is only possible if we have already reached the state after removing battery 1.
In fact we can always get rid of conditions by adding more states explicitly, as in
this last diagram. Usually, conditions represent some far more complex arrange-
ment of states and it is not so easy to do.
A basic alarm clock was introduced in section 6.5 (p. 191), and we saw how
messy its ordinary state transition diagram became.
Here is one way to represent some of the important states of an alarm clock
without showing millions of states:
H*
Time!
Tick
Snooze
Silent Ringing
Wait 5 ticks
209
Chapter 7 Statecharts
alarm is in any ringing state, the user can press the snooze button to silence it, or
after five ticks the alarm silences itself.
The alarm clock has tick actions. The statechart does not say that the ticks oc-
cur every minute, but we can see that nothing inside the clock can ignore ticks:
whatever the clock is doing, when a tick happens, the tick action occurs and we
go back inside the cluster. Because we’ve used an H* arrow, this means that after
every tick the alarm goes back to the last state. But we want ticks to take us to the
next state—otherwise the clock won’t work. Oops.
It would be clearer not to show the tick action at all. Given that the statechart
represents millions of states and all their transitions hidden inside the two clusters,
it is not surprising that it cannot describe the alarm clock’s behavior completely
accurately. Nevertheless, the simple statechart does show some important features
of the alarm clock visually. It could be used as part of the design process of an
alarm clock. Certainly the diagram allows us to talk about what we want more
easily than almost any other visual representation. As we talk about the design, we
see ways to improve the design—or we see ways to make the statechart diagram
reflect more accurately what we thought we wanted.
Arrows can merge as well. The main use of this feature is to get rid of clutter
from statechart diagrams—sometimes a single state with lots of arrows going to it
looks very messy.
210
7.3. A worked example: a Sony TV
Off 8 1 2 Sound
Timeout
7 3
Standby 6 5 4 Vision
Remote
TV
control
211
Chapter 7 Statecharts
Off 8 1 2 Sound
7 3 Watch
Standby 6 5 4 Vision
Figure 7.3: A more faithful statechart for the Sony KV-M1421U TV, using and-states.
obvious issues, like the button names, details of changing sound levels (which, at
this level of detail, would just be self-arrows around the sound state), and so on.
The TV can be off, in standby, or switched on. There is a single button, , that
switches the TV on or off, depending on whether it is off or on; thus arrows that
we might have been tempted to label simply On or Off, are more correctly labeled
On/Off, since they are associated with a single button.
When the TV is on, you can watch a TV channel, adjust the volume, or adjust
the vision properties (contrast, brightness, color). Although the TV can evidently
be put in standby, it isn’t possible to get it into this state without using the remote
control. However, if the TV is in standby, pressing a channel-change button makes
it on, whether you use the TV or remote control to change channel.
The numbers 1 to 8 in the statechart mean the TV channel selected. The TV
shows the numbers on the screen for a few seconds: it would be distracting if 8,
say, was shown indefinitely. We could add an and-state with a timeout to say
when the TV channel number is actually showing.
An interesting user problem can be seen in the statechart: the TV has a class of
states shown in the cluster at the right of the statechart that are exited by doing
nothing until the timeout takes effect. If the user tries to get out—by pressing
buttons—this would reset the timer. So if you think you must press a button to get
back to the normal watching-TV mode, you will be stuck until you give up!
The statechart in figure 7.2 is not quite correct. When we use the TV, we find
that we can still watch a program while adjusting either the sound or vision set-
tings. In other words, watching a TV channel and adjusting are not mutually
exclusive states, as the statechart indicates. In fact, we should be using and-states,
because you can watch a channel and adjust settings, as shown more accurately in
figure 7.3.
Notice the mixed use of default arrows and history arrows. When we switch
the TV on or take it from standby, it will return to the last TV channel we were
watching—that’s the history arrow in the top part of the cluster doing its work; but
212
7.3. A worked example: a Sony TV
H*
Standby
Normal
viewing
Display Large
nn top
Teletex One digit
Off showing showing
30 minute
timer
60 minute
timer H*
Sound Sound
off on
Clock
off
90 minute
timer H*
Figure 7.4: A statechart for the RM-694 control of the Sony KV-M1421U TV.
when we switch it on, it will always be in the watching mode, not adjusting sound
or screen properties—that’s the default arrow in the bottom part of the cluster
doing its work.
The remote control for the Sony TV has a more complicated statechart, which is
shown in figure 7.4 (this page). The statechart is much more complex than the TV
itself—this was the main point made in section 3.2 (p. 63)—but let’s now consider
a few details from it.
From the statechart, we can see the TV as controlled by the remote can be off,
in standby, watching a TV channel, or using teletext. Because of the deep history
on the standby state, we can see that switching the TV on will take it to whichever
was its last mode before it was switched off.
The clock is a simple subsystem we can look at first, with two state clusters, as
shown in figure 7.5 (next page). The clock begins by being off, and the same Clock
button is used to switch it on or off. We haven’t shown the 1, 440 states inside the
clock needed when it is on!
213
Chapter 7 Statecharts
Clock
off
Clock Clock
Clock
on
Figure 7.5: The simple clock subsystem, extracted from the remote control statechart.
Figure 7.6: Selecting channels by pressing digits. Another extract from the remote
control statechart, but rotated from figure 7.4 so that it takes up less space.
The details of the rest of the remote control statechart aren’t too interesting, un-
less you like comparing details of TVs. I have to admit I’ve lost my remote control,
but it allowed me to press digits to select a channel numerically—I wish I could
double-check the details. This is represented in another part of the statechart, as
redrawn in figure 7.6 (this page).
This is a different subsystem from the channels 1–8 and 9–59 cluster (shown
in the bottom right): because that cluster uses + and − to change the range of
channels, something that digits alone can’t do.
214
7.5. There is no right statechart
if it has a remote control, would be a good case study. How are you supposed to
use the remote control under pressure while driving?
Up
Channel Channel
Off
1 2
Volume
215
Chapter 7 Statecharts
I’m sure somebody will comment positively that it is nice how all the arrows go
round clockwise. In fact, the state names and arrows were chosen to go clockwise
when they are increasing things—this is a nice consistency.
Since there isn’t a history connection, we can’t see from this statechart that when
the TV is switched from standby or off back to being on (that is, in any state in the
large cluster), that the TV remembers its last sound level setting. Likewise, we’ve
forgotten to have an initial state for the channel—the statechart does not specify
what channel the TV is going to be on ever.
Somebody will point out that surely if the TV is in standby, you ought to be
able to switch it off directly. As it is drawn at the moment, if the TV is in standby,
to turn it off, you would have to switch it on first! There is no arrow going from
standby to off.
How many volume levels are there? The statechart doesn’t tell us. Can we
mute the sound? Well, not yet anyway. Surely we ought to be able to change from
channel 2 to channel 1? As the statechart is now, we can only cycle through the TV
channels, 1, 2, 3, 4 and back to 1. A user overshooting to 4 would have to go all the
way around again, rather than going back directly from 4.
Here’s an improved statechart. All the changes we’ve discussed can be drawn
into the original statechart with minimal alterations.
Up
H* Channel Channel
Off
1 2
Volume
Channel Channel
4 3
Standby Down
Other features
like mute, etc
We’ve cheated a bit by not giving details of the mute feature, but at least the
statechart now mentions it. We’ve fixed the default TV channel—now when the
TV is switched on (whether from off or from standby), it will continue with the
last channel used, thanks to the history arrow. We’ve added a new arrow from
standby to off, so now that the TV can be switched off in one step—say by pressing
the On/off button.
It’s misleading that the off state has two arrows going to it when they are both
the same user action. We should redraw the statechart so that only one arrow is
required for each action. This will make the statechart clearer but won’t alter its
meaning. The default entry for the on cluster ensures switching the TV on from off
216
7.6. Where did statecharts come from?
still works the same way as before—when you switch the TV on it always comes
on, rather than going into standby.
Up
Standby
H* Channel Channel
Off
1 2
Volume
Channel Channel
4 3
Down
Other features
like mute, etc
You can always make statecharts easier to read by removing detail from clus-
ters. Of course, you can draw the detail of the clusters somewhere else, on an-
other page, so the detail can still be seen, without the distraction of the rest of
the diagram. In the last statechart, above, we’re probably assuming that there is a
separate statechart giving more details for features like mute.
217
Chapter 7 Statecharts
arrows that connect them. Yet many states are conceptually related into sets,
such as the set of “all states that are on.” Statecharts add sets to the notation.
Like hypergraphs, statecharts can have arrows between sets of states.
Like Venn diagrams, statecharts allow drawing sets of states inside each other
to show set inclusion. The set of all states that are on includes two sets: the set
of states where the device is on and, say, the CD is in, and the set of states
where the device is on and there is no CD in it. Venn diagrams allow these
relationships to be drawn naturally. Statecharts are tree structured, so unlike
full Venn diagrams, they do not allow overlapping sets of states—that’s why
the red/yellow/green light example given earlier was a problem.
Many reactive systems have independent subsystems; if these are represented
as finite state machines, you need a big machine that has as many states as the
product of the two subsystems. (In fact, the system is the cross-product of the
subsystems.) This can get out of hand very quickly! For example, three on/off
light switches require a finite state machine of 2 × 2 × 2 = 8 states. Statecharts
use dashed lines to separate independent subsystems and thus avoid the
multiplication of states. Three on/off light switches thus only take
2 + 2 + 2 = 6 states to visualize.
Statecharts have a variety of conventions to reduce the clutter of arrows.
Default arrows, for instance, mean that you don’t need long arrows reaching
long distances inside sets of states.
In summary, graphs and hypergraphs show relations (visualized by arrows)
among states and sets of states; Venn diagrams show structural relations between
collections of sets (visualized by overlap and containment of shapes, usually cir-
cles, overlapping or inside other shapes); and statecharts combine both represen-
tations.
Full statecharts add a few more features that we have not used in this book,
such as broadcast communication. In our use of statecharts, each action has been
written on an arrow, and it causes a transition along that arrow. We didn’t mention
it, but if the same action is on several arrows, all of them work. This is obvious;
for example the off transition might leave many sets of states in a cluster simulta-
neously. Generalizing this “broadcast” nature of actions, Harel allows any action
x/y
to trigger more actions. In effect, arrows are labeled like −→, meaning to do this
when x occurs and behave as if y occurs immediately afterward, that is, broadcast
y to the rest of the statechart.
The actions in full statecharts can also have arbitrary conditions. For example,
y[in X ] means only do transition y when some other part of the statechart is in the
set of states X.
It’s interesting that statecharts combine graphs and Venn diagrams, for both of
these ideas go back to one man, Leonhard Euler, the hero of the opening
stories of chapter 8, “Graphs.”
218
7.7. XML and the future of statecharts
219
Chapter 7 Statecharts
In the future, then, statecharts will be used for drawing and thinking about parts
of interactive systems, but the full details probably won’t be shown, or certainly
they won’t be shown all the time. Actually seeing all of a device’s specification,
even using statecharts, can be overwhelming: given a written statechart (probably
in SCXML), we can extract simpler statecharts to visualize whatever bits of a de-
sign we are interested in and want to think about, but we won’t have to see all of
the design at once, unless we want to. We can still use statecharts for conceptual-
izing, building, and designing—but with XML and other tools, we will be able to
build, analyze and run interactive devices, and document them, all from exactly
the same XML text. Being able to do so much from one document will help im-
prove quality—because different people will be able to work from exactly the same
material. Any faults will be corrected—and will correct things for everybody. (In
contrast, in a conventional design process, each aspect of design is independent
and has its own life, so fixing problems, say in the user manual, rarely helps fix
problems in the interactive device itself.)
You can get more details of SCXML and other variants of XML from
www.w3.org.
7.8 Conclusions
Most systems that we are interested in have many states. Unless we draw their di-
agrams carefully we get in a mess very quickly! Statecharts provide a clearer way
to draw state transition diagrams. They are a widely recognized way of drawing
interactive devices. For the most part, the ways statecharts simplify diagrams ac-
tually helps us do what we want—they bring out many interaction programming
issues and make them easier to see. It would be unfortunate indeed if the gadget
designs we wanted were still nasty to draw, but by-and-large statechart features
correspond to useful features for designing interactive systems.
Statecharts are not perfect. For example, while they show states very well, they
do not show actions very clearly. If a device has a button X , a statechart gives
no visual insight into how X is used, and whether, for instance, it has been used
everywhere it should have been. For many properties, we will have to rely on
automatic analysis to find out whether a design does the things we want it to, but
statecharts are very good for starting a “bigger picture” discussion about design
with designers.
See chapter 9, “A framework for design,” for many ideas of checks and
properties about designs that can be worked out automatically—so we don’t
need to draw diagrams to see their properties.
Sadly, few systems are designed using statecharts, let alone state diagrams.
Statecharts aren’t used because not enough people know about them, and ordi-
nary state diagrams aren’t used because they get too complicated too quickly. So
most devices are programmed without any careful analysis of the state and transi-
tions, and all of the details that we took so much care to draw are never visualized
220
7.8. Conclusions
by the designers. That’s a shame, because then we end up with quirky designs. If
we were designing for safety-critical applications, say, a gadget to monitor a hos-
pital patient, then we must use a technique at least as good as statecharts to help
make the design safe.
221
Chapter 7 Statecharts
222
8
Graphs
All sorts of things can be represented by dots and arrows, from the world wide
web down to how the neurons in our brains are connected. Interactive devices
are finite state machines that can be represented by dots and arrows. Common to
all is graph theory, the underlying mathematical theory. Graph theory gives us a
powerful handle on what devices are doing, how easy users will find things—and
it gives us lots of ideas for improving designs.
In the eighteenth century folk in the German town of Königsberg entertained
themselves walking around the town and crossing the bridges over the River
Pregel, which empties into the Baltic Sea. There were seven bridges over the
Pregel, an island, and a fork in the river. Some wondered whether it was possible
to walk around the city center crossing each bridge exactly once.
Here is a schematic representation of the city of Königsberg, its river and its
bridges schematically:
North bank
South bank
You can imagine that somebody walking around Königsberg is in different states
depending on which bit of dry land they are on—north bank, island, south bank,
and so on. Crossing a bridge changes state.
225
Chapter 8 Graphs
We can redraw Königsberg without the river and draw circles around the states,
and we get something closer to a familiar state transition diagram:
North
bank
East
Island bank
South
bank
One frustrated person who tried walking across each Königsberg bridge exactly
once, perhaps on some romantic evening stroll, wrote to the local mathematician
Leonhard Euler about the problem. By 1736 Euler had published a paper that
showed conclusively that there was no way to walk around the city crossing each
bridge exactly once.
Euler’s 1736 paper started the field of graph theory, which thinks abstractly
about networks, like walks and bridges. In Euler’s honour, a walk that crosses
each bridge exactly once is called a Euler Tour, and any town with bridges that
can be walked around in this way would be called Eulerian. Today, the Königs-
berg bridge problem makes a nice children’s game—there are a few mathematical
addicts who have built small-scale models of Königsberg for children to enjoy
running around them—and maybe learning a bit of mathematics as well. Another
bridge was built in Königsberg in 1875 and it then became possible to walk around
the city center crossing each bridge exactly once: so post-1875 Königsberg is Eule-
rian.
The point for us is that graph theory analyses a network, here the bridges of
Königsberg, and tells us what humans can do with it. In this chapter we will ex-
plore graph theory to see what it can say about interaction programming, and how
design choices influence what a user can do and how easily they can do things.
226
8.1. Graphs and interaction
227
Chapter 8 Graphs
228
8.1. Graphs and interaction
Box 8.1 Chinese postman tools If we ask users to test a device, effectively to check “does
this bit do what its manual says?” and “can you think of any improvements?” for every
function available—without further help—they are very unlikely to follow an optimal way
of testing the device. They are very unlikely to check every action (every button press in
every state): they will miss some. In the very unlikely event that they do manage to check
everything, they certainly won’t do it very efficiently. They will use up a lot more time and
effort than necessary. Trial users checking systems often have to use notebooks to keep track
of which parts they have tested, and I doubt they can be sufficiently systematic even then
not to waste time and miss testing some features.
In short, a program should be used to help users work through their evaluation work. This
means finding a Chinese postman tour. Actually, rather than follow an accurate Chinese
postman tour, it’s more practical to follow a dynamically generated tour that suggests the
best actions to get to the next unchecked part of the device. This is much easier! For the
program, it only needs to keep track of which states have been visited and use shortest paths
to find efficient sequences of actions for the users. For the users, it’s much easier too: there
is no problem if they make a mistake following the instructions, and they can easily spread
the testing over as many days or weeks as necessary.
Section 9.6 (p. 297) gives code to find shortest paths.
Since a device design may change as it is developed, the flags associated with states
can be reset when the design changes if the change affects those states. Thus a designer
can incrementally check a device, even while it changes, perhaps making errors or missing
actions, and still know what needs evaluating and checking. Eventually the evaluation will
cover the entire functionality of the device.
State flags can be used in two further ways. During design, documents may be produced,
such as user manuals. A technical author may wish to flag that they have documented
certain parts of the device and therefore that they want to be notified if those parts of the
device ever change. This allows a technical author to start writing a user manual very early
in the design process. State flags can also be used by an auditor, who checks whether an
implementation conforms to its specification. The auditor can use the flags to assert that a
vertex (or arc) has been checked out and must not be changed gratuitously.
The Nokia 5110 phone that was used in chapter 5, “Communication,” has a Chi-
nese postman tour of 3, 914 button presses! And that’s only considering the func-
tion menu hierarchy, not the whole device. In other words: to test whether every
button works in every state on this device takes a minimum of 3, 914 presses—
and that’s assuming error-free performance. If we made a slip, which we surely
would, it would take longer. Obviously, human testing is not feasible.
Here is an extract from a Chinese postman tour of a medical device:
:
478 Check ON from "Off" goes to "On"
479 Check DOWN from "On" goes to "Value locked for On"
In state "Value locked for On", check unused these buttons do nothing:
DOWN, OFF, PURGE, UP, STOP, KEY, ON
487 Check ENTER from "Value locked for On" goes to "Continuous"
:
229
Chapter 8 Graphs
You can see how each check takes the user to a new state, and in that state only
one check can be made there—you need a Chinese postman tour because each
check necessarily takes you to a new state. In the example above, step 479 must
be the first time the state Value locked for On has been reached in the tour, and
the checking program is recommending that several buttons are checked in this
state to confirm that they do nothing: these buttons can be checked easily since
checking them shouldn’t change the state.
With the device framework we will develop in chapter 9, “A framework for
design,” it is easy to take the test user “by hand” through the best tour: the
device itself can keep track of what the user has and has not tested. This idea
is explained more fully in box 8.1, “Chinese postman tools” (previous page).
If the system can keep track of all this, why does a user have to work labori-
ously through the tour? The system can check that everything is connected, but
no system can know deeper facts like what the Pause button is supposed to do
when the device is recording. A user should get the device in the recording state
and try out the Pause button, see what it does, then tick a box (or whatever) to
“audit” the correct behavior of the device. Once a user has checked the Chinese
postman tour (or any other properties), if the device is modified only the bits that
have changed need to be rechecked—again, a good device framework can handle
this and greatly help in the design and evaluation of a device.
The longer the Chinese postman tour, the harder a device is to check properly:
it simply takes more work. A designer might therefore be interested not in the de-
tails of the Chinese postman tour (because the details are only needed for check-
ing) but in how long it is, that is, how many button presses it will take to do, since
that is a good measure of how hard the device is to understand. If a user was go-
ing to understand a device they would have to explore and learn at least as much
of the device as a postman would have to, so the length of the postman tour pro-
vides a way to assess how hard a device is to understand or to learn. Even if this
is not exactly correct (users may not have to know everything), the longer the tour
the harder the device (or the web site, or whatever) will be to check. So, when a
designer is contemplating adding a new feature or a new button, they might want
to measure the lengths of the Chinese postman tours with and without the feature
to see how the lengths change. Is the extra cost to the user in terms of complexity
worth it in terms of the value of the new feature?
Some special graphs have simple Chinese postman tours: a so-called randomly
Eulerian graph, for example, has a tour that can be found by pressing any button
that has not yet been pressed in the current state. It’s called randomly Eulerian
because when you have a choice, you can behave randomly and still follow a Eu-
lerian tour of the system. Such devices are very simple. If you had a design like
this, its usability could be improved (if seeing all of it is the task—as it would be
with a task like visiting an art gallery and wanting to walk down every passage
to see everything) by having the system indicate which buttons have previously
been pressed.
Art galleries might like to ask design questions like these. Presumably they
want visitors to see many exhibits, even—and perhaps especially—when the vis-
230
8.1. Graphs and interaction
231
Chapter 8 Graphs
Euler tour To travel along each arc (in the The graph must be Eulerian, and if
right direction) exactly once, and get back to so, checks every action is correct.
the starting vertex.
Chinese postman tour To perform a Euler To check that every action (button
Tour, or when an Euler tour is not possible, press; link) is correct.
to travel along each arc at least once.
Hamiltonian tour To visit every vertex exactly The graph is Hamiltonian.
once, and get back to the start.
Traveling salesman tour To visit every vertex, To check that every state (or web
using as few arcs as possible. page) is correct.
Figure 8.1: Summary of four sorts of graph tours. The Chinese postman and traveling
salesman tour have variants if arcs are weighted (have numbers associated with them):
then they should minimize not the number of arcs used, but the total cost of the
weights.
The cost-of-knowledge graph plots how long a user takes against how much of
a device they have explored. If the user follows a traveling salesman tour, they
will do the best possible. Section 5.3 (p. 144), specifically, figure 5.11 (p. 144),
draws some graphs, from random user behavior.
232
8.1. Graphs and interaction
color would not change. You might not notice that you were in a different country,
and perhaps wouldn’t understand the local culture’s colorful nature.
Back in the context of user interfaces, users would probably want to see confir-
mation every time they changed the state of the device. In some sense, the “color”
of each state must be different from the “color” of adjacent states; if not, then some
state changes cannot be noticed, because there is no change.
What corresponds to colors in an interactive device? Most devices have displays
and indicator lights that help tell the user what state the device is in. For example,
there may be a light that says the device is on. This is rather like saying that in the
off state, the light is black, and in the on state (whichever state it is in when it is
on) the light is red. A CD player might have an indicator to say whether a CD is
in the device or not: another choice of two colors. The combinations of indicators
correspond to colors: in the table below, each combination of the two indicators
has been allocated a notional color.
State Indicators Color
Off none Red
On, CD out on Green
On, CD in on, CD Blue
Off, CD out CD Yellow
Every possible state change (like ejecting the CD) corresponds to a color change.
In fact, for this simple abstract device, since every state is a different color, it’s in-
evitable that any state changes the user can do results in color changes. In general,
what a designer should check is that at least one indicator changes for all possible
state changes; then the device is adequately colored, and in principle the user can
tell whenever there is a state change.
Section 10.3.2 (p. 334) shows how to work out tables like the one above,
automatically so that they can be used to help design better systems.
The four color theorem only applies to flat, conventional, maps. For more com-
plex graphs—such as those we encounter with interactive devices—almost any
number of colors may be required. The worst case is a complete graph: a com-
plete graph has every vertex connected to every other, so that all vertices must be
different colors. If there are N states, N colors are required.
A graph’s chromatic number is the minimum number of colors it needs. The
chromatic number of any two-dimensional map is at most 4, unless it has tunnels
or other ways of cheating. The chromatic number of any map drawn on a sphere is
4 too but the chromatic number of a map drawn on a torus (a shape like a sausage
bent into a circle) is 7. An interactive device whose graph can be drawn on paper
without any crossing lines will necessarily have a chromatic number at most 4,
but usually the chromatic number will be higher—most device’s graphs cannot be
drawn without crossing lines.
For any specific graph, it is possible to work out the chromatic number—the
programming is not too hard. If the chromatic number is c, then there must be
enough indicators to count from 1 to c (or, equivalently, from 0 to c − 1) in binary.
Imagine each state is given a color. The chromatic number tells us the least number
of colors we need so that when we go from one state to another, the color changes.
233
Chapter 8 Graphs
If our device had a single colored light with c colors (or c different phrases), that
would be sufficient to tell the user whenever they changed state. With fewer than
c colors, some state transitions could be missed using the indicators alone: the
light’s color wouldn’t change.
Now it is unusual to have a single indicator with lots of colors; instead, there are
usually several indicators, red lights or whatever, that are either on or off, rather
than showing different colors. In principle, the indicators can count as binary
digits, bits. If you have n indicators, you effectively have n bits available, and you
can count up to 2n different things. This is how binary works. So you need 2n to
be at least as large as c to be able to distinguish all state changes. In other words,
you need at least log2 c on/off indicators to distinguish every state for a device
with chromatic number c. Even then, the indicators may not be used properly
(the device might have the right number of indicators but not use some of them
or not use them very well), so that needs checking as well. That is, having enough
indicators is necessary to distinguish all the states, but it is not a sufficient design
criterion to ensure that the device is easy to use.
For a JVC video recorder I worked out the chromatic number, and it was higher
than the number of indicators could handle. In fact, the video recorder did not
say when it was in the states “preview” and “cue.” That’s a potential problem
identified by graph theory—though whether a user would be worried about it is
another matter. For a video recorder, a user would probably be looking at the
TV, not the device itself, and it would be obvious whether it was playing or fast
forwarding from the TV picture. Once or twice, though, I have found myself fast
forwarding the tape when I wanted to rewind it and not noticing the difference.
This JVC PVR is discussed in detail in section 10.3 (p. 330).
For a safety-critical device (say, a medical device to be used in an operating the-
ater), knowing when or whether it changes state is likely to be very important.
What happens if an anesthetist presses 8.0 to deliver 8.0ml of a drug but acciden-
tally does not press the decimal point hard enough? The patient would get 80ml of
the drug, which for many drugs would be quite enough to kill the patient. In such
cases, the device should tell the user, the anesthetist, every time it changes state—
so, in particular, not pressing the decimal point hard enough should be recognized
by the user.
234
8.2. Mazes and getting lost in graphs
a
w a b c d
e b e
c d
Figure 8.2: A very simple maze (left), with intersections and end points marked, and its
corresponding graph (right). The state w represents being in the “rest of the world,”
and the states a to e are places to be inside the maze, where the explorer has to make a
choice or has come to a dead end. The graph representation, being more abstract than
the maze, does not distinguish between the inside and the outside of the maze—for
instance, it could be redrawn with c or b at the top of the diagram.
leave everything else black. We now have a colored graph, that is a lot simpler than
the original. Hopefully, we will have colored in something that has an interesting
structure that we can critique, understand, and improve.
For example, later in this chapter we will talk a lot about a type of graph called
a tree. Many interactive systems considered as state machines are almost trees, but
they have details that make them more complex. So imagine that the tree part
of the graph is colored red. What we want for good interaction programming is a
good red tree. The other stuff, which we haven’t colored red, can wait, or perhaps it
can be handled automatically and generated by a computer—for instance, features
for error correction. Then we look at the red tree, ignoring all the other detail, and
try to improve the design of the red tree.
Trees are discussed in section 8.7 (p. 249).
235
Chapter 8 Graphs
wrong turning? What happens if they fall through a trap door that only lets them
go one way and not back?
The answer is that an interactive device should provide something analogous
to Theseus’s thread. Following the thread back will always get the user out. It’s
easy enough to do this. The thread is usually called “undo.” When users get lost,
they press Undo and get pulled back by the thread to where they just were.
There are a few complications, most notably when the user undoes back to a
state where they have been twice (or more times) before. It isn’t then obvious
what pressing Undo should do—it’s a state where the thread crosses over—go to
the previous state, or to go to the state that first got there. There’s a sense in
which, if the user is trying to recover from a mistake, simplifying the undo trail
will help; on the other hand, any simplification is second-guessing the user and
may contribute to further problems.
When Theseus got back to any doorway he’d been through twice before, he
would see threads going in three directions, and he’d run a small risk of going
around in a circle and taking longer to escape the maze. Even if he had been using
Ariadne’s special directional thread (so he knew which direction he was going in
when he unwound it), he would have two choices—or more choices in a popular
intersection. The moral for Theseus is that whenever he first gets to an intersection,
he ought to leave a stone behind to mark the first entrance, where he entered, to
the intersection—and then it will be easy to decide which way to get out when the
Minotaur chases him and he hasn’t time to think hard about it.
The great thing about graph theory is that we know these two problems—
Theseus panicking about the Minotaur and a modern ticket machine user (with
a train coming to pick them up if they have managed to get a ticket out of the
machine)—even though one is the stuff of myths and the other the stuff of mod-
ern stories and HCI textbooks—are precisely analogous. Thus a good ticket ma-
chine design will give the user different choices in different states, depending on
how to get out. A ticket machine has to make other decisions: as the user travels
around its “maze” they will be picking stuff up at different “intersections,” like
going to states where they can tell the ticket machine where they are going, how
many tickets they want, how many children are traveling with them, and so on.
When a user wants to “get out” not all of these details should be lost. Escaping
from the labyrinth is easier in comparison: Theseus merely wants to get out fast,
not get out with a valid ticket!
Another solution is to have more than one action available to the user. A key
Previous screen is potentially ambiguous, since there may be two or more previous
screens for the user—like having two or more threads coming to an intersection.
We could design the device to have Previous screen and Original way here , which (easily
if it was a soft key) would only appear when there was a difference. Really the
only way to tell what approach would help is to do some experiments with real
people, under the sorts of pressure they’ll be under when they need to buy tickets
before the train comes.
It is possible to escape from certain sorts of real mazes without needing a thread
or without needing an exceedingly good memory for places. In some types of
236
8.2. Mazes and getting lost in graphs
maze you can put your hand on the right hand wall or hedge, and keep on walking
with your hand following the right hand wall. This will take you in and out of cul-
de-sacs (dead ends), but eventually you will escape. If you happen to start trying
to get out when you are in an island within the maze, all you would achieve then
is to go around the island endlessly, never escaping. Hopefully, in real life, you’d
notice this repetition and at some point change hands to touch a wall you’d never
touched before. This would get you off that island, and perhaps closer to the exit.
In 1895 Gaston Tarry worked out a general technique for escaping from mazes.
His assumption was that the passages in the maze were undirected, so you could
walk along them in either direction. Of course, the actions in a state machine
device are directed so Tarry’s method does not work for a gadget, unless we design
it so that actions are, like passages, undirected. This would mean that there needs
to be an inverse for every action: if you can get to a state, you can always go back:
to use Tarry’s ideas, the device would need an Undo button.
Tarry had two ideas about goals. The simpler is that we want to find the center
of the maze, and his systematic way to do this is to obey the following rule:
Do not return along a passage that you took to an intersection for the first time
unless you have already returned along all the other paths.
This would be one way for a traveling salesman to work if they didn’t have a map
of the country (or a map of the maze): eventually they will get to every city. This
method of Tarry’s eventually visits everywhere, so it will certainly get to the center
of the maze. In fact, for a device, it is a technique for a user to get the device to
do (or be able to do) anything it can do—provided the device has a proper Undo
button.
His more complex idea was that from the center you would want to get out effi-
ciently. The first method, above, of finding the center takes you all over the place;
because you don’t know where the center is you need to try going everywhere. But
to escape faster, rather than reexplore everywhere, carry a bag of stones and mark
each passage you travel down—roughly speaking you use the stones as marking
the “ends” of a thread. If you are Theseus, you need a big reel of thread, at least
as long as the passages put end to end; if you are Tarry, you need a bag of stones,
big enough to leave stones at each junction you visit. Tarry’s stones are equivalent
to Theseus’s thread, except no trail is left down the passages only at their ends. If
you are a modern interactive device user, you just need the device to be designed
to keep track of the paths that would have been marked with threads and stones.
I haven’t given quite enough details, but Tarry’s approach is equivalent to the
following more abstract description. He effectively replaces each edge in the orig-
inal maze with a pair of directed, one-way edges going in each direction. He uses
the stones to keep track of which way you’ve been down any edge. His procedure
then efficiently finds an Eulerian tour around this directed graph (there always is
one as he’s added edges)—you walk down each maze passage twice, once in each
direction—so effectively along each virtual directed edge once.
What use are these graph theory ideas for interactive systems? It’s interesting
that web browsers typically change the color of links (hot text) when they have
been used. This is like putting stones down in passage entrances: the links are
237
Chapter 8 Graphs
the ends of passages (to further pages on the web site) and the change in color
marks the fact that the passage (or link) has been used before. Tarry himself sug-
gested this is useful idea, and no doubt usefulness is what prompted the feature
in browsers, though I don’t think the designers of web browsers got the idea from
systematically thinking through graph theory ideas.
Why don’t interactive devices have lights on their buttons? Maybe the color
of a button should change if it has been used before in this state? Maybe a button
should flicker tantalizingly if the user has never tried pressing it in this state? Such
a device would encourage and enable the user to explore it fully over a period of
time. In fact, this idea is a bit like putting the user testing features of a framework
(like the one we mentioned above for checking that the test user is following a
Chinese postman tour) into the device itself. What’s a good idea for the tester is
probably good idea for the general user too.
It is important that there is an algorithm, Tarry’s procedure, for getting around
a maze. The algorithm does not rely on there being a map or bird’s eye view or
indeed having any prior knowledge about where the center of the maze is. This is
rather like wanting to build a device that allows the user to succeed, but (even with
a map of the gadget) not knowing what the user’s tasks are. What state or states
would the user consider the center of their maze of tasks? We don’t know. What
we do know, though, is that it is possible to help users to systematically explore a
gadget so that eventually they can find anything they want in it (provided they can
recognize it when they get there).
In complete contrast, recreational mazes are deliberately designed to be a chal-
lenge. Mostly when we design user interfaces we want to help users rather than
challenge them, which suggests that rather than helping users cope with the cur-
rent design, we could try redesigning to make finding our way around easier.
Graph theory gives us several ideas to do so, from changing the structure of a
device (so a particular way of navigating it will work), changing the actions or
buttons available (so Undo works usefully), changing the indicators or lights at
certain states (so users know they have been there), and so on.
8.3 Subgraphs
Most devices are more complicated than the conceptually simple sorts of graphs—
complete graphs, cycles, or trees—yet these are clearly relevant and useful design
concepts. How can we get the best of both worlds?
Inside a mobile phone, one would expect to find a cycle of menu items that the
user can choose and a Scroll button or a pair of buttons ∧ and ∨ that cycle through
them. Yet the device’s graph as a whole is not a cycle: there is more to it.
A subgraph is part of a graph, obtained by taking some (possibly all) of the
vertices from the original graph as well as some of (possibly all of) the arcs that
connect them. Thus we expect a mobile phone, for example, to include a subgraph
that is a cycle, namely, the cycle of menu items.
Subgraphs have other applications. From a usability point of view, if a user does
not know everything about a device, but what is known is correct, then they know
238
8.3. Subgraphs
a a a
b c b c b c
d e d e d e
b c b c b c
d e d e d e
Figure 8.3: A graph (a) and various graphs derived from it. The subgraph (d) is just
some of the arcs and some of the vertices and, as it happens, is not connected. In
contrast, an induced subgraph (e) contains all relevant arcs. The spanning trees (b
and c) are subgraphs of (a).
a subgraph of the device’s full graph: in other words, the user knows some, but
not all, of the states and actions.
Probably the most important two sorts of subgraphs for designers are so-called
spanning subgraphs, which have all the original vertices but not all the arcs, and
induced subgraphs, which retain all of the relevant arcs but not all the vertices.
Figure 8.3 (this page) shows a simple graph, some subgraphs, two different
spanning trees, and an induced subgraph.
Trees and spanning trees are discussed at length in section 8.7 (p. 249) onward
later in this chapter.
The figure also shows the hinges (sometimes called articulation vertices) of the
original graph. If you delete a hinge, the graph becomes disconnected. If a user
does not know about a hinge, they cannot know about the component of the graph
on the other side of the hinge. It’s therefore crucial that users know about hinges—
or that a graph does not contain them, or contains as few as possible.
We’ll consider induced subgraphs first. If, from an entire graph for a mobile
phone, we pulled out just the vertices corresponding to a submenu of the phone
(say, all its language choices), we would expect to find that the induced subgraph
was indeed a cycle. In other words, if a user gets down to this language submenu
the graph of that menu should be a cycle; otherwise, the user would not be able to
get around it easily. In fact, the induced graph of any menu should be a cycle for
this sort of device.
239
Chapter 8 Graphs
240
8.5. The web analogy
Figure 8.4: An illustrative user model (a) is almost a subgraph of some device model
(b), except that the user thinks there is another state that is connected to the rest of
the state space of the device.
241
Chapter 8 Graphs
Figure 8.5: Equivalent terms from different fields. Is it ominous or obvious that the
same theory works equally well with web sites and mazes?
242
8.6. Graph properties
What happens when the device is used in its intended setting, rather than in a
laboratory setting or in the designer’s office? Usability, they say, is far more.
To avoid the confusion, it’s important to distinguish between different sorts of
properties and how we use them; no single property on its own can claim to char-
acterize “usability” in its widest sense, but we can nevertheless make some very
insightful, and often very clear, assertions.
If a necessary property is absent, a device won’t work correctly. Two very
important necessary properties are liveness and safety.
A safety property says that something (usually dangerous) cannot happen.
A liveness property says that something (usually desirable) can happen.
We do not need to agree about specific properties. For example, if I am pointing a
gun at an aggressor, to me it may be a liveness property that I can fire my gun, but
to the other person, it may be a safety property that I cannot. Yet we’d both agree
that the property was important!
A metric, or measure, says in some sense how hard something is to do (of
course, it could measure other factors).
A metric may not be accurate; it may even be inaccurate in a systematic way. A
low bound is a measure that lets us be certain that the correct value is larger; a
high bound lets us be certain that the correct value is lower. Generally, our
measurements will be off by an unknown factor, for instance, because we don’t
know whether the user is an athlete and how fast they perform—and then we
can only compare measures made the same way relative to one another, rather
than make absolute judgments.
Metrics and properties are often mixed up. How well can we achieve a property?
Is it true in 65% or 95% of states, say? Or a property could be that a measure
achieves or doesn’t exceed a certain value. A cash machine might have a safety
property that it will never dispense more than a certain amount of money.
In contrast to all these hedges, a property or metric worked out for a graph is
quite precise. (Occasionally we deal with such large graphs that we can only guess
on the property, but that’s not a problem for any properties or graphs that we will
consider here.)
We might work out that there are exactly 23 arcs between this and that vertex.
Whether that 23 means that a user would take 23 seconds or 17 days to get from
one to the other, we cannot say. We can guess that if the number was reduced,
the user could be faster. We can guess that if the number was enormous, the user
might never make it. We can be certain, however, that if we increase the number,
even the fastest user would take longer.
All numbers, measures, and properties work like this; in graph theory terms we
can be certain of what we mean; in interaction programming terms and usability
terms, we need to interpret the properties carefully.
In design, we may find some properties are not as we wish; we then improve the
design. For example, we may want to revise a device design so that some things
243
Chapter 8 Graphs
w v
a e
w a b c d
e b u c d
u v
Figure 8.6: The simple maze of figure 8.2 (p. 235) redrawn with an extra wall, cutting
off c, d, and e from the rest of the world, w. Building this wall in the maze creates two
new vertices, u and v. Representing the new wall in the graph means deleting the old
edge between a and e—the maze now has a disconnected graph.
8.6.1 Connectivity
To be able to use a device freely, users must be able to get from any state to any
other, and that means that there must be paths in the device’s graph from every
vertex to every other. Thus a simple usability question has been turned into a
simple but equivalent graph theory question.
A graph that is connected has paths from everywhere to everywhere; a graph
that is disconnected does not. Compare figure 8.2 (p. 235) with figure 8.6 (this page),
which is a disconnected maze. For such a simple maze, it is pretty clear that it is
impossible to do some tasks, like getting in from the outside world (represented
by the vertex w) to the center (c).
244
8.6. Graph properties
Arm Trigger
Disarm
Figure 8.7: A graph that is connected but not strongly connected. A user can go
between Unarmed and Armed states without restriction, and from Armed to Exploded,
but not back again to Armed (or Unarmed) once the Exploded state has been visited.
Compare this graph with figure 9.1 (p. 275), which is a similar, but has an extra arc
and is strongly connected.
Most mazes allow you to walk in either direction down the passages; their
graphs are undirected. Interactive devices are better represented as directed graphs,
in which the notion of connectivity has to be stronger. States may be connected,
but are they connected in the right direction? Figure 8.7 (this page) shows a di-
rected graph that is connected, but not strongly connected. Every vertex in the
graph is connected by a path through edges to every other vertex—but not if you
try to follow the edges in the one-way directions indicated by the arrows.
This book is a graph, from which we can extract many interesting subgraphs.
For example, figure 8.8 (next page) shows the subgraph defined by the explicit
linkage between sections. For example, one arc in the figure represents the link
from this paragraph to the figure itself. The graph is clearly not strongly con-
nected. It is a set of components, each of which represents a related topic explored
in the book. Unlike a maze, this graph is directed: if users (readers of this book)
follow a link, they would have a hard time getting back—the links are one way.
If the book was converted to a web site so that the links were active links that the
readers could click on, the browser would rectify this directedness by making all
links reversible. This is a good example of how an abstract structure (in this case,
the book’s explicit linkage) can be made consistent and easier to use by a program
(in this case, a web browser).
It is easy to write a program to check whether a graph is strongly connected.
Every interactive device should be strongly connected, unless the model of the
device includes irreversible user actions—for example, a fire alarm has a piece
of glass protecting its button: once the user has smashed the glass, there is no
going back to the standby state. If the answer is that the graph is not strongly
connected, you’d want to know where the graph fails to be strongly connected; for
these indicate areas that may represent design faults or deliberate features you’d
want to check.
Section 9.6 (p. 297) gives one way to check strong connectivity.
245
Chapter 8 Graphs
Figure 8.8: This is the graph of all the linkage cross-references in this book. Each
section in the book is a vertex, and if a link refers to a section, we draw an arrow
from one section to the other. For convenience, we do not show the section numbers
(it would make the visualization too detailed) and we do not show isolated vertices.
Connected components of this graph are clearly visible, and the small ones represent
coherent sub-themes within the book. Of course, if we allowed that each section is
connected to its successor, and added linkage like the table of contents and the index,
the graph would be more realistic in terms of representing how the user can navigate
the book—but harder to interpret! Put another way, if I make this book into a web
site, the graph of the web site would contain this graph, shown above, as a subgraph.
246
8.6. Graph properties
properties we want a device (or a graph) to have depend on what sort of graph we
have chosen to represent the design.
We could write a program to find all cycles in a device’s graph: typically we
would expect to find many cycles, for instance, because the device will have a
correction action so the user can undo unwanted menu selections. Often we expect
there to be many simple cycles such as menu↔submenu. If we are checking a
device design, then, we might first delete all arcs corresponding to correction or
undo actions and then check that the remaining cycles consistently use Up (or ∧ )
and Down (or ∨ ). The sorts of questions we can pose of a device are endless!
247
Chapter 8 Graphs
If the independence number is zero, the graph is complete and in principle easy
to use (if the buttons are labeled sensibly). The bigger the number, the more diffi-
cult the device. Since bigger graphs typically have bigger independence numbers,
a more realistic measure for usability is based on the probability that the user has
to think, rather than just read and interpret button labels. What we can call the
independence probability is an estimate of the probability that the user has the
device in an independent set and also wants to get the device to a state in the
same set. We could write programs to work this out for any particular graph—it’s
only an estimate because we don’t know precisely which states the user is likely
to be in.
Section 10.4.3 (p. 343) gives example program code to work out the
independence probability. Chapter 10, “Using the framework,” gives many
other examples.
Section 9.6.5 (p. 311) assesses how the user can solve the task/action mapping
problem—the problem of getting from one state to another efficiently. We will
be especially interested in the designer assessing how hard this is to do over all
cases that interest the user. As we’ve seen, for any tasks that involve an
independent set, this is a nontrivial problem for the user.
248
8.7. Trees
components will include the standby or start state, which we’ll assume the user
knows about. The question is, what’s in the other components that the user cannot
get to except via the hinge? If these components contain states that are important,
then the user must know about the hinge.
It may be appropriate to make the hinge (if the graph has one hinge) into the
standby state (or the home page for a web site) or otherwise redesign the device
so that hinges are close to standby and therefore easier to find and more familiar
through use. Conversely, if a device has more than one hinge, as it can’t have
two standby states, this may well be a design problem the designer should know
about.
Section 10.1 (p. 325) introduces the farmer’s problem and uses it to give further
examples of strongly connected components and how they relate to design.
Some graphs have no hinges. The connectivity of a graph is defined as the small-
est number of vertices you’d have to delete to disconnect the graph. In particular,
if deleting just one vertex disconnects the graph, that vertex is a hinge. If you need
to delete two vertices to disconnect a graph, then the user has two routes into the
states on the other side of them.
If the connectivity of the graph is two, the user is safer, but if they do not know
about at least one of the two vertices, so again part of the graph will be unreachable
for them, except by accident. Can we design to ensure, as far as possible, that the
user knows about at least one of these vertices? As the connectivity of a graph
increases, it becomes less dependent on critical knowledge—the user has more
ways of doing anything.
As a conscientious designer, you will want to (somehow) make sure the user
understands the importance of every hinge in a graph when they are using the
device. Perhaps the device should flash something, or beep. Or perhaps a but-
ton should flash when the device is in a state that makes buttons correspond to a
bridge.
Graph theory gives us ways of evaluating interesting usability issues and giving
them precise measures, which can then be used to guide modifying designs and
try to improve them against the measures. Graph theory also suggests ways of
finding things that can be improved without changing the structure of the device.
We could either eliminate bridges from a design, or we could highlight them so
a user is more aware of their importance. Or we could write more prominently
about them in the user manual.
8.7 Trees
Graphs might seem pretty trivial things on the grand intellectual landscape—just
arcs and edges—but trees, which are even simpler and therefore easier to dismiss,
are in fact very useful.
Trees are of course familiar in the countryside, and sometimes help form the
walls of mazes. Ordinary trees can also be represented as graphs; in fact, a tree-
249
Chapter 8 Graphs
like graph is so common that tree is the technical term in graph theory. A tree is a
particular sort of graph that, well, looks like a tree.
For some reason, as shown in figure 8.9 (facing page), we usually draw trees
upside down, rather looking like the underground roots of a tree without the tree
itself. Indeed, we use the terms “above,” “beneath,” “top down,” and “bottom
up” assuming trees have their roots at the top.
A collection of trees is, unsurprisingly, called a forest. Some people call the arcs
(or edges) in trees their branches, and the ends of branches (that don’t go on to
other branches) are called leaves. The ends of branches, whether they are leaves
or not, are called nodes, just as an alternative to the word vertex.
The root of a tree is a special node with no arcs going to it. We can define leaves
similarly: a leaf is a node with exactly one arc going to it and none from it.
A tree is connected, but as there is only one branch connecting any two adjacent
nodes and there are no cycles (because a tree does not grow back on itself), a tree
cannot be strongly connected. Trees have the property that there is exactly one path
from the root to any other vertex; if the tree was undirected (so you could go back
on the arcs) then there’d be exactly one path between any two nodes. Put another
way, every vertex in a tree is a hinge and every arc a bridge.
Trees are very useful information structuring tools and have all sorts of uses.
They are often used in family relationships and genealogy. Branches can represent
the relation “child of,” and inevitably leaves or nodes of a tree at the same level
immediately below a node are then called siblings. This sort of tree shows the
descendants of the person named at the root of the tree. Or a genealogical tree can
be used the other way round, so the arrows represent “parent of.” In this case,
you would be at the root, and your parents would be one level down from you,
and your oldest ancestors you know would be at the leaves of your tree.
Trees are very often used to represent web sites; the root is the home page and
the leaves are pages where the user does things (like buy a book, if each leaf is
a book description); all the other nodes are then basically navigational, but they
might also have incidental information on them.
In this book, we need directed trees, where each branch points away from the
root. Trees are ubiquitous in user interface design, because almost every device
presents the user with choices—menus being the obvious example. For a user
interface, the root of the tree would probably be “standby” or “off” and the other
nodes might be functions the device can do or menus of further choices. Users
might think that functions are “in” menus but in fact the choices are in the menus
and the functions are leaves one arc below the menus.
A problem for a user interface designer is that it is tempting (and, indeed, of-
ten sensible) to draw a tree to guide the design of the system. Trees are simple
and therefore good for conceptualizing a device. But because a tree is not strongly
connected, designers may forget about the importance of strong connectivity. One
may end up with a device that has a tree-like user interface that appears to be sim-
ple and elegant, but which has some weak spots that are not strongly connected.
More precisely, a designer trying to design a device as a tree should think of it as
a subgraph. If this subgraph is a tree that includes every state of the device, they
250
8.7. Trees
Z Q J X
Figure 8.9: This is a subtree of figure 5.5 (p. 129), which was a Huffman code for the
complete 26-letter alphabet. The root of the (directed) tree is shown as a black disk;
the leaves are shown as letters; the branches are arrows. Either end of an arrow is a
node; if no arrow points to a node, it is a root; if no arrow points from a node, it is a
leaf. To be a tree, there is exactly one path from the root to any leaf.
are designing a spanning tree. A spanning tree guarantees exactly one route from
its root to every other vertex. Users who know a spanning tree for the device, pro-
vided the device starts off at the root state, can do anything with the device even if
they don’t know any more about it. In general, the root will be some obvious state,
like standby. Typically we would then require every state to have some obvious
action that gets back to standby—this would not be part of the spanning tree, but
it would ensure that users can always get back to the root easily and thence to any
other state by following paths in the tree. In fact, in this case, users would need to
know meaning of some key like Reset , or else they would not be able to navigate
the interface.
251
Chapter 8 Graphs
Bus
Old New %CH Total Part %T Cost Price M%C Cost Price M%P
Figure 8.10: Part of the tree-based menu structure for the Hewlett-Packard 17BII
handheld financial calculator. Cost and Price are both shared under two different
headings, so it is not a proper tree (it would be a semi-lattice). The terms are HP’s—
they appear exactly the same, but in block capitals on the calculator display: “Bus”
means business, “%Chg” means percent change, “MU%C” means markup as percentage
of cost, and so on.
This book is a graph that contains lots of trees. For example, the book’s index
is the root of a tree of ideas and people’s names; each page number in the index
is a branch leading to a page, which is the leaf. Or the table of contents is a root
of another tree, and each chapter the first node down from it, with sections being
children of the chapter nodes . . . down to paragraphs, words, and letters finally as
the leaves.
All these trees structure the book in different ways, but none of them are any
good for reading the book. For reading, there are two further sorts of arcs added:
one, implicit, is that as you read you read on to the next leaf without thinking—you
read the book linearly; the second sort of arc is the linkage and cross referencing
that encourage you to read nonsequentially.
For a brief history of the inventions that make reading easier, see section 4.2
(p. 96).
The ideal plan is not as easy as it seems. Figure 8.10 (this page) shows a sub-
graph of the structure of the Hewlett-Packard 17BII handheld financial business
calculator. The structure shown is a tree, but some items are actually shared be-
tween branches. Thus the simple-looking tree structure in figure 8.10 is mislead-
ing, since it looks like a tree only because the repetition of COST and PRICE is not
visually obvious. The design of the 17BII therefore makes the user’s choices for
COST and PRICE permissive.
The advantages of permissiveness are discussed in section 5.2 (p. 134).
Drawing the structure pedantically, so that each item had exactly one box, would
in general make lines cross, and the diagram would become cluttered and unhelp-
ful. A designer would be unlikely to design it looking as bad as this, which would
unfortunately encourage them to make it a strict tree.
Merely drawing trees as a visual aid in the design process tends to lead design-
ers into making user interfaces that are too strict. You almost always need to add
252
8.7. Trees
Box 8.2 Trees versus DAGs You can’t go round in loops in directed or undirected sorts of
tree, so they are said to be acyclic graphs. Acyclic graphs are useful—they guarantee you
can’t get into loops. You may not get where you wanted to go, but you will eventually get
somewhere! In contrast, a user (or a computer program) can get stuck going around a loop
in a cyclic graph forever.
Directed trees are a sort of directed acyclic graphs or DAG. But DAGs allow more than
one path between the vertices, so they are more general than trees. Many DAGs are trees,
and all directed trees are DAGs, but not all DAGs are trees. A DAG might have several
roots—vertices that have no others pointing to them—whereas a tree can only have one
root.
While you might design a device as a DAG or tree, very few interactive devices would be
satisfactory as DAGs, because they do never allow the user to go back to previous states.
Typically, then, you may design a device as a DAG, but use a program to systematically add
more arcs to it to make it easier to use.
further arcs (and that’s best done systematically by program), and great care must
be taken that repeated states are exactly the same state, rather than being dupli-
cates that may not be exactly the same. For example, the state Cost in figure 8.10
may or may not be exactly the same state, though it has the same name, which
suggests to the user it is the same state.
253
Chapter 8 Graphs
254
8.7. Trees
%Chg %Tot
%CH %T
Bus
MU%C MU%P
M%C M%P
Figure 8.11: Venn diagrams are nested nonoverlapping circles that represent trees.
A Venn diagram is used here to represent the command structure of the HP 17BII
calculator, which was also shown in figure 8.10 (p. 252) as a tree. As the text explains,
this drawing is subtly misleading—it looks simpler than the device really is.
directed trees are not designed for going back. Or if a user wants to explore a tree
to see what it has in it, this again is not a simple get-from-root-to-leaf procedure.
In short, trees are not sufficient for all but the simplest user interfaces; even if
functions (leaves) are repeated to make a tree more permissive and easier to use,
a user will still sometimes want to navigate the structure in an essentially a non-
treelike way.
Unfortunately, searching a tree efficiently is quite complex. The user has to go
into each submenu, search it, then come out of the submenu and search the next
submenu. It is very easy for a user to make mistakes, go around some submenus
more than once, and (worse) come out of a submenu prematurely and miss out
altogether a subsubmenu that might have what they are looking for somewhere
in it. Once a user makes a mistake, they may “panic” and lose track of where they
have got to, even if it wasn’t too hard before getting lost in panic!
For a user who does not know how a device tree works, or what classification
scheme a designer used for the tree, a linear list would be far better. In fact, a cycle
(a list that joins up at the end to start again) would probably be better. You can
start on a cycle anywhere and eventually get everywhere, whereas with a linear
list, you have to know where the starting point is and you get stuck at the end.
Whichever you have, there is no wasted time in learning how the list is organized;
for users who learn how it is organized, they are guaranteed that a simple scroll
255
Chapter 8 Graphs
A B C D E F G
Figure 8.12: A linear list for seven commands. The user interface merely requires
buttons equivalent to Next , Previous and Select .
(say, repeatedly pressing ∨ ) will get through everything in the entire list. With a
list, a user will inevitably find whatever they want (provided it is available).
In short, we often want to design a device that is both a tree and a cycle (or,
more precisely, contains two subgraphs, a tree and a cycle, whose vertices are the
device functions). It is not difficult to build a design tool that allows the designer
to see the device as a tree but which systematically inserts all the cycle links in—to
help users in a consistent way, without distracting the designer with the added
complexity of extra permissiveness in the final design.
Section 5.5.5 (p. 153) compares different structures (including the
manufacturer’s original tree) for searching and access times for a fax machine.
256
8.7. Trees
B G
C F
D E
Figure 8.13: A cycle for seven commands. Whether this is better than a linear list
depends on whether A and G are related or should be kept separate, and whether it
matters that the user may not notice they are going round in circles indefinitely—
particularly if the cycle is very big and the user doesn’t remember where it starts. The
buttons or commands for this user interface are the same as required for a linear list.
B F
A C E G
Figure 8.14: A balanced binary tree, with 7 nodes. Imagine the user wishes to select
one of the 7 lettered commands, A–G, and starts from the root of the tree.
if the display shows a command X, and if this is what the user wants, they should
stop. If not they should press Before (or whatever it is called) if what they are after
is earlier in the alphabet than X; otherwise they should press Later .
Unfortunately, there are trees that, though they follow the same rules, are much
less efficient. One is shown in figure 8.15 (next page). Now the average cost to find
a function is (1 + 2 × 2 + 3 × 2 + 4 × 2)/7 = 19/7 = 2.7, which is a bit worse—
but the more functions there are, the worse it gets. For instance, if we had 127
functions, it would be worse by 33 instead of 6. The interesting thing is that both
trees work exactly the same way, and (apart from the time it takes) a user cannot
tell between them.
Perhaps a designer can’t either. That would be very worrying. There are an
exponential number of trees, but there is only one balanced tree (if it’s full). The
chances that a designer hits on the best tree by chance is negligible. Unless the
designer uses design aids or tools, they are very unlikely to make an efficient tree
by designing it by hand—it is too much work.
257
Chapter 8 Graphs
D
A
F
C
E G
Figure 8.15: An unbalanced binary tree, with 7 nodes. The user’s strategy for using this
inefficient tree is the same as for using the efficient balanced tree of figure 8.14 (previ-
ous page).
In these two examples of balanced and unbalanced trees, I’ve assumed that the
tree is a binary tree, with two choices at every node. If we increase the number
of choices, the user can get to where they want faster. The “hidden” cost is that
showing the user more choices requires a bigger screen, and as the screen gets
bigger, the user will of course take longer to read it—and perhaps not read the
occasional extra lines sometimes needed, but which don’t fit in.
The moral is that when you use trees, balanced trees are much more efficient—
and for some applications, such as alphabetic searching, it makes no difference to
the user—other than being faster.
We worked out average costs assuming that everything (using any of the 7 com-
mands or functions of the device) was equally likely. If functions are not equally
likely for whatever the user is doing, then a Huffman tree will give better average
performance than a balanced tree; in fact, if all the probabilities are the same, the
optimal Huffman tree turns out to be a balanced tree. There is one proviso to this
comparison: when I introduced Huffman trees, the functionality was at the leaves,
and in our discussion here, we have functionality of the device both at leaves and
at nodes interior to the tree.
Huffman trees are discussed in section 5.1.3 (p. 126).
When we used Huffman trees for the functions of the Nokia handset, the func-
tions of the handset device were at the leaves of the tree, but submenus (like “Mes-
sages”) were at interior nodes. With this design, the user would never want to use
the device’s Messages function, because it isn’t a function—it is merely a menu
providing all the different sorts of message function.
258
8.8. User manuals
259
Chapter 8 Graphs
260
8.8. User manuals
Box 8.3 Computers in 21 days Visit a book shop and you’ll be spoiled for choice with all
the books explaining computers for people like you. Isn’t it reassuring to discover that you
are not alone and have joined a great band of people who find computers fundamentally
difficult?
Using computers requires lots of trivial knowledge, like when to press the F1 key. That’s
why those self-help books are so thick. Certainly, nothing works unless you know the right
trick. When you do know something, it seems so simple. It is then easy to slip into thinking
that you must have been stupid for not knowing in the first place.
What if all those people who wrote books on how to use computers talked to the man-
ufacturers who made the difficulties? One of my hobbies is to go through manuals, to find
instructions to tell us how to cope with quirks that need not have been there in the first
place. When a manual says, “make sure such-and-such,” I ask why wasn’t the computer
designed to make sure for you? Almost all problems with computers are easily avoided by
proper design, by manufacturers doing a little thinking first. What are computers for if they
can’t do obvious things?
Take any other modern product. You can buy a car. You don’t get thick manuals telling
you how to stop the wheels from falling off. Wheels don’t fall off on their own. In comparison,
most computers are totally unsatisfactory: you get lots of “wheels” that fall off when you
are not watching.
Computers are badly designed because manufacturers can make money without trying
any harder. Yet most people say computers are truly wonderful! Before you know it, you
too will be buying upgrades, more RAM, and some training books and when you’ve spent
another thousand dollars you’ll agree that they are wonderful. It’s not you who benefits from
this, but consumerism that likes to keep you dependent, believing that you are responsible
for fixing the computer’s problems with more of your money.
Since there are so many spanning trees you can get from even modest graphs,
the designer and technical author need all the help they can get. If they choose a
user manual structure on their own—any spanning tree—there’s little chance that
it will be optimal or even nearly optimal.
Even a simple design tool would be a great help. Prim’s algorithm has the nice
advantage that it can be used to extend an existing or draft manual. I envisage
a designer writing a basic manual (which is of course a tree but may not be a
spanning tree), then Prim’s algorithm will provide suggestions for extending the
tree to make it more like a spanning tree—but suggesting a good extension by
however we are measuring “good.”
A designer would like to be able to assess how easy a device is to use. But
we can produce user manuals automatically (or semiautomatically), and there are
now more usability measures we can get by measuring the manuals rather than
the devices directly. The longer a user manual is, for instance, certainly the longer
a user will take to read it. In fact, the longer a user manual has to be to correctly
describe a gadget, the harder the device must be to learn to use. We could auto-
matically generate a user manual, and measure its length. Then if we can modify
the design and reduce the length of the manual, we have in principle an easier-to-
261
Chapter 8 Graphs
learn device design. The depth of nesting in the user manual too is another useful
measure of complexity. It corresponds to how hard a particular feature is to learn.
To take advantage of these ideas we have to have design tools that can gener-
ate user manuals from device specifications. Otherwise, we would make a small
change and have to wait until the technical authors had rewritten the manual! A
designer needs to be able to experiment with the specification of a device and see
whether the manual gets longer or shorter. If this can be done fast enough, as it
can be by drafting rough manuals using automatic design tools, then designers
can very quickly optimize designs to make them easier to learn or use (depending
on what measures they are using).
Generally, better manuals are indeed shorter, but of course there are excep-
tions. A manual for a gun would have a lot of safety instructions, and blindly
removing the safety features so that a manual does not have to discuss them
(and would therefore be shorter) would be counterproductive. Different sorts of
features should be flagged as having different weights in the calculation of the
“length” of the manual. Possibly the more safety discussion, the better—if so,
for such a device, text on safety features would have a negative weight; thus the
more safety-related text the shorter the manual appears to be. Imaginative de-
signers will be able to come up with more sophisticated approaches that can be
programmed into tools.
A technical author can tidy up the automatically generated English. Provided
we have a way of keeping track of what the technical author has rewritten, text
never need be rewritten unless the automatically generated text for the affected
parts changes. Thus if we have automatic tools to help with manual generation
from a device specification, we can do something very useful: when the device
specification changes, (most) effort put into writing a clear manual need not be
lost. Furthermore, we can deliberately change the device specification to get a bet-
ter manual. If our user manual-generating program can quickly give an estimate
of the complexity score of the user manual, then it would be worthwhile for the
designer to experimentally mess around with various alternative designs to find
the one with the clearest, briefest explanation.
262
8.8. User manuals
a a
b c b c
d e e d
Figure 8.16: One or two directed trees? If these two trees are unordered trees, they
are the same (rather, they are different representations of the same unordered tree); if
they are ordered trees, they are two different trees. As ordered trees, the trees differ in
the order of the two children d and e.
assuming it’s not a degenerate spanning tree in which every node has exactly one
child, stretched out as list, as in figure 8.12 (p. 256).
How do you order the tree in the order that suits the reader? The best orders
depend heavily on the application, but these three ideas can be used separately or
in combination:
The user or a technical author—any expert in reading—gives hints on ordering
the sections of the manual. This must be first; this must come before that; that
must come after this.
Flag every mention of a concept in the manual; flag definitions and uses of
concepts. Then order the manual so that definitions precede uses. That way the
user gets to read the section that defines what a dongle is before reading
sections that say what to do with them.
If, in addition to being a tree, the manual contains cross references (as this book
does), we minimize the total length of cross reference links. If we strongly
prefer cross references of length one, that is which go to the previous or next
section, then we can remove them from the document since the reader can
easily read the next section without being told to by a cross reference!
All of these rules (and no doubt others will occur to you) have corresponding
algorithms for implementation.
The first two can be combined; they require topological sorting, which is a stan-
dard and straightforward algorithm. Topological sorting may, on occasion, fail to
find an ordering that satisfies all the requirements.
For example, the standard dictionary joke that badminton is played with shut-
tlecocks and shuttlecocks are things used in the game of badminton is a case of two
sections both defining words the other uses. There is no ordering that puts defi-
nitions before uses. Topological sort will point this out; the designer or technical
author will then have to make a decision that one or other rule has to be relaxed—
or perhaps the document rewritten so that the problem does not arise. To resolve
the dictionary paradox, we need only rewrite the definition of badminton to in-
clude its own definition of shuttlecocks; we could then optionally simplify the
definition of shuttlecock to the terse “see badminton.”
263
Chapter 8 Graphs
Doubtless, dictionary writers either spend sleepless nights with worry, or they
use computers to help them spot problems automatically.
The last of the three ideas for ordering trees is more obscure and requires the so-
called jump minimization algorithm, because the problem is the same as reducing
the length of jumps in a program with goto statements (which correspond, for the
program, to our cross references). A variation of jump minimization is bandwidth
minimization, which tries to minimize the maximum length of any cross reference,
rather than the total length (which of course might mean that some cross references
span great distances).
If the user manual was split into two volumes, an introductory overview leaflet
and a detailed in-depth reference manual, it might be wise to minimize the number
of cross references from the overview to the reference manual, or we might want
to make the reference manual self-contained, with no references to the overview.
There are many choices in ordering a tree. The point is that we can be precise
about the rules we wish to apply or enforce to make the user’s use of the manual
(or device) easier, and we can use algorithms to automatically generate the best
results—or to point out to us any problems or inconsistencies in our requirements.
There is a key advantage to doing this (which I repeat in many ways throughout
this book): if we decide to modify the design in any way, we automatically generate
new manuals without any further effort.
In contrast, if we worked conventionally, even a minor design change could do
untold damage to a carefully written user manual (or any other product of the de-
sign process)—and that potential problem would mean we don’t make improve-
ments because of the knock-on costs, or that we don’t have accurate manuals, or
that we write the manuals at the last possible minute and don’t thoroughly check
them. Very likely, when the manuals are written late, we would not read them
ourselves and get no further insights into how to improve the design (some de-
sign problems will stand out in awkwardly written sections of the manual).
264
8.9. Small worlds
much, because it would be too much trouble to redesign what is already known to
work.
Rather than being created out of nothing, a device graph grows over time. Itera-
tive design occurs for new models based on previous years’ models, but even new
devices will have had some design iteration: when design started, there would
have been some earlier sketch design, and that design then grew and was elabo-
rated over a period of time, mostly by being extended with new features as the
designers or users think of things to add.
When graphs grow, they tend to have special and very interesting properties. In
technical terms, these graphs are called small world or scale-free, and the popular,
well-connected vertices are called hubs. Here are two familiar examples:
The world wide web When I create a web page, I tend to link it up to sites I am
interested in and that are popular. Of course, I’m unlikely to link to unpopular
sites because I probably won’t know about them. In turn, if my page is
popular, lots of people will link to me. Again, as more people get to know
about and link their pages to my page, then my site becomes even more
popular, and even more people link to me. Popular web pages quickly become
very popular. Hopefully, my page will end up being a hub.
The airport network There is already a network of airports and rules about
where airplanes fly. If I want to build a new airport, I’m going want routes
approved to somewhere like Los Angeles or Heathrow, because they are
popular airports, and if I can connect to one, I’ll make my airport far more
popular. Heathrow and Los Angeles are a hubs in the airport network.
If a graph is a small world, the average distance between its vertices will be sur-
prisingly low. In an ordinary random graph, most vertices have about the same
number of edges, and there are no shortcuts from anywhere to anywhere. In con-
trast, in a small world graph, the hubs tend to connect seemingly “distant” vertices
in at most two steps.
Of course, path lengths in a small world graph won’t be as low as in a complete
graph, because if a graph is complete, the user can get anywhere from anywhere
in exactly one step: the characteristic path length (the average shortest distance
between any two vertices) of a complete graph is exactly 1. At the other extreme, a
cycle connects every vertex with the longest path possible: if a graph of N vertices
is a cycle, its characteristic path length is N/2—on average a user has to go half
way around the cycle to find what they are after. In general, the characteristic path
length will fall somewhere between 1 and N/2, and the smaller it is, the faster
the device will be to use, other things being equal. (If a user doesn’t know how
to find short paths, the fact that there are theoretically short paths may not help.
However, the converse is always true: making paths longer makes a device harder
to use.)
On calculating the characteristic path length, see section 9.6.2 (p. 302). Other
uses of the characteristic path length are discussed in section 10.3.1 (p. 333).
The characteristic path length is not the only unexpectedly low measure of a
265
Chapter 8 Graphs
Box 8.4 Small world friendship graphs Consider a graph of people, where everybody is a
vertex, and if two people are friends, then we say there is an edge between the corresponding
vertices. This graph, the so-called friendship graph, forms a popular example of small worlds.
The graph isn’t exactly random, because people tend to make friends depending on other
people’s popularity and the friends they know. That is, naturally, we tend to know people
who are popular, and we make friends either with them or with people they are friends with.
Although the graph is far from complete—nobody can possibly know everybody—it’s a small
world graph.
The graph is small world because friendly, popular people tend to be people other people
become friends with. Popular people become, in small world terms, hubs: bigger and bigger
friendship networks grow around them, reinforcing their hubness. If you know a hub, you
know at one-step-removed very many friends through that hub. Hubs are worth knowing
because they are well-connected.
It is said that everybody is separated by at most six people in this world-wide friendship
graph. This sounds surprising at first, given how many people there are in the world, and
how few of them each of us know. Indeed, it isn’t quite true.
The psychologist Stanley Milgram wanted to know how well connected we are, though
his definition of connectedness was more like “know of” rather than “friends with.” In the
1960s he asked 160 people in Omaha, Nebraska, to write a letter to somebody as close as
they knew to a target stockbroker Milgram had chosen in Sharon, Massachusetts. And when
that intermediate person got a letter, they were to repeat the process and forward the letter
to the next closest person they knew toward the target person.
You would expect the letters to go a long, round-about route—and you’d expect lots of
steps to be needed to link Omaha to a nondescript person in Sharon. Milgram found that
in fact only about 6 steps were needed, and about half of all letter chains passed through
three key people. These key people were hubs of this network.
If it is true that 6 (or thereabouts) is the maximum path length of the friendship graph,
then applying our shortest paths algorithm to it would create a matrix of numbers all
(roughly) 6 or less.
How to find and calculate shortest paths is discussed in section 9.6 (p. 297).
small world graph. The eccentricities and in particular the diameter are also
smaller. These properties are defined in section 9.6.3 (p. 303).
In a complete graph, every vertex has N − 1 arcs coming from it, and in a cycle
every vertex has one arc coming from it. These are the two extremes. As we add
more arcs to an intermediate graph, the characteristic path length will get smaller,
but generally we need to add a lot of arcs to get a low characteristic path length.
Or we can add arcs preferentially to popular hubs. Then the graph becomes a
small world, and the path length will be lower than we’d expect from its number
of arcs, at least compared to a random graph. In practical terms, this means that
each vertex has a low out-degree (a low number of out-going arcs) yet is closely
connected to every vertex.
In interactive device terms, this means that each state needs few buttons (out-
going arcs), yet is not too far from any other state. Making a device a small world
is a way to make every state easier to get to without greatly increasing the number
266
8.9. Small worlds
Box 8.5 Erdös numbers Another human-grown network is the network of people who write
articles together. Paul Erdös was a prolific mathematician who wrote a lot of articles about
graph theory, and he coauthored many of his vast output of 1, 535 papers.
Anyone who wrote with Erdös is said to have an Erdös number of 1, Erdös himself hav-
ing an Erdös number of 0. Anyone who wrote an article with someone who has an Erdös
number of 1 will have an Erdös number of 2, and so on. I have an Erdös number of 4,
because I wrote a paper called “From Logic to Manuals,” with Peter Ladkin (it appeared in
the Software Engineering Journal, volume 11(6), pp347–354, 1997), who wrote a paper with
Roger Maddux, called “On Binary Constraint Problems,” in the Journal of the ACM, 41(3),
pp435–469, 1994, who wrote a paper with Alfred Tarski (himself a prolific mathematician),
who wrote several papers with Erdös, for instance in the Notices of the American Mathe-
matical Society. Tarski has an Erdös number of 1; Maddux, 2; Ladkin, 3; and I, 4. That’s
a low number—but all of us tend to write papers with people who like writing papers with
one another. We tend to write papers with authoring hubs, and hubs are well connected and
reduce the distances enormously.
Since “standby” is the main state of an interactive device, where every user action must
eventually return, perhaps we should be interested in standby numbers—which are a measure
of how hard a state is to reach from the standby state.
That’s a quick overview of small worlds. So what? All small world networks
have two very interesting but contrasting properties that are relevant to interac-
tion programming. Small world networks are robust, but they are susceptible to
attack.
In an random, non-small world graph, if we delete a vertex, the average path
length will increase. In a small world graph, if we delete a random vertex (or a
random edge), average path lengths won’t increase much—unless we remove
267
Chapter 8 Graphs
Box 8.6 LATEX I used LATEX to write this book instead of a conventional wordprocessing
application like Microsoft Word—LATEX is a markup language, a little bit like HTML.
If I had used Word, when I wanted a special symbol, say the Greek letter α, I would
have to have moved the mouse, clicked on a menu, then muddled around to find the right
symbol, then clicked “insert.” However, if Word’s help system was complete, I could have
typed “alpha” into its search box, and found the Greek letter α directly, and perhaps (if I
was lucky) a button called Do it now —like my dialog box on p. 353.
In fact, LATEX already works like this: I just type \alpha, and I get what I want directly.
Typing \ is faster than moving the mouse to some “insert symbol” menu—and I can do it
by touchtyping without looking at the screen or menus.
In effect, the backslash character \ puts LATEX into a hub state. Although this makes
LATEX sound better, its disadvantage is that LATEX itself doesn’t provide the user with any
help; in Microsoft Word, a user searching for symbols can see menus of the things, whereas
in LATEX users have to know how to spell what they want.
Of course, there is no reason why the best of both worlds could not be combined, but
that’s a different story . . .
Features like Do it are discussed in section 10.7.2 (p. 352).
a hub. But there aren’t many hubs, so we are unlikely to pick a hub by chance.
Thus if a random vertex “goes down” in a small world network, very little of
the connectivity suffers. For example, if we pick a purely random airport, say,
Luton, few people (outside of Luton!) would notice if it wasn’t working.
However, if we deliberately kill off a hub, then the average path length will
increase a lot. For example, if a terrorist destroys Heathrow, the airport hub,
then very many routes would be disrupted. Hopefully, then, we work out
where the hubs are and defend them better against failure or against deliberate
attack.
For interactive devices, if what the user doesn’t know is random, then they prob-
ably have a good model of the device. However, if the user is missing critical
information, namely, information about the hubs, then their knowledge of the de-
vice will be very inefficient. So designers should work out where their device’s
hubs are and make sure that users are aware of them, through the user manual
and by using screen design, layout, or other features to make hubs more salient to
the user.
Fortunately, there is an easy way to create hubs. An interactive help system is
a state cluster that could be connected to every significant state—if every entry in
the help system has a Do it button. The theory suggests that a user would find
the system easier to use because everything becomes easier to find. In fact, it
would not be too surprising if users started describing what they wanted in the
help systems search dialogue box as an easier way than using the normal menu
structure.
268
8.10. Conclusions
8.10 Conclusions
Graph theory gives us a very clear way of thinking about interaction program-
ming, and as the bulk of this chapter showed, almost any question about usability
design can be rigorously interpreted as a question about graphs. In turn, any rig-
orous question about graphs can be answered by computer programs analyzing
the design.
A graph is a way of representing an interactive device, and once you think of a
device as a graph, you can ask all sorts of graph-type questions about its design
and figure out how those questions (and their answers) relate to usability and
what you want the device to do well.
Few interactive systems are really programmed as explicit graphs, and the pro-
grammers end up not having a clue what the interaction graph really is. They
then cannot change it reliably, and they certainly cannot verify that it conforms
to anything. The result is bad interactive systems, that cannot use computers to
help with the design process, for instance, to draft user manuals for technical au-
thors. Almost all graph theory questions are too hard for designers (or even users)
to answer without a lot of help from a computer. Since we’ve shown that lots of
graph theory questions are highly relevant to usability, then it follows that design-
ers ought to be using computer-supported design tools that can answer various
graph theory questions.
Sure, some graph theory questions take longer to answer, and a few questions
take far too long to answer in practice, even on smallish graphs (the traveling sales-
man is a case in point). The chances are that any question that takes a computer a
long time to answer is not going to be very relevant for making the user interface
easier to use, since such questions won’t have much meaning in the user’s head—
users aren’t going have time to stay around and find out if they take too long to
work out. Nevertheless, there’s no reason not to use graph theory more.
269
Chapter 8 Graphs
Bell, T., Witten, I. H., and Fellows, M., Computer Science Unplugged, 1999. This
book is full of activities that can be done by individuals or groups of people,
whether students or school children. This brilliant book is highly motivating
and goes against the boring standard ways of teaching graph theory, which all
too often make it look as if it is a complex and arcane subject. See
unplugged.canterbury.ac.nz to get a downloadable version.
Biggs, N. L., Lloyd, E. K., and Wilson, R. J., Graph Theory: 1736–1936,
Cambridge University Press, 1986. As its title suggests, this presents a
historical perspective on the development of graph theory, complete with
(translated) extracts from key papers. It is an easier read than most books on
graph theory because, as it works through history, the earlier stuff is inevitably
simpler. In contrast, ordinary books on graph theory plunge into definitions
and give you everything (I think) too quickly.
Buckley, F., and Lewinter, M., A Friendly Introduction to Graph Theory, Prentice
Hall, 2003. This really is a well-written, friendly book on graph theory, as its
title suggests, at a level equivalent to this book’s treatment, but obviously
emphasizing the mathematics and going more deeply into graph theory than
we could here.
Michalewicz, Z., and Fogel, D. B., How to Solve It: Modern Heuristics, Springer,
2000, is a wide-ranging discussion about problem-solving in programming,
with the added benefit that it puts the traveling salesman problem in the
context of other problems and techniques.
Milgram, S., “The Small World Problem,” Psychology Today, 2:60–67, 1967. The
original paper that established the “six degrees connectivity” of the human
race.
Watts, D., Six Degrees, Heinemann, 2003. This is a popular account of small
world networks. Note that most discussions of small worlds, including this
one, are not very interested in the names of edges—in device terms, the names
of buttons. For a user of a device, the names (or appearance) of buttons are
270
8.10. Conclusions
very important. For example, what might be a simple cyclic graph (considered
as an unlabeled graph) might need different button presses for each edge: the
apparent simplicity of the cycle would be lost on the bemused user.
271
9
A framework
for design
This chapter describes a basic programming framework that lets us build, test and
analyze any interactive device. Once we have a framework for programming finite
state machines, we can do all sorts of things efficiently. The design framework
presented here is very simple, trimmed as it is for the purposes of the book. If you
like programming, please be inspired to do better with your own design projects.
If you don’t like programming, it is still worth reading through this and the next
few chapters, but be reassured that non-programming designers could use tools
that hide the programming details. However, the insights a designer can achieve
through this sort of programming are very important—so read the this and the
next few chapters to see what can be achieved.
273
Chapter 9 A framework for design
In the future, user interfaces will no doubt move beyond push buttons. Devices
may be controlled by speech or even by direct thought, through some sort of brain
implant. The framework will work just as well for such user interfaces, although
we’ve only developed it here with button pressing in mind. In fact, a really impor-
tant idea lies behind this: speech (or any other panacea) does not escape any of the
interaction programming issues our framework makes clear and easy to explore.
274
9.2. A state machine framework
Bright
On
state
Bright Dim
Off Bright
Dim
Off Dim
state state
Off
Off Dim
Figure 9.1: The transition diagram of a simple three-state lightbulb, which can be in
any of three states: on, off, or dim. The initial state is Off, as shown by the default
arrow. All actions are shown, even where they do not change states, so there are always
three arrows out of all states.
Since every device has its own arrays, and we want to have a general approach
for any device, it helps to group all of the device details together. JavaScript (and
most other languages) allow data structures to be combined into single objects,
and that is what we will do here.
The approach shown below is a pretty full description of how a device—in this
case, our lightbulb—works.
var device = {
notes: "simple lightbulb, with dim mode",
modelType: "lightbulb",
buttons: ["Off", "On", "Dim"],
stateNames: ["dim", "off", "on"],
fsm: [[1, 2, 0], [1, 2, 0], [1, 2, 0]], // see Section 9.2.2 (p. 277)
startState: 1,
state: 0,
manual: ["dim", "dark", "bright"],
action: ["press", "pressed", "pressing"],
errors: "never",
graphics: "bulbpicture.gif"
};
Notice that we have allowed for quite a few more features: we’ve specified
the initial state (the lightbulb will start at the state off); we’ve given the device a
name and picture; and we’ve chosen simple words and phrases to describe what
the device does. We can now concentrate on getting the simulation and analysis
275
Chapter 9 A framework for design
to work, confident that we can easily change the device specification later if we
want to.
276
9.2. A state machine framework
this simple device there is one we could have exploited. But we’re trying to build
a general framework for any device, not just for special cases.
These particular numbers are not very exciting, but in general the fsm matrix
tells us how to go from one state when a button is pressed to the next state. Every
row of this matrix of numbers is the same only because the lightbulb is so simple,
and its buttons always do the same things; in general, though, each row would be
different.
The matrix structure is easier to understand when it is drawn out more explicitly
as a table:
Go to this state
when this button is pressed
When in this state Off On Dim
0: dim 1: off 2: on 0: dim
1: off 1: off 2: on 0: dim
2: on 1: off 2: on 0: dim
From this table we can read off what each button does in each state. This light-
bulb isn’t very interesting: the buttons always do the same things, whatever the
states.
The same information can be presented in many other forms, and often one or
another form will be much easier to read for a particular device. State transition
tables (STTs) are a popular representation.
The full state transition table for the lightbulb is particularly simple:
Action Current state Next state
on
Press Off off off
dim
on
Press On off on
dim
on
Press Dim off dim
dim
Each row in the table specifies an action, a current state, and the next state the ac-
tion would get to from the current state. Other features may be added in further
277
Chapter 9 A framework for design
columns, such as how the device responds to the actions, or the status of its indica-
tor lights. (We would also need corresponding entries in the device specification.)
State transition tables can usually be made much shorter and clearer by simpli-
fying special cases:
If an action takes all states to the same next state, only one row in the table for
the action is required (cutting it down from as many rows as there are states).
If an action does not change the state, the row for that state is not required
(with the proviso that if an action does nothing in any state, it needs one row to
say so).
If an action has the same next state as the row above, it need not repeat it.
Here is a state transition table for a simple microwave oven, illustrating use of all
these rules:
Action Current state Next state
Press Clock any Clock
Press Quick defrost any Quick defrost
Press Time Clock Timer 1
Quick defrost
Timer 1 Timer 2
Timer 2 Timer 1
Power 1 Timer 2
Power 2 Timer 1
Press Clear any Clock
Press Power Timer 1 Power 1
Timer 2 Power 2
This table was drawn automatically in JavaScript from the specification given
in section 9.4.6 (p. 286)—it was typeset using LATEX, though HTML would have
done as well. Unfortunately, and ironically because of the simplifying rules, the
JavaScript framework code to generate this table is longer than it’s worth writing
out. The full code to do it is on the book’s web site, mitpress.mit.edu/presson.
Our simple framework is written in JavaScript, because it has lots of advantages
for a book, but a proper development framework would be much more sophisti-
cated. It would allow designers to write state transition tables (and other forms
of device specification) directly—just as easily as if they were editing a table in
a word processor. Behind the scenes, a proper design tool constructs the finite
state machine data, which is then used to support all the features of the frame-
work. Ideally, the finite state machine would be reconstructed instantly when
even the smallest change was made to the table, and every feature the designer
was using—simulations, user manuals, analysis, whatever—would update itself
automatically.
To recapitulate, from the framework model we can generate representations of
the device—such as STTs and, later, user manuals, analysis, diagrams and help—
but we can also (with a little more effort) use those representations to either edit
278
9.3. Prototyping and concurrent design
279
Chapter 9 A framework for design
ways of improving the design—say, when a user spots something—you can revise
the design and redo everything very efficiently.
The framework is surprisingly simple. Obviously what we are sketching is more
pedagogical than real, but the approach—the ideas and motivation behind the
approach—has many benefits. It helps makes design concurrent: everything can
be done more-or-less at once, and no information need be lost between successive
phases of the design process.
When we use a design framework to design concurrently, the following advan-
tages become apparent:
Complex information is shared automatically among different parts of the
design process and with different people (programmers, authors, users)
engaged in the design process.
Design ideas and requirements formulated in the framework can be debugged
and quality-controlled once, yet used for many device designs.
Work need not get lost. The same design representation pervades everything
and does not need to be redone or “repurposed” for other aspects of the design.
Many problems for any part of the design can be identified immediately. For
example, technical authors can start to write user manuals immediately;
problems they face (say in explaining difficult concepts or in using draft
manuals in training sessions) are then known from the start, rather than when
it is too late.
Rather than waiting for later stages of a design process to confront problems, it
is possible to test how the “later” stages work almost immediately. The entire
design life cycle can start to be debugged from the earliest moment, and
feedback from draft final stages is available to improve the earliest conceptual
work.
It is possible to improve the design framework itself. Insights to improve the
framework that help particular design projects are programmed into the
framework and then are freely available to other projects.
These are grand claims for something so simple, but this is a different philosoph-
ical approach to design: you start with a simple framework and embellish it, by
extending the framework in different directions as need arises. Since the exten-
sions are automated, if you make any changes, the products are regenerated. The
alternative is to have a complex design environment, where each product—low-
fidelity, high-fidelity, whatever—is done from scratch, and has to be done again if
there are any changes. Put another way, since every interactive device is a state
machine, we build a framework to run and analyze state machines.
280
9.4. Prototyping as a web device
As described in section 4.5 (p. 105), where we defined the function plural, this
JavaScript would generate better English if we wrote ... "This device has
"+plural(device.stateNames.length, "state")+" and
"+plural(device.buttons.length, "button")+"." ...
You can either use + to join strings together (+ will also add numbers), or you
can just as easily use several calls to document.write on the strings separately. It’s
a matter of personal style.
for( var s = 0; s < device.fsm.length; s++ )
{ document.write("<hr><a name="+s+">In state <b>"+device.stateNames[s]
+"</b> you can press:</a><ul>");
for( var b = 0; b < device.buttons.length; b++ )
document.write("<li><a href=#"+device.fsm[s][b]+">"
+device.buttons[b]+"</a></li>");
document.write("</ul>");
}
This will generate a hugely insightful web page! It’ll look like this:
In state dim you can press:
• Off
• On
• Dim
In state off you can press:
• Off
• On
• Dim
...
If you click on one of the buttons, shown in your browser as underlined hot text,
the web page will scroll up and down to get the current state at the top (depending
on your browser: you may need to make the window smaller so you can see it
scrolling—otherwise, your browser won’t seem to do anything if the target state is
already visible in the window without scrolling). In a sense, you’ve not so much
got an interactive simulation of an interactive device as a rather banal interactive
(hypertext) user manual for it.
It’s easy to do much better. Here’s one idea: give the user some hints about
what pressing a button will do:
281
Chapter 9 A framework for design
282
9.4. Prototyping as a web device
device.html?2, and so on. This is practically the same thing, except that every-
thing can now be done with a single file, device.html, with the “search” part of
the URL (what comes after the question mark character) selecting which state the
device is in.
In the approach we used above, a for loop ran over all possible values of the
state number. Now each page only has to deal with one state, but it gets the state
number not from a for loop but from the search part of the URL, as follows:
var device = ...; // pick a device
var state = location.search.substr(1); // get the search string
document.write("<h1>"+device.modelType
+" is "+device.stateNames[state]+"</h1>");
document.write("<ul>");
for( var b = 0; b < device.buttons.length; b++ )
document.write("<li><a href=device.html?"
+device.fsm[state][b]+">"+device.buttons[b]+"</a></li>");
document.write("</ul>");
283
Chapter 9 A framework for design
Figure 9.2: What the lightbulb device simulation looks like after you press the Dim
button. The three buttons of the device are shown along the top row, and underneath
is the text field that displays the current state name, all generated by the framework
code.
First you need a basic form (written in standard HTML) for the buttons and a
text field for the state name:
<form>
<input type=’button’ value=’Off’ onMouseup=’press(0)’>
<input type=’button’ value=’On’ onMouseup=’press(1)’>
<input type=’button’ value=’Dim’ onMouseup=’press(2)’>
<br>
State = <input type=’text’ name=’display’ readonly><p>
</form>
Rather than calling specific button functions like off in the HTML code for the
form, we’ve used a more general approach by having a single function press
that can do any button; it just needs to be told which one. Thus press(0) means
do whatever pressing Off should do, press(1) means do whatever pressing On
should do, and so on.
The HTML form above can only handle a three-button device, and it is restricted
to fixed button names (Off, On, Dim) at that. It is better to use JavaScript to gen-
erate a form for any device specification. We only need a JavaScript for loop to
do it and to use document.write to generate the correct HTML. Here’s how to do
everything automatically:
function makeForm(d)
{ document.write("<form>");
// generate one line for each button
for( var s = 0; s < d.buttons.length; s++ )
{ document.write("<input type=’button’ ")
document.write("value=’"+d.buttons[s]+"’ ")
document.write("onMouseup=’press("+s+")’>")
}
// plus some more HTML to get a working display ...
document.write("<br>State = <input type=’text’ name=’display’ readonly>")
// then finish the form
document.write("</form>")
}
Now, if we write <script>makeForm(device)</script> anywhere, this will
generate the HTML form code automatically. If the device is the lightbulb we de-
fined earlier in this chapter, then this code will automatically generate the example
form given above.
284
9.4. Prototyping as a web device
285
Chapter 9 A framework for design
function initialize(d)
{ d.state = d.startState; // start off the device in the initial state
makeForm(d);
... other details to be added ...
}
It’s convenient to have a function displayState to display the state name by
writing the name of the state to the form field:
function displayState(d)
{ document.forms[0]["display"].value = d.stateNames[d.state];
}
The initialization function will call displayState(d) to make sure that the ini-
tial state is correctly displayed. A more exciting simulation of a device would
show different images for each state, rather than just writing strings of text—the
state description—to a field, as we’ve done here.
Now all that is needed is the next state function, which we’ve called press.
Notice how press uses the fsm table to work out the next state from the current
state and the number of the button the user has pressed. It then calls displayState
to show the user the new state name.
function press(buttonNumber)
{ device.state = device.fsm[device.state][buttonNumber];
displayState(device);
}
It takes a little bit more tinkering to get it all perfect, but when all the details are
sorted out, we can run the lightbulb simulation as a self-contained web page.
286
9.4. Prototyping as a web device
startState: 0,
action: ["touch","touched","touching"],
manual: ["has the clock running","doing quick defrost",
"using Timer 1","using Timer 2",
"on Power level 1","on Power level 2"
],
errors: "never",
graphics: ""
};
287
Chapter 9 A framework for design
288
9.5. Basic checking and analysis
If every program is written from scratch, the effort to do detailed checking will
rarely seem worthwhile: most of the useful checks are not particularly easy to
program (and it is always going to be easier to hope that the design is correct
than do hard work). Instead, in a framework, we only have to write the checking
program code once, and it is then always available for every device we want to use.
That is a very useful gain and a good reason to use a framework.
The first and simplest thing to check is that we have a valid finite state machine
with the right number of buttons. The properties to check are as follows:
The basic fields are defined in the device specification. (In a strongly typed
programming language like Java this step is not be required.)
The number of buttons and the number of states conform to the size of the
various fsm, indicators, manual, action fields. (This is a check that both Java
and JavaScript cannot do without some explicit programming.)
The startState and all entries in the fsm are valid state numbers. (Again, this
is a check that Java and JavaScript cannot do without programming.)
Report on whether any states have no actions in them. This is generally an
error, except for special devices.
Report on whether any states have only one action in them. This is an example
property of a device that may or may not be a design error; it may be deliberate
in some states for some devices, but generally it is worth highlighting for a
designer to take note.
Pointless states can be identified visually from transition diagrams; they raise a
potential design issue discussed in section 6.3.3 (p. 184).
In JavaScript almost anything goes, so we have to check explicitly that strings
are strings, numbers are numbers, and so on. It is easiest to define some handy
functions to do this and to report errors as necessary:
function isString(s, name)
{ if( typeof s != "string" )
alert(name + " should be defined as a string");
}
289
Chapter 9 A framework for design
290
9.5. Basic checking and analysis
291
Chapter 9 A framework for design
do the work for us. When drawing is automated, if we change our device specifi-
cation, the diagrams can be automatically updated—with speed and accuracy.
To draw diagrams, we will use Dot, an open source standard for drawing graphs.
Dot specifications can be read into many programs, which then do the details of
drawing, creating web pictures, PostScript, or whatever graphic formats we want.
If state a has a transition to state b, we write this in Dot by saying a -> b, and
if we want to say that this transition is caused by some action, such as pressing
a button On , then we tell Dot the transition has a named label by writing a -> b
[label="On"].
Here is a basic description of the microwave oven (generated automatically
from the last example in our framework, on p. 286) in Dot:
digraph "Microwave oven" { /* basic Dot description */
0->0 [label="[Clock]"]; 0->1 [label="[Quick defrost]"];
0->2 [label="[Time]"]; 0->0 [label="[Clear]"];
0->0 [label="[Power]"]; 1->0 [label="[Clock]"];
1->1 [label="[Quick defrost]"]; 1->2 [label="[Time]"];
1->0 [label="[Clear]"]; 1->1 [label="[Power]"];
2->0 [label="[Clock]"]; 2->1 [label="[Quick defrost]"];
2->3 [label="[Time]"]; 2->0 [label="[Clear]"];
2->4 [label="[Power]"]; 3->0 [label="[Clock]"];
3->1 [label="[Quick defrost]"]; 3->2 [label="[Time]"];
3->0 [label="[Clear]"]; 3->5 [label="[Power]"];
4->0 [label="[Clock]"]; 4->1 [label="[Quick defrost]"];
4->3 [label="[Time]"]; 4->0 [label="[Clear]"];
4->4 [label="[Power]"]; 5->0 [label="[Clock]"];
5->1 [label="[Quick defrost]"]; 5->2 [label="[Time]"];
5->0 [label="[Clear]"]; 5->5 [label="[Power]"];
}
The JavaScript to generate this would be easy—just a couple of for loops to
run around the definition of device.fsm. However, if you run this through Dot,
it is immediately obvious that there are lots of ways to improve it; we’ll make the
following improvements in our JavaScript:
The states need names, and we could identify the start state specially.
If a state has a transition to itself (that is, the action does nothing)—for instance
like 0->0 and 4->4 above—we needn’t show the arrow for the transition. This
simplification will remove a lot of clutter from the drawing.
If the same action transitions between two states in both directions—for
instance like the pair 2->3 [label="[Time]"] and 3->2 [label="[Time]"]
above—then we can draw a single arrow but with arrowheads on both ends,
rather than two separate arrows.
If several actions do the same transition, we draw them as a single arrow but
with a label made out of all the actions. Thus 1->1 [label="[Quick
defrost]"] and 1->1 [label="[Power]"] do the same thing and should be
292
9.5. Basic checking and analysis
All sorts of ideas will occur to you on further ways to improve the diagram. Dot is
a sophisticated language and can take lots of hints about what you want. You can
change the color and shape of the states, the arrowhead styles, and so on. What is
important for us is that Dot draws a good enough diagram with no effort on our
part. If we modify the finite state machine definition, the diagram will be updated
automatically. Designers need easy and efficient tools, and this is one of them.
Here’s the JavaScript code to achieve the improvements mentioned above. I
won’t describe the workings of the code in detail; if you copy it out (or copy it
from mitpress.mit.edu/presson), it will work and generate HTML, which you then
cut-and-paste to a Dot program to draw the graph.
function drawDot(d) // generate Dot code for any device d
{ document.write("digraph \""+d.modelType+"\" {\n");
document.write("size=\"4,4\";\n");
document.write("node [shape=ellipse,fontname=Helvetica,fontsize=10];\n")
document.write("edge [fontname=Helvetica,fontsize=10];\n");
document.write("start->"+d.startState+";\n");
document.write("start [label=\"\",style=filled,height=.1,");
document.write(" shape=circle,color=black];\n");
for( var s = 0; s < d.fsm.length; s++ ) // state names
document.write(s + " [label=\"" + d.stateNames[s] + "\"];\n"); // *
for( var s = 0; s < d.fsm.length; s++ ) // single arrows
for( var t = 0; t < d.fsm.length; t++ )
if( t != s ) // ignore self arrows
{ var u = true;
for( var b = 0; b < d.buttons.length; b++ )
if( d.fsm[s][b] == t && d.fsm[t][b] != s ) // single arrows only
{ document.write(u? s + "->" + t + " [label=\"": ",\\n");
u = false;
document.write(d.buttons[b]);
}
if( !u ) document.write("\"];\n");
}
for( var s = 0; s < d.fsm.length; s++ ) // double arrows
for( var t = s+1; t < d.fsm.length; t++ )
{ var u = true;
for( var b = 0; b < d.buttons.length; b++ )
if( d.fsm[s][b] == t && d.fsm[t][b] == s )
{ document.write(u? s + "->" + t + " [dir=both,label=\"": ",\\n");
u = false;
document.write(d.buttons[b]);
}
if( !u ) document.write("\"];\n");
}
document.write("}");
}
293
Chapter 9 A framework for design
After the definition of this function, call drawDot(device) in the JavaScript. Here’s
the diagram it draws for the microwave oven—with no further touching up:
Clock
[Clock],
[Quick defrost]
[Clear]
[Clock],
Quick defrost [Time]
[Clear]
[Clock],
[Quick defrost] [Time]
[Clear]
[Clock],
[Quick defrost] Timer 1
[Clear]
[Clock],
[Quick defrost] [Power]
[Clear]
[Time] [Time]
Timer 2
[Power]
Power 2
If you are keen, you can tell Dot to associate a URL with each state and then you
can click on the transition diagram (in a web browser) and make the device work
by going to the state you’ve clicked on. To do so, add the string
"URL=\"javascript:displayState(state = "+s+")\""
into the line marked * above—the URL runs the JavaScript to simulate the device.
This transition diagram is a detailed technical diagram that might be of interest
to the designer but is too detailed to be of much use to a user. Instead, we can gen-
erate code for Dot (or whichever drawing program we are using) to make things
more helpful for users.
The following Dot code draws a four-state transition diagram but uses pho-
tographs of the device taken when it was in each of the states, using Dot’s param-
eter shapefile to use a picture file rather than a geometric shape. Here the device
294
9.5. Basic checking and analysis
is a Worcester Bosch Highflow-400 central heating system, and the transitions oc-
cur when the user presses the Select button. A diagram like figure 9.3 (next page)
might be useful in a user manual.
Below, we’ve only shown the Dot code, not the JavaScript that could generate it.
You would write some JavaScript that generated the Dot code for only part of the
system (that is, a subgraph), rather than the whole system, for instance, based on
lists of states that are needed to make each diagram to illustrate the user manual.
"Off" -> "Twice" -> "Once" -> "On" -> "Off";
295
Chapter 9 A framework for design
Figure 9.3: A state transition diagram, drawn automatically but using photographs of
the device in each of the relevant states. This sort of diagram can be used directly
in user manuals. The device here is a domestic central heating boiler, and the states
are hot water off, on twice a day, on once a day, or on continuously. Note that the
photographs show both the state indicator and the heater-on light.
One approach to handle state encapsulation is to arrange each state to call its
own JavaScript function. The function can then do what it likes in that state. In
many cases, simple modifications to press will seem easiest:
function press(buttonNumber)
{ state = device.fsm[device.state][buttonNumber];
displayState(state);
switch( state )
{ case 1: dostate1(); break; // do something special in state 1
case 5: dostate5(); break; // do something special in state 5
default: // in this design, no other states do anything special
// but the default lets other states work consistently
// if we added code here for them
...
}
}
296
9.6. Finding shortest paths
Before long, this approach of overriding the definition of each state will get too
complex, especially if you are tempted to write all the code directly into press it-
self. Much better is to improve the way each state is defined centrally in the device
definition. One way to do it is to have a (potentially) different press function for
every state:
var device = {
notes: "Simple lightbulb, with dim mode",
...
stateNames: ["dim", "off", "on"],
stateFunctions: [stateLit, stateOff, stateLit],
...
};
and press does it’s basic work then dispatches to the appropriate state function,
stateLit, stateOff, or whatever:
function press(buttonNumber)
{ device.state = device.fsm[device.state][buttonNumber];
displayState(device);
device.stateFunctions[device.state](); // do special action
}
It’s important that everything can be done from the same core representation
of devices. The JavaScript framework we’ve designed will work well even if we
change, perhaps radically, the definition of the device we are simulating or an-
alyzing. The interaction, as done here in Javascript, the drawings and all of the
analysis and manual writing is done directly from the same definition. We can
easily change the design (by editing the device specification) and then rerun the
simulation or the analyses. If the ideas sketched out in this chapter were used in a
real design process, it would permit concurrent engineering—doing lots of parts
of the design at the same time. Some designers could work on getting the interac-
tion right, some could work on the analysis, some could work on implementation
(like writing better code than we did just now), and so on.
297
Chapter 9 A framework for design
This table, which is stored in device.fsm, tells us the next state to go to. Given
a button press (column) in any state (row), the entry tells us the next state to be in.
For example, the last line (which is state number 5) tells us that button 0 (which
is called Clock and is the first column) would take the device to state 0—this is the
number in the first column of that row.
To work out how many button presses it takes to get from any state to any other
state, we need a new table that has rows and columns for states; each entry will tell
us how many presses it takes to get from one state to the other.
By reading the state-button table above, we can work out the beginning of this
new table, which we will call matrix. If a button pressed in state i causes a tran-
sition to state j, then we want to have a 1 in entry matrix[i][j] to indicate that
we know we can get from state i to state j in one step. Also, we know as a special
case we can get from state i to state i in no steps at all; so every entry matrix[i][i]
should be zero. Any other entry in the matrix we’ll set to infinity because we think
(so far) that there is no way to get between these states.
Note that matrix has less information in it than device.fsm, since we have lost
track of buttons: the new matrix only says what next states are possible from a
given state, whereas device.fsm also says which button should be pressed to do it.
All the code that follows is written inside a function, shortestPaths(d), and
then shortestPaths(device) will work out the shortest paths for that device.
Throughout the following code, d will be the device we’re working on.
Inside the function, we first need a few lines to create a blank matrix of the right
size:
var fsm = d.fsm; // convenient abbreviation
var n = fsm.length; // number of states
298
9.6. Finding shortest paths
The tricky line in the inner for loop, matrix[i][fsm[i][j]] = 1, says, “we’re
in state i, and we know button j gets us to state fsm[i][j], and therefore the
entry in the matrix for state i to state fsm[i][j] should be 1.”
Here is what the cost matrix looks like after running that initialization code:
To state . . .
From state . . . 0 1 2 3 4 5
0: Clock 0 1 1 ∞ ∞ ∞
1: Quick defrost 1 0 1 ∞ ∞ ∞
2: Timer 1 1 1 0 1 1 ∞
3: Timer 2 1 1 1 0 ∞ 1
4: Power 1 1 1 ∞ 1 0 ∞
5: Power 2 1 1 1 ∞ ∞ 0
I printed this table directly from the actual values in matrix, by using a couple
of for loops:
for( var i = 0; i < n; i++ ) // display matrix
{ for( var j = 0; j < n; j++ )
if( matrix[i][j] == infinity )
document.write("∞ ");
else
document.write(matrix[i][j]+" ");
document.write("<br>");
}
It would not be hard to use an HTML <table> to lay the data out nicely.
The symbol ∞ in the table (∞ in HTML) means for the time being we
don’t know a state transition is possible—that is, it might take an infinite number
of steps—to get from one state to another. More precisely, the matrix entries mean:
it can take 0 steps to get from one state to another (if they are the same state); it can
take 1 step to get from one state to another (if there is a single button press that
does it directly); or it can take more, which we’ve represented as ∞.
In short, for any pair of states, say i and j, the matrix tells us whether we can get
from one to the other (in zero or more steps). We now use a standard algorithm, the
Floyd-Warshall algorithm, that considers every state k and determines whether we
can get from i to j via the state k more efficiently. If we can go i → k → j more effi-
ciently than going directly from i → j, we record it in the matrix. Thus we expect
those ∞ values to reduce down to smaller values. The direct cost is matrix[i][j]
and the indirect cost, via k, is matrix[i][k]+matrix[k][j]. Whenever the indi-
rect cost is better, we can improve the recorded cost. Here’s how to do it:
for( var k = 0; k < n; k++ )
for( var i = 0; i < n; i++ )
for( var j = 0; j < n; j++ )
// replace cost of ij with best of ij or ikj routes
{ var viak = matrix[i][k] + matrix[k][j];
if( viak < matrix[i][j] )
matrix[i][j] = viak;
}
299
Chapter 9 A framework for design
At the end of running these three nested for-loops, the program has tried every
way of getting from anywhere to anywhere. The inner two loops find the best way
of getting from state i to j via any intermediate state k, but the outer loop ensures
that we have tried all intermediate routes via 1 to k − 1 first. Thus when the outer
loop is finished, we know the best ways of getting between any two states via any
state 1 . . . k, which covers all possible cases. This is sufficient to know all of the
best routes through the finite state machine.
Here is the result of running this code on our cost matrix:
To state . . .
0 1 1 2 2 3
1 0 1 2 2 3
From 1 1 0 1 1 2
state 1 1 1 0 2 1
1 1 2 1 0 2
1 1 1 2 2 0
We can write the same information as a mathematical matrix (just put it between
round brackets), which is highly suggestive of things you might want to do if you
know matrix algebra; however, to go there is beyond the scope and immediate
interest of this book.
For more on matrices, see box 11.1, “Matrices and Markov models” (p. 382).
Note that there are no ∞ (infinity) symbols in this shortest path table; this means
that (for this device) it is possible to get from any state to any other in a finite num-
ber of steps—thus this device is strongly connected. In particular, each entry in the
matrix gives the least number of steps it will take between two states (specifically,
matrix[i][j] is the cheapest cost of getting from state i to state j by any means).
It goes without saying that our knowing the least number of steps to do any-
thing does not stop a user from getting lost or perhaps deliberately taking a longer
route. But we can be certain a user cannot do better than these figures.
In technical terms, the costs are a lower bound on the user interface complexity,
as measured by counting button presses. That is one reason why the costs are so
important; we have a conservative baseline to understand the user’s performance—
or lack of it—regardless of how good or sophisticated the user is they cannot do
better than these figures. If the costs turned out to be bad (or, more likely, some of
the costs were bad) there is nothing a user can do and nothing training can do; we
either have to live with a difficult-to-use feature (some things we may want to be
difficult), or we have to fix the design.
If we multiply the counts in the cost matrix by 0.1 second, we get an estimate
of the minimum time a fast user would take—and of course we could do some
experiments and get a more accurate figure than our guess of 0.1 second; then the
timings become a much more useful design measure.
300
9.6. Finding shortest paths
to Power 2 (state 5) takes 3 button presses—that’s the number 3 in the top right, at
the end of the first line in the matrix above. Here’s the best route between these
two states:
1. Press Time to get from Clock to Timer 1
2. Press Time to get from Timer 1 to Timer 2
3. Press Power to get from Timer 2 to Power 2
In fact, a designer might be interested in all the hardest operations for a device;
for this device there are two, both taking at least 3 steps (the user will takes longer
if they make a mistake; the analysis shows they cannot take less than 3 steps). The
other worst case is getting from Quick defrost to Power 2. If the worst cases take
∞ steps, then very likely there is something wrong with the design: some things
the device appears designed for are impossible to do.
Some devices—like fire extinguishers—may be designed to be used only once,
and then the designer will expect an infinite cost for getting back from the state
“extinguisher used” to “extinguisher not used.” But even then, it would help
the designer to have these states and problems pointed out automatically by the
framework. For instance, why not make the fire extinguisher refillable?
To find the best route, the correct sequence of button presses, not just the to-
tal cost to the user of following the best route, requires a few extensions to the
program.
We need to keep another table via[i][j] to record the first step along the best
route i → j. Every time we update the cost of the route, we’ve found a better step.
Here are the details:
In the original initialization of matrix we add an initial value to via:
In the easy case of going from a state to itself, the first thing to do is to go there.
matrix[i][i] = 0;
via[i][i] = i;
In the code that updates matrix, if it is better to get from i to j via k, we replace
the first step of i → j with the first step of i → k → j, which is already in
via[i][k]:
301
Chapter 9 A framework for design
Finally, after finding all the shortest paths, to get the best route between any
two states i → j, the following code will print out the route:
var start = i;
var limit = 0;
while( start != j )
{ var nextState = device.fsm[start][via[start][j]];
document.write("Press "+device.buttons[via[start][j]]
+" to get to "+device.stateNames[nextState]+"<br>");
start = nextState;
if( limit++ > n ) { document.write("?"); break; }
}
This code uses the variable limit to ensure that we don’t try to solve an
impossible task—in case it is called when the route has the impossible length
∞. Recall that no route in a finite state machine need take longer than the
number of states; otherwise, it is going around in circles (it must visit some
state twice) and therefore can’t be a shortest route to anywhere.
The code in the last step above prints out the sequence of user actions a user needs
to do to get a device from state i to state j. If we put that code inside a couple of
loops to try all values of i and j we can find out how often each button is used.
We could then design the device so that the most popular button (or buttons) are
large and in the middle. Or we might discover that the popularity of buttons is
surprising (some buttons may never be used on shortest paths, for instance), and
then we’d want to look closer at the device design. Here’s one way to do it:
var bcount = new Array(b); // an array with one element per button
for( var i = 0; i < b; i++ )
bcount[i] = 0;
for( var i = 0; i < n; i++ )
for( var j = 0; j < n; j++ )
{ var limit = 0, start = i;
while( start != j )
{ var nextState = device.fsm[start][via[start][j]];
bcount[via[start][j]]++; // count how often buttons are used
start = nextState;
if( limit++ > n ) break;
}
}
for( var i = 0; i < b; i++ )
document.write(device.buttons[i]+" rated "+bcount[i]+"<br>");
302
9.6. Finding shortest paths
9.6.3 Eccentricity
The eccentricity of a state measures the cost of the worst possible thing you could
try in that state, that is, getting to the most distant state you could get to from that
state.
In graph theory terms, the eccentricity of a vertex is the maximum distance it
has to any other vertex. The diameter and radius of a graph are then defined as
the largest and smallest, respectively, of the vertex eccentricities. Here’s how to
find out all the information:
var radius = infinity, diameter = 0;
for( var i = 0; i < n; i++ )
{ var eccentricity = 0;
for( var j = 0; j < n; j++ )
if( matrix[i][j] > eccentricity )
eccentricity = matrix[i][j];
document.write("<b>"+d.stateNames[i]
+"</b> eccentricity is "+eccentricity+"<br>");
if( eccentricity > diameter )
diameter = eccentricity;
if( eccentricity < radius )
radius = eccentricity;
}
document.write("Diameter (worst eccentricity) = "+diameter+"<br>");
document.write(" Radius (least eccentricity) = "+radius+"<br>");
The designer will be especially interested in any states that have a very high or a
very low eccentricity; the extreme values might indicate an oversight, or, of course,
they may be intentional—sometimes a designer wants to make certain things hard
to do.
303
Chapter 9 A framework for design
For the microwave oven, it turns out that the diameter is 3, and the most eccen-
tric states are Clock and Quick defrost, and in both cases the worst thing to try to
do is to change to the state Power 2.
Clock is a state used for setting the clock, so you’d expect to have to get out of
clock setting, start power setting, then set Power 2, so a shortest path to Power 2
of 3 doesn’t sound unreasonable. If a user is doing a Quick defrost, they are not
very likely to want to ramp up the power to its highest—it’s easy to burn stuff on
the outside but leave the middle frozen. So we have a no reason to worry about
the two largest eccentricities.
D Finish
304
9.6. Finding shortest paths
Box 9.1 Facilities layout Most problems have appeared under different but more-or-less
equivalent guises before. Improving the design of interactive devices by looking at shortest
paths and user data is “the same” as the facilities layout problem. You have some facilities,
like workstations in a factory, and you want to minimize the cost of people carrying stuff
around the factory.
How people can move stuff around the factory is a graph. You collect data on where
people go and how often, you work out the shortest paths, and you then move the facilities
around to reduce the overall costs of moving things. This problem is very well studied; it
is estimated that about 10% of a nation’s industrial effort goes not into just moving stuff
around, but into rearranging facilities or building new ones to move stuff around!
It’s the same problem as deciding where to put interesting states in a state machine, or
deciding where to put interesting pages in a web site, if you know shortest paths and have
some usage data. It turns out that finding a solution to the problem is NP-hard, which is
just a fancy way of saying nobody knows a good way of solving the problem.
It’s nice to know that re-designing an optimal web site, or re-designing an optimal device,
based on user data is NP-hard, for this is effectively saying it’s a very hard job, however you
do it. On the other hand, since the problem is so hard, if you do not use a systematic way
of re-design—using one of the standard algorithms for solving the problem—then it is very
unlikely to be much of an improvement.
The constants a, b, c, d are usually found from experiment, and depend on the
users, their age, fitness, posture, and so on.∗ The total time to do anything is the
sum of these two laws for each user action required. Whatever the exact values,
we can immediately see that there are some generic things that can be done to
speed things up. We could:
Make buttons larger, so W is larger.
Make buttons closer, so D is smaller.
Reorganize buttons into a better layout, to reduce average distances.
On devices with touch-sensitive screens that show buttons, we can change the
size of buttons depending on what state the device is in. For example, making
∗ Often these laws are given differently, using logarithms to base 2. Since the constants are experimen-
tally determined it does not matter what logs are used.
305
Chapter 9 A framework for design
the buttons that are used at the beginning of any sequence of user actions
larger will help the user notice them (as if they are lit up).
We could change the constants by “modifying” the user rather than the device:
we could train the user to be faster, or we could select users who were good at
our tasks.
We don’t always design by minimizing costs; we could try to increase costs
instead. If the user is using a gambling machine, we may want the users to
spend as long as possible losing money. Or, although we know that hitting the
big red EMERGENCY OFF button switches a device off immediately, it may not be
the best way of switching it off normally; we should make its cost higher, for
instance putting it under a protective cover.
306
9.6. Finding shortest paths
a
3:a c
a c b 3:b
3
b d
3:c
d
d
3:d
Figure 9.4: How a single-state (state 3, left) with two transitions to it (a and b) would
be replaced with four states (right), each “remembering” the original state’s different
ways of getting to it. Now the transitions “know” how the user’s finger has moved: for
example, transition c from state 3 : a required the user to move their finger from button
a to button c. In this example, two of the new states (3 : b, 3 : c) are unnecessary as
there is no way to enter them. (For clarity, the righthand diagram does not show all
transition labels, and states 3 : a and 3 : b will also have more transitions to them than
are shown.)
best times, but best measurements of the index of difficulty, depending on button
layout and the device interaction programming.
We can’t reuse the shortest paths algorithm directly because the time it takes a
user to get from one state to another depends on what the user was doing before
they got to the first state. How long it takes to get from state to state depends on
the distance the user’s finger moves. Pressing button A , say, got the device into
state 2, then the user moved their finger from A to press B to get the device to
the next state. The shortest path algorithm assumes each action (arc) has a fixed
cost—but to use Fitts law we need to know which button the user pressed to get
the device to the first state: for instance, the physical distance A to B , if A got us
to state 2, or the physical distance C to B , if C got us to state 2. Somehow the
states need to know which button the user pressed to get there.
Suppose the device has got B buttons and N states. We create a new “device”
with B states for each of the original device’s states. Give the new states names
like s : b, so that in the new device the state s : b0 represents entering the original
device’s state s by pressing button b0 —thus we have states that record how the
user got to them. If on the old device pressing button b1 went from state s to state
t, then on the new device state s : b0 will go to state t : b1 . This transition will take
time according to the Fitts Law to move the finger from the location of b0 to the
location of b1 , and we now have enough details to work the timings out.
307
Chapter 9 A framework for design
To program this in JavaScript, we give the state we’ve called s : b the unique
number sB + b. The code starts by creating a table of all weights for all the new
state transitions, first initializing them to ∞:
var w = new Array(NB);
for( var i = 0; i < NB; i++ )
{ w[i] = new Array(NB);
for( var j = 0; j < NB; j++ )
w[i][j] = infinity; // default is no transition
}
Then we construct the new “device”:
for( var i = 0; i < N; i++ )
for( var b = 0; b < B; b++ )
{ var u = device.fsm[i][b]; // b takes us from i to u=fsm[i][b]
for( var c = 0; c < B; c++ )
// we’ve just pressed b, now we press c
// pressing c takes us from u to fsm[u][c]
w[u*B+b][device.fsm[u][c]*B+c] = Fitts(b, c);
}
We can now use the familiar Floyd-Warshall algorithm to find fastest paths.
This part of the code works exactly as before—though for simplicity here we are
not recording the actual paths taken.
for( var k = 0; k < NB; k++ )
for( var i = 0; i < NB; i++ )
for( var j = 0; j < NB; j++ )
if( w[i][k]+w[k][j] < w[i][j] )
w[i][j] = w[i][k]+w[k][j];
308
9.6. Finding shortest paths
Time in seconds
2
5 10
Action counts
Figure 9.5: A plot of Fitts Law timings (vertically) against button press or action counts
(horizontally) for all possible state changes on the PVR used in section 10.3. Button
counts are closely proportional to user times.
Points close enough to overlap are rotated, like · · · —the more points
at the same coordinates, the darker the blob. The graph also shows a best-fit line
0.21+0.16x.
radio—section 3.10 (p. 80)—require the user to spend a significant time interpret-
ing the display, and Fitts Law does not consider that either. If your device takes
many button presses to change states, Fitts Law will become inaccurate as it in-
creasingly under-estimates times, ignoring the user checking the device feedback
and thinking.
In figure 9.5 (this page), I’ve used the program code and shown Fitts Law tim-
ings plotted against button counts to do the same tasks. As can be seen, the timings
and counts are closely related (at least for this device), so button press counts—
which the basic shortest paths algorithm gives directly—will be sufficiently accu-
rate for most interaction programming design issues (in fact, as accurate as you
could expect from either experiments or from Fitts Law alone). However, if you
wish to design button layouts, the plot also shows us that user timings can vary
by a factor of two or more simply because of the time taken for finger movement;
for button layout, then, Fitts Law can be very helpful.
Familiarity
Another useful cost we can explore is familiarity. There are many ways to repre-
sent familiarity, but suppose we keep a matrix inside the interactive device, ini-
tially with the one-step costs: 1 if a transition is possible, 0 if it is not possible.
Now, every time the user does something, the corresponding transition “cost” is
309
Chapter 9 A framework for design
Box 9.2 Button usage Finding the absolutely best layout of buttons is hard, especially
all possible button positions are considered in all possible layouts (rectangles, circles,
hexagons . . . ). It may be just as useful to find out which are the most used buttons.
A model of the JVC HRD580EK PVR suggests that buttons are used very differently:
Button press Relative use Button press Relative use
or action on shortest paths or action on shortest paths
Record 67.36% (1876) 2.87% (80)
Operate
Stop/Eject
15.44% (430) Forward 1.94% (54)
Pause 8.58% (239) Rewind 1.94% (54)
Play 5.03% (140) insert tape 1.87% (52)
The Record seems to be used excessively: it has too many uses in this design—surely, there
aren’t that many ways of recording? Thus, some of its uses might be reconsidered and
handed over to another, possibly new, button. Or, if we do not change the meaning of the
button (which would change the figures in the table), it should be physically positioned in
the centre of the buttons, to reduce hand or finger movement times.
Inserting a tape is used 1.87% of the time, which also seems high—surely there is only
one thing inserting a tape should do? As it happens, on the JVC PVR, inserting a tape also
switches the PVR on if it is off, so inserting a tape is faster than switching on, then inserting
a tape. Again, analysis questions a user action possibly having too many uses.
The JVC HRD580EK PVR is used in section 10.3 (p. 330). The path length data is
illustrated in a bar chart in figure 11.4 (p. 381).
incremented. The matrix counts how often a user does something; we can assume
counts indicate how familiar the user is with each action. Now maximum cost
paths indicate paths that the user would tend to take themselves. We can find
solutions that are not fastest but—we hope—are most familiar. Alternatively, we
might want to train the user to be proficient in all ways of using a device; then we
would want to recommend they learn and do things they are not familiar with.
If the user was a pilot, nurse, or fireman, then they might be required to be fa-
miliar with every operation of the device; the ideas here then allow the framework
to tell them what they have missed or not yet done enough of. Perhaps their pro-
fessional license requires them to do a certain sort of procedure every year or so;
if so, we can check, and tell them what they have missed.
These ideas give advice depending on what users do. Instead, we could fill the
cost matrix with how experts think the device should be used. In this case, the
shortest paths will automatically give expert advice.
More uses of costs to use in products (rather than just for the designer to
analyze) are discussed in section 10.7 (p. 351), particularly 10.7.5 (p. 358),
and 11.5 (p. 392).
310
9.6. Finding shortest paths
311
Chapter 9 A framework for design
User manuals can be generated with automatic help; see section 11.5 (p. 392).
The fourth point of view is the most interesting: how hard is it for us to work
out how to write a program to solve the user’s task/action mapping problems?
The harder it is for us to solve the problem, then certainly the harder it will be
for users (who know less about the device than we do). We may find that our
programs need hints; if so, so will the users.
In some cases, we may find that the task/action mapping problem is very hard
to solve (it might take exponential time to solve, or worse); then we really need
to redesign the device—because if a problem is theoretically that hard, the user
must find it hard too.
The key importance of the designer getting insight into the difficulty (or ease)
of the user solving problems is part of the computer science analogy, which we
introduced in section 5.7 (p. 157).
var device = {
...
stateNames: ["dim", "off", "on"],
fsm: [[1, 2, 0], [1, 2, 0], [1, 2, 0]],
...
manual: ["dim", "dark", "bright"],
...
};
you would write more like this (shown on the next page):
312
9.7. Professional programming
var device = {
...
states:
[{name: "dim", fsm: [1, 2, 0], manual: "dim"},
{name: "off", fsm: [1, 2, 0], manual: "dark"},
{name: "on", fsm: [1, 2, 0], manual: "bright"}],
...
};
This way of coding brings everything in each state much close together and
therefore helps you get the correspondences right. If you know JavaScript, you
can use object constructors rather than repeatedly writing out name, fsm, and
manual.
The finite state machine field, fsm, is defined using state numbers, and it is
sometimes too easy to mix up state numbers. Instead, every reference to a state
should be to its name. If you want to do this, you will probably want to
preprocess the device to build an internal specification that uses numbers
(because they are much more efficient when the device is running).
Section 9.5 (p. 288) gives other ideas for preprocessing the device specification
to improve its quality.
Strings are readable and make a device specification very clear, which is why
we’re using them a lot in this chapter. But, what happens if you mistype a
string? The device would misbehave. If you are using a typed language like
Java, there are many better approaches. You could have a state constructor, say,
The advantage is that Java will only let you use sOff where you need a state,
and you can only use states in those places. You could not get a button and a
state confused. (Here you’d need to add actions to each state separately, say,
sOff.addAction(button, nextstate).)
If you spend another half hour or so programming, you will be able to
generate code to work on hardware (or Java, or whatever) from the JavaScript
framework we’re using, and you will then be able to get the real device to
work straight from the framework.
It is easy to add special-purpose features to the framework, but we won’t dwell
on them in this book. For example, the framework can be extended to be a
pushdown automaton in which each user action stores the last state in a stack
(in JavaScript, it would be an array, using the methods push and pop). The
Undo button is then programmed to pop the stack and restore the last state.
Section 10.8 (p. 362) suggests how to implement consistent user interfaces by
using program code to define interaction features.
313
Chapter 9 A framework for design
Our framework has finite state machines as explicit data structures. It’s tedious
writing down finite state machines like this—every bit as tedious as drawing
state transition diagrams. Just as statecharts improve the visualization of state
machines, there are many programming language approaches (including
SCXML, the XML statechart notation) that improve the clarity of finite state
machine specifications.
You can use JavaScript as a specification language to build the finite state
machine directly; we show how in section 9.8 (p. 316). We discuss more general
approaches to design tools in section 12.6 (p. 428). For other ideas extending
the framework see section 9.5.2 (p. 295).
314
9.7. Professional programming
Figure 9.6: A Phidget, connected to a USB port on a PC, simulating a device display.
Phidgets use USB interfaces and are well supported. So, for example, you could
use JSON to get our framework into ActionScript, C or Java, then use the Phidget
libraries provided for these languages to run the hardware. Figure 9.6 (this page)
shows a simple user interface built out of Phidgets, running a syringe pump user
interface written using this book’s framework.
You can get more details from www.phidgets.com
315
Chapter 9 A framework for design
316
9.8. Generating device specifications
Figure 9.7: An unusual device to consider as an interactive device, but a finite state
machine nevertheless, with 360 states—more if we distinguish the motor running at
various speeds. The device can be specified using a statechart or program—see fig-
ure 9.8 (p. 319) for the statechart and section 9.8 for the program.
example, you can only change direction from clockwise to anti-clockwise if the
motor is not running.
The best way to define this device in the framework is to write a program that
provides the details. It would be far too tedious and error-prone to write out ev-
erything by hand. Every physical constraint will appear in the program as an
explicit test.
Interestingly, the user manual for the drill (a DeWalt DC925) only mentions one
of the constraints—it says you must not change gear while the motor is running.
Trying to specify this device as a FSM therefore highlights a possible design issue:
should the gears be improved so that nothing need be said in the user manual?
Would it be a better tool if the gears were modified? It would certainly be easier to
use, and with a reduced risk to the user of wrecking the gearbox, which the manual
currently warns about. So, even without doing any analysis with the framework,
the discipline of being explicit about the interaction design of the device has raised
a design issue.
What follows is one way to program it in JavaScript. First we need state num-
bers for every possible state the drill can be in. The drill allows the user to set the
gear, the clutch, and so on, to various settings:
var clutchSteps = 24, directionSteps = 3,
gearSteps = 3, triggerSteps = 2;
317
Chapter 9 A framework for design
every(test);
318
9.8. Generating device specifications
H H 19 18 17
20 16
Gear 1
22 15
Locked
22 14
Drill 13
Collar setting
Hammer 12
Off Off Gear 2 for screwdriver, drill, or hammer modes
1 11
2 10
3 9
Running Running
Gear 3 4 8
reverse forwards 5 7
6
Figure 9.8: A statechart description of the DC925 drill shown in figure 9.7 (p. 317).
Unusually, this device has no self-loops—if the user can physically do an action, the
device changes state. In contrast, pushbuttons on most devices can be pressed even
when they do nothing, which creates self-loops in their FSM model.
Amongst the mass of output this generates, it will print “running locked” a
few times, which is a combination that ought to be impossible! You can’t press
the trigger in to get it to run when the direction slider is in the central, locked,
position. We obviously still have more programming to do.
Next, for all possible user actions in any state, we need to work out what tran-
sitions are possible:
function action(t, c, g, d)
{ transition(t, c, g, d, "trigger in", t+1, c, g, d);
transition(t, c, g, d, "trigger out", t-1, c, g, d);
transition(t, c, g, d, "clutch increase", t, c+1, g, d);
transition(t, c, g, d, "clutch decrease", t, c-1, g, d);
transition(t, c, g, d, "gear up", t, c, g+1, d);
transition(t, c, g, d, "gear down", t, c, g-1, d);
transition(t, c, g, d, "direction +", t, c, g, d+1);
transition(t, c, g, d, "direction -", t, c, g, d-1);
}
This function is saying, for all of the features on the drill, things like “if the cur-
rent state is t, c, g, d then we could increase the clutch setting by 1, and if we did,
it would be in state t, c + 1, g, d.” If we wrote every(action) this would generate
calls to the function transition for every possible thing that could be done in every
possible state. Unfortunately, not all the states and not all actions are actually pos-
sible. For example, if the drill is locked, the trigger cannot be pressed in to make
it run; and if the drill is set to gear 2, we can’t increase the gear to 3, because there
are only three gears (gears are numbered 0, 1, 2, even though the drill itself calls
them 1, 2, 3). We need to program a check on the drill’s constraints:
319
Chapter 9 A framework for design
function allow(t, c, g, d)
{ if( t < 0 || t >= triggerSteps ) return false;
if( c < 0 || c >= clutchSteps ) return false;
if( g < 0 || g >= gearSteps ) return false;
if( d < 0 || d >= directionSteps ) return false;
if( d == 1 && t != 0 ) return false;
return true;
}
The important point is that this function and transition, discussed below, cap-
ture all the device’s constraints in one place. The complex constraints are captured
in a clear programmatic way. For example, the last test in the code above, if( d
== 1 && t != 0 ) return false effectively says, “if the drill is locked, then the
trigger must be out and the motor off.”
Writing the code prompted me to think more about the drill’s constraints: have
I written accurate code for this book? It turns out that you can stall the drill
if you try hammer drilling with the motor in reverse and simultaneously apply
some pressure. This could happen if you are drilling, for instance, reinforced con-
crete and wish to free a drill bit that has got stuck—if you reverse the drill but
leave it in hammer mode, the drill itself may get jammed. It must be an over-
sight that this case is not mentioned in the user manual. The extra code needed
to express this constraint in the function allow is if( d == 0 && t == 1 && c
== clutchSteps-1 ) return false, or in words, “if in reverse, and the trigger is
pushed in, and the clutch is set to hammer, then disallow this state.”
Our framework requires consecutive state numbers 0, 1, 2, 3 . . . with no gaps, so
if some states are not allowed we need a way of mapping conceptual state num-
bers to real state numbers, skipping the states that are not allowed. The easiest
approach is to construct a map as a JavaScript array:
var map = new Array(triggerSteps*clutchSteps*gearSteps*directionSteps);
var n = 0;
function makeMap(t, c, g, d)
{ if( allow(t, c, g, d) )
map[conceptualState(t, c, g, d)] = n++;
}
every(makeMap);
After running this code, map[s] gives us a number 0 to 359 of allowed state
numbers, provided that s is an allowed state from the original 432 conceptual
states; otherwise map is undefined.
Recall that the framework requires each state to be named. Here’s how the map
can be used to name the real states: if a conceptual state is allowed, map the state
number to a real state number, then give it its state name. As before, we generate
the state name from the combination of gears, clutch settings and so on, using the
function stateName we’ve already defined. Notice we use every to conveniently
run over all possible states.
320
9.8. Generating device specifications
every(nameEachState);
To complete the framework FSM, we must make sure all the user’s state tran-
sitions are allowed; that means both the state we are coming from t0... and
the state we are going to t1... are allowed. For the drill, if both states are al-
lowed, the transition between them is always allowed. More complex devices
would need further programming to allow for more complex conditions on what
transitions are allowed—for example, although the drill allows us to change gear
when it is running, the manual warns this is a bad idea because the gears may
grind, and this constraint could be handled by writing ... && (t0 == 0 || g0
== g1)—meaning “trigger out (motor not running) or the gears aren’t changed.”
function transition(t0, c0, g0, d0, button, t1, c1, g1, d1)
{ if( allow(t0, c0, g0, d0) && allow(t1, c1, g1, d1) )
drill.fsm[map[conceptualState(t0, c0, g0, d0)]][lookup(button)] =
map[conceptualState(t1, c1, g1, d1)];
}
The details we haven’t yet provided are for initializing the FSM and defining
the function button, needed as a way of getting a button number from the button
name.
drill.fsm = new Array(n);
for( var i = 0; i < n; i++ )
{ drill.fsm[i] = new Array(drill.buttons.length);
// if you try an action, by default stay in the same state
for( b = 0; b < drill.buttons.length; b++ )
drill.fsm[i][b] = i;
}
function lookup(button)
{ for( var b = 0; b < drill.buttons.length; b++ )
if( button == drill.buttons[b] )
return b;
alert("Button "+button+" isn’t recognized!");
}
The alert in the function lookup will happen if we misspell a button name
anywhere; it’s a useful check.
Now we’ve finished the programming, calling every(action) will create the
FSM we wanted. The FSM will have hundreds of rows like [346, 195, 352,
340, 346, 344, 346, 346], but we need never look at them. We should use the
framework to analyze the drill’s properties rather than looking directly at the raw
data.
321
Chapter 9 A framework for design
9.9 Conclusions
This chapter has introduced and explored the benefits of programming interaction
frameworks: general purpose programs that can run, simulate, or analyze any
device. The advantage of a framework is that it can check and measure all sorts
of useful properties—and it becomes worth doing so because we can compare
and contrast different designs very easily. Without a framework, each device is
a completely different programming problem, and it probably won’t seem worth
going to the trouble of writing high-quality program code to evaluate it.
In particular, this chapter developed a framework in JavaScript that readily al-
lows a device to be specified, simulated, and analyzed. Professional programmers
will perhaps want to redesign the framework for their favorite languages, and
there is much to be gained by doing so.
We could develop the simple framework here into a full-blown design tool.
There are many possibilities . . . it’s astounding that the user interfaces of most
devices are so routine—even a simple framework helps designers become more
certain in their processes, and in turn become more confidently creative.
322
9.9. Conclusions
323
MILES J. HART
10
Using the
framework
Although our main interest is designing good interactive devices, we don’t have
to work with conventional interactive gadgets; things can be very different—from
lightbulbs and microwave ovens to guns and fire extinguishers. Our first example
is an old puzzle—we hope that user interfaces to gadgets won’t be puzzles, unless
they are supposed to be!
325
Chapter 10 Using the framework
across the river. You could imagine making an interactive game that simulated the
farmer’s problem, and then the farmer’s problem would be an interactive device.
To solve the problem requires two things: that the goat is never left alone with
the cabbage, and for the wolf never to be left alone with the goat—in either case,
something will get eaten! If the farmer is silly enough to leave the goat, wolf and
cabbage alone on the same side of the river, then the wolf will wait until the goat
has eaten the cabbage, thus becoming a fat goat, before eating it.
We first show this problem as a finite state diagram, with the states in a circle in
no particular order:
Start —
End —
It’s clearly got a lot more states than a lightbulb! In this diagram, the problem
has effectively been changed into finding a route following arrows from one circle
to another, starting at the circle labeled “Start” and going on to the finish of the
puzzle at the state “End.”
I wrote a program to generate the farmer’s graph, much like the program to
construct the drill in section 9.8 (p. 316). We don’t need it here, but the
program is available on the book’s web site, mitpress.mit.edu/presson.
Some, but not all, of the arrowed lines are one way: if you make a mistake, you
cannot get back. This corresponds to something irreversible happening, like the
wolf eating the goat if they are left alone with the farmer on the other bank of the
river.
If the farmer canoes across the river (following an arrow in the diagram), that
may create an opportunity for the goat to eat the cabbage or for the wolf to eat the
goat. If so, it is a one-way trip. Although the farmer can canoe back to the other
river bank, they can’t get back to the state they left.
326
10.2. Strong connectivity
You could imagine a gadget with eight labeled lights on it, representing all com-
binations of the presence or absence of cabbage, goat, wolf, and farmer on either
side of the river, and some buttons to choose what to take in the canoe. Thought
of like this, the farmer’s problem represents quite a hard-to-use interactive device.
Thinking about the farmer’s problem will help get us into useful ideas for interac-
tion programming.
The farmer’s problem is just like the problem users have with devices: users
want to get devices into certain states that achieve their goals. The farmer wants
to get their stuff to the other side of the river with nothing missing. For both users
and farmers, the problem is to get from one state to another, usually in the fastest
possible way. This is merely a matter of finding the right path through a finite state
machine.
327
Chapter 10 Using the framework
can do anything because they can always get back to whatever they were doing
within that component.
It’s possible that a device has no strongly connected components. A very simple
example would be a sequence of states like a → b → c → d. Here, we can get from
a to b, and from b to c, and so on, but we can never get back to a, or indeed never
back to any state once we’ve left it. This simple device is connected—if we start
in the right places, we can get anywhere—but it isn’t strongly connected—we can’t
get from anywhere to anywhere.
It would be very surprising if an interactive device was not connected—it would
mean that the designer had thought of and specified states that the user could
never get to—but there are times when it is useful for a device to be connected
but not strongly connected (otherwise known as weakly connected), and this usu-
ally happens when the world in which the device operates can change. In the
farmer’s problem, cabbages can get eaten and that is one sort of change—because
the farmer’s problem is a game, the changes are merely “interesting,” but we
might want to study devices like a missile launching controller, or a burglar alarm
(so far as a burglar is concerned, the device is connected—it can go from silent to
alarmed—but it is not strongly connected—the burglar cannot get it back to silent
once it has been triggered).
Now for some important points:
Strong connectivity and strongly connected components are important for
usability.
There are standard algorithms to find connected components, though we won’t
use them in this book as it is unusual for an interactive device not to be
strongly connected.
Neither users nor testing with users can establish such properties, because it is
generally far too hard to do humanly.
328
10.2. Strong connectivity
wolf with the farmer, or the farmer alone across the river, but from any of these
states it isn’t possible to go back to any state where the eaten cabbage or the eaten
goat exists again. Below, we’ve summarized the four states of this strongly con-
nected component, one state per line:
farmer wolf
farmer, wolf
wolf farmer
farmer, wolf
In this strongly connected component, two of the states are, in our technical
terms, pointless since there is only one thing that can be done, just for the farmer
to row across the river. When there is only one thing to do, we should consider
designing pointless states out of the device—in fact, we could redesign this part of
the device down to two states, depending on which side of the river we want the
wolf. And then in each of those two states, there’s now only one thing that can be
done (in each state, we eliminated one of the two choices because it was pointless),
so why do we still need them? And so on. In other words, identifying pointless
states is an iterative process; the designer stops when a “pointless” state actually
has some purpose for the user—or, as happens here, it becomes a terminal state.
In our usage, “pointless” is a technical term, which helps us critique design.
But we should talk to the user—in this case the farmer—to see whether what we
think is pointless is in fact so for them. Here, the farmer might like admiring the
wolf from either side of the river, if so, we would need all the states and they
wouldn’t be pointless—that is, provided the farmer has some reason to need the
states where there is only one choice. Indeed, once we start talking to the farmer,
we might discover that really there are two choices in each state here: the farmer
can stay or row across—we failed to consider the “choice” represented by the self-
arrows.
Pointless states are discussed in section 6.3.3 (p. 184).
329
Chapter 10 Using the framework
c F cg
F gw w
F cgw cw F cw g Fg
Fg g F cw cw F cgw
w F gw
F cg c
Figure 10.1: A subgraph of the farmer’s problem, namely the strongly connected com-
ponent containing the start and end states—all the possible states where nothing gets
eaten. As usual, we have not shown the self-arrows.
330
10.3. A recording device
Box 10.1 Why personal video recorders? Why do I keep mentioning video recorders when
everybody today uses DVDs or computers to watch and record their TV? I’m not just talking
about video cassette recorders (VCRs), which were the original mass-market personal video
recorders (PVRs). Or why does this book keep mentioning PVRs when the world is going
to fill up with robots and implants?
Video recorders suddenly got more complicated in the late 1980s when their mechanical
controls were replaced by embedded computer-based control. Suddenly the user interface
became cheap, little rubber buttons, and the computers could “do anything.” The designer
lost all affordance that the mechanical constraints had imposed.
As anything could happen, anything did. The PVRs became more complex and more
arbitrary from the user’s point of view. The user population had a lot of conceptual catching
up to do as the complexity initially overwhelmed them. Today not many people have problems
(or at least problems they admit to) using video recorders: we are all using MP3 players,
camera phones, PCs, and encounter complicated web sites everyday. That’s how it is.
Even though my first personal video recorder (the JVC model discussed in several places
in this book) has fewer than 30 main states, it still baffles me as a user. But more so it
baffles me why its designers hadn’t been able to structure its few states in a way that made
it easier to use.
Today I don’t think there is any immediate panacea: “this is how all PVRs (or whatever)
should be designed.” I think that analyzing their designs in the way this book promotes will
give designers the tools to build what they really want to build; they’ll be able to see how
to optimize and modify their designs to better support users’ tasks. And insights here are
going to help design all those robots and implants—they’ll still have states their users are
baffled by, and probably far more than 30.
obsolete, but I’ve used this example for very good reasons: see box 10.1, “Why
personal video recorders?” (this page). A VCR is just a gadget that can record
and playback media; it could be an MP3 player, a DVD recorder, a computer, or
anything yet to be invented—we are interested in the principles, not the specific
product (but it helps to have a specific product in mind). The preferred term is
PVR, short for personal video recorder—since we need an abbreviation that does
not commit us to a particular technology.
We mentioned this PVR in section 3.4 (p. 66), and section 5.4 (p. 145).
We first show this PVR state machine drawn as a circular graph. The advantage
of using a circular drawing is that no two lines between different states can ever
be drawn on top of each other (all lines must be at different angles), so we can
be certain that we are looking at everything (unless in some states more than one
button does the same thing, then their lines would be coincident). It helps to write
the names of the states too, but there are so many that the drawing gets quite
messy!
Even though the circular diagram in figure 10.2 (next page) is messy—and we
haven’t put the states in a useful order around the perimeter of the circle—you can
see little design facts like the two states “on with tape in” and “off with tape in”
seem very easy to get to from almost anywhere—you can clearly see the cluster of
arrow heads hitting each of these states from almost every other state.
331
Chapter 10 Using the framework
Figure 10.2: Transition diagram for the JVC PVR drawn as a circle. Circular embeddings
have the advantage that no lines coincide—they are unambiguous unless there are
several lines between the same pairs of states. Tools like Dot can be asked to try to
minimize line crossings, to draw neater diagrams.
An alternative way of drawing the same graph is to rank the states, using the
technique used in figure 10.1 (p. 330) for the farmer’s problem. Figure 10.3 (fac-
ing page) shows a ranked transition diagram for the JVC PVR. To draw it, we
chose the initial state for the machine, which for this machine is off with the tape
out, and drew that state at the far left. Then each column of states drawn is the
same distance, that is, the same minimum number of button presses, from the ini-
tial state. Thus the further right you go in this diagram, the harder things are to do
starting from the initial state. One could draw the ranked graph taking any state
as the initial state; it would then show the user effort getting to any state from
that chosen state. Indeed, we could use the shortest paths matrix to find the most
eccentric state, and hence draw (what for most users would be) the worst-ranked
graph.
The long tail of states, each increasingly hard to get to, makes this JVC device
hard to use—or, rather, it makes doing some things hard to do. The very long tail
out to the right means that these states are unusually hard to get to.
This “stretching” of the device design may or may not make sense for a PVR,
depending on what those tail states are and how they relate to one another. Cer-
332
10.3. A recording device
Figure 10.3: A ranked embedding for the JVC PVR. Off with the tape out is at the far
left. State names and button names are not shown to avoid visual clutter. If you want
to avoid visual clutter and be more informative, make the diagram interactive (e.g.,
use an HTML imagemap) and show the state names as the cursor is moved over the
picture.
tainly, if this device were an airplane, you’d want the state “on the ground but
with the wheels retracted” to be very difficult to get to, but I can’t see why a PVR
need be so lop-sided! Probably the designers never drew a ranked embedding and
therefore never noticed any such issues.
Figure 5.7 (p. 138) shows a ranked embedding for a Nokia mobile phone, like
figure 10.1 (p. 330), but rotated to go top down, rather than right to left.
333
Chapter 10 Using the framework
something in 240 minutes. In fact, it is the state at the extreme right of the last
diagram we drew, figure 10.3 (previous page): it’s not only a long way from off,
it’s as far away as it could be from many states!
To get from this state to this state
fast forward → pause recording, but stop in 240 minutes
off, with tape in → pause recording, but stop in 240 minutes
off, with tape out → pause recording, but stop in 240 minutes
on, with no tape → pause recording, but stop in 240 minutes
play a tape fast forward → pause recording, but stop in 240 minutes
pause playing a tape → pause recording, but stop in 240 minutes
play a tape fast backward → pause recording, but stop in 240 minutes
play a tape → pause recording, but stop in 240 minutes
rewind a tape → pause recording, but stop in 240 minutes
We can measure the average number of button presses to get from one state to
another; it’s 3.9. Given that one state seems to be out on a limb of length 11, it’s
interesting to work out what the best we could do is.
The PVR has 8 buttons and 28 states. One state (namely, the one you start from)
can be reached with 0 button presses, because you are already there; 8 more states
can be reached with 1 press, since we have 8 buttons available to get to different
states. In theory, in each of those 8 states, we could reach another 8 states—a total
of 64 states—in just one more press. Having accounted for 9 states in 0 or 1 presses,
it only leaves 19 states the PVR needs; and these can therefore all be reached in 2
presses.
The average cost is the total divided by the number of states, (0 × 1 + 1 × 8 +
2 × 19)/28 = 1.643. (A quick way to estimate this value is to calculate log8 of 28,
which is 1.602.) However we look at it, this is a lot less than the 3.9 the device
achieves.
We can conclude the device was not designed to minimize button presses to do
things. One would have expected some other advantage for the JVC design deci-
sions, such as the buttons more often meaning the same things, like Play always
meaning play.
10.3.2 Indicators
We can extend a device description to include indicators, descriptions of what
lights, indicators or words are shown to the user by the device in any state. For
example, the “on” light comes on in all states when the PVR is switched on, but
the “tape in” indicator is only shown when there is a tape in—and there can be a
tape in whether or not the device is on or off. Naturally, in its two off states the on
indicator will be off.
Indicators were introduced in chapter 8, “Graphs,” section 8.1.3 (p. 232), in the
context of coloring graphs.
Handling indicators allows us to do further sorts of insightful analysis. I didn’t
introduce indicators earlier because for simple devices, like lightbulbs and drills,
states and indicators are pretty much the same thing.
334
10.3. A recording device
When buttons are pressed, a device should give feedback that something has
happened. Typically, pressing a button not only changes the state, but it also
changes some indicator.
We can define an ambiguous transition as one in which no indicators change.
The user cannot be certain that they pressed the button hard enough—or, perhaps
worse, they cannot be certain that they understand the device, as it appears to
have done nothing when they thought it would do something.
The ambiguous transitions for the JVC PVR are shown below. For instance, if
the JVC model is on with a tape in and you make it go fast forward, it won’t tell
you anything has happened—but this is only one of several ambiguities.
fast forward → on, with tape in
on, with tape in → fast forward
on, with tape in → rewind a tape
play a tape fast forward → on, with tape in
play a tape fast backward → on, with tape in
rewind a tape → on, with tape in
To get these results, I added a new field indicators to the framework specifi-
cation. Each state can now have various indicators, using descriptive JavaScript
strings such as "on" or "tape in".
If a device gives no feedback when a button is pressed—as in the six cases
above or even in very simple situations, as discussed in section 3.14 (p. 86), the
user may easily make an overrun error, a type of error discussed in
section 10.4.1 (p. 341).
There is no need for the indicators to be real indicators on the device. The PVR
has no special indicators for fast forward and rewind states, but you can hear the
PVR whirring madly as it does the fast rewind or fast forward actions, so it has
indicators of a sort. Whether you treat fast forward whirring as an indicator is
moot; we can tell from the table above that the basic device needs them.
More generally, indicators could be anything the designer is interested in. The
PVR has no indicator to say that it has a tape in that can be recorded on—the
user has to experiment to find out, but we could make this a conceptual indicator
in our device specification. Then if we find that users rely on this indicator, this
would suggest a good way of improving the design, namely, make the conceptual
indicator a device indicator.
The device specification needs extra stuff, so each state has a list of indicators
specified:
indicators: [["on", "tape"],
["tape"],
[],
["on", "tape"],
...
]
We can look at the way the device uses its indicators to find out how likely but-
tons are to change them. Do buttons consistently affect indicators? Some buttons
335
Chapter 10 Using the framework
have apparently helpful names like Operate and Play ; presumably, switching on
and off the On and Play indicators . . . well, let’s see.
A simple program running on the specification generates the following text:
For
device JVC HR-D540EK PVR
Play , when it does anything, it always ensures: on, tape
and 7.69% of the time it ensures: record
Operate , when it does anything, 3.57% of the time it ensures: tape
Forward , when it does anything, it always ensures: on, tape
Rewind , when it does anything, it always ensures: on, tape
Pause , when it does anything, it always ensures: on, pause, tape
and 8.33% of the time it ensures: record
Record , when it does anything, it always ensures: on, tape
and 5.26% of the time it ensures: record
Stop/Eject , when it does anything, 3.85% of the time it ensures: on
tape in , when it does anything, it always ensures: on, tape
Analyses like these are easier to read when we use the descriptive fields already
in the framework; the first line say they are results for the JVC HR-D540EK PVR,
which is just the device.modelType text.
The words “record,” “on” and so on here are indicator lights (or words that
light up in the LED screen) on the actual device. Alternatively, we can do the same
analysis but use our own conceptual indicators. For example, although the PVR
does not say so, we know which states make the tape rewind; so we can invent an
indicator to mark those sets of states. Here is the sort of result we can get:
For device JVC HR-D540EK PVR (conceptual indicators)
Play , when it does anything, it always ensures: on, tape
and 15.38% of the time it ensures: auto off record
Operate , when it does anything, 3.57% of the time it ensures: tape
Forward , when it does anything, it always ensures: fast forward, on, tape
and 33.33% of the time it ensures: play
Rewind , when it does anything, it always ensures: on, rewind tape
and 33.33% of the time it ensures: play
Pause , when it does anything, it always ensures: on, pause tape
and 8.33% of the time it ensures: record
Record , when it does anything, it always ensures: on, tape
and 5.26% of the time it ensures: record
Stop/Eject , when it does anything, 3.85% of the time it ensures: on
tape in , when it does anything, it always ensures: on, tape
The table shows that when the Play button does something, it will leave the
PVR with the on and tape indicators on—we have the condition “when it does
anything” since if the device is off, it won’t do anything at all, and that usually
isn’t worth reporting! What Play does is not very surprising, but some of the other
button meanings are.
336
10.3. A recording device
Box 10.2 What “always” and “never” should mean “You can always press Home to get to
your home page”—that’s an instruction seen on Microsoft’s webTV, a domestic interactive
TV system; but it isn’t true that it always works. I needed to use the TV’s other remote
control as well to get to the home page!
It’s fine to write “always” in manuals and user help, and in principle if something is always
true, the device will be simpler. But make sure that you really do mean always. If so, it
simplifies the manuals and reassures the user; it makes everything easier. If something isn’t
“always” but is “almost always” the device will seem especially unreliable.
We can see that the Operate button makes the JVC device on only 44% of the
time—other times, indeed most of the time, Operate makes the device
inoperative!
Why does pressing Rewind sometimes cause the device to play? It’s because if
the device is already playing when Rewind is pressed, the device starts to play
backward (so-called review).
If Rewind sometimes leaves the device playing, why does Pause sometimes leave
it recording?
The Record button seems to make the device record hardly at all, only 5.26% of
the time it is used.
A designer should read a list of results like this carefully. Perhaps the designer
should also annotate the results table with the rationale for the interesting features
(such as the ones we picked out for comment above). A routine extension of the
framework could store the designer’s annotations and later report if a percent-
age changes—such a change would indicate the design had been altered and the
original rational needs reviewing.
One reason why Record doesn’t consistently make the device record is shown
in an answer to a user’s question, shown on p. 353. If a user wants to stop the
recording in 120 minutes, they should press the Record button 5 times. For none of
those presses, at least 5 states, does the Record button start recording.
Box 9.2, “Button usage” (p. 310) explains the Record button usage further.
337
Chapter 10 Using the framework
If we get some experimental data our information for the design can be much
more useful. When we said “44% of the time” (or whatever) we really meant in
44% of the states. Whether these states are used the same amount of time as other
states is, of course, unlikely, though a good first approximation. A device might
be off for most of its life, but a user will never use a device when it is off! Thus we
ought to collect some timings from real use so that our percentages give us data
about real use rather than guesses. Nevertheless, when we are designing a new
device, informed guesses are better than nothing.
More uses for weighting with experimental data are given below, in
section 10.7.5 (p. 358); in particular, we suggest comparing expert designers’
data with ordinary users’ data to see whether we can help users become more
proficient.
Beguiling behavior can be a way for manufacturers to make money. For ex-
ample, a user may be lulled into thinking some behavior of the device is normal,
but very rarely it might imply some avoidable cost for the user. My car, a Land
Rover 90, is a case in point. It has a parking brake (hand brake) with a warning in-
dicator. The warning light, which is a symbol of the parking brake mechanism, always
comes on when the parking brake is on and always goes off when the parking
brake is released. The user does not want to drive the car when the parking brake
is on or partially on, so this indicator light is a helpful warning. The light worked
like that for years, and it beguiled me into thinking that was exactly what it did.
Then, one day, it stayed on. Yet the parking brake still worked perfectly, so I be-
lieved it had to be an electrical fault. Indeed, we have had wiring faults before, so
we pretty much ignored it.
When we took the car for its service, it needed a new set of disk brakes (both
disks and pads)—a very costly repair. We now learned that what we thought was
the “parking brake” indicator is in fact a “general brake” warning light. In other
words, around 99.99% of the time it means nothing unusual but around 0.01%
of the time it means you have a serious problem that needs immediate attention.
Why doesn’t Land Rover use another indicator for this rare problem? Or why not
have some simple electronics (there’s plenty there already) to make the light flash
and maybe a noise too, so it clearly is a serious warning? Or, as there is a sepa-
rate anti-lock brake system (ABS) warning light that is always a brake malfunction
warning light, why not use that for all brake failures? Why not have both lights
come on together? Why not an LCD text display? Or, thinking differently, why not
have the original indicator come on when any brake is used: then the user is also
trained the indicator refers to the main brakes as well, and the percentage 99.99%
changes dramatically—the issue of beguiling behavior disappears. Whatever so-
lution is chosen, and there are many, it needs to be different from the light the user
has learned from long experience means something else.∗
That last phrase, “the user has learned” needs rewriting: it should say, “what
the user has been trained by the design”—it’s a design issue, not a user issue. Here,
∗ By design, Land Rover parking brakes are completely separate from the main brakes, so their fail-
ure modes are independent—they work on the prop shaft, not on the wheels. So why use the same
indicator?
338
10.4. Summarizing and comparing many properties
the car manufacturer saving the cost of installing a clear fault indicator ensures
that from time to time they will sell brake parts at significant profit. The way the
device has been designed, brake failure has—conveniently for the manufacturer—
become the user’s fault for not understanding a warning light the user manual
(but not the warning itself) explains.
The example in box 6.4, “Bad user interfaces earn money” (p. 191) is another
case of rare behavior—but behavior the designers of the device surely know
about—leading to surprising costs for the user.
2. Number of states: 6. How many things can be done with this device?
3. Number of buttons: 5. How many buttons (or other actions) are available to access all the
states?
5. Number of self-edges: 7. A self-edge goes back to the same state; it corresponds to buttons
or actions that do nothing.
7. Probability a button does nothing: 0.23. Chance a random button press in a random state
does nothing. The larger this figure (to a maximum of 1), the safer—or more
frustrating!—the device will be to use.
8. This device is strongly connected. If the device is strongly connected, we can always get
anywhere to anywhere; if not, then there are some traps or irreversibilities that cannot be got
out of.
9. Average cost to get somewhere from anywhere else: 1.47. How many button presses,
on average, does it take to get anywhere?
339
Chapter 10 Using the framework
10. Average cost to get somewhere from anywhere, including the same place: 1.22. If
you include trying to get to the same place (which takes nothing to do) of course the average
cost is less.
11. Worst case cost: 3. The most difficult case of getting from anywhere to anywhere. In a
complete device, this worst case would be 1.
12. Average cost to recover from 1 button press error: 1.3. If in a random state a button is
pressed at random, how many button presses on average does it take to get back? Compare
this cost with the mean cost; if it is higher, most button presses are “one way.”
13. Worst case cost to recover from 1 button press error: 3. If in a random state a button is
pressed at random, what’s the worst number of button presses it takes to get back? If the
device has an undo key, this figure would be 1. If the device has an undo key, this figure
would be 1. Put another way, if you are in “the right place” but accidentally press a button,
this is the average cost of getting back.
14. Average cost to get anywhere after 1 random button press: 1.36. A random press can
give you a bit more information, but has it made your task (whatever it was) harder?
Compare this figure with the average cost between states; typically it will be higher, because
a random button press will tend to take you away from where you want to go.
15. Percentage of single-press errors that can be undone directly: 33.33%. If a button is
pressed by mistake, how often can you get back (undo the error) in just one step? If the
device has an undo key, this figure would be 100%
16. Average cost of an overrun error: 0.2. If the correct button is accidentally pressed twice
(not once), how hard is it to get back (undo the overrun error)? If the device has an undo key,
this figure would be less than 1; if the device was idempotent (when a button gets the device
to a state, it keeps you there), the figure would be 0.
17. Worst case overrun error cost: 1. If an overrun error occurs, what is the worst cost of
recovering?
18. Average cost of a restart recovery for overrun error: 1.4. If the correct button is
accidentally pressed twice (not once), how hard is it to get back (undo the overrun error) if
the user switches the device off first to restart it?
19. Worst case restart overrun error cost: 4. If an overrun error occurs, what is the worst cost
of recovering by restarting?
20. Independence probability: 0.33. The probability that a user requires more than one button
press to do anything. The smaller this number, the ‘’easier” or more direct the device is to
use.
All the above text was generated automatically, using the device specification
from our framework. The complete JavaScript code to do it is on the book’s web
site—for reasons of space (and boredom), we won’t give the full details of gener-
ating all this text here.
Many of the measures can be fine-tuned depending on exactly what a designer
wants. For example, the list above gives the percentage of single-press errors that
can be undone directly; this means that if the user presses a button by accident,
340
10.4. Summarizing and comparing many properties
a third of the time they can recover from this error in one more press. But this is
not counting pressing buttons by accident that do nothing, since this would not
cause an error that needs recovering from. We could count those cases as well, if
we wanted to, and then change the average accordingly.
Box 10.3 Handling infinity properly If a device is not strongly connected, some values in
the shortest paths matrix will be ∞. We have to be careful working out averages when this
is a possibility, because most programming languages don’t handle infinity correctly. In our
JavaScript, we used any value larger than n to represent ∞, so strictly the program code
given above needs tests to see whether apsp[i][j] > n, and if so, to drop out of the loop
and report an infinite result. Perhaps easier is to have a single check to determine whether
a device is strongly connected and, if it isn’t, to only report properties that make sense
(however, average costs of overrun errors do make sense even if a device is not strongly
connected).
Our definition of fsm gives us the next state when button j is pressed in state i,
so device.fsm[i][j] in the loop above gives us the new state we get to if we press
button j. We record this new state in the variable newState. But if we press button j
again—that’s the overrun—we would end up in state device.fsm[newState][j].
This is the state the user overruns to, and we record it in the variable overrun.
Then apsp[overrun][newState] will be the cost of getting back, which we add to
cost, so that in the end dividing cost by n*b will give the average.
After the two nested loops, we print out the answers:
document.write("Average cost of an overrun error: "+(cost/(n*b)));
document.write("Worst case cost of an overrun error: "+worst);
To find the cost of getting back to where we wanted to be is merely a case of
looking up the cost of the shortest route from overrun to newState: this is the
value in apsp[overrun][newState], which cost we can conveniently store in the
variable c. We then use c to add to the running cost and to keep the variable worst
tracking the worst value we’ve seen so far.
All the other code is similar. For example, to do an undo costing, we get the
costs of getting back from a button press. The inner loop would start with code
like this:
var newState = device.fsm[i][j];
var c = apsp[newState][i]; // cost of getting back to state i
This simply gets the costs of getting back to state i if button j has been pressed. The
two further lines we haven’t shown simply add the costs and find the maximum,
as before.
342
10.4. Summarizing and comparing many properties
code, above, we simply took the shortest path, whereever it went. Now, if we want
to assess the cost to the user of recovering from an error by switching off and on,
we use the shortest path from the error to Off, and then from Off to where we
wanted to be. Here’s how to do it:
cost = 0;
worst = 0;
var offState = d.startState; // we could choose any state to be ‘off’
for( var i = 0; i < n; i++ )
for( var j = 0; j < b; j++ )
{ var newState = d.fsm[i][j]; // button j in state i gets to new state
var overrun = d.fsm[newState][j]; // the state after an overrun
var restartCost = apsp[overrun][offState]+apsp[offState][newState];
cost = cost+restartCost;
if( restartCost > worst ) worst = restartCost;
}
The code is exactly the same as before, except apsp[overrun][newState] is re-
placed by apsp[overrun][offState]+apsp[offState][newState], which is the
cost of going from the overrun state to newState, where you wanted to be, via the
state offState.
For the running example, the average restart cost is 5.75 and the worst case is
13 (and that it would be much larger than this in practice since this assumes that
the user makes no mistakes and knows the best way to do it); this is all very much
worse (as we expected) than trying to recover from an overrun by going “straight
back” rather than via Off. So it gives an indication of the extreme cost to the user
if they don’t know that.
343
Chapter 10 Using the framework
The JVC PVR has some curious properties. If we have an overrun error (for
instance, we want to play a tape, but we press Play once too often, perhaps because
we didn’t notice when it had started to play—perhaps it is too slow or doesn’t
provide decent feedback), then it takes 2.3 presses on average to get back to where
we wanted to be (or 3.3 including the error). Yet to get from anywhere to anywhere
takes on average 3.9 presses: an overrun error on this JVC PVR is practically the
same as getting completely lost—an overrun error puts you about as far away on
average from where you want to be as you can be. Moreover, a completely random
button press only takes 1.8 presses to recover (on average)—or 2.8, including the
error. But this is easier than an overrun error! There are three main reasons for this:
(i) some random presses do nothing and therefore cost nothing to recover from;
(ii) most random presses don’t get you as far away as an overrun; (iii) if a button
worked to get you to this state, it is likely to work to get you away from it (in other
words, overrun errors are likely).
The remote control for this PVR is completely different from the unit itself. We
haven’t space to show it here, but it’s very obvious from any drawing of the transi-
tion diagram. Making it different doubles the learning the user has to do to make
good use of the device—and almost doubles the size of the user manual.
For a similar point on remote controls of televisions, see section 3.2 (p. 63).
344
10.4. Summarizing and comparing many properties
It’s worse! On the JVC you’re better off not playing with the buttons “to see
what they do.” But you’re only better off if you know what it is doing, and that
would require some indicator lights to tell you what it is doing.
On the JVC, then, the user is in a quandary: you can’t always tell what state it
is in, and experimenting to find out makes any task harder. Of course, to be fair,
once you’ve experimented and found out where you are, you can now use the JVC
properly, which you can’t do when you don’t know what it is doing.
Program code to do the random pressing is given in section 11.1.5 (p. 376).
345
Chapter 10 Using the framework
Figure 10.4: Compare this simplified PVR diagram with the corresponding diagram,
figure 10.2 (p. 332) for the original design.
Note something very important here: we can prototype a device (let’s say, the
first JVC PVR), evaluate it, and see some design issues (such as the outlying states);
that we can then improve it automatically on the basis of our analysis (here, collaps-
ing all those states).
The result, shown in figure 10.4 (this page), looks like an interesting improve-
ment. You can also see from the diagram that the device is “tighter” than the orig-
inal, yet it provides the same functionality but without all the timeouts. Whether
it really is an improvement for user’s we should leave to doing some experiments.
For all I know, there are enough users who like delayed pauses—and few users
who get frustrated by their device not doing anything for 240 minutes, and then
surprising them. The point is, we’ve got two automatically-related designs, and
we can test them out. Moreover, if during tests we discover some new idea that
really has to be implemented, we can still have two designs: we can continue
regenerating the automatic variations. In a normal design process, without the
automatic support, as soon as you get a new idea, you have got a lot of work to do
to keep all the versions of the device consistent. In our approach, that consistency
comes for free.
346
10.4. Summarizing and comparing many properties
Number of buttons: 4
Number of edges (excluding duplicates): 92, which is 12.17% complete
Number of self-edges: 46
Number of duplicate edges (excluding self-edges): 0
Probability a button does nothing: 0.41
This device isn’t strongly connected.
Percentage of single-press errors that can be undone directly: 48.21%
Average cost of an overrun error: 0.59
Average cost of non-trivial overrun errors: 1
Worst case overrun error cost: 1
Worst case cost for restart after an overrun error: ∞
Independence probability: 0.88
There are lots of things to notice here. The framework program generating the text
was designed for describing ordinary interactive devices, so “button” is hardly
the right word to use. “Action” would be preferable or, better, the action string
should be used from the framework specification. The number of self-edges looks
very high: but most of them are actions that don’t do anything: for example, if
there is no cabbage left because the goat ate it, then the action “take the cabbage”
will do nothing because there isn’t a cabbage to take and it’s counted as a self-
edge. Because this “device” is not strongly connected, many of the average and
maximum costs are infinity, and they are automatically not shown in the program-
generated summary of properties.
The probability (0.41 or 41% of the time) that a button does nothing means,
in effect, that if you shut your eyes and wished, “I want to take the goat across
the river,” then part of the time you couldn’t do it, for instance, because the goat
was on the other side, or was eaten—that something had gone wrong you hadn’t
noticed with your eyes shut. It seems a high probability, but the 41% assumes
you are in any state chosen with equal probability, but if you are trying to solve
the problem you are unlikely to be in a random state—ideally, you should be in a
state in, or close to, the strongly connected component that contains the solution.
If you are in that strongly connected component, nothing has been eaten, so the
probability a “button” does nothing would be lower than 41%. In other words, the
moral of this insight is that the probabilities (or percentages) would make better
sense if the states are weighted by how likely the user is to be in them. For this
“device,” as it happens, we can estimate those probabilities better than for most
devices.
You might have expected the worst overrun error to be infinite, because once
something has gone wrong (like the goat being eaten) there is nothing the farmer
can do about it. Infinities seriously affect usability, but here the worst-case overrun
cost is only 1 and the average is only 0.59! These unexpected results come about
because the farmer’s problem has an interesting structure:
If the farmer’s action stays within the strongly connected component shown in
figure 10.1 (p. 330), then every action is reversible. Any overrun takes the
farmer back to the previous bank, and the overrun can be corrected simply by
347
Chapter 10 Using the framework
doing the action again. In fact, this is true for any action within any strongly
connected component.
If an action takes the system out of the component, immediately the action is
irreversible. For example, the cabbage is eaten. Another action the same, which
would be an overrun, returns the farmer to the same bank as before the
action—but without the cabbage. The overrun can be corrected by repeating
the action, as this takes the farmer back. Again, for this unusual “device,” this
happens to be true for any action leaving a strongly connected component.
Once the cabbage or goat are eaten, some actions, such as the farmer carrying
the cabbage in the canoe, will have no effect. These actions in these states are
self-loops. A self-loop overrun does nothing, so these overruns can be
corrected by doing nothing, 0 actions, thus the average cost of an overrun error
correction is less than 1.
The moral of this story is that analytic measurements, such as the cost of correcting
an overrun error, have to be interpreted carefully. Measurements that sound sensi-
ble are not sensible for all possible devices and user tasks. Moreover, devices that
are not strongly connected generally have special problems that need examining
very carefully.
348
10.5. Lots of properties
349
Chapter 10 Using the framework
If user learning a system means knowing every feature, then the traveling
salesman gives the least number of actions a user will take to know every
feature.
If user learning a system means knowing what to do in every state, then the
Chinese postman tour gives the least number of actions a user will take to try
every action.
We want user manuals to be well written and easy to understand. If we
generate user manuals automatically from a device specification, then the
length of the manual is a good measure of how much a user needs to
know—and the time the user takes to learn a system will be proportional to
this.
The raw measures of the salesman or postman tours visit every feature, so typi-
cally both of these answers will be embarrassingly large numbers. A more realistic
measure can probably be obtained by first deciding what features the user needs to
know and only measuring those. Indeed, if the device is well structured, then the
user does not need to know it all, instead, they need to know the rules its design
is based on.
350
10.6. Comparing designs
expected if there was no overlapping allowed (or 13.5, if we hijack one of the four
buttons to be the Reset button).
De Bruijn sequences are special cases of finding Chinese postmen tours on
special graphs that have Euler cycles. Chapter 8, “Graphs,” defines these terms.
351
Chapter 10 Using the framework
Figure 10.5: More properties and figures, here helping compare two devices. As always,
all are generated by program from the device specifications. When you design systems,
you should work out what analyses you are interested in to help the device’s users do
(and enjoy doing) what they want to do. Obviously the tables you generate would be
filled with the numbers and comparisons that are relevant to your users and tasks.
352
10.7. Putting analysis into products
playing a tape—better than I had ever tried. I thought I understood the device re-
ally well, as after all I had reverse-engineered it to get the specification! To get the
device off with the tape out, which I had done many times, I had always pressed
Stop/eject to stop the tape playing and then a second time to eject the tape, then
finally pressed Operate to switch off.
But my program’s two-press how-to answer was, “You are playing a tape. How
can you off with tape out? Press Operate . Press Stop/eject .” Easily generated and
ghastly English, maybe, but still insightful (we’ll see how to do it in a minute). If
the JVC is off when you eject the tape, it stays off, and switching the device off also
stops the tape playing. So you can do in two presses what I had always done in
three.
There are several ways to ask and answer how-to questions. We could require
an exact and detailed question, such as: “how do I get the PVR to auto-stop record-
ing in 210 minutes?” Here, we would imagine the user selects the desired goal
state from a long list (a menu) of possible states. This sort of question does rather
beg another: if the user has to specify exactly what they want to do, why doesn’t
the device just do it, rather than working out what the user has to do? I suppose
it could be useful for training users who are supposed to know how to use the
device, or who want to use it faster than scrolling through a potentially long menu
of states, or most of the time (like a car driver) cannot take their eyes off what they
are doing to read a menu—they want to learn what buttons to press for next time.
Another way is to break the question down into “tasks.” This enables the user
to ask how to change part of the system’s state, such as: “how do I get the PVR
to pause (leaving whatever else it is doing as unchanged as possible)?” In this
example, the user would only select pause from the menu’s shorter list of choices.
The subtasks correspond to the indicators we introduced earlier. For a PVR, there
will be a pause indicator, and it will either be on or off. The user can now ask
the device, as it were, “How can I change the pause indicator from its current off
status to being on?”
This question then begs the question, why not redesign the device so that its
indicators are buttons? If the indicators tell the user what they want to know, why
not also make them controls so the user can do what they want to directly? Thus
the Pause button could have a light in it; pressing the button (generally) switches
the light on or off. Well, that’s a possibility we won’t explore here—but it’s a good
example of how working out how to help the user results in better, or at least
stimulating, design ideas.
In general, we won’t know whether the user wants to learn how to use a device
or really just wants to use the device. What does the user want when they ask a
how-to question? Do they really want to know how to do something—maybe so
that they can learn the answer and get more proficient at using the device—or do
they just want to get a job done?
Here is one solution, which works on a screen-based computer simulation (or a
web site): we display a dialogue box that provides both options. Note how we’ve
summarized the question the user posed: it would just be compounding things if
the user made a mistake asking the question but didn’t notice they were given the
right answer to the wrong question!
353
Chapter 10 Using the framework
Goal
4
2
1
3
5
Figure 10.6: The user recently visited states 1–5, but they are really trying to get to an
intended goal state, shown top right in the diagram, as efficiently as possible. The user
wonders why they aren’t at the desired goal state already. What did they do wrong?
(For clarity, we aren’t showing all state transitions, except those along the user’s path.)
354
10.7. Putting analysis into products
Goal
4
2
1
3
5
Figure 10.7: The shortest path from state 5 to the user’s goal is shown schematically
by the wiggly arrow—the shortest path may go via many other states. The user could
follow this arrow to get to their goal efficiently. (For clarity, we aren’t showing the other
states.)
If the user says that they do want to know how to do it now, we run the how-to
question answering program for them—because if they want to get to this state,
the how-to answer from the current state is the best way of doing it. Also, the user
is likely to be familiar with the how-to answers and the style of the dialog boxes
that they generate.
The idea works on the farmer’s problem—in fact, it’ll work on any problem.
Suppose the user has made a mistake trying to get everything to the righthand
bank, but the goat has been eaten by the wolf. The user asks why they haven’t
succeeded; on the next page, here’s a possible answer:
You asked: “Why aren’t they all safely on right bank?”
Instead of taking “cabbage,” you should have taken “goat.”
You cannot recover from this error without resetting.
Do you want to reset now or continue?
Reset Continue
Here we can see one of the possible complications of answering questions is that
there may be no shortest path, and this needs to be clearly interpreted for the user.
Since the question and answer above is only running on a simulation of the real
problem, it is possible to provide a reset option, but that might not be possible on
some real devices.
While how-to questions are based on the shortest path from the current state to
the desired goal state—what’s the best way to get from one state to the other—
why-not questions are answered by reversing the user’s recent path they took to
the current state. The four figures 10.6–10.9 help explain the key points.
Figure 10.6 (facing page) schematically illustrates how the user tries to reach the
goal they have in mind. In the figure, the user has most recently visited five states,
numbered 1–5, but they were trying to get to the goal state. Why aren’t they where
they want to be?
Finding the best answer to the why-not question means going back through ear-
lier and earlier states along the user’s recent path, until the best answer is found.
There are many potential answers to a why-not question. Figures 10.7–10.9 show
schematically some of the different possibilities.
355
Chapter 10 Using the framework
Goal
4
2
1
3
5
Figure 10.8: Perhaps going to state 5 took the user further from the goal? A shorter
path to the goal was from state 4. This would certainly be the case if the last user
action had done nothing—simply taking the user back to state 4.
Of course, it isn’t obvious what “best” means for the user—generally, the best
answer won’t simply be the shortest path in terms of counting user actions. An-
swers could be weighted by how long the user took over each transition; the longer
the user takes the “harder” presumably the transition was for them, and so the
path through that transition should be considered to be longer. For example, an
answer that says, “instead of doing x you should have done y” isn’t very helpful if
x happened a year ago! To stop explanations reaching too far back into the distant
past, we can adjust weights so older actions count less in the scoring. This would
stop the user feeling reprimanded for making a mistake ages ago.
There is a big difference between answering the question and the user knowing
what to do next. The answer to a basic why-not question tells the user something
they should learn, or be aware of in the future. But knowing they should have
done something differently probably doesn’t help them enough to get to their
goal now. In the figures, I’ve shown the reverse steps along the user’s path as
if the user’s actions can be undone easily, but this isn’t always so. For example, if
it is hard to get back from state 5 to state 4, even if the shortest paths were from
earlier states, the best thing to do now is take the path from 5. Again, such consid-
erations can be handled automatically by giving reverse paths different weights,
depending on what sort of answers the user wants or is expected to want.
A why-not answer for the user therefore includes the possibility that the user
might have been better off to have done something different earlier, rather than to
progress from where they are now in the current state.
There are many choices for weighting. I found by experimenting that weighting
the last action 1, and all earlier ones infinity gave adequate answers. At least these
weights ensured that the answers to why-not questions were brief.
Goal
4
2
1
3
5
Figure 10.9: Continuing figures 10.7–10.8, perhaps going to states 4 then 5 was a
diversion away from getting to the goal? Answering the why-not question goes back
over the user’s path, here finding that the shortest path to the goal was from state 3.
Figure 10.10: An apparently-crashed Samsung NV3 camera just after a user switched it
on, saying “Turn off and on again.” The camera is telling the user how to recover from
some unexplained internal error. Why doesn’t the camera do a soft restart itself and
save the user the trouble? The camera is behaving as if it is idle. Certainly, the camera
seems idly designed: the message shows that the problem was known early enough in
the development cycle to have an error message written for it, yet the problem was not
fixed—other than by asking the user to do the work.
An error message is better than nothing. Even better is for the device to do itself
what it wants the user to do. It is idle to tell the user to do things that the device
itself can do, and it’s generally bad design—it’s certainly bad design if you haven’t
thought through the trade-offs.
358
10.7. Putting analysis into products
optimize the user’s performance for certain tasks: for learning or for action, refer-
ence, emergency response, fault diagnosis, or for machine-related criteria (such as
its long life)—depending on what cost factors the designer wants to build in.
It would even be possible for the program to work out several answers using
different weightings and, if they are different, to say things like, “The fastest way
to do that is to . . . , but you’re more familiar with doing . . . , which will also do
it, but not so quickly.” Or perhaps, “Although x is the preferred way, if it’s an
emergency do y, which is quicker.”
Another idea is for expert users to train the device (say by solving typical prob-
lems). The advice generated would teach the user solutions that are preferred
by experts. It would be interesting to do both: then the user could be told that,
whereas they have often solved the problem this way, an expert would have done
something else.
In fact, rather than wait for the user to ask an explicit question of the device, we
could wait until they had used it for a long period (all the while adjusting weights)
and then automatically look at every possible thing that can be done on the device.
We would then report back to the user (or their teacher) cases where the user has
taken non-expert paths.
359
Chapter 10 Using the framework
The idea is that when the user presses the Wiz button, a choice of the states in the
wizardNames array will appear. If the state the device is in is one of the wizard
states, then it ought to be identified in the list, rather like a “you are here” marker.
(If the user selected this goal, the wizard is only going to sigh; we can’t automat-
ically assume that the user knows they are there already—but we can be pretty
certain that a user would find it tedious to be given the current state as an option,
select it, and then be told they are already there.)
It does not matter whether we think of the wizard as a feature for the designer
or as a feature for the user. A user will obviously find a wizard useful for getting
through a sequence of steps to achieve goals; a designer will find a wizard useful
for learning how a user would have to use a device to achieve goals. A technical
author writing a user manual might like a wizard’s help for writing optimal in-
structions. One way gives direct help, another gives insight into the design, and
the third way gives insight into writing help.
Once a wizard goal is selected (possibly with a confirmation if we were worried
that some goals the user might choose are “dangerous”—this needs another array
like dangerousNames), the wizard works out all the shortest paths from the current
state to each of the goal states. In the simple lightbulb example, the wizard would
work out the paths to dim and on states if the user wanted to make the lightbulb
come on.
In general, there will be several paths from the current state to each of the wiz-
ard’s goals. In the lightbulb example, if the device is off, then there are two paths
from the current state (off) to the lightbulb being lit: it could be either dim or
on. The wizard automatically follows all paths while they are the same, changing
the device state as it goes. When it comes to a choice, it asks the user what their
preference is to do.
The lightbulb is simple enough to spell out all of the possibilities (without taking
up too much space) to see how the wizard would work:
Current state User asks Wizard’s response
Off off? Says it’s already off
lit? Do you want it on or dim?
Dim off? (does it)
lit? Says it’s already lit
On off? (does it)
lit? Says it’s already lit
Whether you want the wizard to say it’s already lit when it is or to change to, or
offer to change to, the other lit state depends on what you want the device to do.
This simple example may or may not be doing what you think a wizard should
be doing, but it paints a persuasive picture. If you are building an interactive
system—no doubt far more complex that a three-state lightbulb—work out a sys-
tematic way of getting a wizard to work. Even if you don’t need a wizard in the
final, delivered product, you, as the designer, can use the wizard’s possible an-
swers to help you check that the design of the device is sound.
You could easily write a bit of program that asks the wizard every possible ques-
tion in every possible state. How long are its answers? Are their some exceedingly
long answers? If so, the device might be redesigned by adding some shortcuts.
360
10.7. Putting analysis into products
The wizard-checking program might work out the average and standard devia-
tion of the its reply lengths and then present the designer with all questions that
got a reply more than, say, one standard deviation higher than the average. All
of the answers that are so long (however they are classified) then these indicate
issues the designer should reconsider.
How many “dangerous” states does a user risk wading through? Should the
designer reduce the dangers or make the dangerous states more remote (so wizard
answers don’t take shortest paths via these states)?
Or get the wizard to mark all states where users have choices to make in achiev-
ing their tasks: are these reasonably designed? Are the choices, and their impact,
clear to users from the device’s indicators?
A real advantage of building a wizard into a device framework is that once
you have programmed a wizard, it should work however a design is modified
or updated. Since wizards encapsulate the notion of useful tasks, a wizard can
help summarize the difference between its answers on the previous iteration of the
design and its new answers for an improved design. That is, run the wizard on all
possible user questions on Monday, and the wizard saves its answers. On Tuesday,
after having revised and improved the design in some way, run the wizard again;
it reports back to you the changes in its answers. This will be a focused summary
of how your revisions are affecting the tasks you expect the user to want to do.
Wizards are usually thought of as palliatives, features to help users cope with
complex systems. But they are helpful, probably more helpful, for getting design-
ers to consider everything a user has to walk through with a design; they can help
compare the effectiveness of new versions of a design; and they can help the de-
signer identify tedious or dangerous sequences of actions that the user has to do
to achieve their goals. Wizards as design aids are underrated!
There is certainly no excuse for a design to be released to users with a wizard
that can’t solve all problems that will be asked of it. There is no excuse, in fact, for
designers not to develop their own wizards to benefit from the wizards’ design
insights, whether or not any of those wizards are released in the device the users
finally get.
361
Chapter 10 Using the framework
On dynamic checks as variations of drawing rules for finite state machines, see
section 6.3.9 (p. 189).
362
10.8. Complex devices
Figure 10.11: Programming complex devices takes careful planning. For this digital
multimeter, instead of having thousands of separate images, one for each possible state,
it is better to cut the image into slices and treat each slice as a separate indicator. The
location of the slice for the knob is shown by the square in the middle image above.
This slice shows different knob positions by displaying one of the seven different images
of the knob. (The knob is both an input device and a physical indicator: indicators do
not need to be lights—for this device they are a mixture of physical objects and the
icons on its LCD.) Ignoring the digits, which would anyway be best programmed as
graphics, the multimeter also needs the LCD to be cut into 15 slices that are either
blank or show an icon.
Many features are consistent across the device and do not need to be
represented explicitly. For example, if every state has a transition to Off, why
not write some program code so that this part of the design is constructed
automatically? A simple for loop can do this (running over each state); it will
simply set the OFF transition automatically and hence correctly.
The DeWalt DC925 cordless drill and the farmer’s problem are examples of
devices constructed entirely automatically, from JavaScript programs that
construct the finite state machine descriptions. The farmer’s problem has some
pointless states that we had to decide how to handle. A real farmer can see
separate states, such as leaving the goat and cabbage alone before returning,
when it will change to just a goat; but in our models we used in the book, the
goat eats the cabbage immediately, and the wolf eats the goat immediately, so
there are no separate states. I didn’t spot this “design decision” in the model
until I wrote the program to model the problem. Not that understanding the
farmer’s problem is going to help solve any real design problems, but the
insight illustrates the power of creating and analyzing device specifications
automatically.
The farmer’s problem was discussed in section 10.1 (p. 325) onwards; the
DeWalt DC925 drill was discussed in section 9.8 (p. 316);
Many features are consistent over parts of the device. For example, a menu tree
will probably have up, down, left, and right commands for each state, and
363
Chapter 10 Using the framework
again a simple piece of program code can add these features onto a simple tree.
Then all the designer does is specify the tree and its generic features; the
generic features are added automatically to each state in the tree.
The representation of finite state machines is all detail, and the wealth of detail
gets more obscure, the more states are needed. In fact, this is the problem that
statecharts addressed: so all we need to do is introduce program features to
build statecharts. These functions build the underlying finite state machine.
The designer then is dealing with program code that calls functions like
cluster to cluster states together—supporting all the features discussed in
chapter 7, “Statecharts,” and perhaps others that are useful for the application,
such as special forms of history.
Statecharts are only one possible way of representing more complex state
machines. They have the advantage of being very visual, but they are not the
only approach to simplifying handling large systems. Regular expressions
could be used, for instance. The programming languages CSP, Spin, and SMV
are examples of textual languages that basically define state machines. For
some applications, these languages will be much more convenient to use than
the more direct approach used in this book. Furthermore, these languages have
sophisticated features for specifying and checking properties of designs.
10.9 Conclusions
The previous chapter introduced a JavaScript programming framework to simu-
late and analyze devices; this chapter showed how the framework could be used
to measure how effective designs were, and to compare designs. We gave many
examples of interesting design properties that are easily measured.
The properties used in this chapter were trivial (from a programming point of
view) and very easy to work out. Had designers thought about states and actions,
and bothered to write decent programs that checked for sensible device proper-
ties, a lot of needlessly awkward gadgets would never have gone to market. The
point is that for any purpose we have in mind we can come up with relevant
properties—there are many you can think of for your problems that this book has
not space to explore—and then very easily work them out and see how a modified
or altered device design can improve on the measurements.
A particularly important idea was to take some of the design insights to provide
advice to the user. We showed two simple ways to do this: to answer questions
(such as the why-not questions) and to run wizards for the user. All approaches
work in a dual way: they provide immediate insight for designers and they can be
put into practical aids for users.
364
10.9. Conclusions
The languages CSP, Spin, and SMV are described well in the following: Hoare,
C. A. R., Communicating Sequential Processes, Prentice-Hall, 1985; Holzmann, G. J.,
The Spin Model Checker, Addison-Wesley (2004); and Clarke, E. M., Model Checking,
MIT Press, 1999. The Spin and SMV books include key techniques, algorithms,
and tools for model checking; they can be used as introductions to the subject or
as reference books for their prize-winning systems, Prolema (the language behind
Spin) and SMV. Alloy is a completely different sort of system; it’s book gives a
very readable survey of alternative systems and the rationales behind them: see
Jackson, D., Software Abstractions, MIT Press, 2006.
All of these systems can be downloaded and tried out.
365
TYRON FRANCIS WWW.TFRANCIS.CO.UK
11
More complex
devices
It all started in 1991. We spent the weekend in a house in the Canadian Rockies—
Ian Witten, a computer science professor friend of mine joined us and all our hun-
gry kids. The electricity had been switched off, and our first job when we arrived
was to get power switched on—and to get the microwave oven working so we
could cook supper. The microwave was a model aptly called the Genius, Pana-
sonic model number NN-9807. Our task looked easy enough.
It took us about 45 minutes to get it switched on. Why did we find it difficult
to switch a microwave oven on? What general lessons are there, and what can be
done?
At first, I suspected that Ian and I might have been idiosyncratically bad at
using microwave ovens; neither he nor I were expert microwave users. I spent the
weekend trying to understand exactly how the Genius worked, and the next week
I built a careful, accurate simulation of the Genius. I then did some experiments
on other people—asking them to try to get my simulation of the microwave oven
working, to try to do what Ian and I had failed to do.
The conclusion was that our difficulty was not unusual. The only advantage we
had by being professors was that we didn’t immediately blame ourselves for the
problem, as it couldn’t be our fault!
The tendency of people to blame themselves for problems they experience with
bad design is discussed in section 2.9 (p. 52), where we explore cognitive
dissonance—a plausible explanation for the “it’s my fault” syndrome.
Basically, the Genius would not start cooking. We presumed this was because
it needed to know the time: it had some time buttons and was not displaying
the time. When we pressed some of the minute and hour buttons, the Genius
allowed us to get any number between 00:00 and 99:99 shown on its display. So
we naturally thought that the clock accepted 24-hour times. Since it was late in the
evening when we arrived at our friend’s house, about 22:00 hours, we tried setting
that time. The microwave seized up—you could say it really froze, a strange thing
for a microwave oven to do.
We had to unplug to reset it and to carry on. Then we tried 22:05, then 22:15, . . .
and we tried later and later times as time went by, until finally we set the time to
367
Chapter 11 More complex devices
1:00. (We had noticed that we were wasting our time telling the clock the correct
time; we would pretend that it was 1:00, since the oven surely didn’t really care.)
The clock and the microwave then worked!
Having found one way of setting it, we soon realized that we had been misled
by the clock. It was secretly a 12-hour clock, willing only to work when set to a
time between 1:00 and 12:59, even though the display could be set to any number,
including all the misleading 24-hour clock times we had been trying.
We then had a wager about what the user manual would say: would it say how
to set the clock or wouldn’t it? When we eventually found and read the manual,
we agreed that we were both wrong: the manual did warn about the 12-hour clock
problem but relegated the warning to a footnote!
We expected the clock to work one way but it didn’t. Indeed, it gave us mis-
leading clues that, as it could count so high, it obviously had to be a 24-hour clock.
As long as we wrongly assumed that we knew how it worked, we would never be
able to set the clock. Part of the problem was that we didn’t think, “We’re assum-
ing how it works—let’s check,” because it seemed so obvious how it was working
we did think we had assumptions to check.
There were at least two things wrong with the design. First, the clock should
not have seized up when set to an “invalid” time (even one outside of 24 hours,
like 66:66). Secondly, the user manual should have been a bit more helpful, if not
more visible (perhaps there should be a label fixed on the side of the oven?).
Anybody who writes a footnote in a user manual to give the user important
information ought to tell the designer to fix the design—and make the footnote
unnecessary! See section 3.3 (p. 65) where this idea is stated as a design
principle.
Given that both of those design faults had been made, the user interface—the
front panel of the microwave oven—should have been clearer. Or maybe the mi-
crowave should not need setting to any particular time before it works. You don’t
need to know that it is 10 o’clock in the evening to do 3 minutes of high-power
cooking, so why does the microwave?
368
11.1. Starting to avoid design problems
bigger writing that is was a 12-hour clock. Clearly, even getting feedback from user
problems isn’t enough to help fix a bad design once it has gone into production.
Somehow designers need to evaluate their designs before they are committed
to production. Unfortunately, it is rather too easy for designers to be biased when
they try to anticipate how users will work with their designs. When designing
the Genius digital clock, the designers probably assumed that everybody uses the
12-hour clock, and they probably tested the microwave on people who shared that
assumption without thinking about it. It was an unspoken assumption. There
would have been no design problem to solve because nobody would ever enter
10 p.m. as 22:00, and nobody in the design team or evaluation team would notice
this as a potential flaw—until it was too late, that is, as the user manual’s footnote
makes clear. Noticing the design problem was came late in the process.
Some design errors cannot be found by using people, whether designers or test
users. Yet you can be sure that there are users out there who will eventually stum-
ble onto design problems. With the Genius my experiments suggested that about
half the normal population of potential users would have had trouble in the af-
ternoons; that’s pretty bad, but not being able to use a microwave oven is hardly
a disaster. With more safety-critical devices, say medical equipment designed for
nurses to use, the chances of a user problem are lower (the nurses will be trained to
use the devices), but the consequences of a design fault are much higher. Somehow
we have to avoid preconceptions about use, and, in particular, preconceptions we
don’t even think about!
One approach to evaluating system designs is to carefully model users and to
try to make realistic models of how they behave. Of course, this is very difficult.
In the case in question how would we notice that we’d accidentally designed all
the tasks to use 12-hour clocks? If, as designers, we are not aware that the 12-hour
clock is a design issue, why should we build user models for testing purposes that
(always or sometimes) use 24-hour clocks? The 12/24-hour question may seem
pretty obvious in hindsight, but what of the design problems that we don’t know
about? What if we are designing a new product and nobody has any idea what the
key issues are?
Although user models can be run relentlessly without a break and can there-
fore examine a large part of a user interface design, user models are fraught with
difficulties. They may have systematic biasses, so, for example, parts of a design
never get exercised. They may be incorrectly configured, so that timings and other
measures are inaccurate and possibly misleading. In short, it seems easier to use
real people directly to test a device, rather than to use them to build the models
of what they would do. But real people on actual tests are they are expensive an
slow, and it is very tedious to record what they do with a proposed design.
See section 11.7 (p. 401) for ideas on spotting oversights in user testing.
Real people suffer from the same problems that user models do: we can’t recruit
the whole planet to do our system evaluation, so we inevitably miss out on some
crucial behavior that somebody has. It would be easy to recruit ten people and for
all of them to think the same way about clocks (particularly if they are our friends
and relations, from the same office, or from the same country). If so, any study
based on ten people would not be very insightful.
369
Chapter 11 More complex devices
If we are designing our device specifically for young people, or the old, or the
ill, then we should have proper concern for treating them ethically and legally—
getting ten sick people to help us try out our design requires informed consent;
getting children to help requires parental consent; and so on. Getting users who
represent our target audience is tricky. If we are designing a device to give medical
advice, and we are using people to test it, then there is a chance it will give mis-
leading advice—perhaps because we haven’t finished it, or because there is a fault
in the user interface (that’s what we expect—it’s why we’re doing the tests!) and
the user simply gets the wrong advice. There are many complications to consider
if we are serious about using people seriously for real device design!
It might seem counterintuitive, but a safer approach is to assume nothing about
users and to get users who know nothing. Ignorant users might do absolutely
anything, and if they behave randomly then they might assume that the clock was
24-hour; they might even assume it was 100-hour or 7-hour—and their problems
would help designers discover new issues that nobody has yet thought of or had
the patience to unearth. It could be useful to redesign to avoid the problems you
discover with ignorant users. At least you should know the problems exist, so that
palliatives can be worked out, say, by writing warnings in the user manual.
Once we see people making mistakes with 100-hour clocks, we can decide how
to design for real humans. Maybe nobody would do this deliberately (maybe they
would), but the design has to cope if anybody does it by mistake. We need to
know the consequences of any user behavior, deliberate, ignorant, erroneous, or
insightful.
The question is, where do we get such users to work with?
370
11.1. Starting to avoid design problems
50%
100 200
112 Presses to get it working
Figure 11.1: A bar chart showing how many gnomes managed to get the original
Genius microwave oven to work in a given number of button presses. Some of the
10, 000 gnomes used took more than 200 presses, and their data is not shown (which
is why the 50% line seems too far to the right).
assume that the user (or perhaps a simulated user) knows what they are doing or
that they make certain sorts of errors—in any case, generally that the user is doing
some specific thing. A gnome approach makes no such assumptions; they model
the user doing absolutely anything at all—and it is easy to have lots of gnomes
working together or separately helping evaluate the device. In short, gnomes tell
you, the designer, everything about a design, and in the end, when you add up all
the figures, their performance gives a statistical overview of the design. Gnomes
are also a lot faster than human users, so we get far more general results much
sooner.
Let’s sit a gnome down and get them to try to get the Genius to work. We’ll
count how many steps the gnome takes. Obviously sometimes gnomes will get
lucky, and sometimes they will take ages. So we really need to hire lots of gnomes
and average the results. I hired 10, 000 gnomes and sat them down with a com-
puter simulation of the Genius. They worked away, and I drew a bar chart of the
results, shown in figure 11.1 (this page).
Section 11.1.5 (p. 376) explains how to hire gnomes and how to put them to
work.
Almost half the gnomes managed to get the microwave working in 100 but-
ton presses or less (in fact, the median is 112), some took over 200 presses to get
it working, and one even took 1, 560 presses! (Few humans would have the pa-
tience without doing more than unplugging it.) We are counting plugging the
microwave in as one “press” because to get it to work after it has frozen up, it
needs unplugging and plugging back in.
371
Chapter 11 More complex devices
The gnomes seem to find this “simple” job really hard work. Yet if we asked the
designer of the clock how many button presses it takes to set it, they might reply
just four steps! That is, to get the clock to work after the oven’s been plugged in,
press Clock to enter the clock-setting mode, then press 1-hour , so it shows a valid
time (namely, 01:00 o’clock), then press Clock again to start the clock running with
that time. Easy—if you know how.
But my hired gnomes took 160 presses on average, a lot more than the designer’s
guess of 4. This huge discrepancy suggests that the design could be improved,
or at least that the designer’s optimism is unrealistic—or that we could train the
gnomes better.
Let’s look seriously at why the gnomes take so long compared to the designer’s
ideas. The Genius locks up when it is set to a time outside of the 1:00–12:59 win-
dow. There is absolutely no design reason in principle for a lockup. Removing
the lockup (by redesigning the oven) dramatically helps the gnomes to be faster.
Now, half the gnomes succeed in 77 or fewer presses, with an average time of 108
presses.
If we also changed the design so that impossible times, like 27:78, cannot be set,
the gnomes get even faster, taking in average about 50 button presses to get the
microwave going. Half of them will have got it working in 35 or fewer button
presses—the tail of the graph. The number of gnomes who take more than 100
presses now has a factor of five fewer gnomes in it. That’s a huge improvement.
Figure 11.2 (facing page) shows the bar chart based on the improved design.
Naturally we expect the unlucky gnomes to take longer on average than the
designer’s ideal or a typical human user, because, after all, gnomes don’t know
what they are doing. Nevertheless, our gnomes have helped us find a faster and
easier-to-use design. The following table summarizes the results:
Design Average Median
Original Genius 161 112
Debugged not to freeze 108 77
Sensible design 49 35
So a little thought—motivated by random testing—lets us achieve a design
that’s on average about three to four times easier to use (at least for gnomes, if
not for humans).
Now, when we humans use gadgets, much of the time we don’t know how they
work or what we are supposed to do; we’re in much the same league of insight as
the mindless gnomes—and a modified design, such as the one proposed here, that
helps gnomes would also help us.
The improved design still supports all the original functionality (like cooking
chickens), it just removes some design problems. Indeed, with human users, the
faster design has the additional advantage of not allowing a user to display a mis-
leading 24-hour time (like 22:02).
A random exploration of a design presupposes no specific knowledge of the
user (or gnome). This has two advantages. First, a good designer ought to con-
sider the possible wrong ways in which a design might be used. But there are
infinitely many ways of being wrong, and a designer can only think of some of
372
11.1. Starting to avoid design problems
50%
100
100 200
34 Presses to get it working
Figure 11.2: Bar chart showing how many gnomes managed to get the improved design
of the Genius microwave oven to work in a given number of button presses. Compare
with figure 11.1 (p. 371), which shows their performance on the original Genius. Now
half the gnomes succeed in fewer than 34 presses, and only about 10% of the gnomes
are taking more than 100 presses to succeed, whereas before half the gnomes took more
that 112 presses. The improvement is clear from the way this graph tails off so quickly.
them. A random process, like our gnomes, however, embodies all possible wrong
ways of using a system. Randomness is a remarkably effective way of testing out
designs. After all, human users could only test according to their own few and
fixed preconceptions. Moreover, if their preconceptions were the same as the de-
signers, very little would be discovered about the design that the designer didn’t
already think they knew. Quite likely the original Genius design was made by
a designer who didn’t think in 24-hour times and they never thought to test for
them.
So, although a gnomic “random user” is less efficient than a real human user,
it cannot be tricked into guessing the designer’s tacit assumptions. Gnomes are
also a lot cheaper and faster than humans: being cheaper is good ecognomics, and
you get them to work faster by using a metrognome (although I used a computer).
This ease of testing with gnomes is their second advantage.
It is very interesting that a random gnome can set the microwave clock on aver-
age in 50 button presses, whereas Ian and I took far more. Our human intelligence
was not helping us! We would have worked out what to do faster if we had sim-
ply tossed a coin to decide what to do next, because then we would have been
working like gnomes.
373
Chapter 11 More complex devices
374
11.1. Starting to avoid design problems
you are past playing with gadgets, then their odd design is frustrating rather than
fun. The frustration itself makes it harder to enjoy using a device and persevering
to find a solution! When you find yourself in this position, toss a coin or roll a
dice—use some technique to make yourself approach the device like a gnome or a
child. This will help you break out of whatever fixation you have become trapped
by; with luck you’ll be able to get on with your life quickly.
As always, good advice to a user can be rephrased as advice for a designer.
Why not add a button to a device so that the device itself “rolls the dice” and
presses a random button? Better still, why not bias the dice so that it only rolls to
choose buttons the user hasn’t tried pressing recently in the state? Then, if the user
presses the PLAY! button (that is a less threatening button name than HELP ), you
could have a nice flashing display of all the currently untried options, and then
the flashing slows down, cycling through the buttons, until—yes!—it’s settled on
suggesting Off ! Well, maybe if the user switches the device off and on again, they
will be more successful next time.
Of course, we can do better than suggest the user presses a random button.
See section 11.3.2 (p. 385).
Pressing a button at random, or to save the user the effort of having a button
to make a random transition (so the user doesn’t have to worry about working
out a random action), can help the user, but sometimes a random action might be
exactly what the user wants to do. They want to do something surprising, for fun,
and what better than a random action? The Apple iPod Shuffle does just this: it can
be set to play random tracks of music. Here, the “gnome” is a bit more intelligent
than our gnomes—the iPod doesn’t replay tracks it has chosen immediately. In
fact, the idea of shuffle is that it first randomly shuffles the tracks, then plays from
that list of tracks; otherwise, it would run the risk of repeating a lucky track too
often or too soon. When it gets to the end of the list of chosen tracks, it reshuffles
and starts again.
375
Chapter 11 More complex devices
absolutely trivial to get the programming right so that “times” out of the 1.00 to
12.59 window simply cannot be set. A modified design would then work like an
analog wrist watch, where you simply can’t set the time to something impossible.
376
11.1. Starting to avoid design problems
(and wondering what sort of human, how familiar they are with microwaves, and
so on), let’s use a gnome again. Gnomes are cheap and they don’t mind prodding
things all day.
First we need a function that tells us which button the gnome should press. It
looks worse than it is:∗
function randomButton(d)
{ var r = Math.random();
while( Math.floor(d.buttons.length*r) == d.buttons.length )
r = Math.random();
return Math.floor(d.buttons.length*r);
}
Generating random numbers is fraught with difficulties. It is highly recommended
that you verify that your random number generator (which may have to be wrapped
up in code completely different from my randomButton JavaScript code above) is
working well—otherwise all your experiments will be suspect. My first random
number generator in JavaScript failed the tests; it is a salutary experience to write
program code to check your own program code.
A basic way to check that your random button pressing works properly is to do
a few thousand trials and see how often each button would be pressed:
var testLimit = 100000;
var check = new Array(device.buttons.length);
377
Chapter 11 More complex devices
Pressing buttons exactly like a human will mean that JavaScript will be updat-
ing the device display every time something happens—that’s how press was de-
fined in the framework; this is a waste of time (gnomes can’t read), so we can
speed up using gnomes by writing a “blind” press function:
function gnomePress(buttonNumber)
{ device.state = device.fsm[device.state][buttonNumber];
}
As noted in section 4.5 (p. 105), where we defined the function plural, the
last function would generate better English if we wrote ... "Gnome takes
"+plural(count, "step")+" to get from ...
We start the gnome-simulating program in state power 1 and see how long it
takes to get to power 2.
trial(device, "Power 1", "Power 2");
When I tried it, the gnome took 169 steps. Was this gnome lucky, or was the
gnome a bad one who gave us an answer seemingly too hard to be reasonable? Or
is the design bad? Until we do a lot more experiments, we can’t tell whether we
have learned more about the gnome or about the design.
One argument that the gnome was lucky is that we asked it to play with a de-
vice that happened to be strongly connected; if the device had not been strongly
378
11.1. Starting to avoid design problems
Average
200
100
Figure 11.3: Ten gnomes testing a microwave oven to change its power settings. After
time, the gnomes converge on taking about 120 button presses.
connected, the gnome would have run the risk of getting stuck somewhere (just as
any human user would have risked getting stuck). The gnome wouldn’t mind; it
would just keep on pressing buttons—the program simulating the gnome would
never give up, because the test d.state != f might always be true and it would
go around the loop again and again! If the shortest path from start to finish states
is ∞, no gnome will ever find a way of doing it; if the shortest path is finite but the
device is not strongly connected, the gnome may or may not get stuck. If the de-
vice is strongly connected a gnome will eventually be able to get to the finish state
(provided your random number generator is fair). Either you should check the
device is strongly connected, or you should set a maximum limit on the number
of times around the loop.
Whether the gnome was lucky or not, one useful thing we have learnt is that our
design can cope with mad and relentless user testing for 169 presses. We should
do some more testing, to check that the design is robust. (This sort of mindless
but essential checking is something gnomes should be left to do, rather than using
humans.)
We should try more gnomes and at least average the results to get better statis-
tics. So we’ll try some more serious experiments, hiring 10 gnomes at a time; we’ll
run 500 trials with each gnome and plot the results to see what we can learn: see
figure 11.3 for the results. The graphs show that once we have 5, 000 runs (ten
gnomes each doing 500 trials), employing more won’t add much more to what
we already know. But for a more complex device than our microwave oven, this
survey might be way too small. It’s always worth drawing a graph to see how the
numbers are working, and more informative (and a lot easier) than using statistics.
In section 11.3 (p. 381) we work out what an infinite number of gnomes would
average.
379
Chapter 11 More complex devices
380
11.3. Markov models
Figure 11.4: A simple bar chart showing the distribution of path lengths for the JVC HR-
D540EK PVR. With more programming, you can get axes and other useful information
included—our simple JavaScript hasn’t shown that the horizontal axis is the cost of
getting from one state to another (a number, for this device, ranging from 0 to 11)
or that the vertical axis is how often any particular cost appears. For this device, the
most common path length (the mode) is 2, occurring 192 times. Without the axes, it
isn’t obvious that the leftmost bar shows how often a zero path length occurs: for any
device it’s going to be the number of states, since you can stay in any state by doing
nothing. Other uses for path data are shown in box 9.2 (p. 310).
Having collected the information, a simple loop draws the graph using the
image-scaling trick:
for( var i = 0; i < chart.length; i++ )
document.write("<img src=blob.gif width=5 height="+(1+chart[i])+">");
The 1+ in the height expression ensures we get at least a 1 pixel-high line for the
graph axis—a 0 pixel-height line would not show anything. Figure 11.4 (this page)
shows the sort of result you might get.
381
Chapter 11 More complex devices
Box 11.1 Matrices and Markov models The stochastic matrix of section 11.3 (previ-
ous page) is just the same as the cost matrix of section 9.6 (p. 297), except that the costs
(0, 1 or ∞) have been replaced with probabilities. Instead of it being the case that a button
press takes one step to get to another state in the finite state machine, the stochastic matrix
expresses the fact that there is a probability the user will press the button.
Since each row represents all the states a user can end up in, whatever they do, each row
adds to 1.
Suppose the stochastic matrix is called S and v is a vector representing the current state
the device is in; it’s called the state vector. Then vS will represent the state the device is in
after the user has done one thing. More precisely, vS will give a distribution; it will show the
probability that the device is in each state after the user’s action. The beauty of this is that
we can see further ahead easily: vSn tells us the state distribution after exactly n button
presses.
With not much more effort, from v we can work out the expected number of states the
user will have visited after n presses; this is how we worked out the cost-of-knowledge graph
in section 5.3 (p. 144), graphed in figure 5.11 (p. 144).
Standard things to do with Markov models are to find out the expected number of actions
to get to a state, which in user interface terms represents how hard doing something is on
average, or to find out the long-term behavior of a device.
Usually the transition matrices are considered for the whole device, but we can also take
the matrices for each button considered as its own
finite state machine. This generates the
button matrices. If we have buttons On and Off , we can easily find the button matrices
On and Off that represent the transitions that these buttons can achieve.
If the device is in state v, then vOn is the state it is in after On is pressed. Matrices easily
allow us to work out theories of the device behavior; in this example (even with such little
information) it is likely that On times the matrix Off is equal to the matrix Off—in words,
pressing Off after pressing On has the same effect as pressing Off directly. The matrix
equation says that this is true in every state. Since matrices are very efficient for calculating
this is a good way of discovering many deep properties of a user interface.
write a program for us); we can use the results. Fortunately, if you have Markov
modeling added to your design framework, you won’t need to understand tech-
nical details.
To do a Markov analysis, first we must convert the device’s definition into a so-
called stochastic matrix. Here it is displayed in traditional mathematical notation:
⎛ ⎞
3/5 1/5 1/5 0 0 0
⎜ 2/5 2/5 1/5 0 0 0 ⎟
⎜ ⎟
⎜ 2/5 1/5 0 1/5 1/5 0 ⎟
⎜ ⎟
⎜ 2/5 1/5 1/5 0 0 1/5 ⎟
⎜ ⎟
⎝ 2/5 1/5 0 1/5 1/5 0 ⎠
2/5 1/5 1/5 0 0 1/5
382
11.3. Markov models
Box 11.1, “Matrices and Markov models” (facing page) explains more about
how the matrices work.
Each row gives the probability that the user—or gnome!—will change the state
of the microwave oven; thus, if the device is in state 1, the gnome will change it to
state 2 with probability 1/5 (i.e., first row, second column). There are five buttons
for the gnome to choose from, and with probability 1/5 it chooses the button that
changes the state to 2. Sometimes there are two buttons that change the current
state to the same state, hence the 2/5 probabilities. For now, the assumption is that
each button on the device is pressed with equal probability (there are five buttons,
so all the probabilities are so-many fifths); the user interface simulation can give
empirically-based probabilities, which we will use later. This probability matrix
can be fed into our analysis.
The math on this matrix gives a result of 120. This is what thousands of gnomes
averaged at, but it was faster to work out using a Markov model (even though we
needed to be good at math; for many people it’s easier to program the gnomes
than to work out the Markov models).
It is interesting to choose the probabilities differently. Rather than assuming that
each action or button press is equally likely, as above, we can work out the best
way of achieving the task and set those button presses to have probability 1. Now
our gnome is behaving like one that knows what it’s doing.
If we rerun a Markov analysis on this matrix of “perfect knowledge,” we should
get back the shortest ways of doing anything. Indeed, the answer here is 2. Which
emphasizes just how bad 120 is.
Evidently, the more knowledge about the device design, the easier a device is to
use. Difficulty of use can be plotted against knowledge in a graph that combines
the complete randomness of a gnome with increasing amounts of design knowl-
edge. The graph should obviously speed up between 120 (the result of ignorance)
to 2 (gained with the aid of perfect design knowledge). Indeed, this is what we see
in figure 11.5 (next page).
383
Chapter 11 More complex devices
80
60
40
20
Figure 11.5: The more you know, the easier it gets. When you know 100%, you are as
good as an expert, here taking only 2 presses to change power settings on the microwave
oven. If you know nothing, you may take a very long time, but on average you will do
it in 120 presses.
In human terms, we could imagine that we redesign the microwave oven so that
its buttons light up when they are active, or equivalently that they are dark when
they are not going to change the current state. In a sense, this makes the device a
polite device: it tells you when it can do things for you. A polite device tells the
user that a button will do something or that a button won’t do something before
the user wastes time finding out.
The opposite of politeness, rudeness, is discussed in section 6.3.4 (p. 185).
We could either write a slightly more sophisticated gnome-simulating program
(press buttons at random, but out of the ones known to change the current state) or
run a Markov model on a revised matrix of probabilities. When we do, the answer
for the power-changing task drops from 120 to 71.
We can modify the original trial code to work out the times the gnomes take
with the original design and with a modified design where they avoid pressing
buttons they know will do nothing.
function trial2(d, start, finish)
{ d.state = toStateNumber(d, start);
var f = toStateNumber(d, finish);
var count = 0, newcount = 0;
while( d.state != f )
{ var oldstate = d.state;
gnomePress(randomButton(device));
count++;
if( d.state != oldstate ) newcount++;
}
384
11.3. Markov models
Section 11.1.2 (p. 374) suggests playful buttons, and we’ve just suggested
smart buttons that light up if they might be worth pressing (in the last
section). Section 9.6.4 (p. 304) suggests lighting buttons to speed up the user
according to the Hick-Hyman law.
There are very good reasons why both of these design ideas can help users. Can
we do any better? The two small innovations we have suggested are based on
knowing what the user wants to do:
The user is stuck. Show them something random; it may help.
The user wants to do something. Show them which buttons do something.
If we really know what the user wants to do, then the device may as well do it.
This idea leads on to invisible devices and incidental interaction; see
chapter 12, “Grand design.”
So far in our simulations, the gnomes have been pressing buttons equally often
on average. But if one button stands out from the others, we’d want our gnomes to
prefer pressing it, and it should be pressed more often—that is what human users
would tend to do. We might get a few real people to use a prototype of a device,
to get real button press probabilities, and we could then be precise about the prob-
abilities to use—generally, without doing experiments with users, we won’t know
what anyone is trying to do with any reliability or accuracy.
If we assume we have already got the probabilities for each button in an array
device.probpress, it’s easy to write a revised random button function:
385
Chapter 11 More complex devices
function distributedRandomButton(d)
{ do
{ var r = Math.random(), cp = 0;
for( var b = 0; b < d.buttons.length; b++ )
{ cp = cp+d.probpress[b];
if( cp > r ) return b;
}
} while( b >= b.buttons.length );
}
As before, this code loops if the random number generator incorrectly returns
exactly 1—it goes around and tries another number. The code can be made faster
by pre-computing an array of cumulative probabilities rather than repeatedly cal-
culating cp each time the function is called. Of course, the code won’t work cor-
rectly unless the button probabilities probpress add to 1.0, as probabilities should.
More details of algorithms with random numbers can be found in Knuth’s Art of
Computer Programming—see this chapter’s further reading, p. 404.
Section 11.3.1 (p. 383) explores concrete ideas for improving button pressing
probabilities. Making some things too easy may not be a good idea—see
section 5.4 (p. 145).
A user is looking at a device with fifteen buttons that could achieve something.
Either the user knows what to do, or, in the most general terms, this user will have
to search for the right result.
Computer scientists know a lot about how to make searching more efficient. Ob-
vious suggestions are to put the possible target states of the search into a sequen-
tial list, to put them into an alphabetically sorted list, or to use more interesting
techniques, such as binary search, hashing, trie search, or tree search.
The impact of alternative kinds of search techniques (for a fax) are explored in
section 5.5.5 (p. 153).
We can only usefully put states into order if we know what the user is searching
for. If it’s a command name, then putting the commands into order will help. But
in general, we really have no idea what the user is searching for—and the user
may not know until encountering it. If they don’t know what the command is
called, then putting them into alphabetical order in itself doesn’t help much—it’s
no better than being in a linear list in any order.
Even if we have no order (such as the alphabet) that can be used, we can still
help the user search systematically. There are two important and very general
approaches to searching:
Breadth-first search Try to keep as close as possible to the starting state, and
then move out to new states, increasing the distance from the start as slowly as
possible. This approach is good if we have no idea where the goal is.
386
11.4. Human errors
387
Chapter 11 More complex devices
That is, if you train your sophisticated model to reflect real user behavior, unfor-
tunately you run into exactly the same problems that all user testing has: it takes
a very long time to get sufficient, and sufficiently broad, data to be useful. Of
course, you can quickly get an impression, but almost by their very nature, errors
are surprising—they are unlikely—so you need even more data to model errors
well.
Rather than trying to model anything a user might do, it’s more productive to
model specific things based on how we know people behave.
Task/action mappings, discussed in section 9.6.5 (p. 311), help us explore how
a user does a task successfully.
One way to look at user errors is to consider their effect on degrading from
“perfect” performance. We can easily find a shortest path (using task/action map-
pings) and treat this as the model of error-free behavior. Then what might a
user do?
They may make slips and randomly fall off the path anywhere. We have
already modeled this sort of error very well.
They may do things in the wrong order.
They may miss steps on the path but carry on doing the right things (though
probably now on the wrong path).
They may forget some initial steps, missing out the “preparation,” but then
follow the rest of the path correctly.
They may not get to the end of the path, stopping short of performing all the
necessary actions. That is, they behave as if they have finished, when there is
yet more to do. These are completion errors.
They may follow a correct path for a different task. These are transfer errors.
The user has transferred the wrong actions (which were right in a different
context) to the intended task. The different task may or may not be one from
the same device—it might be a transfer error like trying to use a Nokia device
like a Samsung.
If two paths start the same way (or, put formally, if they have a common prefix)
then the user may continue down the preferred path—often, the more
frequently used one—even if this is not the path they intended to go down to
start with. More generally, if multiple tasks share parts of their paths (they
have common subsequences), it’s possible that users will start off doing what
they intended, including the shared part of the path, but then follow the wrong
branch after the shared part. This is a capture error. The more familiar, more
frequently performed actions have “captured” the user.
Users may not stop when they have otherwise completed the task. This is an
overrun error.
388
11.4. Human errors
Box 11.2 Ways to avoid human error Humans make errors for all sorts of reasons, and psy-
chologists distinguish different sorts of error: slips, lapses, mistakes, and variations thereof.
The suggestions for further reading give pointers to more details, but what we want to do is
avoid errors, however they are classified, and then—given that we can’t always avoid errors,
recover from them or reduce their consequences when they do occur.
Don Norman suggests six important strategies for reducing error. Exactly how these
principles are applied depends on the context; the important point is to think through the
options carefully.
Make information visible, so the user can see what to do next. We’ve talked a lot
about using indicators as one way to do this; see section 11.3.2 (p. 385) for example.
Simplify the design to reduce reliance on the user’s memory.
Use affordances (see section 12.3, p. 415)—use “natural” mappings, simple
relationships between controls and effects.
Use “forcing functions” to guide users. Constraints make it hard to do the wrong
things; forcing functions make it impossible. For example, the button that fires the
rocket has a cover; the user is forced to lift the cover before launching the rocket.
Assume that errors will occur and design accordingly. Provide techniques for error
recovery.
Finally, standardize actions, layouts, displays, symbols, and so on; make systems
consistent.
See the boxes 6.1, “Syringe pumps” (p. 168) and 11.3, “Generating manuals
automatically” (p. 397) for example errors. Errors are also discussed in section 11.4
(p. 387), and box 5.7, “The principle of least effort” (p. 147).
389
Chapter 11 More complex devices
Sometimes feedback to the user is ambiguous: users won’t know whether they
have done anything, let alone the right thing! See section 10.3.2 (p. 334).
Sections 11.5 (p. 392) and 11.6 (p. 396) provide interaction programming
techniques to provide accurate training material, so users know how to interpret
feedback. Section 10.7.5 (p. 358) showed how to ensure that the user gets wide
enough experience of a system in their training.
Error consequences can be improved by adding undo, for instance. Robustness
can be improved by introducing more permissiveness into the design. We can also
improve things by making the state of the device clearer—then users may notice
problems sooner, before they are beyond undo or other recovery. Making the state
clearer will help the user do their work with fewer errors.
Being more specific about reducing consequences requires us to know more
about the costs of errors for users. For example, not switching your television
on—you’ve sat down, but it’s still off—is not as costly an issue as leaving your
money behind in a cash dispenser (sometimes called an automatic teller machine,
or ATM), and that, in turn, is not as costly an error as leaving your cash card behind
(which might allow a thief to remove any amount of cash from your account).
That brief analysis suggests that given a choice, the more costly errors for the
user should be arranged, by redesigning the device, to come earlier in the se-
quence. Indeed, this is what we find on cash machines: if they are well designed,
users have to remove their cash card before taking their money. With this design,
users are very unlikely to leave their card behind—but they may still leave their
money behind.
This redesign works in harmony with a quite different explanation of comple-
tion errors. A user went to the cash machine with the task “get cash” uppermost in
mind. With the better design, the user can only complete the intentional task after
having already removed the card—the device has forced the user to do something
that otherwise might be overlooked if cash was dispensed first.
We can invent ways of measuring robustness. One simple measure is the num-
ber of ways a user can achieve a task. If there is one way to do a task, then any
error makes the task fail; the more ways, the less likely any error (of any of the
sorts listed above or any other sort of error) will make the user fail.
We can measure this simple robustness by counting the different number of
paths through the device to achieve the same objectives. For the sake of argument,
we consider all tasks possible on a device to be equivalent to considering all pairs
of states: every task then has a start state and an end state. We count how many
and how long paths are between all pairs of states and take averages to get a mea-
sure of robustness for the “average task.” (Counting paths can be done using a
depth-first search.)
For our microwave oven, we get the graph shown in figure 11.6 (facing page).
The figure shows quite clearly that for this device any task ending in a power
setting—that is, actually cooking—is a lot less robust (in the sense we are mea-
suring robustness!) than either quick defrosting or using the clock. Of course, we
would get more profound results if we weighted averages according to our knowl-
edge of the device. For example, since the user cannot leave a microwave oven
390
11.4. Human errors
Clock
2000 Timer 1
1 2 3 4 5 6 7
Button presses
Figure 11.6: One visualization of robustness; here, measured as the number of possible
ways of reaching a given state, plotted against the number of button presses needed
to reach it. The numbers are averaged over all possible starting states. The larger the
number, the more permissive or robust the target state.
cooking forever, most tasks the user will do will not start from any cooking states.
For simplicity, we assumed all starting states were equally likely—intriguingly,
the graph for this device looks the same and doesn’t change the ordering of the re-
sults (though with different numbers) when we redo the analysis assuming never
starting in a power state.
Robustness against error is not the same as safety. Cars have numerous safety
features, including the following: you cannot change from park to drive without
pressing the brake pedal; you cannot remove the key unless you are in park; in
some manual cars, you cannot start the engine unless the clutch is depressed—to
avoid risking starting the car when it is in gear, and, incidentally, to make it easier
to start (the starter motor won’t be turning over the gear box’s cold oil).
A car is not robust in the sense that there are lots of ways for the users to achieve
their goals; the point is to ensure that users can only achieve goals safely. Some
possible states and transitions are designed out; the user simply cannot do them.
A “robust” car might allow you to start the engine when the car is in drive without
your foot on the brake pedal, but a safe car would prohibit this error. Some errors
you want to forgive the user, some (like starting a car when it is in gear) you want
to block, before something worse happens—though occasionally it is good practice
to start in gear, for instance on steep hills. Designers have to make tricky tradeoffs.
A measure of robustness is the size of strongly connected components; we used
the farmer’s problem as a motivating example to redesign devices in
section 10.2.1 (p. 329).
391
Chapter 11 More complex devices
Figure 11.7: User error is defined by design error—illustrated here by the JVC UX-
NB7DAB CD/radio. Evidently, the manufacturers are aware of the design defect as the
CD tray door has a warning label “CAUTION DO NOT TRY TO FORCEFULLY OPEN
AND CLOSE THE DOOR BY HAND TO PROTECT FROM MALFUNCTION.” The
label has been made removable because it isn’t visually aesthetic. However, if the label
was removed the design would not become aesthetic anyway: it would then have a
hidden interaction defect and be worse. The photograph shows what happens if you
drop a CD when attempting to change one: the door may close with the CD trapped,
covering the very button that is supposed to be used to open and close the door. Now
you have to fiddle with the trapped CD—or force the door!
Finally, people can also make sophisticated errors involving reasoning. They
may have, in some sense, quite the wrong idea of the tasks they are trying to
achieve. In controlling a complex system like a nuclear power plant, indicators
tell the user all sorts of things about the overall situation. The user then has to
reason about what to do to avoid a catastrophe. Many things can go wrong—not
least being that the power station itself overwhelms clear thinking with far too
many alarms.
392
11.5. Automatic, correct help
Although our device definition is very basic, it can be used to generate useful
help for the user or for technical authors (technical authors can at least start from
an accurate draft of the manual).
We now define a function help that explains the shortest path (the least num-
ber of button presses) to get from any state to any state. The definitions given
below can be adapted straightforwardly to provide clearer help if “buttons” aren’t
actually pressed (maybe, for example, they are knobs that have to be twisted).
The device might have an interactive feature, so pressing a button gives help—
perhaps showing it in a display panel. If so, it might be defined partly as follows,
making use of the current state: here is a small part of the microwave oven’s man-
ual:
To get from the device Power 1 to Power 2:
Press Time
Press Power
Program code to generate manual entries like this is based on finding shortest
paths, discussed in section 9.6 (p. 297).
We’ll need this fact later. Note that the best way of getting from Power 1 to Power 2
takes two button presses, as we realized in the previous section.
Ideally one would write more sophisticated manual-generating programs to
generate better natural language. In particular, straightforward parametrization
of the program would allow equivalent manuals to be generated in any appropri-
ate language.
If we developed a typographical style for user manuals, then all devices pro-
cessed in the framework would be able to use that style. Also, one could generate
interactive HTML manuals for the web, and then the user could also follow hy-
pertext links to learn the exact workings of the device.
We can print an entire manual just by finding the best way to get from each state
to every other state. It’s still a bit long and boring, but it starts off like this:
To get from the device Clock to Quick defrost:
Press Quick defrost
To get from the device Clock to Timer 1:
Press Time
To get from the device Clock to Timer 2:
Press Time Press Time
To get from the device Clock to Power 1:
Press Time Press Power
To get from the device Clock to Power 2:
Press Time Press Time Press Power
393
Chapter 11 More complex devices
how to do everything at least checks that the user can do anything they want to.
We should certainly write a program to generate this manual and run it on any
proposed device; if the program “gets stuck” then the device has problems.
We might prefer to typeset the user manual in a different format. Here is an
extract of one generated for the JVC PVR that starred in the last chapter:
...
If you are playing a tape, but have paused it, you can:
Press Play to play a tape
Press Operate to off, with tape in
Press Forward to fast forward
Press Rewind to rewind a tape
Press Stop/Eject to on, with tape in
If you are playing a tape fast forward, you can:
Press Play to play a tape
Press Operate to off, with tape in
Press Pause to pause playing a tape
Press Stop/Eject to on, with tape in
If you are playing a tape fast backward, you can:
Press Play to play a tape
Press Operate to off, with tape in
Press Pause to pause playing a tape
Press Stop/Eject to on, with tape in
...
A user can press buttons in the manual itself—it’s just a web page, viewed in
a browser—and it would work just like the device it described. You can get nice-
looking buttons relatively easily by using cascading style sheets. Better still, you
could use images for the buttons.
For many devices, whether a PVR or microwave oven, a user’s tasks won’t be
to get from a known state to another state, but simply to get to the desired state,
regardless of the initial state. We can generate a manual for this sort of use.
To represent a device in an unknown state, we represent its possible states as
a set, and we define a function to find out what set of states the device will be in
after a given sequence of button presses; it involves some fun programming. A
394
11.5. Automatic, correct help
breadth-first search can then be used to look for unique states. Then, by defining
some routines to explain things in (for instance) English, we can print out the
sequences of button presses to get to each state. We now have the user manual
that tells a user how to do anything regardless of what the device is doing to start
with. Notice how short it is. Perhaps because of its brevity, we can get some
interesting design insights straight from it.
Whatever the device is doing, you can always get it to
Clock by pressing Clock .
Quick defrost by pressing Quick defrost .
Timer 1 by pressing Clock , then Time .
Timer 2 by pressing Clock , Time , then Time .
Power 1 by pressing Clock , Time , then Power .
Power 2 by pressing Clock , Time , Time , then Power .
This time I had some fun making the English a bit nicer. It says, “press a, b, then
c”; the data for this microwave oven never needed the fancy “press twice” feature.
But making a manual look nice helps—and is more fun to program.
Looking at these instructions, it looks like the clock button ought to have been
called Reset . If so, note that you can still get to the quick defrost state by pressing
it (that is pressing the original button Clock ) first, then the Quick defrost button.
We might think that if such a manual is “good,” what would a device look
like that this manual was the complete explanation for? To find out, all we need
to do is change the English-printing routine to one that goes back to the device
specification and sees which parts of it are used and which are not. Here’s the
result, summarized as a table:
—Buttons—
Clock Quick defrost Time Clear Power
Clock
Quick defrost
States
Timer 1
Timer 2
Power 1
Power 2
Compare this table with the state transition table used for the same device in
section 9.2.2 (p. 277).
We could draw this table in many different formats, which might be a good idea
to try for much bigger systems—when size and readability become more serious
issues.
Look at the entries in this table: these are the only parts of the microwave
oven’s state machine specification that the user manual required. The Clear button
doesn’t seem to be helping much—there are no ticks in its column! Our generating
a manual and then automatically going back to the specification has exposed po-
tentially bad design elements. If this sort of manual is a good idea, then the Clear
button as presently defined is a design feature that needs better justification.
395
Chapter 11 More complex devices
Here is an extract from a manual generated for a more complex system, the JVC
PVR from earlier chapters:
In the following states (play a tape fast forward; pause playing a tape; play a
tape fast backward) you can press Play to play a tape.
If you are playing a tape, but have paused it, additionally you may:
—Press Forward to fast forward.
—Press Rewind to rewind a tape.
In the following states (play a tape fast forward; play a tape fast backward)
you can press Pause to pause playing a tape.
—If you are playing a tape fast forward, you cannot do anything else.
—If you are playing a tape fast backward, you cannot do anything else.
This chunk of the manual would be “inside” a section explaining how the buttons
Operate (the badly named switch on/switch off button) and Stop/Eject work, since
they can be used at any time.
Many other sorts of manuals can be generated too, and by creating them sys-
tematically with programs we can guarantee their correctness. We can also use the
technique of going back from a good manual to reappraise the specification. After
all, if we have a good user manual, then the bits in the specification that aren’t
apparently needed are immediately suspicious features.
396
11.6. Tools to support user manuals
397
Chapter 11 More complex devices
the shorter a user manual, while remaining faithful to the design, the simpler the
system must be to understand, if not use.
We can easily measure the length of basic manuals that have been generated
automatically. Indeed, you don’t actually need to generate the manual to know
how long it is; it’s easy to write programs to estimate the length. The estimate
may be awry for all sorts of reasons, but as the device design is changed,
changes in the manual length can be very informative for the designer. A new
feature that doubles the length of the basic user manual should be immediately
suspect (unless it is very useful)!
Experiments show that when user manuals are rewritten to be shorter, they are
more effective, even if the device design does not change. One of the few
outcomes of direct commercial value from interaction research is the
minimalist technique of developing manuals: it is cheap and easy to do, and it
effectively helps users perform better.
Typically, modified designs that result in shorter manuals will be easier to use.
Why not, then, set up an automatic process to generate good-enough user
manuals (if we are only interested in their length, they don’t need to have a lot
of detail), then modify designs so that the automatically generated manuals get
shorter?
There are different sorts of manual. Do not confuse manuals and technical docu-
ments written to help designers or to help the design process with manuals written
to help the user. What helps a user decide to buy one product or another, say, by
looking up functionality they want, is very different from what helps a user do
what they want to do with the product.
Section 8.8 (p. 259) discusses spanning trees and the algorithms for finding
them as a way of generating manuals.
Users typically want to focus on their actual tasks and activities. The approach for
designing manuals for this purpose is called minimalist instruction.
Instead of telling a user how to use all the functions of product, tell them how to
do useful tasks—and identify what these tasks or activities are. Get rid of clutter
about what the device is for, assume that users already have domain knowledge—
knowledge about what they got the device to do in the first place.
When users do use a device, they make mistakes. Therefore an important part of
a minimal manual (especially a minimal manual, which by its nature doesn’t cover
everything) is how it supports the user successfully recognizing and recovering
from errors.
Rarely do users read manuals from start to finish. Manuals should therefore be
designed so that users can jump around to what they want to do, from moment to
moment. In fact, manuals should be hypertext, or web-based documents.
398
11.6. Tools to support user manuals
In section 11.5 (p. 393) shows how to generate correct text like this using the
framework.
Let’s suppose the technical author wants to rephrase this sort of instruction. Here
there are many ways of doing it—for simplicity, we’ll assume the author is writing
in HTML:
First, the technical author can write something simple like this:
399
Chapter 11 More complex devices
In this method, the JavaScript inserts some accurate and reasonably written
English, for instance, press <span class=key>Time</span> then press
<key>Power</span>. (Note that the author has written a period after the
closing </script>.) If you are keen, the function getFrom could have some
more parameters to refine its style, so that the technical author has more
flexibility in phrasing.
There is no need to use the framework just to say how to use the device, as it
can provide all sorts of other information. Suppose there is a more basic
function than getFrom, called dataGetFrom that returns a list or array of button
presses. With this function, the technical author could write arbitrary
JavaScript to format the information derived from the device specification. The
author could write things like this:
The bad spelling and grammar can be fixed easily; see the discussion of the
plural function in section 4.5 (p. 105).
If the boilerplate approach seems too restrictive, the author could write like
this:
. . . now the JavaScript is being used to check that what the author wrote is
correct, but it is not generating the text itself.
If you are keen, you could make the function check a lot more sophisticated
and flexible; for instance, it could take regular expressions instead of, as here, a
strict sequence that has to be exactly right, or flagged as wrong.
All the examples above gave the author warning if the designer had changed
the design specification. The same ideas can give the designer warnings too.
Here’s one suggestion:
400
11.7. Using real people to help design
These examples used cascading style sheets to format how the keys or key presses
look. Here’s a very simple style sheet, to show how it might be done so that all
keys get a simple but consistent format:
<style type="text/css">
span.key { font-family: sans-serif; border-style: outset;
color: white; background-color: silver;
}
</style>
This will give you consistent, light gray buttons with white sans serif text (text
like this) and a simple 3D effect that makes each button stand out. You put this
style sheet in the HTML head element, that is, anywhere between <head> and
</head>. If you make any changes to it, every key described in the manual will
change its format. As usual for cascading style sheets, this is a very useful sepa-
ration of style from content—the author can decide in one place what every key
will look like—and using the framework has also helped you separate the devel-
opment of the good English from the checking of its accuracy and correspondence
with the device specification.
401
Chapter 11 More complex devices
With a bit more programming we can answer questions like, “What transitions
did the users try that the device isn’t designed to support?”
Clock was pressed in state Clock but did nothing
Quick defrost was pressed in state Quick defrost but did nothing
Clear was pressed in state Clock but did nothing
Power was pressed in state Clock but did nothing
We give suggestions for exploring design changes based on the Fitts and
Hick-Hyman laws in section 9.6.4 (p. 304), where we also define the function
Fitts we use here.
11.8 Conclusions
This chapter presented two key messages: use random techniques to help solve or
provide insight into difficult design problems, and use tools to help generate and
maintain good information about a design.
402
11.8. Conclusions
Microwave ovens were used as a running example. They are simple and easy to
understand, but they were only a driving example—with a simple system, we can
see the design issues crisply. More complex devices have worse design problems
that are even harder to manage well (and they are harder to write clearly about):
the need for good design methods is greater than the impression this chapter may
have given. The consequences of bad design are often much worse than not being
able to cook supper.
Now, at the end of part II, we’ve got both theoretical and practical ideas and
techniques for designing better systems, but there’s still a long way to go. Many
interactive devices are very complex. In part III, next, we change the approach to
take a higher-level view: how can we approach real, messy, interaction program-
ming and still be successful?
Write programs to check your programs that help you design → section 11.1
(p. 377).
Use automatic testing, not prolonged human user testing → section 11.1
(p. 379).
Provide feedback about the future behavior of the device, not just its present
state → section 11.3 (p. 385).
Arrange a device so that costly errors come earlier—avoid completion errors
→ section 11.4 (p. 390).
User error is defined by design error → figure 11.7 (p. 392).
Automatically generate sketch manuals → section 11.6 (p. 396).
403
Chapter 11 More complex devices
404
11.8. Conclusions
405
The significant challenges we face cannot be solved
at the same level of thinking we were at when we
created them.
— Albert Einstein
Part III
Press on
You know how to design better from part II. Now part III, the final part of the book,
inspires you to put your knowledge into action.
407
c
BASEL ACTION NETWORK
WWW.BAN.ORG
Woman in Guiyu, China, about to smash the cathode ray tube (CRT) from a computer
monitor in order to remove the copper-laden yoke. The immediate hazard is breathing
the phosphor screen coating, which is toxic. Monitor glass is later dumped in irrigation
canals and along the river where it leaches lead and phosphor into groundwater. The
groundwater is so contaminated that fresh water is being trucked in for drinking.
12
Grand design
Design is hard, and the real world, with its rush-to-market forces, won’t wait long
for a principle-driven designer. This chapter discusses how we can keep the ad-
vantages of the simplicity of finite state machines, but changing our approach so
that successful real world, complex designs can be undertaken too.
There is no escaping that real systems are complex. How do we keep the bene-
fits of the of a finite state machine approach—and all the checking and automatic
features, such as user manual generation—while handling the unavoidable com-
plexities and urgencies of real-world requirements?
Make the device behave like the real world If it does, the user will
intuitively understand the device, and the designer can copy the real world’s
solution → section 12.3 (p. 415).
409
Chapter 12 Grand design
Change the world to fit the device Think about the big systems picture, not
the device in isolation → section 12.4 (p. 426).
Make the device like its predecessor, only better Then we can avoid
designing everything from scratch → section 12.5 (p. 427).
Make the design easy to change We will get it wrong, but at least we ought
to be in a position to improve it → section 12.6.1 (p. 428).
Use simple techniques as long as possible Then we can manage with ease
as much complexity as possible → section 12.6.2 (p. 429).
Use sophisticated design tools This book covers finite state machine
approaches, but there are more powerful—and harder to use—techniques
available. Build your own design tools if necessary → section 12.6.3 (p. 431).
Make the device simple To try to make it more complex than we can handle
is to betray the user. If the device is a medical or other safety-critical device,
this is a wise decision → section 12.7 (p. 434).
Know the user, and design accordingly Find out what users want and keep
them happy; this may be far easier than putting lots of design effort into issues
users don’t worry about → section 12.8 (p. 435).
Why make devices like anything that has gone before? Be imaginative;
use computers properly. Think bigger—think outside of the box → section 12.9
(p. 436).
If all else fails Take care to handle errors and get feedback from users, so at
least future designs can be improved → section 12.10 (p. 439).
Finally . . . The next chapter, chapter 13, “Improving things,” discusses the
essential attitudes and awareness that go with successful use of these and all
other design principles.
410
12.2. Make design disappear
Box 12.1 Problem solving We need to solve problems all the time, but so often solving
the problem is so important that it focuses our energies—we need the solution!—and thus
we rarely step back to consider other approaches. Problem solving is a well-studied area
and provides many ways of making progress, often by going in different directions than our
natural focus would lead us in.
Design is a problem for designers, and use is a problem for users; so from both sides, it
is worth looking at problem solving, because it can help both designers and users.
Edward de Bono has collected and expressed many problem-solving principles. Often his
work has come under criticism because, perhaps, his ideas seem so obvious once they are
expressed clearly. His book Teach Your Child to Think has the advantage of collecting most
of his effective ideas into a single book. He argues that we should develop thinking tools, and
he gives some examples: just as a carpenter can cut wood and screw it together, we should
have thinking tools to split problems and join them together in new ways. His “six thinking
hats” is a tool that helps us compartmentalize our thinking styles; if we don’t clearly separate
our emotions, our judgments, our fact gathering, and so on, then we run the risk of being
driven down one route without considering important alternatives. Similar ideas are also
developed, with a much more direct relevance to user interface design, by Ben Shneiderman
in his Leonardo’s Laptop.
George Pólya was a mathematician who studied problem solving deeply, and his classic
How to Solve It should be read by everyone. Often when we try to solve a problem, we get
stuck. Pólya suggests that we should imagine the problem solved, and see if we can work
backward to where we are at the moment. Another deep idea is to generalize: sometimes a
specific problem has distracting details—can we work out what the general problem is and
see if we can solve that? Often it’s easier.
Terry Winograd and Fernando Flores’s ground-breaking Understanding Computers and
Cognition suggests (among many other ideas) that we should not always strive to solve
problems—sometimes we can dissolve them. That idea is developed in section 12.2 (fac-
ing page), where we suggest that design should be invisible.
Fridges switch their lights on when you open their doors; you don’t have to
switch the light on. Cars are another good example of this invisible or incidental
interaction. When you open a car door, the inside light comes on. On some cars,
if you get into reverse when you have the windscreen wipers on, the rear wipers
will come on automatically—the car switches them on automatically because it
can work out that if you need the front wipers on to go forward, need the rear
wipers on to go backward.
In some cars, the instrument panel is dimmed for night driving, but when you
reach your hand toward, say, the radio, it lights up brighter so you can see it better.
For car thieves, a quite different sort of user, the burglar alarm is another exam-
ple of interaction that happens as a side effect of interacting. Something happens
automatically, without direct interaction from the user—though in this case, the
interaction is done on behalf of the owner, not the user. With alarms, unlike car
wipers, it’s essential that the interaction and control of the interaction is invisible;
otherwise, the burglar would override it.
411
Chapter 12 Grand design
Cars don’t just interact with their users, they are also part of the bigger transport
system. Road tolls can work automatically from sensors—the driver just drives
past the sensor and is billed automatically for using the road. In the past, this
interaction would have required either a human operator or a machine to take
cash. Now the interaction has disappeared.
Interaction also becomes invisible in less friendly ways. Traffic lights and speed
cameras interact with cars; the driver has to do nothing other than be there. Traffic
lights can stay on red indefinitely if they “think” nobody is waiting for them. As
usual, it’s wise to have timeouts in case there is an unusual situation beyond the
sensors’ awareness, such as a horse rider or cyclist might be—you shouldn’t keep
lo-tech users waiting forever!
Household energy-saving devices work in the same sort of way. If nobody
moves in a room, the lights go out to save energy—this is hopeless for a library,
where you would expect people to remain still as they read. Given that the lights
go out on a timeout, which can be frustrating as it will take people by surprise,
the timeout should be adjustable (perhaps up to infinity—that is, never—for peo-
ple who really want to override the feature), and the lights should come on again
when there is movement. A person is plunged into darkness shouldn’t then have
to find the light switch; it should come on automatically! Even better would be for
the energy-saving system to slowly dim the lights; this at least would not plunge
anyone into complete darkness suddenly.
Timeouts seem a necessary evil for walk-up-and-use devices. Because users
may walk away from a device half way through an interaction, it is important
that the device reset itself to its standby “helpful” state so it is ready when a new
user walks up to it. Many walk-up-and-use devices work in many languages, and
the first interaction they require is to know what language the user wants—a new
user may not be able to read anything on the last screen left them by the previous
user. But timeouts are a crude way of doing this; much better is to use a proximity
sensor—a sensor that tells the device if there is a body or movement nearby. The
standard technology, such as passive infrared sensors (PIRs) or ultrasound motion
sensors, from burglar alarms can be used.
Sensors no doubt have a hard time telling the difference between people and
dogs or pigeons. A more reliable approach is to rely on the people carrying some
form of electronically readable identification, such as a key fob (which merely re-
quires proximity), or a swipe card—either magnetic strip or optical bar code—that
requires some basic interaction. If the user wears the swipe card on a cord, the
card can be kept in the device while the user is interacting. When leaving the user
has to take the card with them, and the device realizes they’ve left. Of course, if
the card is merely swiped through the device at the start of interaction, the device
doesn’t know whether the user is still there or not, which is the original problem
we were trying to solve.
There are many other technologies. Car keys now often have simple radio trans-
mitters in them, so a driver can unlock the car when nearby—the driver doesn’t
have to put the key directly into the lock. Most of these car keys are also passive
security devices: the car won’t start at all unless the key fob is close to the ignition
switch. The car “knows” the right user is present.
412
12.2. Make design disappear
Although many of these devices are proprietary, it has often surprised me that
there is so little reuse of the technologies. For example, my key fob lets me drive
my car, but if my house had a sensor that could learn the code my key fob uses,
then I could get into my house or set the burglar alarm with the key fob. Instead,
I need to have two different devices to do this. The same comment applies to my
workplace: I need more keys and cards to get around the building I work in. Yet
it could easily have learned my car key’s fob codes and been authorized to open
doors with that.
I guess manufacturers who make locks get a useful income from selling swipe
cards or their own key fobs, and they have little incentive to make things easier
to use in this way. This reluctance, if it is reluctance, will have to change as more
people use implants for the same purpose.
In prestige and high-security applications, people are willing to have little tags
injected into them. The tags are little cylinders, about 2mm (a tenth of an inch) in
diameter, and can easily be forced under your skin down a large diameter hypo-
dermic needle. Once under your skin, you may have automatic access to secure
areas, or have your bar and hotel bills paid—or rather, billed—directly.
VeriChip is one such technology, made by Applied Digital Solutions—and its
commercial success suggests that this is a design area we must take seriously.
VeriChip uses a passive implant (so it uses the power of the external sensor, need-
ing no power of its own) that responds with a unique number. The sensor then
uses this number to retrieve whatever information is relevant about the user. For
example, for access to a club, the scanner could bring up a photograph of the cus-
tomer.
413
Chapter 12 Grand design
From the railway company’s point of view, the disappearing ticket machine also
gets rid of passengers who travel without tickets. Most railway cars have weight
sensors on them; if the weight and number of wireless-enabled passengers does
not tally, a conductor can be warned to do a manual check. Or perhaps the train
could wait while the angry delayed passengers sort the problem out for them-
selves: people might get off one at a time until the train leaves without them,
because they don’t have any means of paying automatically.
We’ve completely eliminated the conventional interaction programming prob-
lems, but we’ve replaced them with new problems—we’ve replaced conventional
interaction programming issues with ethical and business issues:
While we have made the device interaction very smooth, some people won’t
have the resources to be wirelessly connected. For these people, the user
interface has become much worse.
The old device could take cash and was essentially anonymous. The new
approach still needs a way of billing the user, but now it needs to do it with
something like a credit card account. The railway company therefore knows
who you are every time you travel—as well as where you live, and your credit
rating, and so on.
The railway could invent its own credit card, which a traveler buys (they could
be wireless-enabled as well, so they do the complete job: providing passenger
location and paying). The credit on the card allows customers to travel, but
because they have paid up front, perhaps with cash, their anonymity need not
be compromised.
There are many possible “compromise” designs that make the interaction half
invisible. Why not, for instance, make the credit card idea low tech: it could be a
piece of paper. The piece of paper, once bought, allows 10 or 100 trips. The user
could punch it on the train . . . and we’ve just invented continental-style railway
ticketing.
When considering the tradeoffs between these alternatives, remember that most
walk-up-and-use devices are not meant to be perfect. Automatic ticket machines
are called “queue busters” because their purpose is to reduce queue lengths at
railway stations or at airports. If they only handle two-thirds of passengers, the
station or airport can employ fewer staff; for the (mostly regular) users, for whom
the ticket machines are successful, the experience will be good. For users with
complex requirements for their travel or tickets, they’d still need to talk to humans
anyway.
414
12.3. Design like the real world
415
Chapter 12 Grand design
Figure 12.1: The letter X has four mirror symmetries and four rotational symmetries
(not counting turning it around in space, off the paper)—unless you look very closely
at the varying thickness of the strokes and at the details at their ends, the serifs. The
careful design of the symmetry-breaking details are what makes good typography—and
good interaction programming, once we have ways of seeing symmetry in programs.
problems. Mostly we find this practical problem solving so easy that we rarely
think about it.
One of the simplest and yet most profound symmetries is called translation,
otherwise known as moving. The laws of physics don’t change when something
is moved. You change something’s position—this is the change you intend—and
now you find everything seems the same. Movement is a sort of symmetry, just
like reflecting in a mirror. In mundane terms, my mug of tea works like a mug of
tea—and if I move it to another room it still behaves like a mug of tea. As I said,
this is so obvious that we don’t think about it.
Imagine that you generously give me a coin, which I place in my left hand. I
now put the coin in my right hand and open my right hand for you to see. The
coin has gone! You gasp! I am a magician. I have surprised you because I broke
a fundamental law of physics: just by moving a coin from one hand to another
does not make it disappear. It should have been unchanged, for we know that
translation is a symmetry, a very basic fact about the way the universe works. If I
can break translation symmetry, I must be a very good magician indeed.
It’s even more fun doing magic with children, because they are still learning the
rules about how the world works. Even putting a hand over your face to make it
disappear briefly can get a young enough child excited about whether you’ll ever
reappear.
While breaking symmetry is fun when it is done for entertainment, it is devas-
tating when it happens for other reasons. You put your coin in your pocket, and
next time you look, it has gone! You know the universe doesn’t behave like this, so
you blame a pickpocket. You would be lucky if it was only a coin that was stolen.
In short, breaking a symmetry can be upsetting.
We like symmetry; it is deeply tied up with out aesthetic sensibilities. A wiggly
line is just a wiggly line, but a line that has a repeated pattern is a frieze and can
be used for decoration. Poetry mixes translation (movement, as you move your
eye down the verse) and some rhythm and rhyme that is unchanged (or changes
at a different rate) as you read—a partial symmetry that is artistically balanced
between being overly repetitive, and dull, or being unstructured.
Computer graphics experiments done with peoples’ faces show that symmetric
faces are often perceived as more attractive than asymmetric ones. Symmetric
things have “balance” and seem more elegant; we like symmetry.
416
12.3. Design like the real world
In contrast, broken things are often asymmetric. If you damage a ball, it will
have fewer symmetries and, for instance, will stop rolling freely (which requires
spherical symmetry). Damage, of course, often does random destructive things:
it would be hard to do damage symmetrically to something. When people or an-
imals are diseased or damaged in any way, the marks of the disease are very un-
likely to be symmetrical. When we go looking for mates, we tend to prefer people
who are symmetric, since this is a good heuristic that they are undamaged and are
more likely to be undamaged genetically. In evolutionary terms, the people who
mated with asymmetric partners probably had fewer healthy children. We who
are the survivors of generations of people making mating decisions, tend to like
symmetry, since it has served our ancestors well.
Symmetry is also intensely practical. I’ll mention four advantages of it, before
we move on to applying the idea in design.
When we try to solve a problem but get stuck, we need to find another way of
looking at the problem, so that if we can solve it from this other view, we’ve
solved the problem. This is exactly what symmetry is: we want to transform a
problem but leave the essential bits unchanged. We sometimes even say we
want to “turn it over” in our minds.
Symmetry compresses information. If something has a symmetry, we can
throw some of it away and yet be able to reconstruct it later just from knowing
the symmetry. Here is a very simple example: a table of numbers and their
squares:
Number Square
1 1
2 4
3 9
4 16
Now if I tell you that squaring is symmetric—that is, the square of a number is
unchanged even if you change the number to be negative—I have doubled the
amount you know from reading the table, without making the table any bigger.
Because squaring is symmetric, I need only show you half the table—you
know, for instance, that −4 squared is 16. Symmetry has effectively allowed
the table to be compressed without loss of any information.
Symmetry makes things easier to use. More succinctly, we called this principle
permissiveness: if something can be used in more than one way, it is
permissive. Again, this is symmetry. The transformation is to use the device a
different way with the end result—what the device does—unchanged.
A physical example of symmetry and the principle of permissiveness is that a
plug that looks symmetric ought to be symmetric. A USB plug ends in a little
metal rectangle; it ought to go in its socket either way up, then. But it turns out
that the appearance of symmetry is deceptive. A USB plug is not symmetrical,
though it looks like it is unless you look closely. They are therefore much
417
Chapter 12 Grand design
harder to put in than they need be; either they should have been really
symmetrical (so it didn’t matter which way they go in), or they should not
have looked symmetrical (so we know which way is up).
See box 5.5, “Permissiveness” (p. 136), for further examples and more
cross-references.
Symmetry makes things easier to debug for programmers. From a user’s point
of view, symmetry makes things easier to learn. If we have good reason to
think something is symmetrical, then we can learn how it works (or check that
it works) by only trying it out partially. If we think the A and B parts of
something work the same way, that is, that they are symmetrical, then we need
only try A to know that B works the same way. It follows from this observation
that user manuals should explain when parts of a device behave the same way
as other parts; this will let users learn and understand faster.
To summarize what is a very big and wide-ranging topic: we learn rules about the
way the world works, and some very deep rules are called symmetries. Symme-
tries make things much easier to understand, they make problems easier to solve,
and they are aesthetically very satisfying, but occasionally they are broken. Under
the right circumstances, it may be an exciting game to break rules, but in more
prosaic circumstances it is awful. The world seems to have let you down.
Translating these ideas into interaction programming, a programmer should
identify the symmetries in the planned design, or in what the planned device is in-
tended to do, and make them obvious to the user—and make them reliable, rather
than half-hearted. And avoid appearing to have symmetries that actually don’t
work, like the USB plug.
Half-hearted, beguiling, design rules are discussed in section 10.3.3 (p. 337).
418
12.3. Design like the real world
Box 12.2 Design guidelines If we know we want to provide undo in a device, we can use
simple programming to check that the design allows all user actions to be undone, or we can
write programs to create undo actions for all user actions in the original device specification.
Either way, we have a device that has undo—which is what we wanted.
But what features should designers want in good devices? An idea like “provide undo” is
a typical design guideline, and there are books and manuals of guidelines that are a source
of inspiration—or even legal protection—for designers. Guidelines are particularly useful if
a group of designers are working together in a team, so that they have a coherent view of
the design work, or for a creating a range of devices that all work in ways the users see as
consistent. Often manufacturers will have proprietary guidelines that impose their distinctive
style, to ensure their devices have a suitable corporate feel about them.
Although there has been no systematic study of what makes for good guidelines, re-
searchers have been filtering them for years, trying to identify core guidelines that are few in
number, easy to use and effective in design. Here’s a sample, which I’ve organized starting
with guidelines that a framework can readily help support or test, and the later ones being
more inspirational:
1 Match the user’s task sequence 8 Allow users to create short cuts
2 Provide undo 9 Design for errors
or provide easy reversal of actions 10 Provide help
3 Remove modes 11 Provide a sense of progress;
4 Provide informative feedback, give users a sense of achievement
and provide it immediately 12 Information should appear
5 Be consistent, utilize symmetry in a natural order
6 Provide clearly marked exits 13 Minimize the user’s memory load
7 Provide shortcuts 14 Speak the user’s language
This book has raised all these ideas—and more—in discussion, and indeed a proper list of
guidelines would expand and explain the ideas in careful detail, to suit the types of device
being designed and the types of users and activities for which they are being designed.
After the generalities of typical guidelines, it’s important to have more specific guidelines
for users with impaired vision, hearing, or movement—especially if you are designing for
older people, children, or users working in harsh environments—say, where they have to
wear gloves or eye protection. Remember that you can often design for both “normal” and
“impaired” users by suitable techniques: for instance, you should always use more than just
color to distinguish buttons and indicators on a device—use different high-contrast symbols
as well—otherwise some users may not be able to tell them apart reliably.
to know so much about the world. Every chair you have ever seen behaves like
a chair, you have not had to learn how each chair works. Your mind would be
full if it had to have all that information in it; instead, you’ve learned very general
principles about how objects work.
More to the point, you’ve also learned how things like buttons work. If you
press a button, it goes in, and something happens. The affordance is that some-
thing that looks like a button is a pressable thing. If you are a designer, and you
want your user to press things, then make them look like buttons. Buttons have the
affordance of pressing. If you have to use a computer’s flat screen, using shading
and shadow effects to make the button look like it sticks out, so it can be pressed.
419
Chapter 12 Grand design
Flick down
Light Light
off on
Flick up
Figure 12.2: A wall light switch and its transition diagram. This is a UK switch; in the
US, it would be switched up for off.
The design rule is to find out the affordances and then represent them in the
device you are designing. Then the user will “automatically” know what to do—
they don’t need to know specifically how your device works, because it works like
other things they are already familiar with.
420
12.3. Design like the real world
Other sorts of on/off switches are knobs that you rotate; again, they work just
as well when they are turned over. You could of course design an on/off switch
not to be symmetric, but this would be unusual—and result in something harder to
use. A real physical switch is easier to use if it is designed to reflect the symmetries
of the state transitions it manipulates.
0
9 1
8 2
7 3
6 3
5
If we imagine that each of the four digits has a button underneath it, each would
make a sort of domino. Here is our basic physical building block: a basic domino-
shaped unit that gives space for a single digit and the button that controls it is
immediately below it. Here’s how one of these dominos would look if it was
displaying the digit 3:
Pressing the button under a digit increases the digit by one, taking 0 to 1, 1 to 2
and so on, and 9 back to 0. This is what the statechart diagram says.
Now, take a domino and move it right three times to get a row of four dominos.
A row of buttons and digits showing 1215 would look something like this:
u u u u
421
Chapter 12 Grand design
No doubt, if we were designing a real product, it wouldn’t have all those lines
on it, but would have a nice smooth surface. With or without lines, it has a simple
horizontal symmetry: a symmetry of translation—if you move left or right, each
domino looks and behaves the same.
The statechart of the four dominos combined has a corresponding symmetry as
well:
H H H H
0 0 0 0
9 1 9 1 9 1 9 1
8 2 8 2 8 2 8 2
7 3 7 3 7 3 7 3
6 3 6 3 6 3 6 3
5 5 5 5
This is a really deep and fascinating outcome: the physical device and the state-
chart have corresponding symmetries. But before we get too excited, we must
acknowledge that there is an awful problem, brought about by the very symmetry
we’ve been claiming is a good thing. By convention, times go from 00:00 (mid-
night) to 23:59 (a minute before midnight) and then back to 00:00. But, thanks to
the consistent statechart, our clock’s display goes from 0000 to 9999; worse, strictly
it goes back to 0999, 9099, 9909, or 9990 with one user action—it takes four sepa-
rate user actions to go from 9999 to 0000. The buttons in the dominos treat each
digit as the display of an independent state machine; the device—so far—has the
sense neither to handle multi-digit numbers nor times.
The righthand pair of digits shouldn’t be able to get to a number higher than 59
minutes, and the left hand pair of digits shouldn’t be able to get higher than 23 (or
12 if we are using a 12-hour clock rather than a 24-hour clock). In short, it doesn’t
behave at all like we would want a clock to!
The opening story of chapter 11, “More complex devices,” exposes some of the
problems of clocks that can count from 0000 to 9999.
We can either try to change the world, and have 10, 000 “minutes” in a day to
make our lives easier, or we can spend a moment trying to reconcile our design
with convention, which requires clocks to go from 12:59 to 1:00 and from 23:59 to
0:00. Affordance may tell us the “best” way of doing it, but convention tells us
the way users will like! (If we were designing a digital display for some other
purpose than telling the time, say for an aircraft altimeter, we wouldn’t have this
conflict.) I blame the Babylonians for the mess, though the French tried a decimal
system during the Revolution and, more recently, Swatch introduced an “internet
time” dividing the day into 1, 000 “.beats” instead of 1, 440 minutes. A .beats clock
would be easier to make easy to use.
One way to satisfy social convention is to remove the first and third buttons and
combine digits into hours or minutes:
422
12.3. Design like the real world
u u
What we have done is to enlarge our dominos to display two digits, so one
domino goes from 0 to 12 (or 23), and the other from 0 to 59. Now the buttons
increase the units of the numbers, whether hours or minutes. Repeatedly pressing
a button cannot cause an invalid time to be displayed.
The disadvantage of combining digits into pairs is that it now takes much longer
to set a time. The righthand button might need pressing 59 times (say, to get it
from displaying 00 to displaying 59), whereas with two buttons for the same task
it would only need pressing 5 (for the left digit) plus 9 (for the right digit) times—a
total of only 14 presses, about four times faster.
The usual way of reducing the increased button-pressing effort is to make the
buttons automatically repeat, so the user can change the time from 0 to 59 with
only one press. To speed up this method, holding the button down for longer
makes the numbers change faster: when you want to slow down, stop pressing
the button, and start again. When we try to makes things simpler, we often make
them more complex in other ways. We have to use our skill and judgment to find
the right balance.
423
Chapter 12 Grand design
Select scene
17 18
19 20
Figure 12.3: A typical DVD menu design, schematically depicting a select scene menu
from a DVD. There are two distinct menus: the four scenes (as shown, scene 18 is
selected), and the main menu at the bottom, giving the user a total of 11 choices.
Similar issues arise in many DVD scene selection menus.
move in a consistent way. For example, predicting what moving down will do
from some selection on the bottom menu is not easy.
Explaining the DVD design clearly is a problem. The next few paragraphs are,
while accurate, tedious to read. This in itself is an indication that the design should
be improved. When designs are not easy to explain, there is less motivation for
thinking about them; you probably take shortcuts to avoid tedious explanation!
Please persevere for the next few paragraphs with that thought in mind—how
would you have designed the DVD interaction to be easier to describe, and easier
to understand, use, and improve?
The geometric layout is obvious from the appearance of the menus, as drawn in
figure 12.3 (this page), but the menus are also structured hierarchically, although
the screen layout does not make this very clear:
The menu item “main menu” goes up a level in the hierarchy if it is selected.
The “group scene selection” menu items on the bottom row (1-4, 5-8, . . . 21-22)
move across levels in the hierarchy and bring up a screen similar to the one in
the figure, which could have been displayed by choosing 17-20 from this menu.
424
12.3. Design like the real world
Once a group scene has been selected, the selection moves to the lowest level in
the hierarchy, to a specific one of four scenes.
The keys ← and → go to numerically previous and next scenes, in the obvious
order they are seen in the film. Thus, given that scene 18 is selected in the figure,
→ would go to the next scene, selecting 19, then (if pressed again) it would go to
20, then 21 (redrawing the menu with scenes 21 and 22), then 22, then back to 1.
Similarly, ← would select 17 then 16, and eventually it would go back to 1, and
recycle round to 22.
Vertical movement, however, is deceptive. With scene 18 selected, ↓ moves to
scene 20, and then ↑ will go back and select scene 18 again. So we think we my
have a rule: ↓ ↑ leaves you where you started, just as pressing ← → would
always leave you where you started.
Inconsistently, though, ↓ pressed in either lower scene (here, 19 or 20) moves
down to the main menu row, as does ↑ in either upper scene (here, 17 or 18).
Furthermore, ↓ from the left column (here, 19) moves to “main menu,” but from
the right column moves to 21-22 because this is the next set of scenes. My emphasis
is needed, since ↓ from the left column always goes to main menu, but from the
right column it goes to the next set of scene choices in the bottom menu, which is
different every time.
Now let’s summarize the main features of this design. Mostly, but not always,
the arrow keys move the selection point in the corresponding direction on the
screen. Left and right arrows also cycle, whether on the bottom menu, or in the
scene region.
Inconsistencies are almost entirely due to the multiple uses of ↑ and ↓ , since
there is no great call for jumping forward and backward two scenes, their use in
the scene selection menu is questionable.
Whether this menu design causes real problems for users is an open question.
Whether improving the geometric affordances would reduce user problems is an-
other question. Both questions are worth exploring experimentally, though of
course DVD menus as such are not a major issue for usability—they are for en-
tertainment. Do the extra seconds a user takes, or the additional errors they make,
matter? Indeed, it is possible that the extra seconds enhance the user’s enjoyment,
for instance, by prolonging anticipation. Or is the graphic design (the static im-
pression, rather than the interaction) more important for marketing?
Many DVD menu screen designs are much worse, suggesting that ease of use is
not a design priority. For example, as many DVDs organize menu items in circles
(or arcs), meaning that even within a menu there is no consistent direction for the
arrow keys to follow.
425
Chapter 12 Grand design
426
12.5. Make the device like its predecessor
427
Chapter 12 Grand design
428
12.6. Use design tools wisely
intended for the final design. The point is that every stage of the development
process is checked out well before you get to really needing it. If the spike uncov-
ers some problem, start fixing it now before it creates an actual delay in the real
development process.
An extremely powerful way to make designs easier to change is to build your
own design tools. Building tools forces you to think at a higher level—about the
properties of designs in general rather than your specific design. You then solve
general problems, and the solutions serve you well when you make changes to
your design as your ideas for it change. The your design tools can ensure or en-
force whatever properties your devices need.
429
Chapter 12 Grand design
Box 12.3 Defending finite state machines This box reviews some of the defenses of FSMs
against the popular criticisms.
FSMs are large FSMs for typical devices often have thousands or more states. The size
of a FSM is not a real concern once it is programmed, though obviously it would be
a serious problem if one wanted to draw the corresponding transition diagram.
FSMs are unstructured FSMs are indeed unstructured. However, they can be “nested”
so that large classes of states are treated as one big state—this makes what you are
thinking about simpler, even though the underlying FSM is just as big. If an FSM
has been designed to be like this, it can be drawn easily using statecharts, which are
a very good way of drawing large FSMs.
FSMs are finite FSMs are finite and therefore formally less powerful than infinite
computational models such as pushdown automata (PDAs) or Turing Machines. For
example, a handheld calculator using brackets is readily modeled as a PDA, and
therefore one might think it is not an FSM. However, all physically realizable digital
devices are FSMs, whether or not it is convenient to model them explicitly as such.
FSMs are not relevant to users FSMs are mathematical or program structures, and
they do not exist in any useful concrete form for users except in the very simplest of
cases—where they are hardly necessary to help the users! Users should not be
expected to reason about the behavior of devices by using FSMs: they are typically
far too big—but this argument does not mean that designers should not reason using
FSMs, particularly if they have programs to help them do so.
Few systems are implemented as FSMs Most systems are implemented in ad hoc
ways, and determining any model from them is hard, if not impossible. In this sense,
FSMs suffer from problems no different from any other formal approach. Better, one
would start with the formal model and derive (preferably automatically) the
implementation.
FSM models are impossible to determine On the contrary, if systems are developed
rigorously, it is not hard to determine finite models of user interfaces from them.
FSMs are bad for design The rigor of FSMs encourages interaction programmers to
make simpler devices that they understand, so they can analyze and build reliable
interactive systems. FSMs have a clear theory, and we can measure all sorts of
important design properties with simple programming. We can also generate help,
user manuals, and other important material from FSMs specifications.
and this high-level specification can be translated by a tool into a finite state ma-
chine. The designer’s tool can then generate (if it wishes) enormous finite state ma-
chines that would have been way beyond comprehension but for being expressed
in the high level way. Thus the size of the finite state machine has been hidden
from the designer, but all the advantages of the basically simple FSM approach
have been retained; all of the measurements and other advantages our framework
provides now comes for free, with no extra effort.
430
12.6. Use design tools wisely
Box 12.4 Reductionism Reductionism says that we can make progress by reducing things
to bare principles and then reason successfully about useful properties of systems from those
underlying principles. We can discuss the large-scale behavior of planets in the solar system,
for example, without worrying about their weather systems; we can just study their grav-
itational influences on one another. (The reason why science became so successful since
the Renaissance was reductionism.) Abstraction is the computer science term: abstraction
reduces the number of concepts that one needs to think about all at once.
Reductionism as a philosophy, ontological reductionism, asserts that not only is it con-
venient to think like this, but reality is really like this, that is, made out of simpler things.
In human-computer interaction it’s pretty hard to go along with this form of reductionism
because interaction has many deeply-interacting features: the user’s motivation, lighting
conditions, whether users are part of a social group, and so on. Many things that have to
do with user context influence the success of a design and cannot be reduced.
Yet if we fail to be reductionist at the right time, we miss out on useful insights. A
reductionist programmer would build really simple programs and get them to work—perhaps
using finite state machines—and then build on necessary embellishments. Ideally the result
of a reductionist approach would be a human-computer system that was understood and
was easy to modify, and so on.
Instead, there is a temptation to resist reducing design at any stage to make it simpler.
This is a recipe for despair and results in large ad hoc system designs that nobody—whether
designer or user—understands.
Thinking about states and actions and finite state machines is unashamedly reductionist.
They allow designers to think clearly about simple things. They don’t help designers think
about everything they need to or should! But where they help, they help enormously.
The drill and the farmer’s problem—see sections 9.8 (p. 316) and 10.1
(p. 325)—were generated by program, which was much easier to write than
work out by hand the correct relations between all the states.
The program lex (and all its derivatives) is a well-known example: it reads a list
of regular expressions and builds a finite state machine from them. The regular
expressions are very much easier to read than the details of the finite state machine
that is generated by this process.
Lex is explained more fully below, in section 12.6.4 (next page).
Many concurrent programming notations, such as CSP, Spin, and SMV, can
compile into finite state machines; if these notations are easier for a designer to
use, then they can be exploited and all of the advantages we promoted in part II
can either be derived directly or indirectly by compiling into finite state machines.
(Unfortunately, you don’t always get finite machines for all cases, but that is a tech-
nical issue beyond the scope of this book; LTSA is a dialect of CSP that guarantees
finite machines.)
431
Chapter 12 Grand design
We rather constrained our use of JavaScript because the entire rules of interac-
tion were captured by our finite state machine specification. That is, we used no
serious power of JavaScript to decide what an interactive device would do; ev-
erything was determined by a finite state machine. In general, if we choose, we
could program any interactive behavior in JavaScript, but then our ability to ana-
lyze and understand what we had designed would be very severely curtailed. In-
deed, as most interactive programs are written in general-purpose programming
languages, there is very little hope of ever generating user manuals or drawing
transition diagrams from interactive programs written conventionally—or of get-
ting any of the other advantages we covered in part II of this book.
For many purposes, specifying systems as finite state machines will seem overly
restricted. Fortunately, there are several very well designed programming lan-
guages that are much more powerful and yet have all the advantages of being
tractable and easy to analyze. Foremost among these languages are SMV and Spin,
though there are several approaches that are less like programming and more like
specifying in algebra, such as Alloy and CSP. See the further reading at the end of
this chapter (p. 441).
In contrast, rapid application development (RAD) tools like Flash let you build
interactive systems very quickly, which is a benefit in itself, but few I would con-
sider “sophisticated design tools” as they provide no analysis of what you are
building. Typically, they cannot even answer simple questions, like whether the
design is strongly connected.
432
12.6. Use design tools wisely
ulations we created in part II were faithful to our finite state machine specifica-
tions. In fact, lex and yacc have been around a very long time, since the late 1970s,
and there are now many more advanced and flexible systems.
Here is a very condensed overview of lex, in sufficient detail to inspire you to
either find similar tools (on the web) or to build your own! In giving this overview,
I have translated lex’s compiler-compiler terminology into our terminology of in-
teraction programming.
A lex specification starts with some declarations: these declarations may in-
clude definitions of the buttons the user can press. After a %% is a list of regular
expressions and program code (written in C in the original lex) that is run when
the regular expressions match the user’s actions. The regular expressions are usu-
ally set up to recognize things like <=, comments, strings, numbers, and variable
names (the lexical tokens of the language—hence lex’s name). Here’s a very simple
example, written more in the style of lex than being a complete working example,
based on some of the main buttons of a simple handheld calculator:
// a simple calculator, with memory, display and error flag.
DIGIT [0-9] // define decimal digits
%%
{AC} { // clear display
if( !error ) d = 0;
}
433
Chapter 12 Grand design
calculators registers, and this also happens if the user does AC AC AC AC . . . any
number of times, because of the +.
One of the advantages of lex is that it makes a number of checks on the regular
expressions. For example, it will warn you if there are two regular expressions
that both match the same sequence of actions in an undefined way.
Yacc works in a similar way, but it uses a grammar rather than regular expres-
sions. If you were implementing a calculator, the grammar would be used for
handling expressions like 3 + 4 × 5/(2 + 2.3), and it could also do the AC AC sort
of things we used lex for above. Again, yacc can find many errors in grammars—
itself a good reason for using compiler-compilers, whether or not you want to use
the compilers they generate.
It’s hard to imagine many of the calculator problems discussed in section 3.1
(p. 61) arising if compiler-compilers had been used. Section 3.8 (p. 74)
discussed feature interaction on calculators, another problem that would be a
lot less likely if using compiler-compilers.
Mock-up a prototype using your favorite user interface design tool. Get it ap-
proved because it looks so nice. Start programming it in a decent programming
language like Java, because you can then do anything, and fix any problems . . .
what could be easier?
Unfortunately, we might accidentally make the actual device a bit more complex
than we intended, without noticing. Programs are notorious for looking easy but
camouflaging mysterious bugs—many of which are not noticed immediately, if
ever. Programs for user interfaces have the added disadvantage that their clarity
and correctness have nothing to do with how clear or sensible the user interface
is. Programmers can easily write code that is hard to use—and for which it’s hard
to tell it’s hard to use.
Rather than use general-purpose techniques, it is better to keep the device sim-
ple because to try to make it more complex than we can handle (whether or not we
know what we can reliably handle) is to betray the user. If the device is a medical
or other safety-critical device or if the device is going to earn or handle money,
simplicity is a wise decision. Get it right first, by keeping it simple, then make it
better.
On environmental reasons for simple designs, see section 1.10 (p. 29).
Rather than see simple design tools as limiting hindrances, you can easily ex-
tend them. For example, if a design tool presents a device, say, as an overly simple
434
12.8. Know the user, and design accordingly
tree, this makes things easier for the designer (but harder for the user). The simpli-
fying view of the device needs extra tool support to make the final device easier to
use, but it still lets the design benefit from the designer’s better grasp of the device
goals. For example, a tree does not have “undo” and “go-back” functions (because
these introduce cycles), yet both such useful functions would be easy for a design
tool to add systematically to a tree.
Section 8.7.2 (p. 253) raises Chris Alexander’s insight that a city is not a tree.
Section 8.7.3 (p. 254) discusses how to add undo and other functions to trees.
435
Chapter 12 Grand design
productive. The process of getting to know users includes and involves them in
the design process, which helps ensure they will be more supportive of the systems
that, otherwise, they might have felt had been imposed on them.
If users are not involved in design processes, they become disenfranchised. And
“users” here can mean not just the people who end up using the systems being
designed, whether mobile or desktop, but also their managers and other people
throughout the organization and its supply chains. To give just one example the
British National Health Service is spending approximately £3 billion a year (in
2005) on a major IT project; one of the main causes for the spiraling costs has
been the way frontline staff have become disengaged from the design process.
Participative design is crucial.
Problems of using people for even simple systems were outlined in section 6.5
(p. 191), where we discussed a very simple alarm clock. The issues were also
discussed in section 11.1 (p. 369). The merits of human testers compared to
gnomes is discussed in section 11.7 (p. 401). The present discussion continues
in the next chapter, in section 13.2 (p. 450), where we compare and contrast
user -centered design and designer -centered design.
436
12.9. Exploit computer magic
Figure 12.4: A PDA simulating a calculator. Instead of pressing buttons with your fin-
ger, a stylus is used on the touch-sensitive screen. Compare this with figure 3.1 (p. 62).
The first picture, 1, below shows the correct equation 3 × 6 = 18 with the user
just starting to write “divided by 5” by hand. What the user has written—if we
ignore the 6 correction the magic paper put in a moment ago—is 3× 5 = 18. Now,
the correction is 30, not 6, so the calculator changes the last correction, 6, to 30. All
the changes are done with nice morphing, fluid animation, and scaling. This is
shown in picture 2 below; the smooth change 6-to-30 happens as the handwritten
5 is converted to a nicely typeset 5.
Essentially, at this point, the user has solved the equation 3×5 x = 18 for x, and
found that x = 30, without having to use any variable names or to rearrange the
calculation, as they would have to do on an ordinary calculator.
1 2 3 4
Next, in picture 3, the user has drawn a ring around the 3 × , and they are in the
process of dragging it to the bottom of the fraction. Picture 4 shows what happens
when the user lets go—the 3 × has moved to the bottom, making 3 × 5, and the
magic paper slides it into the exact place from where the user dropped it, at the
same time showing the correction 270 that is needed on the top.
This very brief description hardly gives a full impression of how nice the cal-
culator is to use. The examples above seem contrived, so that the features can be
explained—they give a rather stilted picture of how it works. It feels much more
fluid in practice. For example, you can write 23 and it will show 23 = 8 immedi-
ately; if you then drag the 3 down below the 2, it will show 32 = 9 immediately.
It is very satisfying to play with. Try working out a calculation like, “if I get 15%
437
Chapter 12 Grand design
PHIL BOORMAN
Figure 12.5: Writing on the new calculator using an interactive whiteboard. The
whiteboard has a back-projection so the user does not cast a shadow on it, and it
detects where they are writing by using cameras.
discount, how much can I buy for $12?” and you’ll soon appreciate that the flexi-
ble moving around is very useful. Also, being able to edit a calculation helps give
confidence; if you don’t really believe the answer, you can change the 15 to 5, say,
and check the answer changes in the way you expect.
Will did a small experiment asking computer science students to come with
their own ordinary calculator and do high school math exam problems. Will then
repeated the questions, but asked them to use the new calculator. No user got the
wrong answer for any question with the new calculator, though they did when
they used their own calculators.
This is an extraordinary result, suggesting that there is a lot of potential for this
new design, especially for critical applications, like calculating drug dosages. That
people enjoy using it—they even laugh as they see equations solved!—is very ex-
citing. The calculator works wonderfully in schools, with interactive whiteboards
where your writing can involve whole arm movement.
The moral is that we simply decided to think out of the box; we wanted to make
a magic calculator, not one like all of the others.
438
12.10. If all else fails
439
Chapter 12 Grand design
Don’t try to be perfect. Often a device designed to be only 80% successful will
be better than one that was designed to be perfect—see section 12.2.2 (p. 414).
This list of ways to anticipate errors—and to turn them to advantage—is not ex-
haustive!
12.11 Conclusions
Finite state machines provide a theory and practical framework for interaction
programming, but they get tricky to use for large design projects, particularly
when working under real-world pressures. This chapter provided a wide range
of heuristics for design that can either be used in conjunction with finite state ma-
chines, or independently. For a summary, revisit the list of principles at the start
of this chapter, on p. 409.
Good design is not just a matter of knowing (and putting into action) theories
and principles. It’s also about understanding people (whether users or designers)
and having the right attitudes. The next chapter examines these more personal
and ethical aspects of design.
440
12.11. Conclusions
Change the world, rather than design from bad requirements → section 12.4
(p. 426).
Use “spikes” to check and debug the entire proposed design and processes →
section 12.6 (p. 428).
441
Chapter 12 Grand design
Thimbleby, H. W., Blandford, A. E., Cairns, P., Curzon, P., and Jones, M., “User
Interface Design as Systems Design,” Proceedings People and Computers, XVI,
pp281–301, edited by Faulkner, X., Finlay, J. and Détienne, F., Springer, 2002.
The sections about the ticket machine in this chapter are based on this
conference paper I wrote with my colleagues. The paper makes a few
additional points—it also has a longer reference list if you want to follow up
ideas.
Thimbleby, H. W., “Reflections on Symmetry,” Proceedings of the Advanced
Visual Interfaces Conference, AVI2002, pp28–33, 2002.
Thimbleby, W. J., “A Novel Pen-based Calculator,” Proceedings of the Third
Nordic Conference on Human-Computer Interaction, ACM NordiCHI, pp445–448,
2004. The calculator has advanced considerably since that early paper; you can
get more information and download the current version from
www.cs.swan.ac.uk/calculators.
There are many books on problem solving; here are just a few, particularly well-
related to interaction programming:
de Bono, E., Teach Your Child to Think, Penguin, 1993. de Bono is the inventor of
lateral thinking. Each chapter of this book succinctly covers a different creative
approach to thinking.
Michalewicz, Z., and Fogel, D. B., How to Solve It: Modern Heuristics Springer,
2000, is a wide-ranging discussion about problem-solving in programming. It
was mentioned in the chapter 8, “Graphs,” for its coverage of the traveling
salesman problem.
Pólya, G., How to Solve It, Princeton University Press, 1945, reprinted by
Penguin, 1990. This is a classic and easy-to-read book—and popular enough to
sell over a million copies. If you like George Pólya’s work, a more
mathematically detailed book is Pólya, G. Mathematical Discovery, Combined
edition, John Wiley & Sons, 1981.
Shneiderman, B., Leonardo’s Laptop, MIT Press, 2002. This is a very inspiring
book, both about thinking and about thinking about computer design.
Winograd, T., and Flores, F., Understanding Computers and Cognition, Ablex,
1986. “Winograd and Flores” has gone down as one of the classic turning
points in the field; their book lifted everyone’s ideas about philosophy
(specifically phenomenology) and showed how to think in new ways about
interacting with computers. The book ties finite state machines into speech
acts, and presents interesting ideas about email/CSCW systems.
442
12.11. Conclusions
Compiling technology has moved on considerably since the early days of lex and
yacc. Of course, you will be able to find up-to-date details using a search engine.
Aho, A. V., Sethi, R., and Ullman, J. D., Compilers: Principles, Techniques and
Tools, Addison-Wesley, 1985. This is the classic “dragon book,” with a very
good introduction to the use and principles of lex and yacc:
Appel, A. W., with Palsberg, J., Modern Compiler Implementation in Java,
Cambridge University Press, 2nd edition, 2002. This is a more modern book,
available in different editions for different programming languages such as
Java.
Parr, T. J., Quong, R. W., “ANTLR: A Predicated-LL(k) Parser Generator,” in
Software—Practice & Experience, 25(7), pp789–810, 1995. ANTLR is an
object-oriented compiler-compiler that is very versatile, and easier to use.
443
13
Improving
things
445
Chapter 13 Improving things
All these example value systems could be refined; they illustrate a common as-
sumption, though, that the value system is obvious from the context. However,
the value system of choice is influenced by the designer, by the market, and other
implicit factors. Given a design problem, it is hard to assess every possible is-
sue carefully; we’d simply be overwhelmed. It is very hard to combine making a
mobile phone small, light, fast, easy to use, technologically up to date (and show-
ing off the fact), cheap, international, trendy, high-resolution, . . . Instead, we have
preferences and heuristics for finding design solutions by focusing on what few
values we really want to promote. Rather than spending a lot of time searching
for a really good idea, we might just polish the first or second reasonable design
idea we think of.
Efficiently finding good-enough solutions, rather than taking time to find the
best solution is called satisficing—see section 2.10 (p. 54).
We have higher priorities. After all, if we take the time to develop the “perfect”
device, it may never be finished and reach the market at all. If we do not deliver
on time, at cost, and with something working, the whole exercise is pointless. We
are under constraints, and within those constraints we need to do as good a job as
possible. As Voltaire put it, the best is the enemy of the good.
In other words, a designer identifies the value system a device should represent,
and that value system then helps focus the design process, so the designer doesn’t
have to consider everything possible. In this book, we’ve often assumed that the
design values include efficiency of use, and, given those values, several examples
show how problems can be turned into opportunities for fresh design. If the prob-
lem is that the user does not know which buttons do anything, why not put lights
in them so that they light up when they actually do something?
Chapter 11, “More complex devices,” showed that using lights may double the
speed of a user. In chapter 8, “Graphs,” we used graph theory to identify
problems, but we could also make the “identification” itself a feature of the
user interface. For example, when a user comes to a bridge (a graph theoretic
bridge, that is), the relevant button lights up and attracts their attention. This
feature is easier to add than to change the device specification, and it might be
a lot more effective in improving the user’s performance or willingness to use
the device. It would certainly make the device stand out in a shop or product
review.
446
13.1. People are different
to Lesson 4 to refine what is meant by “different” and then to explore what can be
done about it.
Your users are different from you: they have different skills, likes, dislikes, and
attitudes—they have different personalities. What makes sense to you as a de-
signer or programmer is unlikely to make sense to a typical user. It is often said
that a designer should know their users: find out who they are and what sort of
systems suit them. Do they prefer, and perform better with, graphical interfaces
or textual ones? Do they need training and online help, or are they already highly
trained? Are there special issues, such as the work environment: is theirs noisy,
badly lit, or hazardous? You might think, “Who needs to make products for badly
lit contexts?” But every home has a badly lit living room where people sit down
to watch television in the evening: how do they use the DVD player, the sound
system, the remote control in the dark? Every car is driven sooner or later at night:
how can the controls be well lit, but not bright enough to be a distraction? (Per-
haps the controls should have gesture sensors, so gadgets like the radio only light
up when a hand moves toward it.)
How will you balance security issues (say, if the product is stolen) against mak-
ing it easy to use—you don’t want a thief to find it too easy to use?
Are children going to use your gadgets? Children work under different social
pressures than adults. They have strong views about work and fun, and while
their reading skills may be lower, they have a huge amount of curiosity and a high
threshold for complexity. Are you designing systems for teaching? If so, some
students are more motivated by the experience than by the resulting grades and
competition. Designs for one sort of learning preference will differ from those for
another.
Are you designing for an international market, with people with different lan-
guages and cultures using devices? Arabic is written right to left, and the order-
ing of things, whether text or button sequences, might need to be different than
for an English left-to-right culture. Color preferences will be different. The way
people respond to you studying how they are different will itself be different! In
some cultures, it would be natural for a user to think you know best and should be
pleased. If so, it will be difficult to find problems with your design. In the 1920s the
Hawthorne Works in Illinois wanted to see how illumination affected workers—
just as today we might be experimenting to see how design affects users. The
now-famous experiments have gone down in history as the Hawthorne Effect,
that users will react in interesting ways to your doing experiments with them.
There are four important alternatives to knowing the user.
Get somebody else to know the user. There are plenty of usability companies,
psychologists, ergonomists and others that can find out about your users. But
remember that knowing your user is an ongoing process: users will change as
they use your devices (they’re supposed to be useful, right!) and although an
early market survey will find out useful information about users, the people
surveyed may not fully understand what the planned product will do. If you
are designing for a specialist area, such as medicine or diving, you will need
expert help. Retain a usability expert or company to repeatedly know the user
as the product takes shape.
447
Chapter 13 Improving things
You don’t need to know the user if they can find your products—get the user
to know you. Since everybody is different and the world is a big place, there
must be somebody who likes your product! If you are designing a product that
ends up only suiting 10% of people, that’s still a huge market—even if you
don’t know which people suits. What’s more, a focused product is more likely
to be distinctive, so the people who like it will really like it. In short, you don’t
need to know the user if the user knows you—and there are enough happy
users to keep you in business.
Make devices customizable. If you don’t know what users want, at least allow
them to change the design so that they can make the design suit themselves.
Provide good ways for users to give you feedback on the design (for example,
provide a supporting web site with a good mechanism for getting and replying
to product comments from users), so that you can continue to improve it.
Know yourself first. The better you understand your unique makeup, skills,
and personality, the more effective you can be. You will better understand how
you differ from and complement other people.
448
13.2. Conventional design
Box 13.1 The ISO 9241 standard ISO is the International Standards Organization. Its stan-
dards cover all sorts of activities; many have become legal requirements. The ISO 9000 series
of standards covers quality and quality processes and will be familiar to many organizations.
ISO 9241, Ergonomics of Human System Interaction, is a standard applicable to all aspects
of the material in this book: it covers usability, including hardware, software, and usability
processes. You could consult ISO 9241 to design a gadget, evaluate a display, set usability
metrics, evaluate a graphical user interface, or test a keyboard.
ISO 9241 contains checklists to run a usability evaluation and examples of how to oper-
ationalize and measure usability. (It also contains extensive bibliographies.) The following
table gives an idea of the coverage of the ISO 9241. Note that most of it is about mea-
suring user satisfaction or performance, something that cannot be done until a product is
completed.
Design objective Effectiveness Efficiency Satisfaction
Any business not familiar with ISO 9241 (and the related standards, such as ISO 13407)—
either using the standard or knowingly not using it—would be failing to use an important
resource in user interface design.
the design process to understand the users, collect empirical data on users’ behav-
ior and performance, and iterate designs to improve them. Some say that close
involvement with users is the only way to produce a good design. By studying
how users really perform, usability of systems can be significantly improved, and
engaging users in the process, as a political act, builds their commitment to the
design.
The lottery effect explains why interactive systems don’t work as well as ex-
pected. We are surrounded by good interactive systems, so we naturally expect
new systems to be good too. But we overlook the fact that there are many bad
interactive systems too, probably far more bad than good ones, but bad systems
don’t survive. If we are in the process of designing a new interactive system, we
want to make one that survives, of course, but the statistics are against us. I sus-
449
Chapter 13 Improving things
pect that the lottery effect has influenced much of the accepted wisdom on what
best design practices are supposed to be.
The lottery effect is discussed in section 2.2 (p. 40).
There are all sorts of ways of designing, some better than others. User-centered
design (UCD) in particular emphasizes that when building interactive products,
the whole point is users and the overall context of what they are doing and expe-
riencing. UCD starts with a strong emphasis on finding out what users want by
defining what “success” means for them.
The conventional approach to UCD has four phases: analysis, design, imple-
mentation, and deployment. The sequential nature seems to make obvious sense:
work out what you want (which for UCD means finding out what users want or
need), design it, make it work, and then get it out to users. Following this sequen-
tial structure, here are some typical activities for each phase:
Analysis phase Meet with key stake-holders, typically the final users, to define
the design vision. Distinguish between the users who will get their hands on
your device, and those who are your clients—the ones helping to specify the
device and its function (and probably paying for it).
Design phase Work out the principles that will guide future decisions. Design
may be broadly split into low- and high-fidelity design:
Normally, you would begin design with paper and pencil, drawing
screens or panels, button layouts or whatever. Create a low-fidelity
prototype, out of cardboard or wood, so that you can begin to conduct
usability testing on something—you can give users the block of wood,
and see how it will work out. Is it too big or too heavy? These are crucial
design questions and easily answered through a little prototyping. Often
low-fidelity prototypes encourage users to play and be creative in how
things should work, because there isn’t really a “correct” way for them to
be used that limits their thinking. Get users to imagine scenarios of how
the devices will be used.
Then create a high-fidelity detailed design. Repeat the usability testing
again now that you have something that really works.
Deployment phase Finish the system and productize it. Confirm that all or
enough of the original objectives have been achieved. Start shipping.
While each step makes perfect sense and is useful, there are several problems with
this conventional UCD sequence. “Begin at the beginning,” the King of Hearts
said gravely to Alice, “and go on till you come to the end: then stop.” A sequential
process seems sensible, but the world is more complicated; unexpected things can
450
13.2. Conventional design
happen, and it’s rare to be able to assemble all the stakeholders at the beginning
and not need to keep going back to them. What about feature interactions that get
spotted too late?
What if the programmers have new insights into the design in the implementa-
tion phase? Or, after the deployment of a new product, it will probably be found
that it does not work as well as hoped: the design needs revising. Thus design
should be seen as a cycle, supporting iterative design. That, of course, means that
the design process should assume that every iteration starts with an earlier design
to build on or modify. Indeed, in product design, it is rare to build a completely
new device; most products are “merely” upgrades and developments of previous
models. If a design is going to be iterated—as it will—then it is very important
that each cycle through the design process not have to start again from scratch.
Design tools should be used so that designs can be incrementally modified, so that
user manuals can be incrementally modified, rather than having to start again—or
worse—be put off indefinitely by the overwhelming effort of starting again.
The sequence isolates the skills of the delivery team from the design phase. In
fact, it turns the programers into mere system builders rather than architects. Un-
derstanding why UCD has come about will help us to understand it better.
451
Chapter 13 Improving things
system, get some experience of it in use, and then modify the design to make it
better. Key to this process is to reliably evaluate how the system is being used so
that useful insights can be fed into the next iteration of the design.
It is counterproductive to impose a prototype design on everybody in an orga-
nization. If it doesn’t work very well, it can cause a business disaster. Instead,
iterative design is usually done with a few selected users—maybe around five
people. They help identify problems and issues, which are then fixed, iterated
again, and—when the design is good enough—imposed on everyone else in the
organization.
We’ve mostly moved on from organizations imposing ideas on their workers.
Today workers more often use common off-the-shelf software (COTS), and con-
sumer devices like mobile phones. The way these products are designed is quite
different. For a start, the organization does not define what it wants and procure
a new system; instead, it selects from a choice of products already on the market.
Whether or not the chosen products work well, the design emphasis has shifted
away from the organization to the manufacturer. The design problem is now to
make a range of products that people want—rather than products for what people
say (or their managers say) they need. Simply, if a device manufacturer is not mak-
ing products people want, it goes out of business. Unfortunately, it has to have the
products available before people will buy them.
In short, the original idea of iterative design has to change with the times. Now
products are released after comparatively little evaluation and design, at least very
little compared to the size of the (anticipated) market. If the product succeeds in
the market, make more of it; if it fails, bring out the next model. The manufacturer
goes through the iterative design process, and as soon as something works “well
enough” it is put on the market. If it sells well, they’re in business. If it doesn’t
sell well enough, a new model is in the pipeline anyway. To cope with failure,
typically a manufacturer makes a range of products catering to different needs. If
one product fails or doesn’t sell too well, the broader portfolio keeps the company
afloat.
Originally, iterative design was a good approach—when workers were forced to
use systems designed for them. Now, users are increasingly customers, not work-
ers. Iterative design still works, but since a manufacturer puts a lot of effort into a
design, it may as well put it on the market as soon as possible and see if somebody
likes the product. For many products there will be somebody who likes it; if there
are enough people, the product succeeds. This newer approach allows compa-
nies to be more innovative—they don’t need to know exactly what tasks and work
users do; they merely need an idea that catches customers’ imaginations. And if
they catch the imagination of the procurement department of a large organization,
then many sales follow automatically.
Web design is different again. Unlike physical devices, which have tooling-up
costs, a web site can be iteratively developed almost continuously. A problem now
is that if you are successful, you still want to improve, but you don’t want to lose
customers already used to features that probably should be fixed or improved.
On the other hand, if the design is not good enough, revenue can stop almost
452
13.2. Conventional design
immediately, and—crucially—the web site designers get very little feedback from
the users who fail to use the site effectively. If it doesn’t work for them, they are
unlikely to provide feedback to help improve things. Thus there is even more need
for design done with representative users, for instance, users chosen at random
from the public, perhaps paid to participate in evaluations.
Iterative design often goes wrong when a product is prematurely released to the
public—the product may not be optimal or as well designed as it might have been,
but at least the company making the device is (hopefully) still in business.
We are still arguing about the merits of different sorts of keyboard; see box 5.2,
“The QWERTY effect” (p. 125). Proper iterative design might have postponed
the adoption the keyboard; instead, it was better to have a worse design
available sooner.
While the historical progression has naturally seen an increasing emphasis on
the user, from knowing their task to knowing what they like, there has been a
corresponding decrease in emphasis on the designer and implementor. Design is
very hard. Most programmers, probably happy enough to have built a system that
works at all, have no spare energy to make it, additionally, work well for users—
indeed, usually programmers are never motivated by involvement in the creative
design process. Moreover UCD trivializes programming into a passive following
of requirements: users drive design, and programmers try hard to make users’
ideas reality. These two forces together create a culture where the designers’ and
programmers’ potential insights into device usability never get nurtured.
453
Chapter 13 Improving things
There may be conflicts between your requirements that are not at all obvious until
you are holding what you said you wanted.
In hindsight, asking for a low-weight camera is an obvious requirement, but
now you’ve wasted a year and don’t have quite what you wanted—though you
do have exactly what you asked for. To summarize: in ordinary design, you spend
a lot of time specifying what you want, you wait a long time, and you don’t get
quite what you really wanted. In bad cases, your developers will hand over the
product and say it meets your requirements; they aren’t interested in making it
better.
How would iterative design work with cameras? After you’d made your list of
basic requirements, you go to a camera shop and describe what you want. The
shop assistant spreads half a dozen cameras in front of you and says they are
pretty much what you asked for. Try holding this one? Perhaps it doesn’t have a
battery in it and doesn’t even work—but you get an idea of what it feels like. You
want one that feels more solidly made? Then you should try this one . . . This is
satisficing, and you’ve got yourself a camera that is good enough and, because of
its market size, costs a fraction of what you thought you really wanted.
In a sense, the shop assistant has shown you some rapid prototypes of what
you wanted. In fact, they only took seconds to show you—the camera shop is
not building prototypes—but the idea is the same. You can get a good idea of
what you want from quick mock-ups, or (as here) alternative products. You can
now work out with the shop assistant what you really want. Instead of specifying
exactly what you want, you are instead working with things that really work, and
you are saying to the shop assistant how you want to change what you are being
shown for something better. It’s easier and faster to say how to change things than
to specify a complete design.
A good camera shop will let you test or rent a camera: try before you buy. When
you get your nice digital SLR home, you’ll probably discover that you wanted a
more convenient shape. Or if you take a compact home, perhaps you’ll discover
that you want better control. Who knows? But you wouldn’t have anticipated any
of these changes you want if you’d tried to write out all your requirements before
starting.
In iterative design, you get something close to what you want very quickly, and
you then work with the designers to make it better. Both of you are talking the
same language. You can try the prototypes out, and better imagine how further
developments will work. You can see how the device will change your life and
make a better list of requirements and changes. In short, you end up with some-
thing closer to what you want, and sooner.
454
13.2. Conventional design
Box 13.2 Paper prototyping Most of this book has been about how to get a design to work,
to know exactly what it is, to analyze it so you know how to tweak its design to get the best
performance, and to use tools so that you can build it reliably and can tell the user exactly
what it does. More or less. That list misses out the important bit of how to get going.
Where do design ideas come from in the first place, and how do you get the design ideas
into your head so that something appropriate can be implemented?
There are three simple rules:
Start with paper prototypes. These only take a few minutes to make, and you can
work with users in their environment and context to see how they get on with the
prototypes. Paper prototypes have the huge advantage of not inhibiting people from
making suggestions. They are easy to scribble on. They are easy to update during a
session with users as new ideas get sorted out. A block of wood, half a dozen clipped
on pieces of paper, and a packet of colored pens can simulate all sorts of complex
devices.
It can take years to work out what people want and then implement it. When you
finally deliver the finished product, users will have drifted and changed their
minds—and in any case finally using your device will change how users want to work.
Instead, implement parts of your design ideas quickly, and try them out quickly with
your users. This way, users get more involved in product development too. If you are
lucky, you need never deliver 100% of the original planned product; it’s much easier to
deliver 60% and stop while everybody is still happy! This idea is a form of agile
development.
Whatever you do won’t be ideal; it will have flaws—users will have moved on, the
market will have found another fashion, your suppliers won’t have some chip in
stock . . . Design devices to be break. If you don’t design them to break, they will
anyway, but then you won’t know what to do. Instead, design them to break, so you
have a planned way of fixing them. One way to help is to design a framework like this
book suggests, so that the framework is a firm, well-debugged, foundation, but you
can very easily fix broken design details.
late. Instead, why not start writing the user manual as early as possible—indeed,
at the first meetings with stakeholders or while brainstorming with users—and
then many design issues can be fixed sooner. By definition, the user manual is
something users can understand, so for them it is a helpful thing to see as soon
as possible so they can provide feedback. Of course, if you want the user man-
ual developed concurrently, the design process needs to be supported by tools, a
framework, that ensures that working on the manual is not effort lost as the design
is revised.
The benefits of concurrent design have been touched on in several places in this
book; they were introduced in section 6.8 (p. 195), which also defined the key
problem in interaction programming.
455
Chapter 13 Improving things
One reason why government computerization projects so often fail is that they
specify requirements and expect a system to work to the requirements on a go-
live date. After all, that is what they have procured. Rarely do systems work as
expected, or even at all at first. And systems that involve users are even less likely
to work as expected.
The real world is complex, and there are no general solutions to the problem of
procuring interactive systems. Here, however, are some thoughts that can help:
Require processes. If suppliers are using best methods, such as iterative design,
the products they deliver will be better in the end than if they work to the
procurer’s original functional requirements in a strict way.
Require prototypes. Arrange to get users’ hands and eyes on prototypes as
quickly as possible so that things can be learned and the designs modified as
necessary. Sixty percent of a prototype today is going to be more use than a
more complete prototype in a few month’s time—the more complete prototype
will be harder to modify, and you will have lost months of opportunities.
456
13.3. Varieties of evaluation
Early prototypes should include all factors of a design (if only as mock-ups):
screen layouts, user manuals, interactive help, web backup, . . . to show that
they can be integrated.
Require systems that are open, or at least extensible. Whatever system is finally
delivered will almost certainly need modifying as soon as you get experience
using it. Make sure that modifying the system is easy. What features does the
system have to help maintain its own help and user manuals, for instance, so
that modifications can be tracked and made consistently?
Try to procure something that is like an existing product that already works and
can be tried out. New “solutions” will have completely new problems. As with
the rationale for iterative design, it’s easier to start with something that works
and make it better than to specify something from scratch.
457
Chapter 13 Improving things
professor, who may be more interested in whether you can do statistics than in
your insight into how to improve a bad user interface!
Having said that, it’s important for this book to give you a taste of the usability
evaluation methods (UEMs), that are available for evaluation. Evaluation is a key
part of design; if you don’t know how well you are doing against your goals, it
will be hard to improve.
Devices we design won’t usually be perfect; in fact, initial designs will probably
have numerous problems that are easy to identify and well worth fixing. We will
want to improve them. But to improve things, except by pure luck, we need to
know how our designs are performing now, compared with what we want them
to achieve. And, the world being what it is, we will often find that we change our
minds about what we want the design to achieve—users certainly change their
requirements as soon as they start using a new system that is any use to them!
Devices won’t be perfect. The usual solution is to recruit users, either directly
or indirectly, to help improve the design. In fact, this is the easy approach. Things
are badly designed mainly because the designers didn’t think enough first. Users
should have been consulted to understand the domain, certainly, but then clear
thinking should help enormously to get a good user interface to do what it should.
The problem is (in my mind) that historically we haven’t thought long and hard
enough about design, and the only way to fix the mess that results was to empha-
size getting users involved to help find the problems. The right way, though, is to
design better (which is much harder) and make the evaluator’s job easier.
Evaluation is pointless unless it is done within a framework that can interpret
and make use of any insights. Central to evaluation, then, are three factors:
Requirements What do we want to achieve? One of the things we want to
achieve is a device that is easier to evaluate—so program it so that evaluation,
recording user data, user logs and so on, is done.
Evaluation How well are we achieving it?
Iteration Can we easily change the system to improve it?
There are different sorts of problems that may be uncovered. There are minor
problems to killer problems, and there are fixable and unfixable problems. The
goal is to find significant, fixable problems cost effectively and not to make worse
problems as others are fixed! So-called discount methods are stripped-down eval-
uation methods that are supposed to give good results with less effort.
There are three main approaches to evaluation, typically called inspection, in-
quiry, and testing methods.
Inspection First, we can make measurements of a design without involving users
at all. Most of this book, especially part II, provides ideas for this analytic or
structural analysis. These are things that can be measured or determined from
the design itself, usually using experts—expert programmers, expert
designers, expert usability people, or expert users.
Inquiry We can make measurements from a working (or mocked-up) system,
getting data about how people use it in the field. This can provide a huge
458
13.3. Varieties of evaluation
amount of data, but this method has two major problems: people who have
problems with the design may not do enough to be proportionately
represented in the data (if they can’t get the system to work at all, you have no
data whatsoever!); worse, you don’t hear at all from people who choose not to
use it. You may never find out how to make your market larger.
Testing Finally, testing takes specific design questions or trade-offs and sets up
tests to see which options are best. Typically, testing is done in a laboratory.
Each of these basic approaches can be used in early stages of design, during proto-
typing or once a design has been released. Most people forget the rich information
that can be got from real users—particularly if they can register (or even use) the
system over the web. If users are involved, it’s important to get the right users:
for example, the people buying a system may be managers and not actually the
people who will be using it. Most user surveys collect lots of information about
demographics—like, what sort of users are we selling to?—but very little informa-
tion that will help design better products.
459
Chapter 13 Improving things
More guidelines are given in box 12.2, “Design guidelines” (p. 419).
In a heuristic evaluation, the design is assessed against such principles, prefer-
ably by several experts working independently. These queries could be expanded
indefinitely; that’s why usability experts are used to bring their experience and
knowledge to the evaluation. Typically, the expert evaluators use checkbox lists
and record their comments on areas for improvement. For instance each principle
might be scored on a scale from 0 to 4: 0, not a usability issue at all; 1, cosmetic
problem; 2, low priority; 3, major problem; 4, a catastrophe that must be fixed.
The point of heuristic evaluation is to assess critical design issues that are be-
lieved to be generic. If a simple usability study done with a few users encoun-
tered some problems, you’d have no idea whether this result was significant—is
it a quirk of the user or a flaw in the design? With heuristic evaluation, one tries
to identify the systematic problems that are likely to be significant. Obviously,
heuristic evaluation will be much stronger if it can draw on real usability studies
done with real users. For example, the expert evaluators may notice that the de-
vice does not have undo; what they can’t tell is whether this matters given the way
real users would use it.
460
13.3. Varieties of evaluation
Viscosity A viscous system is one that requires many actions for a user to achieve
goals. There are two sorts of viscosity: repetition viscosity—concerning the
difficulty of doing the same thing repeatedly, and knock-on viscosity—the sort
of viscosity that arises when a “simple” change requires many steps.
Progressive evaluation Can a user find out as they work how well they are doing?
Does a device give immediate feedback?
Abstraction Abstraction allows a user to change and generally extend their user
interface in ways they want. For example, on a mobile phone a user may add
speed dialing codes: in one or two button presses, they can now dial long
numbers. In effect + 3 (or whatever the convention) becomes an abstraction,
hiding the detail of what it means, for instance dialing +441792521189.
Provisionality Can a user do “what if” operations? Can they test parts of an
interaction, for instance to check they have understood things or have got their
data correct, before committing to doing things? Can I write a text message,
choose who to send it to, but not actually send it yet (not on my phone, you
can’t)?
461
Chapter 13 Improving things
Cognitive dimensions can be traded against each other. For example, the more
abstraction a device provides, generally the harder provisionality becomes: it is
now harder to check parts of the abstracted actions. Before the abstraction of speed
dialing, an incorrect number was fairly easy to change, say correcting digit-by-
digit, but with speed dialing the phone has probably already dialed the number
before it can be corrected.
Of course, the idea of CDs is that the designer thinks: is it possible to have
abstraction and provisionality? For speed dialing, it is. The speed dialing could
behave exactly like the user entering the phone number; if so, a user could check
the number is right, and correct it as any other number they enter. Whether this is
better (more provisionality) or not (reduces the speed-up of abstraction) requires
further thinking about what the device and user are trying to do.
462
13.3. Varieties of evaluation
should have but doesn’t. You need to give your users some basic instructions: for
example, they don’t need to analyze how they are thinking (that’s your job).
Often, especially with early prototype evaluation, you have to to help users
through impasses when they otherwise can’t work out what to do. Of course,
these impasses are things that must be fixed in the design, and finding them will
be very useful. Unless you are a psychologist, once you’ve found an impasse in
the design, it probably isn’t very interesting to let the think aloud session continue
without helping the user out of the predicament that the next version of the device
will avoid.
463
Chapter 13 Improving things
Isn’t it more realistic to say that design is complex and that humans—whether
users, customers, or designers—are not capable of doing perfectly. Some informa-
tion from users is invaluable, of course, and designers should heed it; likewise—
and hardly ever emphasized by the conventional wisdom—designers’ intuitions
too are valuable and surprisingly powerful. Designers have more experience, a
systemwide view of design, training in handling complexity, and so forth. One
should also draw on their expertise, hence designer-centered design, DCD.
An analogy will make the virtue of DCD apparent. Suppose we use conven-
tional empirical usability to test the quality of a wooden floor. Most of the time,
most users will not get their feet caught in the one raised splinter, so conventional
evaluation is unlikely to spot the problem that needs fixing. Therefore the floor
is accepted with a flaw. On the other hand, with designer-centered design, any
competent designer would think about splinters and devise a process to ensure
that there were none. Or you could, somewhat pushing the analogy, imagine a
salesman writing a brochure claiming that there are no splinters. If that claim is
written early enough (concurrent design again) and the designer reads it, they will
ensure that there are no splinters by devising processes to prepare the floor to meet
its specifications. Similarly, with interactive product design: users can easily miss
flaws that are hard to find. Any empirical quality control is therefore a chancy
operation. On the other hand, if designers are involved, almost any usability crite-
rion can be ensured at early stages in the design process rather than checked after
things can be tested on users.
Design processes that involve users—what they think and what they do using
mock-ups, prototypes, manuals, or field systems—can only be informed by what
is encountered, not by the principles and aims of the design. In contrast, design-
ers have implicit knowledge about design, which can be checked in any way by
writing programs or by generating visualizations, user manuals, and so forth.
Because designers can create multiple perspectives of a design, they are in a cen-
tral position to reflect about the design. Different forms of the design—working
models, user manuals, help texts, transition diagrams—highlight different sorts
of design issues. What we want is a reflective designer or, better, a designer who
writes or talks aloud about the design and makes insights explicit. Once explicit,
they have to be justified, and if the designer’s ideas are indeed sensible, they have
to be ensured so that the final product actually works in the intended way.
464
13.3. Varieties of evaluation
not a walkthrough to find out what the user thinks (although this is valuable as
another design tool) but a technique to facilitate the designers thinking. Some of
the designer’s explanations as they think aloud will be convoluted; such expla-
nations identify potential areas of change to the design that would enable clearer
explanations. Elegant and satisfying parts of the design, on the other hand, can be
generalized so that the same qualities are available more uniformly or consistently
across the design.
Imagine that the designer explains to a potential user how the device works and
can achieve various tasks. The designer must explain its principles. This should
not be a demonstration—certainly not a demonstration that follows a script, as
demonstrations are too easily turned into drama that encourage people to imagine
more there than there is, and an exercise of imagination makes it very hard for the
design to be completely honest. We need a design approach that recognizes the
difference between a real system and a show—and works with a real system. A
central purpose of designer-centered design is to uncover wishes and hence refine
designs so that unfulfilled wishes become realities.
In design aloud it is more productive not to use a mock-up system (such as ones
made out of paper, storyboards, and other simulations) for the same reasons. A
running prototype should be used, but it should be treated as the real system.
The advantage of working with computers is that they have higher standards of
accuracy than humans (that’s why they crash so often!), and you cannot get away
with any hand waving. You can say of a piece of cardboard that it should work in
such a fashion, and your users may believe you, but a computer program shows
whether what you are saying about the design is fact. You cannot fake a high-
fidelity prototype, and thus many more detailed design issues will be uncovered.
We want the designer to be able to say, “I used the wrong design,” “the same
design feature works over here too,” or “the design is more general than that”—
these sorts of insight about the design cannot emerge from a storyboard that has
no specific design! Design aloud generates insights into details of system design
rather than to those superficial (and more easily changed) aspects such as appear-
ance. Typically a design aloud session generates interesting programming ideas.
Thus, design aloud is used when a system is nearly ready for delivery, but before
its design is fixed.
The design aloud rules require that the designer honestly explain how the sys-
tem actually works and what if any are its limitations. Designers can also express
intentions, describing features as they wish them to be—but they must be strictly
honest in distinguishing actual and wished-for features. Obviously, design aloud
can be recorded for future reference and can be done with one or more designers
working as a team; exactly how you put the idea into operation will depend on
circumstances.
Here’s a simple example: The designer explains that the user’s name must be
typed in a box on the screen. The designer (being honest) says that the name must
be no more than 100 characters. That is an obvious usability bug, but in design
aloud no user has had to discover it. For the sake of brevity, these are trivial ex-
amples. Ideally, the designer in a design aloud session would think of new ideas
465
Chapter 13 Improving things
Box 13.3 Design as literature Since hearing it expressed in 1982, I have been impressed
with Don Knuth’s insight that good program design is literature. Programming is design,
and design is literature too. It is but a short step from this idea to combine them and make
design worth talking about. And a design that is really worth talking about becomes a better
design if you can start talking about it soon enough, while the design is in progress. Knuth
invented “literate programming” as a means of combining the best of both programming
and literature. When a program is written to be read as literature (as well as a computer
program), it gets better, because the programmer gets involved in more than one narrative—
the program itself and its explanation to people who read the literature about it. Seeing it
twice, and combined automatically in a literate programming tool, encourages and enables
the programmer to program better.
Literate programming is a similar to the ideas in chapter 9, “A framework for design,”
where we developed a programming framework that enables an interactive device specification
to be converted into a user manual. Seeing the device itself, as a working prototype, its
manual, and the program itself, together gives the designer multiple views of the same
thing—and helps generate design insights. Moreover, the framework significantly reduces
the work required for it all to work together in a coordinated fashion. When you have a
design insight, you want to be able to put it into action easily!
Knuth is probably the most prolific programmer ever. His programs (notably TEX, which
is used to produce most of the world’s scientific literature in physics and mathematics—and
was also used for typesetting this book) have changed typography forever, and his Art of
Computer Programming is among the most significant books written on computer science—
it was named, along with Einstein’s The Meaning of Relativity, as one of the best twelve
scientific monographs of the 20th century by the magazine American Scientist. We should
therefore take very seriously two bits of Knuth’s advice:
The designer of a new kind of system must participate fully in the implementation.
and
The designer should also write the first user manual.
These ideas work for interaction programmers too. The dual perspective of being both a
designer and an expositor enables you to see all sorts of inconsistencies and shortcomings
that are not visible from any one point of view of the design. If you have the design tools
that facilitate this, so much the better.
466
13.4. Ethics and interaction programming
Deontology Good is defined by rules. Society is defined by rules, and we all get
along when we follow them.
467
Chapter 13 Improving things
good doctor, at least for a deontologist. Even if doctors who disagree with the law
use their own rules they are still deontological in their approach to ethical issues.
In design, deontology corresponds to the idea that good user interfaces are ones
that are defined following guidelines. In systems where users are highly trained,
from driving railway trains to flying space shuttles, it is crucial that user interfaces
(often made by many different companies) adhere to the training and expectations
of the users.
Situation ethics (or ethical particularism) A situation ethicist would say that
ethical issues are specific to given situations. Thus whether a doctor should
perform an abortion depends on a whole complex of issues, such as whether
the conception happened because the parents were trying to obtain transplant
material for a sibling, or whether the sex of the child will be desirable in the
culture. Each situation is complex in its own way, and it would be
unreasonable to have rules to make automatic decisions.
In design, the situation ethics starts from being clear about what the user’s task
and context is. What is the user doing with the system? The situation the user is
in may not be the general one we planned; we should study specific users, specific
tasks.
Utilitarianism Often we have to make ethical decisions for groups of people. A
utilitatrian will do a cost/benefit analysis to maximize the overall benefits.
A utilitarian would say that in design we should persuade people with the eco-
nomic benefits of new devices; cost effectiveness is all. Or reduced keystroke count
is all.
Web sites are perhaps the easiest way to see the connection between “good de-
sign” and utilitarian outcomes. Web sites make money. The “better” the site,
the more money it makes—this is a utilitarian approach to goodness. Better here
means more profit.
We have to make decisions for ourselves and, for a utilitarian, any decision
has multiple consequences; a utilitarian balances ethical consequences almost me-
chanically.
Consequentialism What are the ethical consequences? For a consequentialist, the
ends justifies the means.
A consequentialist would say that we can’t know how good a user interface is
without evaluating it under real use—what are the consequences of using this
system? Good device design must be based on empirical evaluations, and partic-
ularly ones that are “ecologically valid,” seeking to evaluate the effectiveness of
the user interface in real conditions. We should collect usability metrics, data like
how effective users are in performing their tasks.
Virtue ethics It is hard to follow any ethical code of behavior of any sort. Instead,
you as an agent in the world can try to be as good as possible, develop your
virtue, develop your integrity, skills, patience, and so on, so that when you are
thrown into an ethical situation your whole character and maturity guides you.
468
13.4. Ethics and interaction programming
Virtue ethics is almost the complete opposite to consequentialism: if you are virtu-
ous, the causes are good, so the results must be good. The virtuous designer might
take this approach: “I am a good and experienced designer, so the systems I make
are good.”
Hedonism A hedonist says, simply, “Enjoy yourself!” If people are having fun
and are happy, all other issues are distractions.
Empower the user! The user experience is paramount. Hedonism is obviously im-
portant for the games industry, but is also a key issue in driving customer choice.
If you don’t know how to make a good mobile phone—so the customer cannot tell
the difference between your phone and a competitors—then promote hedonism.
A better phone will be one users enjoy and know they will enjoy.
Professional ethics All the ethical stances discussed above could be about
individual preferences. In contrast, professional ethics is concerned with how
people work professionally, in organizations, as members of professional
bodies, or as consultants.
Professional ethics is often taught to students doing vocational courses—we want
students to get jobs and to stay on the right side of the law! Capitalism embod-
ies a particular ethics—“the market is good,” sometimes deteriorating into “the
market is right.” When that is bolstered with “work within the law,” one has a
professional ethics—special concern for contractual behavior, handling conflicts of
interest, respecting privacy, working with children, and so on. Professional ethics
is often reduced to a (deontological) list of guidelines that professionals should
follow; if so, it is then called a code of professional responsibility and thus avoids
any hint of the interesting philosophical issues ethical codes raise.
Ethics is of course much bigger than this short list can begin to cover. There are
feminist ethics, Marxist ethics, and a whole range of ethical approaches that not
everybody would even begin to agree on, such as sadism and various religious
ethics. And so on. Of course there is also more to design: for game playing, learn-
ing environments, persuasive applications (“captology”) or secure systems (such
as voting systems), and these highlight some further complementary ethical issues
that cannot be explored here.
The various approaches can be classified in many ways. Ethical monism is that
there is one principle on which we should make moral judgments. So hedonism
469
Chapter 13 Improving things
Box 13.4 Hippocrates Hippocrates had suggestions for practicing doctors that were innova-
tive in the second century BC: doctors should take an oath that defined them professionally.
It required that they have an ethical position and be reliable. He also suggested that to do
medicine well doctors should listen to their patients—they should know their users.
Ancient Greek medicine has virtually nothing to do with modern interactive system design,
but parallels can be drawn. We have essentially no theories of effective design; instead, we
must emphasize practical approaches to building better systems. Central to designing well
is the attitude of listening to users and having a worked-out ethical stance—a topic that is
discussed at more length in section 13.4 (p. 467).
There is a deeper parallel: medicine is about making people healthy; interaction program-
ming is about making people’s lives better. Even without considering interactive implants
and their direct effects on health, we should take improving lives as seriously as physicians do.
470
13.4. Ethics and interaction programming
The fields of usability divide fairly cleanly corresponding to their ethical sig-
natures: this is not surprising, since usability is about making computers better,
and ethics is about what better itself means. The previous section considered eth-
ical positions as design issues; we can go the other way around, and look at ap-
proaches to HCI in ethical terms.
Design guidelines approach “Follow this style guide and get a better user
interface” corresponds to deontology—what is right is defined by rules. There
are rules for adhering to the look and feel of most platforms, and a very strong
sense in which to follow the style guides makes any system better because
users will be familiar with the rules.
Box 12.2, “Design guidelines” (p. 419) summarizes some guidelines; see also the
last chapter’s further reading (p. 441).
Fit for purpose Some claim that making systems “fit for purpose,” to best help
the users achieve goals is paramount. (This naturally assumes that some effort
has been put into finding out what users’ tasks and requirements are.) Being fit
is, ethically, a utilitarian approach. If we emphasize being user-centered (to
determine exactly what particular users want), then the corresponding ethics is
situational.
Formal methods Another HCI approach comes from computer science’s formal
methods: there is, in principle, a right way to design systems. This is an
absolutist or rational ethics. Almost diametrically opposed to this approach is
typical industrial design practice: a designer makes a striking design that is
accepted as good because the designer is good—this is virtue ethics.
Empirical studies of usability The empirical studies that inform usability raise
well-known ethical issues. Does logging user activity infringe on their privacy?
Does timing users put them under unhealthy stress? Such ethical issues are
about how one studies usability problems, rather than about how one applies
the discoveries to make better systems.
People have been arguing over the various ethical points of view for centuries, so it
is not surprising that the diverse HCI approaches, given that they can be put into
correspondence with ethical views, are not readily reconciled either. People ar-
gue! Notwithstanding experimental methods, explicit discussion of ethical issues
has been largely avoided in mainstream HCI. If making a computer system more
usable allows fewer users to achieve higher levels of productivity, then usability
puts people out of work. Is this for their good?
People are complicated, users are complicated, and human-computer systems
reflect human diversity. Central to all approaches to usability is that systems fail
471
Chapter 13 Improving things
unless the needs of users are explicitly considered, hence the term “user-centered
design.” This is essentially the prime ethical judgment—how one attempts to
bring it about influences the secondary ethics. Nevertheless, you can’t be user-
centered if the devices don’t work, you also need to be technology-centered—
hence this book, of course. To keep the balance, it’s helpful to consider HCI as
balancing act between “externalist” and “internalist” perspectives—one consider-
ing the user, task, environment external to the device, and the other considering
the programming, insides of the device. Balanced HCI is both.
472
13.4. Ethics and interaction programming
in some sense for their users. Apart from being able to make the point that ethics
and design are diverse, we are really more concerned with going from the analogy
to some practical way to improve designs—or to improve ourselves as designers.
We can make the connection from one to the other through justice.
Aristotle defined justice as doing good for other people. Designing a system to
be usable is about making the system good for the people who use it, so promoting
usability is promoting justice. Aristotle’s definition still begs the question what is
doing good? Usability may be justice, but this view does not seem to help make
things better or more usable; it doesn’t begin to give us any insight into improving
user interface design. Moreover, it doesn’t help us design differently, or better,
even if we want to promote justice.
In the 1970s John Rawls published a landmark, operational definition of justice.
He claimed that his approach avoids introducing controversial ethical elements: it
avoids the possible conflicts of ethical position inherent in usability.
According to Rawls, a just system is one that has been designed under a veil of
ignorance. Rather than define a just system (which begs the question of what we
mean by good), Rawls defines a process that results in a just system. The problem
he solves is that one person’s good system is not necessarily another’s. I might
design the system to suit myself: I might, for instance, ensure the system has subtle
bugs so that my company has to retain me to keep fixing them. Clearly, while this
strategy might benefit me, it is not to many other peoples’ advantage. Instead,
if I design a system without knowing what role I shall have when the system is
delivered, then I will try to make the system as good as possible for all possible
users. This is the idea of the veil of ignorance: just in case I might be a telephone
support operator, I had better design the system to be easy to explain; just in case
I might be a colorblind user, I had better make the colors reconfigurable. Design
a system knowing that you will play some part in its future but without knowing
what part you will play.
Rawls was concerned with building just societies and he wanted a way to avoid,
for instance, the problem that dictators (those with the power to define the soci-
ety) do not often build fair societies. Why should any society designed by one
person be any good? Typically, powerless people are marginalized. But if the so-
ciety’s designer designed a social system not knowing whether or what way they
would be powerless, then they are unlikely to design in a way that marginalizes
the powerless.
There is a direct correspondence between the imaginary Rawls approach and
social engineering to the actual social engineering that is caused by computer sys-
tems. Rawls imagined that someone would sit down and envisage a social struc-
ture that later they would have to live in. He argued that to do this successfully,
that is justly, the architect would have to work under a veil of ignorance.
With computer technology, we routinely do the former, and never the latter.
Every program we write creates a part of a world where the users of that program
have to obey our rules. Have we created a just system? Rawls would say not, since
we built it knowing that we would remain programmers separate from the world
where it was used.
473
Chapter 13 Improving things
Of course systems vary enormously in how easy to use they are, how “just”
they are. In support of the Rawlsian view we might argue that some systems are
good for users by chance (sometimes despite the designer’s plans!) whereas other
systems are good because designers also use them, and so forth. Mobile phones are
an example of a technology certainly used by their designers; telephone answering
services are an example of technology rarely used by their designers.
The Rawls approach straightforwardly supports the central tenets of good us-
ability practice. If you do not know the user, you cannot design a good system.
For Rawls this is a moral consequence; for the designer it is an empirical fact.
Rawls suggests a quite specific procedure, which we can translate into a design
process. You may be a user of the system you are building, but you do not know
which sort of user. Therefore, find out who the users are, and appreciate their
individual differences, the details of their tasks, and so forth. It is rare for designers
to use the systems they build, but Rawls claims that if a system is built without
working through the veil of ignorance, there will not be a just outcome.
Pursuing justice is consistent with established good usability practice; never-
theless, associating design and ethics, more specifically, justice, achieves very little
unless it is generative. We still need to do a lot better than have a philosophical
language to disguise our hindsight.
474
13.4. Ethics and interaction programming
The chocolate bar is trivial, but the same technique has been generalized for
decision making for far more complex circumstances. Imagine that several busi-
nesses wish to divide some piece of intellectual property: some wish to acquire
rights, and some wish to acquire profit. There may be several companies, and
many conflicting factors. A computer program can help by organizing the parties
to make choices that leave the future undetermined (so they impose a veil of igno-
rance) yet that also tend toward a final resolution, with the parties agreeing that
they have fair shares.
We know that there is at least one interesting application of Rawls’s ideas in
programming: but in an application area. It is not surprising that some specific
programming techniques are required to make a particular program help people
reach a state of consensus. In a similar vein, we might wonder whether designing
programs for ballistic missiles promoted justice because (under the veil of igno-
rance) we could end up living in the target city. We would not willingly design
missiles under those circumstances, and therefore we can conclude that program-
ming weaponry is unjust (in the Rawlsian sense). Thinking like this, we could pick
off examples one by one and conclude whether or not they were just.
Here are some practical ideas:
Writing user manuals for users can be seen in a similar light. Since under the
veil of ignorance, we might be the future users. We should write user manuals
while we know what the system is about, before we become those future users!
For many projects, technical authors, not developers, write the user manuals.
But under the veil of ignorance, in the future we could be the technical authors,
trying to write the user manual but struggling to find out what the system is
supposed to do. Design the system (or the framework the system is built with)
so that the future manual is better and so that the process of writing the
manual is easier.
As designers we can do a lot better than worry about individual future needs:
we can build tools, like design frameworks, to make us (whoever “we” turn
out to be) more effective. We no longer merely design a system; rather, we
design a framework where a range of systems can be developed. For, there is
no such thing as the design; every design has multiple versions, and you had
475
Chapter 13 Improving things
better have a framework that allows you to explore the variations. It might be
that in one design project we really do come up with some insights too late to
do anything with them, but these insights can be fed into the framework we
are using, and then they become resources for all future designs. You only
need to have hindsight once for it to become foresight.
Many programs are impossible to use unless you know exactly how to use
them, even though, as is often the case, the knowledge needed seems quite
trivial. Sometimes what the user wants is achieved by holding down the
control key when another similar action is performed: yet the basic action does
not mention its variation. We often underestimate the barriers these sorts of
features create: the solution is so easy that the problem looks trivial. The veil of
ignorance asks, as it were, if you forgot that simple fact, could you work out
how to proceed?
476
13.5. Designing “the right way”
Box 13.6 Computers were invented too soon Society in the 1950s and ’60s was creaking
under the growing complexity of all its regulations. Tax rules were becoming incomprehensi-
ble. Social security baffled everybody. Financial regulation was impenetrable. If computers
had not been brought in to help the government manage this complexity, there’s a good
chance that instead the government would have started simplifying the complex systems so
that humans could understand them. As Jo Weizenbaum said, computers were invented too
soon. Instead of a simple society, we have a complex society beset with complex regulations
that only computers can handle. Computers just seem to absorb complexity. When a new
regulation is invented, computer programs are extended, and nobody who writes the rules
has to worry about the growing burden of managing the rules, since the computers adapt so
readily.
The people who have to obey the rules have an increasingly difficult time. The rules affect
people’s lives: they are no longer abstract instructions inside government computers, which
can be modified and added to at will: whether people can comply with the rules depends on
whether those rules are humanly understandable. It’s quite clear that even experts no longer
understand the rules we all have to live by.
worlds work. What are the rules for understanding a mobile phone, say? If science
has rules for doing good work, it follows that artificial scientists have an obliga-
tion to build systems in which such rules are more likely to work, and more likely
to work regardless of the users—this goes back to the veil of ignorance—hence all
those design principles, such as consistency, permissiveness, modelessness, and
affordance. These are principles that create worlds that are easier to understand.
477
Chapter 13 Improving things
You could make these ideas more concrete by thinking through some interaction
programming project: initially, you think there is a right way to design the de-
vice; then you notice some users disagree—but they are wrong because they don’t
know what’s right; then you notice that, actually, nobody knows what’s right, and
maybe you need to do some experiments to find out what’s right. If you get as far
as doing experiments to find out what’s right, you are into Perry’s third level.
Next we move from a simple right/wrong dualism to a more considered posi-
tion. Anyone has a right to their own opinions, or we may say, “This is how they
want us to think.” Diversity of opinion creates niches with which we may per-
sonally identify, for instance, by following heroes. As we progress through these
stages of development, we start to realize that some orientation or personal com-
mitment is required. We’ve accepted that the experiments didn’t give us facts to
answer design questions but more ideas.
There is a distinct step between realizing a commitment is logically required to
actually making a commitment. Once we do make a commitment, we then have
to start to work through the consequences of it. Perhaps our commitment is to
profit, or to enhancing the lives of our device’s users? Finally we realize that our
commitments themselves are dynamic and our choices part of our self-identity.
Our rapid progression through Perry’s ideas creates a misleading impression of
straightforward progression. For a student working fulltime, it would take many
years to progress up this chain of intellectual sophistication. Indeed, some stu-
dents do not make it (one of Perry’s contributions was to explore the ways of
failing to make progress).
In everything we learn, although we may be taught otherwise initially, the key
point is that the distinction of absolute “right” and “wrong” is temporary. When
we start learning a new skill or subject (such as programming), initially we give no
thought to right and wrong: it is all part of the given structure we are learning, and
it is taken for granted that what we are learning is good. Later, if at all, we come
to acknowledge that our personal choices have ethical consequences that need to
be worked out.
To summarize, even as we learn a subject as technical as interaction program-
ming, we develop through levels of increasing ethical awareness. Conversely, peo-
ple get stuck at unsophisticated levels when their ethical awareness does not de-
velop. There is initially a right way to use object-oriented patterns, say, but later
as we gain in expertise and skill, we see that “the right way” is a larger question—
something we have to work out dynamically for ourselves. It is gratifying that
programming is a powerful enough discipline to accommodate our growing eth-
ical commitments. We can comment better, we can make programs more usable,
we can make them more maintainable . . . , and there is no end to the tools we can
develop to enhance our and others’ experience with our programs.
Perry’s levels are personal levels of understanding and development. They can
be compared to maturity models, which are used to pin down how well an organi-
zation is following some method, such as a quality control method. Phil Crosby’s
Capability Maturity Model has five levels:
Box 13.7 Scenarios Narrativium (to use Terry Pratchett’s term) is the imaginary element of
story telling. Heroes have to travel and do great things. You can’t write a tragedy without
everyone knowing things will go wrong. This is the narrative impulse.
The alternative reality of Discworld, the popular science fiction fantasy milieu created by
Pratchett, takes narrativium to its limits. Discworld is magical, and the narrative power of
magic actually works: wizards make spells and things happen. In contrast, the real world we
live in is not driven by narrativium, so when we try to do magic, it is only wishful thinking.
There is no built-in reason for seventh sons of seventh sons to become kings. In a Discworld
story, they have to, even if they are daughters.
The analogy here is not purely made in jest! Pratchett creates an alternative world, much
like user interface designs create new worlds for users—indeed, worlds that envisioned only
a few years ago would have seemed to be science fiction too. Worlds are held together by
their narrativium, the narrative sense they make to readers—or in interaction programming,
the sense they make to users.
Narrativium has been regularized for use in design. To do good design, designers write
scenarios of how real users will achieve real goals. John is a postman, and he is delivering
letters, and the device will use the RFIDs to indicate when he passes the right door to deliver
letters. We already think a bit about John, and our imagination develops the narrative—
would this idea really work? What else does John want to do? What’s going to happen,
and how as designers can we draw out the narrativium for the best outcome?
You use your personas—see box 2.2, “Personas” (p. 46)—to construct scenarios, and you
write the scenario, or several, as if it was written for your chosen persona.
Scenario: Writing a story
Persona: Jemima
Background: Using a word processor for homework.
A message has come up saying the word processor is short
of memory and windows must be closed. Should she save
The scene: the file she is working on first, or close other windows?
Which ones?
The story would be filled out in a lot more detail . . .
The scenario must be written from the persona’s point of view. It’s no use writing a story that
says “Jemima rebooted, reinstalled the software, and everything worked again. The End.”
the web) are certainly a great part of our hope for humanity. In this perspective,
design plays on the field of world ethics.
Hans Küng has put it like this: “An answer in negative terms can hardly be
enough if ethics is not to degenerate into a technique for repairing defects and
weaknesses. So we must take the trouble to give a positive answer to the question
of a world ethic.” There are echoes of our comments on privatives, but we do not
need to paraphrase Küng’s full program here; instead, we pick up just one of his
specific proposals.
In any situation it is difficult to weigh up benefits and to balance individual,
institutional, and professional ethics. Küng paraphrases some rules from “present
day ethics” from his international and cross-cultural survey. Küng was thinking of
480
13.6. Conclusions—and press on
progress in general (most people are worried about things like gene manipulation
and nuclear power), so I add a design gloss to each one:
A rule for solving problems This is practically our wake up call! There must be
no progress that brings greater problems. There must be no device that brings
about greater problems in its use.
A rule for burden of proof Anyone who presents a new system has to
demonstrate that what is embarked on achieves what is claimed, for instance,
that it does not cause human or ecological damage. On what grounds does a
designer justify an innovation? We need the empirical data, or other rules that
we agree can be taken as proof. If there isn’t a burden of proof, in what sense
can a designer be said to have succeeded?
A rule for common good Küng’s example for this principle is to prefer
preventive medicine to remedial medicine. In design, we should prefer
easier-to-use systems to, say, more intelligent help systems.
A rule of urgency The more urgent value (such as survival) has priority over
higher values (such as privacy). Thus concern for a user’s RSI (repetitive strain
injury, and going under various names: carpal tunnel syndrome, computer
related syndrome, CRS), which can easily happen with overuse of push button
devices, or upper-limb disorders (which more easily happen with desktop
computer use, through bad posture) could take priority over privacy
considerations that arise in monitoring work load. Or the safe use of a car
radio button should take priority over aesthetics or having lots of features.
An ecological rule This rule is a special case of the previous: survival of the
ecosystem, which cannot be replaced, has priority over social systems. In
design, we should consider the wastage that disposing of obsolete user
interfaces causes: we are usually so excited about the improvements in
technology that we forget that to benefit from them we have to discard old
systems.
481
Chapter 13 Improving things
you can put in the programs that will make the company’s products sell to users
to do what they want to do.
Your creativity and understanding of the underlying design issues can make the
world a better place. This book is a sandwich: the bottom slice of bread told you
what a mess we’re in and that it needs fixing; the layer of meat told you that you
could start fixing things; and the final top slice of bread preached that you should
make a commitment to changing things for the better.
Now that you’ve nearly finished this book, don’t waste any more time! Look for
ways to improve user interface design by doing better programming, and by tying
more of the separate stages of product development together with better programs
that share the key specifications of the interactive devices: the formal specifications
right through to the technical manuals and online help can all be tied together by
better programming. Computers are fast enough, and you should be using their
power to simplify and streamline rather than to add layers of complexity.
By interpreting and adapting ethics, designers can find a suitable foundation in
which to continue to grow knowledge and to do good work.
482
13.6. Conclusions—and press on
Don’t forget, every chapter has its own list of references and that chapter 4, “Tran-
sition to interaction programming,” has an especially long list, covering the entire
field.
483
14
Four key
principles
Interactive devices are badly designed, for many reasons. The market permits
it—we like these things and we’re the market that buys them. It is very difficult
to design well, both because poor design is accepted, but fundamentally because
the user’s understanding of a device, what they want to do with it, how it is pro-
grammed, and how it interacts, are different points of view that are hard to recon-
cile with one another.
This book is organized into three parts as follows:
Part I Interactive systems and devices do not fulfill their potential, for all sorts of
reasons: economic, social, psychological, and technical.
Part II Computer science provides many practical creative ideas and theories
that can drive effective interaction programming.
Part III Knowing the science is fundamental, but it is also essential to have the
right attitudes and approaches to managing the complexity of designing
systems for people to use.
But exactly how does computer science provide the ideas and theories that can
drive effective interaction programming? Part II is the “meat” of the sandwich
and fleshes out four key interaction programming principles:
485
Chapter 14 Four key principles
486
demonstrator, the visualizations . . . all aspects of the design can come out of the
same specification.
With proper use of tools, each aspect of the design can be cross-referenced, and
the design can be developed concurrently. For example, the user manual can be
developed from the start, and insights in writing it clearly can feed into other parts
of the design.
When the different products of a design—the device specification itself, its im-
ages, its user manual, and so on—are integrated the advantage is that many peo-
ple can work concurrently. The disadvantage, too, is that many people can work
concurrently! The framework must therefore provide features to help people syn-
chronize their work, to help identify parts of a design that somebody requires
fixing, and to help people be notified when any aspect of their work has become
“obsolete” as a result of other design activity. For example, an auditor may wish
to fix a safety-critical aspect of a design; nobody else can then change this with-
out permission. Or a marketing person may have a “needs this” feature that must
be filled in before the design can be completed. A framework that interrelates all
interaction programming should track these various correspondences.
Integrating user manuals, of all forms, into the design process is a recurring
theme in Press On. See especially sections 11.5 (p. 392), and 11.6.1 (p. 397).
Visualizations of a device also include technical diagrams such as statecharts,
which were used throughout chapter 7, “Statecharts”; graphs such as
figure 11.1 (p. 371); and pictures for user manuals such as figure 9.3 (p. 296).
Visualizations can also be numerical, and we gave lots of examples in
chapter 10, “Using the framework,” in particular section 10.7 (p. 351)
suggested many ways of embedding such ideas into products to enhance their
value to users.
487
Chapter 14 Four key principles
If you can’t think of a property that can be programmed directly, try using a
gnome approach—simulate the user and measure how well they would perform.
Then modify the design, and see how the measurements change.
A special case of property is a beguiling or partial property: this is a property
that is almost true of a device. The user learns the property is true—indeed, most
of the time the property is true—but from time to time the device will behave
in unexpected ways because the property is not quite true. It is possible to find
beguiling properties automatically, but they can also be identified by finding any
property that is, say, 95% true of a device. You should examine the exceptions to
partial properties particularly carefully, and ideally keep an automatic record of
the ones that are alright or the ones that cannot be fixed.
Section 6.3 (p. 182) introduced a wide range of simple interaction programming
rules. Chapter 10, “Using the framework” covered properties that required
programming to explore.
In section 6.8 (p. 195), we said that the key problem in interaction programming is
that we cannot see states—the stuff that makes interactive systems work. Nei-
ther users nor designers can see states, modes, or actions, except in very spe-
cial, very simple cases. We have to use good design practices to overcome this
blindness—and when we do interaction programming properly, there are many
additional advantages. If we do not do interaction programming properly, then
interactive devices get just “thrown together” and—magically—they just work,
sort-of. Programmers and marketing people get overconfident, based on years
of slipped standards. Most interactive devices are indeed thrown together, and
many have grown arbitrarily over years. We end up with the widespread interac-
tion problems we belabored in part I.
It’s time to change. And fortunately, interaction programming, driven from
sound computer science principles is fun and worthwhile.
488
What to do now
You can do many things now: study and critique particular devices in depth; learn
more about the underlying ideas and background science of Press On; research
more about the ideas and develop them further; or you could transform practice—
for instance, by building tools—so that interactive devices improve and the world
becomes a better place. Here, then, are some starting points:
Study some interesting device carefully, then define and explore it using Press
On’s framework.
Build a library of specifications for lots of devices, so the material in Press On
can be tried out on more devices.
After you’ve built a library of examples, compare them. For instance, draw
graphs of one property against another, and explore what patterns emerge.
Build a design tool using all of the features we’ve covered—and add your own
ideas. JavaScript was fine for Press On’s pedagogic framework, but a
professional tool requires more flexibility and sophistication: for instance, start
with the open platform Eclipse (see www.eclispe.org) and build a development
tool, or build a web 2.0 system, perhaps as a wiki that combines device
specifications, user manuals, user comments, means of both user and
framework-based evaluation, and interactive animation.
Develop standards for interaction programming—particularly for medical
devices (which are simple-enough to develop standards for, and they need
them because they are all horribly different from one another). Remember that
standards can include programming languages and environments for
interactive systems.
Combine the analytical principles in this book with user-centered development
techniques, for instance by providing annotation to manage usability
experiment protocols along with device specifications.
Continue the ideas into mobile, context-aware, multi-user, implants, and into
different areas, like games, education, and sports.
Research and explore small world models on larger devices.
Fix any interaction programming mess—and let the world know.
489
Index
Specific text will occasionally be on the following page if the paragraph it is in
straddles two pages.
491
Index
492
bridge, graph, 248. Land Rover, 338.
¨
bridge, Wheatstone, 119. locks, 412.
Brinks Home Security, 191 .
¨
mobile phones, 152.
British Library, 67 , 208 . ¨
safety, 391.
British National Corpus, 123. ¨
vintage, 26.
British Telecom, 27. ¨
broadcast communication, 218. cascading style sheets, 111, 394, 401.
Brown Corpus, 123, 128, 130, 132. cash machine, 390.
Brown, J. S., 35, 483. Castillo, D., 323.
browser, 245. Caxton, W., 98.
CCXML, 219.
¨ compatibility, 104. CD door example, 392.
Brunnstein, K., 482.
Buchanan, G., xvii. CDs, cognitive dimensions, 461, 483.
Buckley, F., 270. CDs, compact discs, 6, 163, 233.
buildings metaphor, 99, 110. central heating example, 294.
burden of proof, 481. The Character of Physical Law, 441.
burglar alarm, 191 , 328. characteristic path length, 265, 303,
business efficiency, 18. 333.
Butler, J., 441. chatterbot, 44.
Butterworth, R., xvii. checkDevice, 291.
button matrix, 382 . checking, robustness, 379.
button pressing, random, 345. χ2 test, 377.
Buxton, W. A. S., 5, 114. children, 113, 419 , 447, 483.
¨ and PVRs, 374.
Chinese postman tools, 229 .
C Chinese postman tour, 157, 228, 231,
C, C++, C#, xvii, 102, 196, 314, 316. 232, 352.
cadmium, 28. chip, 53.
Cairns, P., xvii, 365, 404, 442. choice, operating system, 33.
Cake Cutting Algorithms, 483. Christensen, C. M., 24, 35.
calculator example, 61, 74, 436. chromatic number, 233.
calculator feature interaction, 74. Chung, K. L., 404.
cameras, 65, 357, 428, 453. chunking, 167.
captology, 57, 58, 469. Cialdini, R. B., 58.
capture error, 388. cinema, see home theater.
car radios example, 80. city is not a tree, 253, 435.
card sorting, 141 . Clarke, A. C., 8, 15, 111, 221.
Card, S. K., 5, 159. Clarke, E. M., 365.
Cardano, G., 119. clique, 247.
cargo cult, 16 , 35. closure (psychological), 147 .
carpal tunnel syndrome, 125 , 481. cluster, 201, 283, 364.
Carroll, J. M., 114, 323, 404, 483. cluster analysis, 141 .
cars, 23, 261 . Clytemnestra, Queen, 119.
¨ analogy, 17, 25, 51 , 261 . Code of Hammurabi, 13 .
interaction, 410. code, Huffman, 126, 134 , 160.
¨
493
Index
494
CSCW, computer supported design phase, 450.
cooperative work, 93, 442. design to break, 31, 455 .
CSP, 172, 217, 364, 365, 431, 432. designer-centered design, DCD, 463.
CUL8R, 148, 467. Designing Interactions, 113.
cult, cargo, 16 , 35. Designing Interactive Systems, 114.
culture, 447. Designing the User Interface, 114.
Curzon, P., 442. Détienne, F., 442.
customer feedback, 28. developing countries, 29.
customization, 448. device indicator, 335.
cut-and-paste, 337 . devices
cycles, graph, 246.
¨ 17BII calculator, 252.
¨ 505 car, 80.
¨ 2110 mobile, 67.
D ¨ 5110 mobile, 135–151, 228.
DAB radio, 30. ¨ alarm clock, 191.
DAG, directed acyclic graph, 253 .
¨ Alba TV, 188.
Davis, E., 35.
DC925 drill, 317.
¨ DC925 drill, 317.
DCD, designer-centered design, 463. ¨ DF200 fax, 27, 27 , 153.
de Bono, E., 411 , 442. ¨ DVD player, 423.
de Bruijn sequence, 350. ¨ EOS 500/500N/350D cameras,
de la Ramée, P., 98. 65.
Dearden, A., 365. ¨ flashlight, 169, 190, 201.
debugging, 106, 243. ¨ Genius microwave, 67, 367, 376.
declarations, 108. ¨ Highflow-400, 294.
deep history, 206. ¨ HRD580EK PVR, 66, 146, 234,
default arrow, 176. 310 , 331 , 336, 381, 394, 396.
default state, 176, 181. KEH-P7600R radio, 80.
defenses FSM, finite state machine,
¨
430 . ¨ KV-M1421U TV, 63.
Degani, A., 111, 221. ¨ MC100 calculator, 61.
deontology, 467. ¨ Model 38, gun, 177.
deployment phase, 450. ¨ N95 mobile, 6, 76.
depth-first search, 386, 390. ¨ NN-9807 microwave, 367.
design aloud, 464. ¨ obsolete, 6, 27, 31 .
design blindness, 83 .
design error, 392.
¨ Oncommand TV, 337 .
design, iterative, 361, 376, 428, 448, ¨ One For All, 72.
451–453. ¨ photocopiers, 31.
Design Methods for Reactive Systems, ¨ Pioneer KEH-P7600R radio, 80.
114, 222. ¨ RM-694 remote control, 63.
design niche, 83 . ¨ Samsung NV3, 357.
The Design of Children’s Technology, ¨ Sharp SD-AT50H, 72.
113, 483. ¨ SL300LC calculator, 61.
The Design of Everyday Things, 112. ¨ Sony Ericsson mobile phone, 185.
495
Index
496
EPR, extending producer camera, 65.
¨
responsibility, 31. car radios, 80.
equal opportunity, 279.
¨
CD door, 392.
Erdös number, 267 . ¨
central heating, 294.
Erdös, P., 267 . ¨
digital radio, 31 .
ergonomics, 92. ¨
fax, 27.
error, 389 , 392, 475 . ¨
home theater, 26, 68, 72.
¨ capture, 388. ¨
Land Rover, 338.
¨ completion, 150, 388. ¨
¨ overrun, 341, 388. ¨ library, 20.
mobile devices, 76.
¨ post-completion, 147 , 168 . ¨
mobile phones, 67, 134–153.
¨ tolerance, 449 . ¨
setting clocks, 67.
¨ transfer, 388. ¨
ethical monism, 469. television, 63, 166.
¨
ethical particularism, 468. Turning the Pages, 67 .
ethical pluralism, 470. ¨
video, 66.
ethics, 36, 467. ¨
Videoplus+, 134 .
¨ consequentialism, 468. ¨
visa, 19.
¨ deontological, 467. ¨
WAP, 124 .
¨ environmentalism, 29, 469. ¨
washing machine, 75.
¨ hedonism, 469. ¨
Excel, 380.
¨ justice, 472. Exocet, 28.
¨ medical, 470 . exoskeleton, 7.
¨ monism, 469. experience, 5, 427.
¨ particularism, 468. explanation, see user manual.
¨ pluralism, 470. extending producer responsibility,
¨ professional, 467, 469. EPR, 31.
¨ situation, 468. externalist HCI, 471.
¨ usability, 470. externalizing use cost, 19, 21, 22, 29.
¨ utilitarianism, 468. eyesight, 152.
¨ virtue, 468. EZSort, 141 .
Ethics of Computing, 482.
Euler tour, 226, 228, 232.
Euler, L., 218, 226, 254, 470.
Eulerian graph, 226.
F
evaluation, 458. facilities layout, 305 .
Evans, C. R., 5, 23, 35. familiarity, 309.
event-driven programming, 196. Fano, R. M., 126.
every, 318. farmer’s problem, 325, 346, 431.
evolution, 42, 54, 416. fashion items, 24.
examples Faulkner, X., 442.
¨ 8310 mobile, 136 . fax example, 27.
calculator, 61, 74, 436. fazing, 53.
¨
497
Index
498
government procurement, 100, 456. Hardin, G., 35.
Gow, J., xvii, 365. Harel, D., xvii, 89, 217, 221.
Grab, OS X, 187. Harrison, M. D., xvii.
graph, 165, 227. harsh environments, 419 .
Haskell, 314.
¨ clique, 247.
Hawthorne Effect, 447.
¨ coloring, 232. hazardous materials, 30, 408.
¨ complete, 233, 246. hazing, 53.
¨ connected, 244. HCD, human-centered design, 93.
¨ connectivity, 244. HCI, human-computer interaction,
¨ cycles, 246. 92, 457.
¨ directed, 227. HCI Models, Theories and Frameworks,
¨ Eulerian, 226. 114, 323, 483.
¨ friendship, 266 . hcibib, 114.
head-up display, 152.
¨ independent sets, 247. heart monitor, 56.
¨ random, 267. Heathrow airport, 46, 265, 267.
¨ randomly Eulerian, 230. hedonism, 469.
¨ strongly connected, 244, 300, 327. Hein, P., 475 .
¨ subgraph, 238. help, 352.
¨ trees, 249. ¨ principles, 362.
Graph Theory, 270.
graphics, 199. ¨ system, 268.
GraphViz, 323. ¨ see also user manual.
Herival, J., 92.
Greek alphabet, 119.
heroes, 5, 478.
Greek medicine, 470 .
heuristic, 55, 374, 409, 416.
Green, T. R. G., 483.
heuristic evaluation, 459.
Greenberg, S., 114.
Hick-Hyman law, 304, 385, 402.
Gresham College, xvii.
hierarchical FSM, 315.
grook, 475 .
Higgins, H., 43.
Grudin, J., 114.
high-fidelity prototype, 279, 450.
guidelines, 419 .
Highflow-400, 294.
Guiyu, China, 408.
Highway Code, The, 82.
gun, Model 38, 177.
hinge vertex, 239, 248.
Gutenberg, J., 97, 208 .
Hippocrates, 470 .
history entrance, 205.
HMI, human-machine interface, 92.
H Hoare, C. A. R., 365, 434.
H annotation, 205. Holden, K., 441.
HAL, 111, 221. Holmes, O. W., 116.
Hamiltonian tour, 232. Holzmann, G. J., 365.
hammer drilling, 318, 320. home theater, 6, 26, 68, 72.
Hammurabi, 13 . Horrocks, I., 221.
hand brake, 338. hot text, 237.
handwriting recognition, 436. hotel keys, 175.
499
Index
500
The Invisible Computer, 36, 112, 441. Jones, M., xvii, 112, 159, 404, 442.
invisible interaction, 175, 385, 410. JScript, 102.
IOD, index of difficulty, 306. JSON, JavaScript object notation, 314.
iPod Shuffle, 375. jump minimization, 264.
ISO 9241/13407 usability standard, justice, 472.
449 .
isString, 289.
isStringArray, 290. K
isWholeNumber, 289.
iterative design, 361, 376, 428, 448, Kaliningrad, 226.
451–453. Kant, I., 470.
Kay, A., 5.
Kaye, J., 323.
J KEH-P7600R radio, 80.
key design problem, 195.
Jackson, D., 365.
key fob, 412.
Jackson, T., 89.
keyboard layout, 125 .
James, D., 111.
Japan, 17. keycard, 175.
Java, xvii, 102, 109, 196, 313, 314, 316. keypad lock, 68.
JavaHelp, 323. knobs, 363.
JavaScript, xvii, 102–109, 274–303, 322, know the user, 410, 435, 447.
431, 432. Knuth, D. E., 160, 386, 404, 466 , 483.
Königsberg, 225, 470.
¨ arrays, 108. Kruskal’s algorithm, 260.
¨ assignment, 108. Küng, H., 480, 483.
¨ comments, 108. KV-M1421U TV, 63.
¨ debugging, 106. Kwan, M., 228.
¨ declarations, 108.
¨ for, 109.
¨ functions, 105, 109. L
¨ initialization, 108. labeled transition system, LTS, 183 .
¨ objects, 109. Ladkin, P. B., 267 .
¨ operators, 108. Land Rover example, 338.
¨ short cut operators, 108. Landauer, T. K., 95, 113.
¨ subscripts, 108. landfill, 31 .
¨ types, 108. lateral thinking, 442.
LATEX, xvii, 268 .
¨ while, 109.
JavaScript object notation, JSON, 314. Latin, 97.
JavaScript: The Definitive Guide, 111. Laurel, B., 5, 58.
Jensen, S., 111. law and user manuals, 392.
Jobs, S., 472. laws of motion, Newton’s, 95.
Johnson, C., 483. laws of physics, 415.
Johnson, D. G., 483. leachate, 27 .
Johnson, S., 150 . leadership position, 18.
joint connectors, 210. least effort, principle of, 147 .
501
Index
502
Milgram, S., 266 , 270. Myhrvold, N., 25.
minimal user manual, 397.
minimalist instruction, 398.
minimizing overheads, 19. N
Minotaur, 235.
N95 mobile, 6, 76.
mirror symmetry, 415.
Nader, R., 16, 51, 51 , 89, 179.
misleading, don’t be, 186.
nameEachState, 321.
mission-critical systems, 7.
nanotechnology, 7.
Mistakes Were Made, 59.
narrativium, 480 .
MMI, man-machine interaction, 92. Nass, C., 42, 59.
mobile devices Nathan’s First Law, 25.
¨ early patent, 120. National Health Service, NHS, 436.
¨ N95 mobile, 76. natural mapping, 389 .
Mobile Interaction Design, 112. natural sciences, 476.
mobile phones example, 67, 134–153. Navi key, 135.
mode, 163, 165, 201, 202. necessary property, 243.
Model 38, gun, 177. nets, Petri, 172, 222.
Model Checking, 365. New Scientist, 27.
Model-based Design and Evaluation of Newell, A. F., 5.
Interactive Applications, 222. Newman, W. M., 5, 199.
modelessness, 149, 476. Newton’s laws of motion, 95.
Modeling Reactive Systems with NHS, UK National Health Service,
Statecharts, 221. 436.
models, Markov, 381. NiCd batteries, 28.
Modern Compiler Implementation, 443. niche, design, 83 .
Modern Power Generators, 15. Nicomachean Ethics, 482.
moding out, 165. NiMH batteries, 28.
modularization, 99. NN-9807 microwave, 367.
Moggridge, B., 113. nodes, 250.
monism, 469. noisy-channel coding theorem, 127 .
Moore FSM, finite state machine, 183 . nondeterminism, 182, 183 .
Moore’s Law, 22, 27, 35, 36. nonsequential reading, 251.
Moore, E. F., 183 . Norman, D. A., 5, 36, 112, 389 , 441.
Moore, G. E., 22, 36. NP-hard design problems, 305 .
Morse code, 121. nuclear reactors, 31 .
number, independence, 247.
Morse, S. F. B., 119, 125.
number pads, 114.
movable type, 97.
numbers, random, 377.
MP3 player, 330.
The Nurnberg Funnel, 404.
mugging, 41.
Müller-Lyer illusion, 55.
multilevel statechart, 202.
multimeter, 363. O
multiple representations, 92. obituary, fax, 27, 27 .
murder, 121. object-oriented programming, 196.
503
Index
504
Ponzo illusion, 55. Pygmalion, 43.
positive feedback, 33, 34. pyramids, 39.
post-completion error, 147 , 168 . Python, 314.
Postman, N., 89.
Practical Statecharts in C/C++, 222.
Pratchett, T., 480 .
prefix-free code, 126, 135.
Q
Pregel, river, 225, 248. Quan, M., 228.
press, 284, 286, 288, 296, 297. queue buster, 414.
prestige, 25. Quong, R. W., 443.
Prim’s algorithm, 260, 261. QWERTY effect, 125 , 159.
principle of least effort, 140, 147 .
principles, 476.
Principles of Interactive Computer R
Graphics, 199.
RAD, rapid application development,
printing press, 97, 208 .
432.
Prior, R., xvii.
radio data system, RDS, 80.
probability, independence, 247, 343.
problem solving, 411 . radio, Digital Audio Broadcasting, 30,
31 .
processes, random, 381.
procurement, 456. radius, 303.
Product Design, 441. Ramus, P., 98.
product differentiation, 27, 54. random
productivity paradox, 21, 113. ¨ button pressing, 345.
professional ethics, 467, 469. ¨ graph, 267.
professional programming, 312. ¨ numbers, 377.
programming, 100, 102, 273–364. ¨ processes, 381.
¨ professional, 312. ¨ user, 373.
¨ web, 280–286. randomButton, 377.
progressive evaluation, 461. randomly Eulerian graph, 230.
projector, see home theater. ranked embedding, 138, 330.
Prolema, 365. rapid application development, 432.
Prolog, 196. Raskin, J., 5, 112.
properties, 242. Rawls, J., 473, 483.
prototype, paper, 455 . RDS, radio data system, 80.
provisionality, 461. reactive system, 173, 217.
psychology, 93. readability, 260.
The Psychology of Everyday Things, 112. Readings in Human-Computer
pumps, syringe, 168 . Interaction, 114.
pushdown automata, PDA, 172, 313, Reason, J., 113, 405.
430 . reconditioning, 30.
PVR, personal video recorder, 66, 146, recycling, 29.
147 , 234, 330, 331 , 351, 395. reduce, reuse, recycle, rethink, 29, 434.
and children, 374.
¨ reductionism, 431 .
simplified, 345, 346. Reeves, B., 42, 59.
¨
505
Index
regular expression, 172, 183 , 364, 429, Sale of Goods Act, 392.
431, 432. salve, weapon, 96 .
reliability (computer versus car), 17. Samek, M., 222.
remote control, 14, 63, 66, 72, 80. Samsung NV3, 357, 358.
removing battery, 164. Sasse, M. A., 483.
Renaissance, 431 . satellites, 15.
repetitive strain injury, 481. satisficing, 54, 446.
reportPointlessStates, 290. scale-free networks, 18, 265.
reproducibility, 476. scenario, 399, 480 .
requirements, 453, 458. scenarios, 46 .
research, 114. Schilit, B., 27.
¨ methods, 5. science, cargo cult, 16 , 35.
reset, 184. science fiction, 15.
restart costs, 342. Scientific American, 25.
restrictions, FSM, 409. scientific calculators, 75.
reversibility, 481. Scott, K., 221.
river Pregel, 225, 248. screwdriving, 318.
Rivest, R. L., 160. SCXML, 219, 295, 314.
RM-694 remote control, 63. search, 386.
road layout, 47. secretive, don’t be, 188.
road to wisdom, 475 . self-arrow, 181.
road toll, 411. self-help book, 261 .
Robertson, J., 483. self-reference, 506.
Robinson, A. H., 159. separation of concerns, 253.
Robinson, S. N. W., xvii. sequence, de Bruijn, 350.
robotic body enhancer, 7. sequential design, 197.
robustness checking, 379. Sethi, R., 443.
root, of tree, 250. sets, independent, 247.
rubbery keys, 341. setting clocks example, 67.
Ruby, 314. Shackel, B., 5.
rude, don’t be, 167, 185, 384. Shannon, C. E., 126, 127 , 160.
run length encoding, 156. Sharon, 266 .
Sharp, J., 287, 323.
Sharp SD-AT50H, 72.
S Shaw, G. B., 43.
sadism, 470. Shneiderman, B., 114, 411 , 442.
safe cracking, 350. Sholes, C. L., 125 .
safety and manuals, 262. shopping, 19, 20, 23.
safety catch, 177. short cut operators, 108.
safety-critical systems, 7, 217, 220, short messaging system, SMS, 158.
234, 248, 369, 434, 486, 487. shortest path, 157, 266 , 297–311, 348.
safety property, 243. ¨ user manual, 393.
safety, car, 391. shortestPaths, 298, 380.
Sagem, 28. shuttlecock, 263.
salami slicing, 40. Simon, H. A., 476.
506
simplicity, 116, 434.
¨ mode, 165.
The Simplicity Shift, 111.
simplified PVR, personal video
¨ transition table, 277.
recorder, 345, 346. ¨ vector, 382 .
statechart, 2, 63, 64, 201–221, 254, 295,
simulator virtual reality, 42.
364.
Sinfonia, 21. incorrect, 215.
site, web, 445. ¨
situation ethics, 468. ¨ multilevel, 202.
Six Degrees, 270. stateName, 318.
sketch user manual, 396. Stein, C., 160.
Skiena, S. S., 160, 365. stochastic matrix, 382.
skinning, 285. store-to-memory problem, 63.
stroke, 41.
SL300LC calculator, 61.
strong connectivity, 182.
slices, 363.
strongly connected, 244, 300, 327, 378.
small screens, 124 .
small worlds, 18, 264, 265, 489, 489. ¨ components, 327.
smart batteries, 28. STT, state transition table, 277.
smoking, reducing, 152. style, 51 .
SMS, short messaging system, 158. subgraph, 238, 251, 295.
SMV, 172, 217, 364, 365, 431, 432. ¨ user manual, 240.
snail mail, 120. subscripts, 108.
The Social Life of Information, 35, 483. subtree, 251.
social proof, 55. supermarket, 20.
social sciences, 93. Surely You’re Joking, Mr. Feynman, 35.
soft buttons, 77. survey bias, 46.
Software Abstractions, 365. SVG, scalable vector graphics, 323.
software bloat, 25. Swan, J. W., 120.
software warranty, 48. Swatch, 422.
Sony Ericsson mobile phone, 185. Swiss Army knife, 36.
SOS, 158. symbol code, 133.
South Africa, 84. symmetry, 415.
spanning subgraph, 239, 240. synchronization problem, 66, 352.
spanning tree, 239, 251, 260, 352. synthetic world, 44.
speech, 172. syringe pumps, 168 .
speech acts, 442.
speed dialing, 461.
spike, 428. T
Spin, 172, 364, 365, 431, 432. T9, 148.
The Spin Model Checker, 365. table of contents, 97.
Sproull, R. F., 199. take-back policies, 29, 32.
standards, ISO, 449 . Taming HAL, 111, 221.
standby numbers, 267 . Tarry, G., 237.
state, 163, 166. Tarski, A., 267 .
¨ cluster, 201, 283, 364. task/action mapping problem, 311.
flags, 229 . Tavris, A., 59.
¨
507
Index
508
UIMS, user interface management on screen, 66.
¨
system, 199. shortest path, 393.
Ullman, J. D., 111, 443.
¨
sketch, 396.
ultrasound motion sensor, 412. ¨
subgraph, 240.
UML, 201, 221. ¨
tools, 396.
UML Distilled, 221. ¨
typography, 393.
unbalanced tree, 256. ¨
wizard, 360.
Understanding Computers and ¨
Cognition, 411 , 442. user models, 240, 267.
undo, 189, 236, 342, 390, 481. user, random, 373.
Unified Modeling Language, UML, utilitarianism, 468.
201, 221. UX-NB7DAB CD/radio, 392.
universal market, 448.
Universal Principles of Design, 441.
universalization, 470. V
Unsafe at Any Speed, 51 , 89. Vail, A. V., 121.
UPA, Usability Professionals’ value system, 445.
Association, 115. VCR, video cassette recorder, 330,
upgrades, 12, 25, 30. 331 .
upper-limb disorders, 481. see also PVR.
¨
urgency, 481. vector, state, 382 .
usability, 93, 242, 435. veil of ignorance, 473–477.
¨ ethic, 470. Venn diagram, 218, 254.
Venn, J., 254.
¨ evaluation methods, 459.
VeriChip, 413.
¨ standard, ISO 9241/13407, 449 . video cassette recorder, VCR, 330,
¨ study, 459. 331 .
Usability Professionals’ Association,
see also PVR.
UPA, 115. ¨
video example, 66.
USB plug, 417, 428.
Videoplus+, 134 .
user-centered design, UCD, 93, 94,
video recorder, see personal video
450, 471.
recorder, PVR.
user error, 392.
vintage car, 26.
User Interface Design, 112.
virtual machine, 99.
user interface management systems,
virtual orchestra, 21.
UIMS, 199.
virtual reality, 172.
user manual, 2, 64–66, 83, 83 , 96, 197,
229 , 259, 260, 281, 311, 344, 350, ¨ simulator, 42.
virtue ethics, 468.
352, 368, 376, 394, 395, 397 , 418,
virus, computer, 48.
450, 454, 475.
visa example, 19.
¨ complete and correct, 393, 395. viscosity, 461.
¨ complex, 466. visual system, 124 .
¨ diagrams, 294. visualization, 152.
¨ legal issues, 392. VN6000 PVR, 66.
minimal, 397. voice menu system, 19.
¨
509
Index
510