Human Computer Interaction: Iv/Ii Semester
Human Computer Interaction: Iv/Ii Semester
Human Computer Interaction: Iv/Ii Semester
NOTES
For
IV/II SEMESTER
Prepared by:
Suresh.V,
M.Tech
Introduction to HCI
HCI (human-computer interaction) is the study of how people interact with computers
and to what extent computers are or are not developed for successful interaction with
human beings. A significant number of major corporations and academic institutions now
study HCI. As its name implies, HCI consists of three parts: the user, the computer
itself, and the ways they work together.
User
By "user", we may mean an individual user, a group of users working together. An
appreciation of the way people's sensory systems (sight, hearing, touch) relay information
is vital. Also, different users form different conceptions or mental models about their
interactions and have different ways of learning and keeping knowledge and. In addition,
cultural and national differences play a part.
Computer
When we talk about the computer, we're referring to any technology ranging from
desktop computers, to large scale computer systems. For example, if we were discussing
the design of a Website, then the Website itself would be referred to as "the computer".
Devices such as mobile phones or VCRs can also be considered to be “computers”.
Interaction
There are obvious differences between humans and machines. In spite of these, HCI
attempts to ensure that they both get on with each other and interact successfully. In order
to achieve a usable system, you need to apply what you know about humans and
computers, and consult with likely users throughout the design process. In real systems,
the schedule and the budget are important, and it is vital to find a balance between what
would be ideal for the users and what is feasible in reality.
Underlying the whole theme of HCI is the belief that people using a computer system
should come first. Their needs, capabilities and preferences for conducting various tasks
should direct developers in the way that they design systems. People should not have to
change the way that they use a system in order to fit in with it. Instead, the system should
be designed to match their requirements.
USABILITY
Usability is one of the key concepts in HCI. It is concerned with making systems easy to
learn and use. A usable system is:
Easy to learn
Easy to remember how to use
Effective to use
Efficient to use
Safe to use
Enjoyable to use
Imagine that you just put your document into the photocopier and set the photocopier to
make 15 copies, sorted and stapled. Then you push the big button with the "C" to start
making your copies.
What do you think will happen?
(a) The photocopier makes the copies correctly.
(b) The photocopier settings are cleared and no copies are made.
If you selected (b) you are right! The "C" stands for clear, not copy. The copy button is
actually the button on the left with the "line in a diamond" symbol. This symbol is widely
used on photocopiers, but is of little help to someone who is unfamiliar with this.
visibility – there is little visual mapping between controls and the users’ goals, and
controls can have multiple functions.
• The affordance of an object is the sort of operations and manipulations that can be
done to it. A door affords opening, a chair affords support. The important factor for
design is perceived affordance – what a person thinks can be done with an object. For
example, does the design of a door suggest that it should be pushed or pulled open?
Factors in HCI
There are a large number of factors which should be considered in the analysis and
design of a system using HCI principles. Many of these factors interact with each other,
making the analysis even more complex. The main factors are listed in the table below
Task Factors
Easy, complex, novel, task allocation, monitoring, skills
Constraints
Cost, timescales, budgets, staff, equipment, buildings
System Functionality Hardware, software, application
Productivity Factors Increase output, increase quality, decrease costs, decrease
errors, increase innovation
EXERCISE
1. Suggest some ways in which the design of the copier buttons on page 3 could be
improved.
2. For the following scenarios, map out what you do (USER INPUT) with the way the
system seems to operate (SYSTEM FEEDBACK)
• Buying the books “Human Computer Interaction (J. Preece)” and “Shaping Web
Usability (A. Badre)” on the internet
• Sending a text message on a mobile phone
3. Consider factors involved in the design of a new library catalogue system using HCI
principles
4. Use the internet to find information on the work of Donald Norman and Jakob Neilsen
Chapter 2
Human Cognition
Cognitive Psychology
The science of psychology has been very influential in Human Computer Interaction. In
this course we will look at some of the main developments and theories in cognitive
psychology (the study of human perception, attention, memory and knowledge), and
the ways in which these have been applied in the design of computer interfaces.
Perceptual processor
Outputs into audio storage
Outputs into visual storage
Cognitive processor
Outputs into working memory.
Has access to:
Working memory
Long term memory
Motor processor
Carries out actions
The MHP model was used as the basis for the GOMS family of techniques proposed by
Card, Moran, and Newell (1983), for quantitatively modeling and describing human task
performance. GOMS stands for Goals, Operators, Methods, and Selection Rules.
This represents a change in emphasis from human factors to human actors - a change in
focus on humans from being passive and depersonalized to active and controlling.
The person is considered as an autonomous agent able to coordinate and regulate
behavior, not a passive element in a human machine system
Distributed Cognition
Distributed cognition is a framework proposed by Hutchins (1991). Its basis is that to
explain human behavior you have to look beyond the individual human and the
individual task. The functional system is a collection of actors, technology, setting and
the interrelations to one another. Examples of functional systems which have been
studied include:
• Ship navigation
• Air traffic control
• Computer programming teams
The technique is used to analyze coordination of components in the functional system. It
looks at
• Information and how it propagates through the system
• How it transforms between the different representational states found in the functional
system. One property of distributed cognition that is often discovered through analysis is
situation awareness (Norman, 1993) which is the silent and inter-subjective
communication that is shared among a group. When a team is working closely together
the members will monitor each other to keep abreast of what each member is doing. This
monitoring is not explicit - rather the team members monitor each other through glancing
and inadvertent overhearing. The two main concerns of distributed cognition are:
To map out how the different representational states are coordinated across time,
location and objects
To analyze and explain breakdowns
Example:
An electricity power plant was redesigned so that the old system consisting of a single
large display screen which could be seen by all of a team of three operators was replaced
by individual workstation screens for operators. This worked well until there was a
problem which resulted in dangerous gases being released. The team of operators had
great difficulty in finding the source of the problem and deciding what to do.
Because they no longer have access to all the information, they have to spend time
explicitly coordinating their understanding of the situation by talking to each other.
Under the old system, the knowledge would be shared – one operator would know what
was happening with another’s area of the plant without explicit communication. Although
the team’s individual responsibilities would still have been clearly divided, the
knowledge of the system would be shared.
How the new system of individual workstations could be modified to make better use of
distributed cognition?
Chapter 3
Perception
An understanding of the way humans perceive visual information is important in the
design of visual displays in computer systems. Several competing theories have been
proposed to explain the way we see. These can be split into two classes: constructivist
and ecological.
Constructivist theorists believe that seeing is an active process in which our view is
constructed from both information in the environment and from previously stored
knowledge.
Perception involves the intervention of representations and memories. What we see is not
a replica or copy; rather a model that is constructed by the visual system through
transforming, enhancing, distorting and discarding information.
Similarity - how items that are similar in some way tend to be grouped together.
Similarity can be shape, colour, etc.
Closure - how items are grouped together if they tend to complete a pattern.
Affordances (Ecological)
The ecological approach argues that perception is a direct process, in which information
is simply detected rather than being constructed (Gibson, 1979).
A central concept of the ecological approach is the idea of affordance (Norman, 1988).
The possible behaviour of a system is the behaviour afforded by the system. A door
affords opening, for example. A vertical scrollbar in a graphical user interface affords
movement up or down. The affordance is a visual clue that suggests that an action is
possible.
When the affordance of an object is perceptually obvious, it is easy to know how to
interact with it.
Norman's first and ongoing example is that of a door. Some doors are difficult to see if
they should be pushed or pulled. Other doors are obvious. The same is true of ring
controls on a cooker. How do you turn on the right rear ring?
"When simple things need labels or instructions, the design is bad."
Affordances in Software
Look at these two possible designs for a vertical scroll bar. Both scrollbars afford
movement up or down.
What visual clues in design on the right make this affordance obvious?
Representation
A graphical user interface must represent information visually in a way which is
meaningful to the user. The representations may be highly sophisticated, for example 3-
dimensional simulated
‘Walkthroughs’. To represent 3D objects on a 2D surface, perceptual depth cues are used:
• Size
• Interposition
• Contrast, clarity and brightness
• Shadow
• Texture
• Motion parallax
Graphical Coding
Visual representations can also be used as a form of coding of information at the user
interface.
System processes, data objects and other features can be represented by different forms of
graphical coding. Some codings are abstract, where there is no relation, other than
established convention, for example:
• Abstract codes to represent files
• Abstract shapes to represent different objects
• Reverse video to represent status
• Colour to represent different options
Other codings are more direct, for example:
• Different icon sizes to reflect different file sizes
• Different line widths to represent different tool widths in a drawing package
• Bar charts to show trends in numerical data
The most direct codings are icons which represent the objects they portray, for example:
• The wastebin icon
• A paper file to represent a file.
Colour theory
Coloured screens are the primary sensory stimulus that software produces, and poor
colour choices can significantly reduce the usability of GUI applications or web sites.
Colour can affect readability and recognition as described above, and it can also affect
the user’s overall impression of an interface. An application which uses clashing or
discordant colours will often provoke a negative reaction in users, who will not enjoy
using it. Good use of colour can be powerful in any application, but is particularly
important in web pages.
Choice of harmonious colours can be helped by a basic understanding of colour theory.
The main tool for working with colours is the simple colour wheel shown here. (You are
best to look at these notes online to see them in colour!)
The black triangle in the centre points out the primary colours. If you mix two primary
colours, you will get the secondary colour that's pointed out by the lighter grey triangle.
When you mix a primary with either of its closest secondary colours, you get a tertiary
colour; these are located between the points of the black and grey triangles.
A harmonious set of colours for an interface is known as a colour scheme. Colour
schemes are based on the colour wheel. There are three main sets of colour schemes:
analogous, complimentary, and monochromatic. These are illustrated using an
application called
ColorWheel Pro, which is designed to allow colour schemes to be created and previewed.
Each scheme is illustrated by a colour wheel showing the range of selected colours, and
the scheme applied to a logo.
Analogous
Analogous colours are those that are adjacent to each other on the colour wheel. If you
pick any range of colours between two points of either triangle on our colour wheel (i.e.
yellow to red, orange to violet, red to blue, etc), you will have an analogous colour
scheme
phenomenon known as simultaneous contrast occurs, wherein each colour makes the
other look more vibrant.
Monochromatic
If you mix white with a pure colour, you produce tints of that colour. If you mix black
with a pure colour, you get shades of that colour. If you create an image using only the
tints and shades of one colour you have a monochromatic colour scheme.
Example - http://www.yakima.com
Chapter 4
Attention and Memory
Attention
The human brain is limited in capacity. It is important to design user interfaces which
take into account the attention and memory constraints of the users. This means that we
should design meaningful and memorable interfaces. Interfaces should be structured to be
attention-grabbing and require minimal effort to learn and remember. The user should be
able to deal with information and not get overloaded.
Our ability to attend to one event from what seems like a mass of competing stimuli has
been described psychologically as focused attention. The "cocktail party effect" -- the
ability to focus one's listening attention on a single talker among a cacophony of
conversations and background noise---has been recognized for some time. We know from
psychology that attention can be focused on one stream of information (e.g. what
someone is saying) or divided (e.g. focused both on what someone is saying and what
someone else is doing). We also know that attention can be voluntary (we are in an
attentive state already) or involuntary (attention is grabbed). Careful consideration of
these different states of attention can help designers to identify situations where a user’s
attention may be overstretched, and therefore needs extra prompts or error protection, and
to devise appropriate attention attracting techniques. Sensory processes, vision in
particular, are disproportionately sensitive to change and movement in the environment.
Interface designers can exploit this by, say, relying on animation of an otherwise
unobtrusive icon to indicate an attention-worthy event.
In a work environment using computers, people are often subject to being interrupted, for
example by a message or email arriving. In addition, it is common for people to be
multitasking -carrying out a number of tasks during the same period of time by
alternating between them. This is much more common than performing and completing
tasks one after another.
In complex environments, users may be performing one primary task, which is the most
important at that time, and also one or more less important secondary tasks. For
example, a pilot’s tasks include attending to air traffic control communications,
monitoring flight instruments, dealing with system malfunctions, which may arise, and so
on. At any time, one of these will be the primary task, which is said to be fore grounded,
while other activities are shortly suspended.
People are in general good at multitasking but are often prone to distraction. On
returning to an activity, they may have forgotten where they left off. People often develop
their own strategies, to help them remember what actions they need to perform when they
return to an activity.
Such external representations, or cognitive aids (Norman, 1992), may include writing
lists or notes, or even tying a knot in a handkerchief.
Cognitive aids have applications in HCI, where the system can be designed to provided
them –
The system should inform user where he was
The system should remind user of common tasks
For example, Amazon’s check out procedure displays a list of steps involved in the
process, and indicates what step has been reached.
Automatic Processing
Many activities are repeated so often that they become automatic – we do them without
any need to think. Examples include riding a bike, writing, typing, and so on. Automatic
cognitive processes are:
• fast
• demanding minimal attention
• unavailable to consciousness
The classic example used to illustrate the nature of an automatic operation is the Stroop
effect.
To experiment with this, look at the colour sheet at the end of this chapter.
This experiment demonstrates interference. The interference between the different
information (what the words say and the colour of the words) your brain receives causes
a problem. There are two theories that may explain the Stroop effect:
• Speed of Processing Theory: the interference occurs because words are read
faster than colors are named.
• Selective Attention Theory: the interference occurs because naming colors
requires more attention than reading words.
If a process is not automatic, it is known as a controlled process.
Automatic processes
Are not affected by limited capacity of brain
Do not require attention
Are difficult to change once they have been learned
Controlled Processes
Are non-automatic processes
Have limited capacity
Require attention and conscious control (Shiffrin & Shneider, 1977)
Memory Constraints
The human memory system is very versatile, but it is by no means infallible. We find
some things easy to remember, while other things can be difficult to remember. The same
is true when we try to remember how to interact with a computer system. Some
operations are simple to remember while others take a long time to learn and are quickly
forgotten.
An understanding of human memory can be helpful in designing interfaces that people
will find easy to remember how to use.
Levels of Processing Theory
The extent to which things can be remembered depends on its meaningfulness. In
psychology, the levels of processing theory (Craik and Lockhart , 1972) has been
developed to account for this. This says that information can be processed at different
levels, from a shallow analysis of a stimulus (for example the sound of a word) to a deep
or semantic analysis. The meaningfulness of an item determines the depth of the
processing – the more meaningful an item the deeper the level of processing and the more
likely it is to be remembered.
Meaningful Interfaces
This suggests that computer interfaces should be designed to be meaningful. This applies
both to interfaces which use commands and interfaces which use icons or graphical
representations for actions. In either case, the factors which determine the
meaningfulness are:
Context in which the command or icon is used
The task it is being used for
The form of the representation
The underlying concept
Meaningfulness of Commands
The following guidelines are examples taken from a larger set which was compiled to
suggest how to ensure that commands are meaningful (Booth 1994, Helander, 1988):
Syntax and commands should be kept simple and natural
The number of commands in a system should be limited and in a limited format
Consider the user context and knowledge when choosing command names.
Choose meaningful command names. Words familiar to the user
The system should recognize synonymous and alternative forms of command
syntax
Allow the users to create their own names for commands
Sometimes a command name may be a word familiar to the user in a different context.
For example, the word ‘CUT’ to a computer novice will mean to sever with a sharp
instrument, rather than to remove from a document and store for future use. This can
make the CUT command initially confusing
Meaningfulness of Icons
Icons can be used for a wide range of functions in interfaces, for example
The road sign for "falling rocks" presents a clear resemblance of the roadside hazard.
a knife and fork used in a public information sign to represent "restaurant services". The
image shows the most basic attribute of what is done in a restaurant i.e. eating.
this represents Microsoft Outlook – the clock and letter are examples of the tasks this
application does (calendar and email tasks)
the picture of a wine glass with a fracture conveys the concept of fragility
this represents a connection to the internet – the globe conveys the concept of the internet
Arbitrary icons – bear no relation to the underlying concept
Combination Icons
Icons are often favoured as an alternative to commands. It is common for users who use a
system infrequently to forget commands, while they are less likely to forget icons once
learnt. However, the meaning of icons can sometimes be confusing, and it is now quite
common to use a redundant form of representation where the icons are displayed together
with the command names.
The disadvantage of this approach is that it takes up more screen space. This can be
reduced by using pop-up tool tips to provide the text.
There is often isolated and specific use of icons and graphical representations for links.
The following examples are from amazon.co.uk.
• Buttons to submit forms, e.g. search boxes:
• Images of items - the user can click to get more information on the item:
Icon use in web pages is sparing for a number of reasons, for example:
• Pages often convey information and branding graphically, so it would be difficult to
focus attention on icons among other graphical content.
• Graphical links are often banners to focus attention to a small number of specific items
• The web browser has its own set of icons
Interference: The Stroop Effect Don't read the words below—just say the colours
they're printed in, and do this aloud as fast as you can.
If you're like most people, your first inclination was to read the words, 'red, yellow,
green...,' rather than the colours they're printed in, 'blue, green, red...' You've just
experienced interference.
When you look at one of the words, you see both its colour and its meaning. If those two
pieces of evidence are in conflict, you have to make a choice. Because experience has
taught you that word meaning is more important than ink colour, interference occurs
when you try to pay attention only to the ink colour.
The interference effect suggests you're not always in complete control of what you pay
attention to.
Chapter 5
Knowledge and Mental
Introduction
By discovering what users know about systems and how they reason about how the
systems function, it may be possible to predict learning time, likely errors and the relative
ease with which users can perform their tasks. We can also design interfaces which
support the acquisition of appropriate user mental models. "In interacting with the
environment, with others, and with the artefacts of technology, people form internal,
mental models of themselves and of the things with which they are interacting. These
models provide predictive and explanatory power for understanding the interaction."
- Donald Norman (1993)
Mental models are representations in the mind of real or imaginary situations.
Conceptually, the mind constructs a small scale model of reality and uses it to reason, to
underlie explanations and to anticipate events.
Knowledge Representation
Knowledge is represented in memory as:
• Analogical representations: picture-like images, e.g. a person’s face
• Propositional representations: language-like statements, e.g. a car has four wheels
Connectionist theorists believe that analogical and propositional representations are
complementary, and that we use networks of nodes where the knowledge is contained in
the connections between the nodes.
A connectionist network for storing information about people is shown below:
One of the main characteristics of knowledge is that it is highly organised. We can often
retrieve information very rapidly. For example, see how quickly you can retrieve the
answers to the following queries:
What is the capital of Italy?
Name a model of car manufactured by Ford.
How many days are there in a year?
You (probably) answered these very quickly, which suggests that the knowledge is
organised in some way.
The connectionist network is one theory for how this organisation happens. Another
theory is that knowledge consists of numerous schemata. A schema is network of
general knowledge based on previous experience. It enables us to behave appropriately in
different situations.
For example, suppose you overheard the following conversation between two friends:
A: “Did you order it?”
B: “Yeah, it will be here in about 45 minutes.”
A: “Oh... Well, I've got to leave before then. But save me a couple of slices, okay?
And a beer or two to wash them down with?”
Do you know what they are talking about? You’re probably pretty sure they are
discussing a pizza they have ordered. But how can you know this? You've never heard
this exact conversation, so you're not recalling it from memory. And none of the defining
qualities of pizza are represented here, except that it is usually served in slices, which is
also true of many other things.
Schema theory would suggest that we understand this because we have activated our
schema for pizza (or perhaps our schema for "ordering pizza for delivery") and used that
schema to comprehend this scenario.
A script is a special subcase of a schema which describes a characteristic scenario of
behaviour in a particular setting. Schank and Abelson (1977) described the following
script for eating at a restaurant:
Talk
Waiter prints bill Waiter delivers bill to
Leaving:
customer
Customer examines bill
Calculate tip Leave tip Gather belongings
Pay bill Leave restaurant
People develop a script by repeatedly carrying out a set of actions for a given setting.
Specific actions are slotted into knowledge components.
Schemata can guide users’ behaviour when interacting with computers. As they learn to
use a system they may develop scripts for specific tasks, such as creating documents,
saving documents, and so on. These scripts may be useful in guiding behaviour when
faced with new systems.
In older computer systems, as illustrated in the figure below, there were few universally
accepted conventions for performing common tasks. Printing a file in one word processor
tended to require a different set of commands or menu options from another word
processor. There is much more commonality between different applications now – to
save a file in pretty much any Windows application you can go to the File menu which
will be the first one in the menu bar, and select
Save or Save as….
However, it is still important for interface designers to concentrate on ensuring that as far
a possible their systems make use of the users’ “How to use a computer” or “How to use
a web site” schemata.
The WordStar word processor – how do you save a file with this?
Mental Models
A major criticism of schema-based theories is their inflexibility. We are also good at
coping with situations where our scripts are inappropriate. We can adapt to predict states
and comprehend situations we have never personally experienced
The theory of mental models has been developed to account for these dynamic aspects of
cognitive activity. Mental models are related to schemata – models are assumed to be
dynamically constructed by activating stored schemata. We construct a mental model at
the time to deal with a particular situation.
The term “mental model” was developed in the 1940’s by Craik, who said:
“If the organism carries a ‘small-scale model’ of external reality and of its own possible
actions within its head, it is able to try out various alternatives, conclude which is the best
of them, react to future situations before they arise, utilise the knowledge of past events
in dealing with the present and future, and in every way react to it in a much fuller, safer
and more competent manner to emergencies which face it.”
A key phrase here is to “try out various alternatives”. When an architect is designing a
building, architect’s models allow alternative design ideas to be tested without actually
constructing a real building. Similarly, a mental model allows a person to test behaviour
or reaction to a situation to be tested before taking any action. This is called running the
mental model.
The danger in using mental models is that a person’s model of a situation may not be
accurate.
Example: You want to heat up a room as quickly as possible, so you turn up the
thermostat. Will this work?
Users will develop mental models of a computer system. It is important for interface
designers to ensure that their systems encourage users to develop an appropriate mental
model. The system image (the view of the system seen by the user) should guide the
user’s mental model towards the designer’s conceptual model (the way the designer
views the system).
As implemented
As observed
System Image
If the users’ mental model of how the system works is not accurate, then they may find
the system difficult to learn, or ‘unfriendly’.
Example
A Windows user exposed to a Unix environment for the first time. He has to type a
document in the Emacs editor as opposed to Word. The user makes a typo and without
hesitating presses his fingers on the Control and the Z buttons since these are the keys he
always used as a keyboard shortcut for UNDO command. The user gets frustrated as the
Emacs editor completely disappears from the screen and is back to the Unix prompt with
no error message.
The fact that the user has been working on Windows builds a mental model for the
UNDO command in almost all windows programs and associates this model with the
action of pressing CTRL-Z, not knowing that these actions will cause a completely
different action in Unix environment.
Example:
consider changing gear in a car. Think about how to do it and how you decide which gear
to select. Think about constructing a structural model to capture the features of how it
works. Then do the same with a functional model. Which model do you find more
difficult to construct?
Human computer
Evaluate and Interface
interprete the
display Update Display
System
Update System
FormulateOfgoals
Department Computer Science and Engineering Page 32
and actions
generate output
Human Computer Interaction
The Gulf of Evaluation is the amount of effort a user must exert to interpret the physical
state of the system and how well their expectations and intentions have been met.
• Users can bridge this gulf by changing their interpretation of the system image, or
changing their mental model of the system.
• Designers can bridge this gulf by changing the system image.
The Gulf of Execution is the difference between the user’s goals and what the system
allows them to do – it describes how directly their actions can be accomplished.
• Users can bridge this gulf by changing the way they think and carry out the task toward
the way the system requires it to be done
• Designers can bridge this gulf by designing the input characteristics to match the users’
psychological capabilities.
Design considerations:
Systems should be designed to help users form the correct productive mental models.
Common design methods include the following factors:
• Affordance: Clues provided by certain objects properties on how this object will be
used and manipulated.
• Simplicity: Frequently accessed function should be easily accessible. A simple
interface should be simple and transparent enough for the user to concentrate on the
actual task in hand.
• Familiarity: As mental models are built upon prior knowledge, it's important to use this
fact in designing a system. Relying on the familiarity of a user with an old, frequently
used system gains user trust and help accomplishing a large number of tasks. Metaphors
in user interface design are an example of applying the familiarity factor within the
system.
• Availability: Since recognition is always better than recall, an efficient interface should
always provide cues and visual elements to relieve the user from the memory load
necessary to recall the functionality of the system.
• Flexibility: The user should be able to use any object, in any sequence, at any time.
• Feedback: Complete and continuous feedback from the system through the course of
action of the user. Fast feedback helps assessing the correctness of the sequence of
actions.
We will look in more detail later on at techniques for interaction design taking these
factors into consideration.
Chapter 6
Interface Metaphors
Metaphors convey an abstract concept in a more familiar and accessible form. A
metaphor is a figure of speech in which an expression is used to refer to something that it
does not literally denote in order to suggest a similarity A widely quoted example can be
found in Shakespeare's As You Like It:” All the world's a stage..."
Metaphors are widely used to make use of users’ existing knowledge when learning new
computer systems.
Verbal Metaphors
Verbal metaphors are useful tools to help users to understand a new system. For example,
people using a word processor for the first time consider it similar to a typewriter. This
perceived similarity activates the user’s ‘typewriter’ schema, allowing them to interpret
and predict the behaviour of the word processor, for example:
These links provide basic foundation from which users develop their mental models.
Knowledge about a familiar domain (typewriter) in terms of elements and their relation to
each other is mapped onto elements and their relations in the unfamiliar domain
(computer).
• Elements: keyboard, spacebar, return key
• Relations: hit only one character key at a time, hitting a character key will result in a
letter being displayed on a visible medium By drawing on prior knowledge a learner can
develop an understanding of the new domain more readily.
Computer has a QWERTY keyboard
Typewriter has a QWERTY keyboard
Keys should have same effect as they do on a typewriter AND dissimilarities can cause
problems for learners. For example, the backspace key on a typewriter moves the carriage
back, while on a word processor it usually deletes a character. However, as they become
aware of the discrepancies, they can develop a new mental model.
Advance organizers
Verbal metaphors provided in advance can aid learning. For example, Foss (1982)
studied the effect of describing a metaphor for a system to new users before they start
learning. This is called an advance organiser. In this case file creation and storage were
explained in terms of a filing cabinet metaphor, and the result was that these users
performed better when they actually used the system.
Physical Electronic
Composite metaphors
A problem with the ‘metaphor as model’ approach is the difficulty of introducing new
functionality which does not fit into the interface metaphor. Designers have got round
this by introducing composite metaphors. These allow the desktop metaphor to include
objects which do not exist in the physical office, for example:
• Menus
• Windows
• scroll bars (these make use of the concept of unrolling a scroll, or rolled-up document)
It might be assumed that users may have difficulty with composite metaphors. In general
it has been found that people can deal with them rather well and can develop multiple
mental models. Some composite mental models can cause confusion. For example, on a
Macintosh, you can eject a disk by dragging it to the trash – you retrieve it by throwing
it away.
The Web
Metaphor: Travel
Familiar knowledge: going from place to place
Online shopping
Metaphor: Shopping cart
Familiar knowledge: adding items, checking out
Graphics packages
Metaphor: Toolbox
Familiar knowledge: paint, brushes, pencils, rubbers
Media Players
Metaphor: Tape/CD player
Familiar knowledge: play, stop, fast-forward/rewind buttons
Multimedia environments and web sites
Metaphor: Rooms (each associated with a different task)
Familiar knowledge: interiors of buildings
Some researchers have experimented with virtual reality environments to interact with
applications such as databases. This figure below shows a database query represented as
opening a drawer in a room where data on a specific topic is stored. Do you think this
type of metaphor is helpful?
Similarly, some early web sites used elaborate metaphors to make their web sites closely
resemble the real world. These were not generally considered to be successful. Compare
the Southwest Airlines home page from 1997 and its replacement from 1999. Why do
you think the original metaphor was abandoned?
Progress Bars
Metaphor: progress is to the right (orientation metaphor). Based on direction of reading
– highly cultural, as some cultures read from right to left.
Icons
Symbolic icons use metaphors to convey their meaning – e.g. a globe to represent the
World. Wide Web or the magnifying glass icon in a photo manipulation program to
represent zooming in on an image.
Pervasive computing
Pervasive computing, also known as ubiquitous computing is the trend towards
increasingly ubiquitous connected computing devices in the environment, a trend being
brought about by a convergence of advanced electronic - and particularly, wireless -
technologies and the Internet. Pervasive computing devices are not personal computers as
we tend to think of them, but very tiny - even invisible - devices, either mobile or
embedded in almost any type of object imaginable, including cars, tools, appliances,
clothing and various consumer goods - all communicating through increasingly
interconnected networks. In these devices, the computer interface moves away from the
desktop and the interface
Chapter 7
Input
Input Devices
Input is concerned with:
recording and entering data into a computer system
issuing instructions to the computer
An input device is
” “a device that, together with appropriate software, transforms information from the
user into data that a computer application can process”
The choice of input device for a computer system should contribute positively to the
usability of the system. In general, the most appropriate device is one which:
• Matches the users
Physiological characteristics
Psychological characteristics
Training and expertise
• Is appropriate for the tasks
e.g. continuous movement, discrete movement
• Is suitable for work environment
e.g. speech input is not appropriate for noisy workplace
Many systems use two or more complementary input devices together, such as a
keyboard and a mouse.
There should also be appropriate system feedback to guide, reassure and, if necessary,
correct users’ errors, for example:
On screen - text appearing, cursor moves across screen
Auditory – alarm, sound of mouse button clicking
Tactile – feel of button being pressed, change in pressure
Keyboards
A keyboard is:
A group of on-off push-buttons
A discrete entry device
Issues:
• Physical design of keys
size
feedback
robustness
• Grouping and layout of keys
QWERTY typewriter layout is most common but others are possible
Types of keyboard
QWERTY
Standard alphanumeric keyboard designed for typewriters. The key arrangement was
chosen to reduce the incidence of keys jamming in mechanical typewriters.
Dvorak
The Dvorak keyboard is a typewriter key arrangement that was designed to be easier to
learn and use than the standard QWERTY keyboard. The Dvorak keyboard was designed
from the typist's point-of-view - with the most common consonants on one side of the
middle or home row and the vowels on the other side so that typing tends to alternate key
strokes back and forth between hands. The Dvorak approach is said to lead to faster
typing. It was named after its inventor, Dr. August Dvorak. Dr. Dvorak also invented
systems for people with only one hand.
Both Windows and Macintosh operating systems provide ways for the user to tell the
system that they are using a Dvorak keyboard. Although the QWERTY system seems too
entrenched to be replaced by the Dvorak system, some keyboard users will prefer the
more ergonomic arrangement of the Dvorak system.
Chord Keyboards
Chord keyboards are smaller and have fewer keys, typically one for each finger and
possibly the thumbs. Instead of the usual sequential, one-at-a-time key presses, chording
requires simultaneous key presses for each character typed, similar to playing a musical
chord on a piano.
The primary advantage of the chording keyboard is that it requires far fewer keys than a
conventional keyboard. For example, with five keys there are 31 chord combinations that
may represent letters, numbers, words, commands, or other strings. With fewer keys,
finger travel is minimized because the fingers always remain on the same keys. In
addition, the user is free to place the keyboard wherever it is convenient and may avoid
the unnatural keying posture associated with a conventional keyboard.
The most significant disadvantage of the chording keyboard is that it cannot be used by
an untrained person. At least 15 hours of training and practice are necessary to learn the
chord patterns that represent individual letters and numbers. A second disadvantage of
the chording keyboard is that data entry rates (characters per unit of time) are actually
slower than data entry rates for conventional keyboards. Due to the increased learning
time and slower performance, chording keyboards have not become commercially viable
except for specialized applications.
Dedicated buttons
Some computer systems have custom-designed interfaces with dedicated keys or buttons
for specific tasks. These can be useful when there is a very limited range of possible
inputs to the system and where the environment is not suitable for an ordinary keyboard.
In-car satellite navigation systems and gamepads for computer games are good examples.
Pointing Devices
• These can be used to specify a point or a path.
• Pointing devices are usually continuous entry devices.
Cursor controls
Two dimensional devices which can move a cursor and drag objects on the screen.
Mice
Can move around on a flat surface. Mice are not convenient in limited spaces.
Presentation mice
Handheld devices, usually wireless, do same job as an ordinary mouse but do not need a
surface.
Trackballs
Ball rotates in fixed socket. Some people find this easier to use than a mouse.
Touchpads
Usually found on laptop computers, but can also be used as separate devices. Work like
trackballs but without moving parts.
Joysticks
Used when user needs to input direction and speed. Other devices are used to indicate
position. To see the difference, consider playing a flight simulation game with a mouse
as your input device. Why would this be difficult?
Cursor Keys
Cursor keys can be used to move a cursor, but it is difficult to accomplish dragging.
Using keys can provide precise control of movement by moving in discrete steps, for
example when moving a selected object in a drawing program. Some handheld computers
have a single cursor button which can be pressed in any of four directions.
Touch screens
Touch displays allow the user to input information into the computer simply by touching
an appropriate part of the screen. This kind of screen is bi-directional – it both receives
input and it outputs information.
Advantages:
Easy to learn – ideal for an environment where use by a particular user may only
occur
once or twice
Require no extra workspace
No moving parts (durable)
Provide very direct interaction
Disadvantages:
Lack of precision
High error rates
Arm fatigue
Screen smudging
Touch screens are used mainly for:
Kiosk devices in public places, for example for tourist information
Handheld computers
Pen Input
Touchscreens designed to work with pen devices rather than fingers have become very
common in recent years. Pen input allows more precise control, and, with handwriting
recognition software, also allows text to be input. Handwriting recognition can work
with ordinary handwriting or with purpose designed alphabets such as Graffiti.
Pen input is used in handheld computers (PDAs) and specialised devices, and more
recently in tablet PCs, which are similar to notebook computers, running a full version of
the Windows operating system, but with a pen-sensitive screen and with operating system
and applications modified to take advantage of the pen input.
Pen input is also used in graphics tablets, which are designed to provide precise control
for computer artists and graphic designers.
A tablet PC
3D input devices
All the pointing devices described above allow input and manipulation in two
dimensions. Some applications require input in three dimensions, and specialised input
devices have been developed for these.
3D trackers
3D trackers are often used to interact with Virtual Reality environments.
Stationary Controllers (Small range of motion)
Best for precise 3D element manipulation
Motion Trackers (Large range of motion)
Best for 3D region pointing or head tracking
Virtual Reality Gloves (Datagloves)
Hand gestures
Head Mounted Displays - HMDs (Tracker+Displays)
Best for 3D scene navigation/exploration
The Spacemouse
Speech input
Speech or voice recognition is the ability of a machine or program to recognize and carry
out voice commands or take dictation. In general, speech recognition involves the ability
to match a voice pattern against a provided or acquired vocabulary. Usually, a limited
vocabulary is provided with a product and the user can record additional words. More
sophisticated software has the ability to accept natural speech (meaning speech as we
usually speak it rather than carefullyspoken
speech).
There are three basis uses of Speech Recognition:
1. Command & Control
give commands to the system that it will then execute (e.g., "exit
application" or
"take airplane 1000 feet higher")
usually speaker independent
2. Dictation
dictate to a system, which will transcribe your speech into written text
usually speaker-dependent
3. Speaker Verification
your voice can be used as a biometric (i.e., to identify you uniquely)
Speech input is useful in applications where the use of hands is difficult, either due to the
environment or to a user’s disability. It is not appropriate in environments where noise is
an issue.
Much progress has been made, but we are still a long way from the image we see in
science fiction of humans conversing naturally with computers.
Choosing Devices
The preceding pages have described a wide range of input devices. We will now look
briefly at some examples which illustrate the issues which need to be taken into account
when selecting devices for a system: matching devices with:
the work
the users
the environment
Matching devices with work
Example: Application – panning a small window over a large graphical surface (such as
a layout diagram of a processor chip) which is too large to be viewed in detail as a
whole. The choice of
device is between a trackball and a joystick.
This scenario was suggested by Buxton (1986).
Trackball is better suited as motion of ball is mapped directly to motion over surface,
while
motion of joystick is mapped to speed of motion over surface.
No obvious advantage to either device for zooming.
Zooming and panning is possible with joystick (just displace and twist) but virtually
impossible
with trackball, so the joystick is better suited.
Difficult to locate accurately and keep stationary with joystick, but could be done with
trackball.
When trackball is stopped then motion is stopped, and bezel around ball can be rotated.
Trackball is therefore better suited.
Matching Devices with Users
Chapter 8
Output
Output Devices
Output devices convert information coming from an internal representation in a computer
system to a form understandable by humans. Most computer output is visual and two-
dimensional. Screens and hard copy from a printer are the most common. These devices
have developed greatly in the past two decades, giving greater opportunities for HCI
designers:
Screens
Larger display areas, more colors and higher resolutions, allowing interfaces to
become more graphical and to present more information.
Higher performance graphics adapters allowing detailed 3D visualisations to be
displayed
High quality flat panel screens which can fit in laptop and pocket computers. Flat
panel screens can also save space in non-mobile systems, which can be highly
significant in environments where large amounts of information must be
displayed in a limited physical space, e.g. a stock exchange.
Touch screens, using finger or pen as for input – allow input and output devices to
be combined
Printers
Speed and quality – the laser printer allowed computers to output high quality text
quickly
Colour – inkjet printers made it possible for a wide range of users to produce
colour hard copy
Cost – the reduction in cost of printers allows much more flexibility in their use.
In order to increase bandwidth for information reaching the user, it is an important goal to
use more channels in addition to visual output. One commonly used supplement for
visual information is sound, but its true potential is often not recognized. Audible
feedback can make interaction substantially more comfortable for the user, providing
unambiguous information about the system state and success or failure of interaction (e.
g., a button press), without putting still more load onto the visual channel. Tactile
feedback can serve a similar purpose, for example, input devices physically "reacting" to
input. This form of output is often used in gaming (force feedback joysticks, steering
wheels, etc.) as well as other specialized applications.
Purposes of Output
Output has two primary purposes:
Presenting information to the user
Providing the user with feedback on actions
Visual Output
Visual display of text or data is the most common form of output. The key design
considerations are that the information displayed be legible and easy to locate and
process. These issues relate to aspects of human cognition described in previous chapters.
Visual output can be affected by poor lighting, eye fatigue, screen flicker and quality of
text characters.
Visual Feedback
Users need to know what is happening on the computer’s side of an interaction. The
system should provide responses which keep users well informed and feeling in control.
This includes providing information about both normal processes and abnormal situations
(errors).
Dynamic visualizations
Dynamic visualizations are computer controlled visualizations of information. They can
allow users to interact with information in a visual way, often allowing complex
relationships within the information to be discovered which would be difficult to observe
in textual form. The figure below shows a visualization of a ‘social network’ –
information about messages passed among a group of students and replies received. The
information is shown as a 3D wire frame model, and the user can rotate the model or
zoom in and out at will to explore the model.
This is an example of visualisation of external data which comes from a process beyond
the computer’s control. Model-based simulation displays information based on a model
or simulation under the computer’s control, such as a visualization of a mathematical
function.
All visualizations require a mapping between the information or model and the way it is
displayed. The mapping should be chosen to make the meaning we wish to extract
perceptually prominent. For example, in the social network visualisation, the position
of the student in the model indicates the number of replies received to messages.
Computer-based visualisations have advantages over other forms of visualisation, such as
video recordings, for communicating information:
Can be controlled interactively by the user
Can change the mappings between the model and the way it is displayed, for
example by changing colour coding
Once mappings have been established, can easily produce visualisations of new
situations
Sound Output
Sound is of particular value where the eyes are engaged in some other task or where the
complete situation of interest cannot be visually scanned at one time, for example:
Applications where eyes and attention are required away form the screen – e.g.
flight decks, medical applications, industrial machinery
Applications involving process control in which alarms must be dealt with
Applications addressing the needs of blind or partially sighted users.
Data sonification – situations where data can be explored by listening, given an
appropriate mapping between data and sound. The Geiger counter is a well-
known example of sonification.
Natural sounds
Gaver (1989) developed the idea of auditory icons. These are natural, everyday sounds
which are used to represent actions and objects within an interface. Gaver suggests that
sounds are good for providing information on background processes or inner workings
without disrupting visual attention. At that time, most computer systems were capable
only of very simple beeps. Developments in sound generation in computers have made it
possible to play back high quality sampled or synthesised sounds, and a wide variety of
natural sounds are used in applications.
Speech
Speech output is one of the most obvious means of using sound for providing users with
feedback. Successful speech output requires a method of synthesising spoken words
which can be understood by the user. Speech can be synthesised using one of two basic
methods:
Concatenation
Digital recordings of real human speech are stored. These can be words, word segments,
phrases or sentences. These can be played back later under computer control. This can be
very successful when a very limited range of output is required – for example
announcements on trains which state the name of the next station.
Synthesis-by-rule
Does not use recordings. Instead, speech is completely synthesized by the computer using
phonemes as building blocks. A phoneme is the smallest contrastive unit in the sound
system of a language. This can be useful when large vocabularies are required. It also
allows variation in pitch and tone. The W3C is currently working on a specification for
Speech Synthesis Markup Language (SSML), an XML base
Speech output is not always successful -the infamous 1983 Austin Maestro had digital
instruments and a speaking computer. These innovations were not widely copied…
Chapter 9
Requirements for support
Users have different requirements for support at different times. User support should be
• Available but unobtrusive
• Accurate and robust
• Consistent and flexible
User support comes in a number of styles:
• Command-based methods
• Context-sensitive help
• Tutorial help
• On-line documentation (integrated with application)
• written documentation (manuals or notices)
• Web-based documentation
Design of user support must take account of
• Presentation
• Implementation
Online Help
Many programs come with the instruction manual, or a portion of the manual, integrated
into the program. If you encounter a problem or forget a command while running the
program, you can summon the documentation by pressing a designated Help key or
entering a HELP command once you summon the Help system, the program often
displays a menu of Help topics. You can choose the appropriate topic for whatever
problem you are currently encountering. The program will then display a help screen that
contains the desired documentation.
Some programs are more sophisticated, displaying different Help messages depending on
where you are in the program. Such systems are said to be context sensitive.
There has been a large body of research done to try to understand how users interact with
online help. One aspect which has been studied is the kind of questions which prompt use
of online help (O’Malley 1986, Sellen & Nicol 1990). Typical questions appear to focus
on:
• Goal exploration – what can I do with this system?
• Definition and description – what is this? What is it for?
• Task achievement – how do I do this?
• Diagnostic – how did that happen?
• State identification – where am I?
Most desktop applications and operating systems now have comprehensive online help
systems, and a range of tools are avaible to make the process of creating help systems
relatively easy. However, the system designer must take account of the above question
types to make sure that the system is actually addressing users’ needs.
BSc Applied Computing Human Computer Interaction: 9. User Support