0% found this document useful (0 votes)
10 views38 pages

12 Usability Testing

The document outlines the process and importance of usability testing in human-computer interaction, emphasizing the need to evaluate usability, functionality, and acceptability of interactive systems. It details various evaluation approaches, the setup of usability testing labs, and the steps involved in planning, running, and analyzing usability tests. Additionally, it discusses methodologies, metrics for success, and tools like the System Usability Scale (SUS) and NASA-TLX for measuring user experience.

Uploaded by

freizer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views38 pages

12 Usability Testing

The document outlines the process and importance of usability testing in human-computer interaction, emphasizing the need to evaluate usability, functionality, and acceptability of interactive systems. It details various evaluation approaches, the setup of usability testing labs, and the steps involved in planning, running, and analyzing usability tests. Additionally, it discusses methodologies, metrics for success, and tools like the System Usability Scale (SUS) and NASA-TLX for measuring user experience.

Uploaded by

freizer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

User Evaluation:

Usability Testing
Human Computer Interaction
Fulvio Corno, Luigi De Russis
Academic Year 2019/2020
Evaluation Goal (recap)
§ «Evaluation tests the usability, functionality, and acceptability of an
interactive system»
o According to the design stage (sketch, prototype, … final)
o According to the initial goals
o Alongside different dimensions
o Using a range of different techniques
§ Very wide (and a little bit vague) definition
§ The idea is to identify and correct problems as soon as possible

2
Human Computer Interaction
Evaluation Approaches (recap)

§ Evaluation may take place: § Based on expert evaluation:


o In the laboratory o Analytic methods
o In the field o Review methods
o Model-based methods
§ Involving users:
o Heuristics
o Experimental methods
o Observational methods § Automated:
o Query methods o Simulation and software
o Formal or semi-formal or informal measures
o Formal evaluation with models
and formulas
o Especially for low-level issues
3
Human Computer Interaction
Lab vs. Field

Evaluation in Lab Evaluation in the Field


§ Advantages § Advantages
o specialist equipment available o natural environment
o uninterrupted environment o context retained (although observation
may alter it)
§ Disadvantages o longitudinal studies possible
o lack of context
o difficult to observe several users § Disadvantages
cooperating o distractions
o noise
§ Appropriate
o if system location is dangerous or § Appropriate
impractical o where context is crucial
o for constrained single user systems to o for longitudinal studies
allow controlled manipulation of use

4
Human Computer Interaction
Involving Users: Experimental Methods

Usability/User Testing Controlled Experiments


§ "Let's find someone to use our app, § "We want to verify if users of our
so that we will get some feedback app perform task X faster/…/with
on how to improve it." fewer errors than our competitor's
§ anecdotal, mostly app."

§ observation-driven § scientific
§ hypothesis-driven

5
Human Computer Interaction
Involving Users: Experimental Methods

Usability/User Testing Controlled Experiments


§ "Let's find someone to use our app, § "We want to verify if users of our
so that we will get some feedback app perform task X faster/…/with
on how to improve it." fewer errors than our competitor's
§ anecdotal, mostly app." In January
§ observation-driven § scientific
§ hypothesis-driven

6
Human Computer Interaction
Usability Testing
§ Usability testing speeds up many projects and produces cost savings in a
system development
§ Participants should represent the intended user communities, with attention
to:
o background in computing and experience with the task
o motivation, education, and ability with the natural language used in the
interface
§ The movement towards usability testing stimulated the building of ad-hoc
usability labs

7
Human Computer Interaction
Usability Testing Labs
§ The usability lab usually consists of two areas
o the testing room
o the observation room
§ The testing room is typically smaller
and accommodates a small number of people
§ The observation room can see into the testing room typically via a one-way
mirror
o it is larger and can hold the facilitators with ample room to bring in others,
such as the developers of the product being tested

8
Human Computer Interaction
Usability Testing: 3 Steps
1. Plan
o who are your participants? what are you going to test, where, and how?
2. Run
o one participant at time, multiple sessions
o collect data about the interactive system/interface
3. Analyze
o extract information from the collected data, both qualitative and
quantitative

9
Human Computer Interaction
Plan
Usability Testing

10
Human Computer Interaction
Usability Testing: Plan
§ Choose who you will involve in the test
o who are your (target) users?
§ How many participants do you need?
o 5!
o https://www.nngroup.com/articles/how-many-test-users/
§ Decide who and which roles you are going to "play"
o you need at least a facilitator of the session
o other 1-2 people may serve as note-takers and observers
o N.B. developers, designers, creators, … of the interactive system in
evaluation must not serve as facilitators!

11
Human Computer Interaction
Usability Testing: Plan
§ Choose which task(s) you are going to ask your participants to perform
o tasks may be introduced with a scenario
o they must be concrete and with a clear goal
o between 5-10 tasks

§ Choose any methodology you are willing to apply


o think-aloud, cooperative evaluation, …, none
• more details in a few slides
o and for which tasks you are going to use it

§ Define detailed success/failure criteria for each task

12
Human Computer Interaction
Usability Testing: Plan
§ Decide whether you need or want to ask any additional information
o before and/or after the test
o before and/or after each task
o before and/or after a meaningful group of tasks

§ Select which equipment you will need


o also with respect to the criteria and methodology you define

§ Prepare an informed consent form for participants to fill

13
Human Computer Interaction
Usability Testing: Plan
§ Decide whether to have a debriefing session at the end of the test
o for each participant
o observers and note-takers can ask general and specific questions, to better
understand some pathways or comments
§ Develop a written test protocol ("script") for consistency among sessions
o step-by step instructions with all the needed questions and forms
o often down to the exact words that the facilitator will say
o the appendix may contain a table with all tasks and their metrics
§ Practice your script with friends or colleagues
o to fix obvious bugs so that you do not waste (yours and users’) time

14
Human Computer Interaction
Informed Consent Form
§ Professional ethics practice is to ask all participants to read, understand, and
sign a statement which says:
o I have freely volunteered to participate in this experiment
o I have been informed in advance what my task(s) will be and what
procedures will be followed
o I have been given the opportunity to ask questions and have had my
questions answered to my satisfaction
o I am aware that I have the right to withdraw consent and to discontinue
participation at any time, without prejudice to my future treatment
o My signature below may be taken as affirmation of all the above
statements; it was given prior to my participation in this study

15
Human Computer Interaction
Metrics
§ For success/failure criteria and additional information
§ Subjective metrics, i.e., questions you ask participants:
o prior to the session, e.g., background info
o after each task scenario is completed, such as ease and satisfaction
questions about the task
o overall ease of use, satisfaction, and likelihood to use/recommend at the
end
§ Quantitative metrics
o what you will be measuring in your test, e.g., successful completion rates,
error rates, time on task

16
Human Computer Interaction
Metrics
Successful Task A task is successfully completed when the participant Boolean value, 0-100
Completion indicates they have found the answer or completed the scale, …
task goal.
Critical Errors Deviations at completion from the targets of the task, so Absolute or relative
that the participant cannot finish the task. Participant may number
or may not be aware that the task goal is incorrect or
incomplete.
Non-Critical Errors Errors that are recovered by the participant and do not Absolute or relative
result in the participant's ability to successfully complete number, or they may
the task. These errors result in the task being completed affect the "successful
less efficiently. task completion"
Error-Free Rate The percentage of participants who complete the task Relative number
without any errors.

17
Human Computer Interaction
Metrics

Time On Task The amount of time it takes the participant to complete the Time
task.
Subjective Measures Self-reported participant ratings for satisfaction, ease of Likert Scale
use, ease of finding information, etc.
Likes, Dislikes and What participants liked the most about the system, what Free text
Recommendations they liked least, any recommendations for improving it, etc.
Typically at the end of the session or a meaningful part of
it.

Reliable and validated questionnaires exist for subjective measures and open questions

18
Human Computer Interaction
Methodology: Think-Aloud
§ While the participant performs a task, she is asked to describe what she is
doing and why, what she thinks is happening, etc.
§ Advantages
o simple, it requires little expertise
o can provide useful insight
o can show how the system is actually used
§ Disadvantages
o subjective
o selective
o the act of describing may alter task performance (e.g., time-on-task metric)
19
Human Computer Interaction
Methodology: Cooperative Evaluation
§ Variation of the think-aloud
§ The participant and the facilitator collaborate during the evaluation
o both can ask each other questions throughout

§ Additional advantages
o less constrained and easier to use
o user is encouraged to criticize system
o clarification possible

20
Human Computer Interaction
Equipment
§ Any of these can work for an effective usability testing:
o Laboratory with two or three connected rooms outfitted with audio-visual
equipment
o Room with portable recording equipment
o Room with no recording equipment, as long as someone is observing the
participant and taking notes
o Remotely, with the participant in a different location (either moderated or
unmoderated)

21
Human Computer Interaction
Equipment: Some Material
§ Paper and pencil § Computer logging
o cheap, limited to writing speed o automatic and unobtrusive
o large amounts of data may be
§ Audio difficult to analyze
o good for think-aloud
§ Eye-tracking
§ Video o to track and record eye movements
o accurate and realistic
o needs special equipment
o may be obtrusive

Mixed use in practice


o audio/video transcription difficult and requires skill
o some automatic support tools available
22
Human Computer Interaction
Post-Task Questionnaire: SEQ
§ Single Ease Question (SEQ)
§ Post-task questionnaires need to be short (1–3 questions) to interfere as little
as possible with the flow of using the system in a session
§ SEQ exemplifies this concept in a useful and simple manner
o experimentally validated
o reliable, valid, and sensitive
§ It asks the user to rate the difficulty of the activity they just completed, from
Very Easy to Very Difficult on a 7-point Likert scale

23
Human Computer Interaction
Post-Test Questionnaire: SUS
§ System Usability Scale (SUS)
o a "quick and dirt" (but trustable) usability scale
o invented by John Brooke in 1986
§ It measures the perceived usability of a system
§ A 10-item Likert-scale questionnaire
o each question has 5 response options
§ It produces a score from 0-100
o not equivalent to a percentage score!
§ A SUS score above 68 is considered above average

24
Human Computer Interaction
SUS: Questions
1. I think that I would like to use this system frequently.
2. I found the system unnecessarily complex.
3. I thought the system was easy to use.
4. I think that I would need the support of a technical person to be able to use this system.
5. I found the various functions in this system were well integrated.
6. I thought there was too much inconsistency in this system.
7. I would imagine that most people would learn to use this system very quickly.
8. I found the system very cumbersome to use.
9. I felt very confident using the system.
10. I needed to learn a lot of things before I could get going with this system.

25
Human Computer Interaction
SUS: Scoring
To calculate the SUS score of your system:
1. Each answer is 1-5 (X)
2. For every odd-numbered question, subtract 1 from the score (X-1)
o e.g., the answer for question 1 is 4, so its score is 4-1 = 3
3. For every even-numbered question, subtract the score from 5 (5-X)
o e.g., the answer for question 2 is 4, so its score is 5-4 = 1
4. Sum the scores from even and odd-numbered questions
5. Multiply the total by 2.5

26
Human Computer Interaction
SUS: Advantages and Disadvantages
§ Advantages § Disadvantages
o Score reliability has been o It is a subjective measure of
evaluated over the decades and it perceived usability
is on par with more complex and • it should not be your only method
costly methods o It gives no clues about how to
o Free, quick, and simple improve the score
o Quite used in industry • it is not diagnostic
o Applicable to a wide range of o It is not possible to make
technologies, systems, and systematic comparisons between
products two system and their
functionality using SUS

27
Human Computer Interaction
Post-Test Questionnaire: NASA-TLX
§ NASA Task Load indeX (NASA-TLX)
o emerged in the 1980s
o the result of NASA efforts to develop an instrument
for measuring the perceived workload required by
the complex, highly technical tasks of aerospace
crew members
§ Useful for studying complex products and tasks in high-
consequence environments
o e.g., healthcare, aerospace, military, etc.

28
Human Computer Interaction
NASA-TLX: Questions
§ 6 questions on an unlabeled 21-point scale
o ranging from Very Low to Very High
§ Each question addresses one dimension of the perceived workload:
o mental demand
o physical demand
o time pressure
o perceived success with the task
o overall effort level
o frustration level
§ Respondents weigh each one of the questions pertaining to the six
categories, to indicate which mattered most to what they were doing
29
Human Computer Interaction
NASA-TLX: Score
§ A complex instrument to score
§ NASA shares a paper and pencil version
o with instructions
o https://humansystems.arc.nasa.gov/groups/tlx/tlxpaperpencil.php
§ and a free iOS app to compute the score
o https://itunes.apple.com/us/app/nasa-tlx/id1168110608

30
Human Computer Interaction
Sample Scripts and Some Tips
§ Sample Usability Testing scripts, with no task described in them, mainly:
o https://www.sensible.com/downloads/test-script.pdf
o http://www.lse.ac.uk/intranet/staff/webSupport/guides/archivedWebeditor
sHandbook/pdf/script.pdf
§ How to create good tasks?
o https://www.nngroup.com/articles/task-scenarios-usability-testing/

31
Human Computer Interaction
Run and Analyze
Usability Testing

32
Human Computer Interaction
Usability Testing: Run
§ Get informed consent
o better in written format
§ One person acts as the facilitator and rest of team are observers
o at least one of the observers must take notes
§ Tell each participant:
o "we are testing our app, not you! Any mistakes are app’s fault, not yours."
o IMPORTANT!

33
Human Computer Interaction
Usability Testing: Run
§ The facilitator should always follow the script, remain neutral, not help the
participants, and provide clear instructions
o tasks can be given in a written form, one at time, … or vocally
§ The facilitator must encourage participants to adopt (and explain) the chosen
methodologies, at the right moment
o e.g., how the think-aloud work and for which tasks to use it
§ Note-takers take notes of the participant’s behavior, comments, errors and
completion (success or failure) of each task
§ The system is ready to measure all the defined criteria

34
Human Computer Interaction
Usability Testing: Analyze
§ Analyze collected data to find UI failures and ways to improve
o e.g., written notes, audio, video, usage logs, …
§ Do not forget to consider the collected metrics
o per task and overall
§ Quantitative data can be summarized in, e.g., success rates, task time, error
rates, satisfaction questionnaire ratings
§ Look for trends and keep a count of problems that occurred across
participants
o e.g., observations about pathways participants took,
comments/recommendations, answers to open-ended questions

35
Human Computer Interaction
References
§ Alan Dix, Janet Finlay, Gregory Abowd, Russell Beale, Human Computer
Interaction, 3rd Edition
o Chapter 9: Evaluation Techniques
§ Ben Shneiderman, Catherine Plaisant, Maxine S. Cohen, Steven M. Jacobs, and
Niklas Elmqvist, Designing the User Interface: Strategies for Effective Human-
Computer Interaction
o Chapter 5: Evaluating Interface Design
§ usability.gov - Improving the User Experience
o https://www.usability.gov

36
Human Computer Interaction
References
§ Beyond the NPS: Measuring Perceived Usability with the SUS, NASA-TLX, and
the Single Ease Question After Tasks and Usability Tests
o https://www.nngroup.com/articles/measuring-perceived-usability/
§ John Brooke, SUS - A quick and dirty usability scale, 1986
o https://hell.meiert.org/core/pdf/sus.pdf
§ The Pros and Cons of the System Usability Scale (SUS)
o https://research-collective.com/blog/sus/

37
Human Computer Interaction
License
§ These slides are distributed under a Creative Commons license “Attribution-NonCommercial-ShareAlike 4.0
International (CC BY-NC-SA 4.0)”
§ You are free to:
o Share — copy and redistribute the material in any medium or format
o Adapt — remix, transform, and build upon the material
o The licensor cannot revoke these freedoms as long as you follow the license terms.
§ Under the following terms:
o Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were
made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses
you or your use.
o NonCommercial — You may not use the material for commercial purposes.
o ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions
under the same license as the original.
o No additional restrictions — You may not apply legal terms or technological measures that legally restrict
others from doing anything the license permits.
§ https://creativecommons.org/licenses/by-nc-sa/4.0/

38
Human Computer Interaction

You might also like