0% found this document useful (0 votes)
82 views11 pages

Chapter 8-Summative Evalution

This document discusses summative evaluation, which involves assessing the overall experience of a finished product compared to benchmarks. It provides examples of summative evaluations, including comparing metrics of a new mobile app to the previous version, or comparing results to competitor apps. Summative evaluations aim to understand performance over time and return on investment. Expert reviews are also discussed as a type of qualitative summative evaluation to assess a product against competitors or usability heuristics. The document outlines best practices for conducting effective expert reviews.

Uploaded by

Charlotte
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views11 pages

Chapter 8-Summative Evalution

This document discusses summative evaluation, which involves assessing the overall experience of a finished product compared to benchmarks. It provides examples of summative evaluations, including comparing metrics of a new mobile app to the previous version, or comparing results to competitor apps. Summative evaluations aim to understand performance over time and return on investment. Expert reviews are also discussed as a type of qualitative summative evaluation to assess a product against competitors or usability heuristics. The document outlines best practices for conducting effective expert reviews.

Uploaded by

Charlotte
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

CHAPTER 8- SUMMATIVE EVALUTION

This approach is in contrast to a formative evaluation which is used to find and eliminate problems
during the design and development process, rather than judge a completed product against specific
goals.
RECALL:
Formative evaluation aims to test the functionality and usability of the design and to identify
and rectify any problems.
Summative evaluations describe how well a design performs, often compared to a benchmark such as
a prior version of the design or a competitor. Unlike formative evaluations, whose goals is to
inform the design process, summative evaluations involve getting the big picture and assessing the
overall experience of a finished product. Summative evaluations occur less frequently than
formative evaluations, usually right before or right after a redesign.

Summative evaluation example

You have shipped a new mobile app, it is time to run a study and see how our app stands in
comparison to the previous version of the app. We can gather the time on task and the success rates for
the core app functionalities. Then we can compare these metrics against those obtained with the
previous version of the app to see if there was any improvement. We will also save the results of this
study to evaluate subsequent major versions of the app. This type of study is a summative evaluation
since it assesses the shipped product with the goal of tracking performance over time and
ultimately calculating our return on investment. However, during this study, we might uncover some
usability issues. We should make note of those issues and address them during our next design
iteration.

Alternatively, another type of summative evaluations could compare our results with those obtained
with one or more competitor apps or with known industry-wide data.
All summative evaluations paint an overview picture of the usability of a system. They are intended to
serve as reference points so that you can determine whether you’re improving your own designs over
time or beating out a competitor.

Although summative evaluations are often quantitative, they can be qualitative studies, too. For
example, you might like to know where your product stands compared with your competition. You
could hire a user design (UX) expert to do an expert review of your interface and a competitor’s. The
expert review would use the 10 usability heuristics as well as the reviewer’s knowledge of UI and
human behavior to produce a list of strength and weaknesses for both your interface and your
competitor’s. The study is summative because the overall interface is being evaluated with the goal of
understanding whether the user design (UX ) of your product stands up to the competition and whether
a major redesign is warranted.

Expert Reviews
 Expert reviews are best used when you can’t conduct a usability test or in conjunction with
insights collected from observing even just a handful of users attempting realistic tasks on a
website or application.
 Formal expert reviews have proven to be effective.
 Experts may be available as staff or as consultants
 Expert reviews may take one-half day to one week but lengthy training may be required to explain
the task domain or operational procedures
 There are a variety of expert review methods to choose from
 Expert reviews can be scheduled at several points in the development process when experts are
available and the design team is ready for feedback
 Different experts tend to find different problems in an interface, so 3-5 expert reviewers can be
highly Productive, as can complementary usability testing.
 Expert reviews can be conducted on short notice and with little time commitment
 Can occur early or late in the design phase
 Deliverable can be a formal report with problems identified and recommendations
 Deliverable can also be an informal presentation with the development team and managers
 Expert reviews may require training on the task domain
Five steps for conducting an effective expert review 
i. Understand the Method & Human Behavior
An expert review is not just your opinion of likes and dislikes. While you do need to use your
judgment, that judgment should be guided by principles of how humans interact with computers—
which will ideally be backed by research.
ii. Have Some Idea of Common Tasks Users Will Perform
Focusing an expert review on the top tasks a user might likely perform will help improve the number
and relevance of problems uncovered. It may also reduce the chances the problems you find
are false positives.
A common way to apply the expert review is to use a modified cognitive walkthrough. Decompose
each task as the user would attempt it, and try and think like the users. This means you should have
some data on both the type of tasks and the type of domain knowledge (e.g. terms) the users are
likely to know when they’re using the interface.
iii. Conduct the Review methodically and independently
Conduct the review. You can go low tech or high tech. Use a Word document, spreadsheet,
PowerPoint deck, paper, or web form to record your observations. Think globally and locally about
the experience: look for issues that span multiple screens (like navigation) and issues that are more
particular (like content or actions on a specific page). Record screen shots with clearly articulated
examples of what you’ve identified is a problem, its possible impact, and suggestions for
improvement (when appropriate).
iv. Have another Expert Perform an Independent Review
The best expert reviews are those that involve multiple evaluators working independently. Have at
least one other person conduct the review; ideally 3-5 is a good number. Even less experienced
evaluators can help provide a fresh perspective to more seasoned experts. This redundancy helps
find more issues AND minimizes the perception that this exercise is only one person’s biased
opinion.
v. Categorize, Reconcile Differences, and Add Severity
Aggregate your results and report on which issues multiple evaluators identified. This will help
corroborate the findings (a measure of validity). Reconcile the problems that only some (or one) of
the evaluators found and see whether these are unique problems or just different instances of
another problem already identified. Expect to find a lot of unique problems, but remember that just
because only one evaluator identified it, doesn’t mean it’s not a legitimate problem. Consider using
some form of simple severity rating (e.g. minor, moderate, severe) or find a way to prioritize the
issue list if there are a lot of problems uncovered.

Expert Review Methods


1. Heuristic evaluation
2. Guidelines review
3. Consistency inspection
4. Cognitive walkthrough
5. Formal usability inspection

i. Heuristic evaluation
–An expert reviews the interface identifying where it doesn’t follow design heuristics – rules of
thumb. E.g. Shneiderman’s 8 Golden rules, or Jacob Nielson’s heuristics
– Obviously, the expert has to be an expert – familiar with the rules and how to apply them

Shneiderman’s 8 Golden rules


To improve the usability of an application it is important to have a well-designed interface.
Shneiderman's "Eight Golden Rules of Interface Design" are a guide to good interaction
design.

1 Strive for consistency.


Consistent sequences of actions should be required in similar situations; identical terminology should
be used in prompts, menus, and help screens; and consistent commands should be employed
throughout.

 Consistent sequence of actions should be used in similar situations


 Identical terminology should be used in prompts, menus, help screens etc.
 Consistent color, layout, capitalization, fonts etc. should be employed throughout

2 Enable frequent users to use shortcuts.


As the frequency of use increases, so do the user's desires to reduce the number of
interactions and to increase the pace of interaction. Abbreviations, function keys, hidden
commands, and macro facilities are very helpful to an expert user.

3 Offer informative feedback.


For every operator action, there should be some system feedback. For frequent and minor
actions, the response can be modest, while for infrequent and major actions, the response
should be more substantial.

4 Design dialog to yield closure.


Sequences of actions should be organized into groups with a beginning, middle, and end.
The informative feedback at the completion of a group of actions gives the operators the
satisfaction of accomplishment, a sense of relief, the signal to drop contingency plans and
options from their minds, and an indication that the way is clear to prepare for the next
group of actions.

5 Offer simple error handling.


As much as possible, design the system so the user cannot make a serious error. If an error
is made, the system should be able to detect the error and offer simple, comprehensible
mechanisms for handling the error.

6 Permit easy reversal of actions.


This feature relieves anxiety, since the user knows that errors can be undone; it thus
encourages exploration of unfamiliar options. The units of reversibility may be a single
action, a data entry, or a complete group of actions.
7 Support internal locus of control.
Experienced operators strongly desire the sense that they are in charge of the system and
that the system responds to their actions. Design the system to make users the initiators of
actions rather than the responders.

8 Reduce short-term memory load.


The limitation of human information processing in short-term memory requires that
displays be kept simple, multiple page displays be consolidated, window-motion frequency
be reduced, and sufficient training time be allotted for codes, mnemonics, and sequences of
actions.

Jacob Nielson’s heuristics


 Nielsen’s Heuristics include:

 Visibility of system status: The system should always keep users informed about what is going
on, through appropriate feedback within reasonable time.
 Match between system and the real world: The system should speak the users' language, with
words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-
world conventions, making information appear in a natural and logical order.
 User control and freedom: Users often choose system functions by mistake and will need a
clearly marked "emergency exit" to leave the unwanted state without having to go through an
extended dialogue. Support undo and redo.
 Consistency and standards: Users should not have to wonder whether different words, situations,
or actions mean the same thing. Follow platform conventions.
 Error prevention: Even better than good error messages is a careful design which prevents a
problem from occurring in the first place. Either eliminate error-prone conditions or check for
them and present users with a confirmation option before they commit to the action.
 Recognition rather than recall: Minimize the user's memory load by making objects, actions, and
options visible. The user should not have to remember information from one part of the dialogue to
another. Instructions for use of the system should be visible or easily retrievable whenever
appropriate.
 Flexibility and efficiency of use: Accelerators—unseen by the novice user—may often speed up
the interaction for the expert user such that the system can cater to both inexperienced and
experienced users. Allow users to tailor frequent actions.
 Aesthetic and minimalist design: Dialogues should not contain information which is irrelevant or
rarely needed. Every extra unit of information in a dialogue competes with the relevant units of
information and diminishes their relative visibility.
 Help users recognize, diagnose, and recover from errors: Error messages should be expressed
in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.
 Help and documentation: Even though it is better if the system can be used without
documentation, it may be necessary to provide help and documentation. Any such information
should be easy to search, focused on the user's task, list concrete steps to be carried out, and not be
too large.

Advantages and Disadvantages of Heuristics


A heuristic evaluation should not replace usability testing. Although the heuristics relate to criteria that
affect your site’s usability, the issues identified in a heuristic evaluation are different than those found
in a usability test.

Advantages Disadvantages

 It can provide some quick and relatively  It requires knowledge and


inexpensive feedback to designers. experience to apply the heuristics
 You can obtain feedback early in the design effectively.
process.  Trained usability experts are
 Assigning the correct heuristic can help sometimes hard to find and can be
suggest the best corrective measures to expensive.
designers.  You should use multiple experts and
 You can use it together with other usability aggregate their results.
Advantages Disadvantages

testing methodologies.  The evaluation may identify more


 You can conduct usability testing to further minor issues and fewer major
examine potential issues. issues.

ii. Guidelines review


A guideline review involves having an evaluator compare an interface against a detailed set of
guidelines. Guidelines can be used for creating an interface (typically used by designers and
developers) or evaluating it for compliance (typically performed by usability evaluators).
Guideline reviews predate the web but became more popular with the increase in graphical user
interfaces (GUIs). 

Example 1: Usability of web sites

Userfocus usability evaluation workbook

Authors: David Travis (userfocus.co.uk), July 6, 2009

Available both as web content and as free Excel workbook. Translations exist for several languages.

The web site and workbook are organized in 9 sections that include each between 13 and 37
guidelines. An evaluator then can rate a site +1/0/-1 for each guideline and add a comment. The
worksheet will then compute percentages with respect to all filled-in ratings (some guidelines may not
be relevant for your own web site). The 9 sections cover the following issues:

1. Home page usability: 20 guidelines to evaluate the usability of home pages.


2. Task orientation: 44 guidelines to evaluate how well a web site supports the users tasks.
3. Navigation and IA: 29 guidelines to evaluate navigation and information architecture.
4. Forms and data entry: 23 guidelines to evaluate forms and data entry.
5. Trust and credibility: 13 guidelines to evaluate trust and credibility.
6. Writing and content quality: 23 guidelines to evaluate writing and content quality.
7. Page layout and visual design: 38 guidelines to evaluate page layout and visual design.
8. Search usability: 20 guidelines to evaluate search.
9. Help, feedback and error tolerance: 37 guidelines

The workbook is locked to prevent mistakes, but you can unlock it and add/remove criteria.

Example 2: Nielsen's Top ten Guidelines for Homepage Usability

Author: Jakob Nielsen (2002)

Available: Top Ten Guidelines for Homepage Usability at useit.com. 

iii. Consistency inspection

Consistency Inspections are aimed at evaluating consistency across a family of products (Nielsen &
Mack 1994). Interfaces are evaluated by designers from multiple projects to ensure consistency
within their own designs.
 For all products covered in the inspection, choose members from their respective development
team to form an inspection team.

 Members chosen must have the power to vote for or against design elements

 Members must have the power to change their product’s design

 Have a usability personnel to document interface differences between products

 For each element documented, have the team come to agreement with what it should look and
work

 All members should agree on each change

 For each change that cannot be agreed on by all members, the change will be kept for a more
focused discussion in another meeting. It is used during the early stages of development, when
development work has not reached the stage where the various products require extensive changes
for ensuring consistency.
STRENGTHS

Consistency Inspections allows straightforward comparisons between products during reviews for
consistency.
It also promotes consistency across groups of different products. Product managers will then be aware
of issues of consistency during the later stages of development.
WEAKNESS

There is, however, a weakness. It is difficult to make consistency judgements, and therefore makes
Consistency Inspection a complex task.

iv. Cognitive Walkthrough


 The cognitive walkthrough is a usability evaluation method in which one or more evaluators
work through a series of tasks and ask a set of questions from the perspective of the user.
 Experts simulate users using the interface to do typical (or important) tasks (“walking though”
them).
 High-frequency tasks are a starting point, but rare critical tasks should also be walked through.
 During a walkthrough, the expert checks:
 Will the users know what to do,
 See how to do it, and
 Understand from feedback whether the action was correct or not?
 Then there is a group meeting to discuss issues that arise (w/ expert, users, designers, possibly
managers).

v. Formal usability inspection


 Formal usability inspections are structured activities with defined steps and trained inspectors.
This method is most appropriate for more complex software where product teams want to track
usability defects and establish a process to detect and eliminate major usability bugs.
 The experts hold a courtroom-style meeting, with a moderator or judge, to present the interface
and to discuss its merits and weaknesses.
 Challenge- Design-team members may rebut (deny) the evidence about problems in an
adversarial (confrontational ) format.
 Can be educational experiences for novice designers and managers, but they may take longer to
prepare for.
 Rarely used compared to other expert review methods

When Each Type of Evaluation Is Used

Formative and summative evaluations align with your place in the design process. Formative
evaluations go with prototype and testing iterations throughout a redesign project, while summative
evaluations are best for right before or right after a major redesign.

Great researchers begin their study by determining what question they’re trying to answer. Essentially,
your research question is the same as the type of evaluation. Below is a list of possible research
questions you might have and the corresponding evaluation. For that reason, this table is descriptive,
not prescriptive.

You might also like