100% found this document useful (1 vote)
160 views11 pages

HCI Chapter 8 Evaluation Techniques

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
160 views11 pages

HCI Chapter 8 Evaluation Techniques

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Outline

01076568 Human Computer Interaction • What is evaluation?


• Goals of evaluation
Chapter 8 : Evaluation techniques • Evaluation through expert analysis
• Evaluation through user participation
ดร.ชมพูนทุ จินจาคาม • Choosing an evaluation method
[[email protected]]
• Consent form
ภาควิชาวิศวกรรมคอมพิวเตอร์ คณะวิศวกรรมศาสตร์
สถาบันเทคโนโลยีพระจอมเกล้ าเจ้ าคุณทหารลาดกระบัง • Summary

C. Jinjakam, CE, KMITL 2

What is evaluation? Goal of evaluation


• Evaluation role is to access designs and test
systems to ensure that they actually behave as • Evaluation has three main goals:
we expect and meet user requirements. – To assess the extent and accessibility of the
system’s functionality
– To assess users’ experience of the interaction
• Ideally, evaluation should occur throughout
– And to indentify any specific problems with the
the design life cycle, with the results of the system.
evaluation feeding back into modifications to
the design.

C. Jinjakam, CE, KMITL 3 C. Jinjakam, CE, KMITL 4


• The system’s functionality is important must • Identify specific problem with the design
accord with the user’s requirements. – This may aspects of the design which, when used
– Evaluation at this level may measuring the user’s in their intended context, cause unexpected
performance with the system to assess the results, or confusion amongst user.
effectiveness of the system in supporting the task.

• User’s experience of the interaction


– How easy the system is to learn, its usability and • We will consider evaluation techniques under
the user satisfaction with it. two broad headings: expert analysis and user
– It may include his enjoyment and emotional participation.
response, particularly in the case of aim to
entertainment.

C. Jinjakam, CE, KMITL 5 C. Jinjakam, CE, KMITL 6

Evaluation through expert analysis To do walkthrough, you need four things:


1. A specification or prototype of the system.
• Cognitive walkthrough 2. A description of the task the user is to perform
– Originally proposed by Polson and colleagues as on the system.
an attempt to introduce psychological theory into 3. A complete, written list of the actions needed
the informal and subjective walkthrough to complete the task with the proposed system.
technique. 4. An indication of who the users are and what
– The main focus is to establish how easy a system is kind of experience and knowledge the
to learn (by hands on, not by training or user’s evaluators can assume about them.
manual).

C. Jinjakam, CE, KMITL 7 C. Jinjakam, CE, KMITL 8


The evaluation try to answer the following four • Heuristic evaluation
questions: – Heuristic is a guideline or general principle or rule of
thumb that can guide a design decision or be used to
1. Is the effect of the action the same as the user’s critique a decision that has already been made. 3-5
goal at that point? evaluators is sufficient.
2. Will users see that the action is available? – Severity rating on a scale of 0-4 (less-most)
0 = I don’t agree that this is a usability problem at all
3. Once users have found the correct action, will 1 = Cosmetic problem only: need not be fixed unless extra
they know it is the one they need? time is available on project
4. After the action is taken, will users understand 2= Minor usability problem: fixing this should be given low
priority
the feedback they get? 3= Major usability problem: important to fix, so should be
given high priority
4= Usability catastrophe: imperative to fix this before product
can be released (Nielsen)

C. Jinjakam, CE, KMITL 9 C. Jinjakam, CE, KMITL 10

Nielsen’s ten heuristics are: • Model-based evaluation


1. Visibility of system status – Dialog models can be used to evaluate dialog
2. Match between system and the real world sequences for problems, ex. Unreachable states,
3. User control and freedom circular dialogs and complexity.
4. Consistency and standards
5. Error prevention
6. Recognition rather than recall • Using previous studies in evaluation
7. Flexibility an efficiency of use – Ex. Usability of different menu types, the recall
8. Aesthetic and minimalist design command names, and the choice of icons.
9. Help users recognize, diagnose and recover from
errors
10. Help and documentation

C. Jinjakam, CE, KMITL 11 C. Jinjakam, CE, KMITL 12


Evaluation through user participation • Empirical methods: experimental evaluation
– Variables :
• ตัวแปรต้ น (สิ'งที'จะทําการทดลอง กําหนดขึ +นเพื'อทดสอบสมมติฐาน)
• Styles of evaluation • ตัวแปรตาม (ผลที'เกิดจากตัวแปรต้ น เป็ นตัวแปรที'ต้องทําการวัดค่า บันทึกผล)
– Laboratory studies; take part in controlled tests. • ตัวแปรควบคุม (ตัวแปรที'สง่ ผลต่อการทดลองให้ คลาดเคลื'อนได้ จึงต้ องควบคุมให้
– Field studies; into the user’s work environment in เหมือนกัน)
order to observe the system in action.
– Hypothesis is the prediction of the outcome of an
• Empirical methods: experimental evaluation experiment.
• Stating that a variation in the independent variable will
– participants should be chosen to match the cause a difference in the dependent variable.
expected user population as closely as possible. • By disproving the null hypothesis, which states that
And sample size must be large enough to be there is no difference in the dependent variable
representative of the population. between the levels of the independent variable.

C. Jinjakam, CE, KMITL 13 C. Jinjakam, CE, KMITL 14

• Empirical methods: experimental evaluation • Empirical methods: experimental evaluation


– Experimental design – Statistical measures => look at the data (freak
• How many participants are available and are they event) and save the data.
representative of the user group?
• Experimental method:
– Between-subjects => participant is assigned to a different
condition (experimental and control conditions)
[Adv. learning effect resulting, Disadv. Require greater number
of participants.]
– Within-subjects => each user performs under each different
condition.
[This design can suffer from transfer of learning effect. Tackle by
set group A(1-2) then group B(2-1)]

C. Jinjakam, CE, KMITL 15 C. Jinjakam, CE, KMITL 16


Discrete = levels, numbers Ex. Screen color as red, green, blue
Continuour = any value Ex. Person’s height, time taken to complete task
An extensive and accurate analysis might ask about the data as:
• Is there a difference?
Ex. Using Hypothesis testing. Answers are not yes/no but as
‘we are 99% certain that selection from the menus of five
- Assume data has come from probability distribution. items is faster than that from menus of seven items’.

• How big is the difference?


- Fewer assumptions Ex. Movie rank order (1-4 stars)
Ex. ‘Selection from five items is 260 ms faster than from the
seven items’.

• How accurate is the estimate?


-True or false Ex. It’s raining or it isn’t raining, button is red, there’re 3 menus
Ex. ‘Selection is faster by 260 ± 30 ms’. Or ‘we are 95% certain
that the difference in response time is between 230 and 290
ms’.

C. Jinjakam, CE, KMITL 17 C. Jinjakam, CE, KMITL 18

An example: evaluating icon design


• Hypothesis : User will remember the natural icons more easily
than the abstract ones.
• Null hypothesis is no difference between recall of the icon
types.

C. Jinjakam, CE, KMITL 19 C. Jinjakam, CE, KMITL 20


An example: evaluating icon design An example: evaluating icon design
• ตัวแปรต้ น : varying the style of icon (natural and abstract) • To avoid learning effects from icon position, the placing of
• ตัวแปรตาม : the number of mistakes in selection and the time icons in the block can be randomly varied on each
taken to select an icon. presentation. Users are divided into two groups with each
• ตัวแปรควบคุม : อุปกรณ์ที'ใช้ ทดลอง (computer, mouse, keyboard), group taking a different starting condition.
operating system, light, size of icon, color tone of icon, etc.
• A between-subjects experiment would remove any learning • Measure the time taken to complete the task and the number
effect for individual participants, but it would be more difficult of error made.
to control for variation in learning style between participants.
• => A within-subjects design is preferred, with order of
presentation controlled.
• The user is presented with a task (say ‘delete a document’)
and is required to select the appropriate icon.

C. Jinjakam, CE, KMITL 21 C. Jinjakam, CE, KMITL 22

• Empirical methods: experimental evaluation


– Studies of groups of users
• The participant groups ; Ex. 3 exp x 10 participants, take
< 5 mins
time longer than single-user.
• The experimental task ; the task also depends on the
≈ 20 mins nature of the groupware system.
• Data gathering
• Analysis
• Field study with groups
=s.d./√10
Mean diff = 52 s

And we used a within-subject design, there is another independent variable –


the participant. => we should use analysis of variance (ANOVA)

C. Jinjakam, CE, KMITL 23 C. Jinjakam, CE, KMITL 24


• Observational techniques • Observational techniques
– Think aloud and cooperative evaluation – Automatic protocol analysis tools
• Observation where the user is asked to talk through what he
is doing as he is being observed. Ex. Describing what he
believes is happening, why he takes an action, what he is
trying to do.
• The evaluator can clarify point of confusion.
– Protocol analysis
• Paper and pencil
• Audio recording
• Video recording – Post-task walkthroughs
• Computer logging (Ex. Record user actions at a keystroke) • To reflect the participant’s actions back to them after the
event, by asking comment, or directly question.
• User notebook (Ex. Participants be asked to keep logs of
• This is only way to obtain a subjective viewpoint on the
activity/problems.) user’s behavior.
C. Jinjakam, CE, KMITL 25 C. Jinjakam, CE, KMITL 26

• Query techniques • Query techniques


– Interview – Questionnaires
• The level of questioning can be varied to suit the • Multi-choice
context and that the evaluator can probe the user more
deeply on interesting issues as they arise. • Ranked
– Questionnaires
Less flexible than interview technique, questions are
fixed in advance. But it can reach wider participant
group and take less time.
• General (Ex. Age, sex, occupation, experience, etc.)
• Open-ended (Ex. Can you suggest any improvements to
the interface?)
• Scalar (judge a specific statement on a numeric scale)
C. Jinjakam, CE, KMITL 27 C. Jinjakam, CE, KMITL 28
• Evaluation through monitoring physiological • Evaluation through monitoring physiological
responses responses
– Eye tracking for usability evaluation – Physiological measurements
• Number of fixations • Heart activity; blood pressure, volume and pulse.
• Fixation duration • Activity of the sweat glands; galvanic skin response(GSR)
• Scan path • Electrical activity in muscle; electromyogram (EMG)
• Electrical activity in the brain; electroencephalogram(EEG)

C. Jinjakam, CE, KMITL 29 C. Jinjakam, CE, KMITL 30

Choosing an evaluation method • Factors distinguishing evaluation techniques


– Subjective vs. objective
• Factors distinguishing evaluation techniques • Bias in subjective techniques should be recognized and
avoided by using more than one evaluator.
– Design vs. implementation
• Objective techniques should produce repeatable
• The main distinction is in evaluation in implementation results, which are not dependent on the persuasion of
is a physical artifact exists, something concrete can be the particular evaluator.
tested.
– Qualitative vs. quantitative measures
– Laboratory vs. field studies
• Quantitative measurement is usually numeric and can
• Field studies retain naturalness of the user’s be easily analyzed using statistical techniques.
environment but do not allow control over user activity.
• Qualitative measurement is opposite, but can provide
• Ideally the design process should include both styles of important detail that cannot be determined from
the evaluation. numbers.

C. Jinjakam, CE, KMITL 31 C. Jinjakam, CE, KMITL 32


• Factors distinguishing evaluation techniques • Factors distinguishing evaluation techniques
– Information provided – Intrusiveness
• Low-level information (Ex. Which font is most readable) • Most immediate evaluation techniques run the risk of
=> an experiment can be designed to measure a influencing the way the user behaves.
particular aspect of the interface. – Resource
• Higher-level information (Ex. Is the system usable?) • Resources to consider include equipment, time money,
=> can be gathered using questionnaire and interview participants, expertise of evaluator and context.
techniques, which provide a more general impression
of the user’s view of the system. • Some decisions are forced by resource limitations. Ex. It
is not possible to produce a video protocol without
– Immediacy of response access to a video camera.
• Ex. Think aloud, record user’s behavior at the time of
the interaction itself, post-task walkthrough.

C. Jinjakam, CE, KMITL 33 C. Jinjakam, CE, KMITL 34

• A classification of evaluation techniques • A classification of evaluation techniques

C. Jinjakam, CE, KMITL 35 C. Jinjakam, CE, KMITL 36


• A classification of evaluation techniques • A classification of evaluation techniques

C. Jinjakam, CE, KMITL 37 C. Jinjakam, CE, KMITL 38

Consent form
Consent form
• Depending on the agency or institution
overseeing the research, participants are usually
required to sign a consent form prior to testing.
• The goal is to ensure participants know
– that their participation is voluntary,
– that they will incur no physical or psychological harm,
– that they can withdraw at any time, and
– that their privacy, anonymity, and confidentiality will
be protected.
• The experiment begins. Summary
• The experimenter greets each participant, • Aim of evaluation is to test the functionality and
introduces the experiment, and usually asks usability of the design and to identify and rectify any
problems.
the participants to sign consent forms.
• Often, a brief questionnaire is administered to • A design can be evaluated before any
implementation work has started, to minimize the
gather demographic data and information on cost of early design errors.
the participants’ related experience.
• Query techniques provide subjective information
• This should take just a few minutes. from the user. For objective information,
• The apparatus is revealed, the task explained physiological monitoring can capture the user’s
and demonstrated. physical responses to the system.
• Practice trials are allowed, as appropriate. • The choice of evaluation method is largely
dependent on what is required of the evaluation.
C. Jinjakam, CE, KMITL 42

You might also like