©2011 1 introducing evaluation chapter 12 adapted by wan c. yoon - 2012

58
©2011 1 www.id-book.com Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

Upload: doreen-avis-hicks

Post on 12-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20111www.id-book.com

Introducing Evaluation

Chapter 12

adapted by Wan C. Yoon - 2012

Page 2: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20112www.id-book.com

The aims• Explain the key concepts used in evaluation. • Introduce different evaluation methods. • Show how different methods are used for

different purposes at different stages of the design process and in different contexts.

• Show how evaluators mix and modify methods.

• Discuss the practical challenges • Illustrate how methods discussed in Chapters 7

and 8 are used in evaluation and describe some methods that are specific to evaluation.

Page 3: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20113www.id-book.com

Why, what, where and when to evaluate

• Iterative design & evaluation is a continuous process that examines:

• Why: to check users’ requirements and that users can use the product and they like it.

• What: a conceptual model, early prototypes of a new system and later, more complete prototypes.

• Where: in natural and laboratory settings.• When: throughout design; finished products can

be evaluated to collect information to inform new products.

Page 4: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20114www.id-book.com

Bruce Tognazzini tells you why you need to evaluate

• “Iterative design, with its repeating cycle of design and testing, is the only validated methodology in existence that will consistently produce successful results. If you don’t have user-testing as an integral part of your design process you are going to throw buckets of money down the drain.”

• See AskTog.com for topical discussions about design and evaluation.

Page 5: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20115www.id-book.com

Types of evaluation• Controlled settings involving users, eg usability

testing & experiments in laboratories and living labs.

• Natural settings involving users, eg field studies to see how the product is used in the real world.

• Any settings not involving users, eg consultants critique; to predict, analyze & model aspects of the interface analytics.

Page 6: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20116www.id-book.com

Usability lab

http://iat.ubalt.edu/usability_lab/

Page 7: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20117www.id-book.com

Living labs

• People’s use of technology in their everyday lives can be evaluated in living labs.

• Such evaluations are too difficult to do in a usability lab.

• Eg the Aware Home was embedded with a complex network of sensors and audio/video recording devices (Abowd et al., 2000).

Page 8: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20118www.id-book.com

Usability testing & field studies can compliment

Page 9: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©20119www.id-book.com

Evaluation methodsMethod Controlle

d settingsNatural settings

Without users

Observing x x

Asking users

x x

Asking experts

x x

Testing x

Modeling x

Page 10: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201110www.id-book.com

The language of evaluation

• Analytics • Analytical evaluation• Controlled

experiment• Expert review or

critique • Field study • Formative evaluation• Heuristic evaluation

• In the wild evaluation• Living laboratory• Predictive evaluation• Summative evaluation• Usability laboratory • User studies • Usability testing • Users or participants

Page 11: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201111www.id-book.com

Key points• Evaluation & design are closely integrated in user-

centered design.• Some of the same techniques are used in evaluation as

for establishing requirements but they are used differently (e.g. observation interviews & questionnaires).

• Three types of evaluation: laboratory based with users, in the field with users, studies that do not involve users

• The main methods are: observing, asking users, asking experts, user testing, inspection, and modeling users’ task performance, analytics.

• Dealing with constraints is an important skill for evaluators to develop.

Page 12: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201112www.id-book.com

Evaluation studies: From

controlled to natural settings

Chapter 14

augmented and annotated by Wan C. Yoon - 2012

Page 13: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201113www.id-book.com

The aims:

• • Explain how to do usability testing• • Outline the basics of experimental design • • Describe how to do field studies

Page 14: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201114www.id-book.com

Usability testing

• Involves recording performance of typical users doing typical tasks.

• Controlled settings. • Users are observed and timed.• Data is recorded on video & key presses are

logged. • The data is used to calculate performance times,

and to identify & explain errors. • User satisfaction is evaluated using

questionnaires & interviews. • Field observations may be used to provide

contextual understanding.

Page 15: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201115www.id-book.com

Experiments & usability testing

• Experiments test hypotheses to discover new knowledge by investigating the relationship between two or more things – i.e., variables.

• Usability testing is applied experimentation. • Developers check that the system is usable by the

intended user population for their tasks.• Experiments may also be done in usability testing.

Page 16: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201116www.id-book.com

Usability testing & researchUsability testing

• Improve products• Few participants• Results inform design• Usually not completely

replicable• Conditions controlled as

much as possible• Procedure planned• Results reported to

developers

Experiments for research

• Discover knowledge• Many participants• Results validated

statistically • Must be replicable• Strongly controlled

conditions• Experimental design• Scientific report to

scientific community

Page 17: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201117www.id-book.com

Usability testing

• Goals & questions focus on how well users perform tasks with the product.

• Comparison of products or prototypes common.

• Focus is on time to complete task & number & type of errors.

• Data collected by video & interaction logging.• Testing is central.• User satisfaction questionnaires & interviews

provide data about users’ opinions.

Page 18: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201118www.id-book.com

Usability lab with observers watching a user & assistant

Page 19: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201119www.id-book.com

Portable equipment for use in the field

Page 20: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201120www.id-book.com

Page 21: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201121www.id-book.com

Testing conditions

• Usability lab or other controlled space.• Emphasis on:

– selecting representative users;– developing representative tasks.

• 5-10 users typically selected.• Tasks usually last no more than 30 minutes.• The test conditions should be the same for every

participant.• Informed consent form explains procedures and

deals with ethical issues.

Page 22: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201122www.id-book.com

Some type of data• Time to complete a task.• Time to complete a task after a specified. time

away from the product.• Number and type of errors per task.• Number of errors per unit of time.• Number of navigations to online help or

manuals.• Number of users making a particular error.• Number of users completing task successfully.

Page 23: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201123www.id-book.com

Product and Process measures

• Product measures– Task success/failure– Task Completion Time– Number (and type) of Errors (per task, per unit time,

etc.)– Subjective assessment (difficulty, uncertainty,

satisfaction etc.)

• Process Measures– Number of incorrect responses (for types)– Information gathering activities– Missed observations or hypotheses– Hesitations– Undo’s– Conditional assessment of decisions

Page 24: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201124www.id-book.com

Usability engineering orientation

• Aim is improvement with each version.• Current level of performance. • Minimum acceptable level of performance.• Target level of performance.

Page 25: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201125www.id-book.com

How many participants is enough for user testing?

• The number is a practical issue.• Depends on:

– schedule for testing;– availability of participants;– cost of running tests.

• Typically 5-10 participants. • Some experts argue that testing should continue

until no new insights are gained.

Page 26: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201126www.id-book.com

Experiments

• Predict the relationship between two or more variables.

• Independent variable is manipulated by the researcher.

• Dependent variable depends on the independent variable.

• Typical experimental designs have one or two independent variable.

• Validated statistically & replicable.

Page 27: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201127www.id-book.com

Different, same, matched participant design

Design Advantages Disadvantages

Different No order effects Many subjects & individual differences a problem

Same Few individuals, no individual differences

Counter-balancing needed because of ordering effects

Matched Same as different participants but individual differences reduced

Cannot be sure of perfect matching on all differences

between-subjects

within-subjects

paired or matched-subjects

Page 28: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201128www.id-book.com

Hypothetical Models

• Mean Difference• Linear Regression and Variable Transformation • ANOVA

– Within and Between-Subject Models– Mixed Models– The Assumptions

• Independence• Normally distributed• Homogeneity of variances• Additivity and Sphericity

• ANCOVA– When a parallel pairs of model can be assumed.

Page 29: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201129www.id-book.com

Field studies

• Field studies are done in natural settings.• “in the wild” is a term for prototypes being used

freely in natural settings.• Aim to understand what users do naturally and

how technology impacts them.• Field studies are used in product design to:

- identify opportunities for new technology;- determine design requirements; - decide how best to introduce new technology;- evaluate technology in use.

Page 30: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201130www.id-book.com

Data collection & analysis

• Observation & interviews– Notes, pictures, recordings– Video– Logging

• Analyzes– Categorized– Categories can be provided by theory

• Grounded theory• Activity theory

Page 31: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201131www.id-book.com

Data presentation

• The aim is to show how the products are being appropriated and integrated into their surroundings.

• Typical presentation forms include: vignettes, excerpts, critical incidents, patterns, and narratives.

Page 32: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201132www.id-book.com

Key points Usability testing is done in controlled conditions. Usability testing is an adapted form of experimentation. Experiments aim to test hypotheses by manipulating

certain variables while keeping others constant. The experimenter controls the independent variable(s)

but not the dependent variable(s). There are three types of experimental design: different-

participants, same- participants, & matched participants. Field studies are done in natural environments. “In the wild” is a recent term for studies in which a

prototype is freely used in a natural setting. Typically observation and interviews are used to collect

field studies data. Data is usually presented as anecdotes, excerpts, critical

incidents, patterns and narratives.

Page 33: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201133www.id-book.com

Analytical evaluation

Chapter 15

adapted by Wan C. Yoon - 2012

Page 34: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201134www.id-book.com

Aims:

• Describe the key concepts associated with inspection methods.

• • Explain how to do heuristic evaluation and walkthroughs.

• • Explain the role of analytics in evaluation. • • Describe how to perform two types of

predictive methods, GOMS and Fitts’ Law.

Page 35: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201135www.id-book.com

Inspections

• Several kinds.• Experts use their knowledge of users &

technology to review software usability.• Expert critiques (crits) can be formal or

informal reports.• Heuristic evaluation is a review guided by a

set of heuristics.• Walkthroughs involve stepping through a pre-

planned scenario noting potential problems.

Page 36: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201136www.id-book.com

Heuristic evaluation

• Developed Jacob Nielsen in the early 1990s.• Based on heuristics distilled from an empirical

analysis of 249 usability problems.• These heuristics have been revised for current

technology. • Heuristics being developed for mobile devices,

wearables, virtual worlds, etc.• Design guidelines form a basis for developing

heuristics.

Page 37: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201137www.id-book.com

Nielsen’s original heuristics

• Visibility of system status.• Match between system and real world.• User control and freedom.• Consistency and standards.• Error prevention. • Recognition rather than recall.• Flexibility and efficiency of use.• Aesthetic and minimalist design.• Help users recognize, diagnose, recover from

errors.• Help and documentation.

Page 38: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201138www.id-book.com

Discount evaluation

• Heuristic evaluation is referred to as discount evaluation when 5 evaluators are used.

• Empirical evidence suggests that on average 5 evaluators identify 75-80% of usability problems.

Page 39: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201139www.id-book.com

No. of evaluators & problems

Page 40: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201140www.id-book.com

3 stages for doing heuristic evaluation

• Briefing session to tell experts what to do.• Evaluation period of 1-2 hours in which:

– Each expert works separately;– Take one pass to get a feel for the product;– Take a second pass to focus on specific features.

• Debriefing session in which experts work together to prioritize problems.

Page 41: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201141www.id-book.com

Advantages and problems

• Few ethical & practical issues to consider because users not involved.

• Can be difficult & expensive to find experts.• Best experts have knowledge of application

domain & users.• Biggest problems:

– Important problems may get missed;– Many trivial problems are often identified;– Experts have biases.

Page 42: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201142www.id-book.com

Heuristics for websites focus on key criteria (Budd, 2007)

• Clarity• Minimize unnecessary complexity & cognitive

load• Provide users with context• Promote positive & pleasurable user experience

Page 43: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201143www.id-book.com

Cognitive walkthroughs

• Focus on ease of learning.• Designer presents an aspect of the design &

usage scenarios.• Expert is told the assumptions about user

population, context of use, task details.• One or more experts walk through the design

prototype with the scenario.• Experts are guided by 3 questions.

Page 44: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201144www.id-book.com

The 3 questions

• Will the correct action be sufficiently evident to the user?

• Will the user notice that the correct action is available?

• Will the user associate and interpret the response from the action correctly?

As the experts work through the scenario they note problems.

Page 45: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201145www.id-book.com

Pluralistic walkthrough

• Variation on the cognitive walkthrough theme.• Performed by a carefully managed team.• The panel of experts begins by working

separately.• Then there is managed discussion that leads to

agreed decisions.• The approach lends itself well to participatory

design.

Page 46: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201146www.id-book.com

A project for you …

• http://www.id-book.com/catherb/• provides heuristics and a template so that you

can evaluate different kinds of systems. • More information about this is provided in the

interactivities section of the id-book.com website.

Page 47: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201147www.id-book.com

Analytics

• A method for evaluating user traffic through a system or part of a system

• Many examples including Google Analytics, Visistat (shown below)

• Times of day & visitor IP addresses

Page 48: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201148www.id-book.com

Social action analysis(Perer & Shneiderman, 2008)

Page 49: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201149www.id-book.com

Predictive models

• Provide a way of evaluating products or designs without directly involving users.

• Less expensive than user testing.• Usefulness limited to systems with predictable

tasks - e.g., telephone answering systems, mobiles, cell phones, etc.

• Based on expert error-free behavior.

Page 50: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201150www.id-book.com

GOMS• Goals – what the user wants to achieve eg. find a

website.• Operators - the cognitive processes & physical

actions needed to attain goals, eg. decide which search engine to use.

• Methods - the procedures to accomplish the goals, eg. drag mouse over field, type in keywords, press the go button.

• Selection rules - decide which method to select when there is more than one.

Page 51: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201151www.id-book.com

Keystroke level model

• GOMS has also been developed to provide a quantitative model - the keystroke level model.

• The keystroke model allows predictions to be made about how long it takes an expert user to perform a task.

Page 52: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201152www.id-book.com

Response times for keystroke level operators (Card et al., 1983)

Operator Description Time (sec)K Pressing a single key or button

Average skilled typist (55 wpm)Average non-skilled typist (40 wpm)Pressing shift or control keyTypist unfamiliar with the keyboard

0.220.280.081.20

P

P1

Pointing with a mouse or other device on adisplay to select an object.This value is derived from Fitts’ Law which isdiscussed below.Clicking the mouse or similar device

0.40

0.20H Bring ‘home’ hands on the keyboard or other

device0.40

M Mentally prepare/respond 1.35R(t) The response time is counted only if it causes

the user to wait.t

Page 53: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201153www.id-book.com

Summing together

Page 54: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201154www.id-book.com

Using KLM to calculate time to change gaze (Holleis et al., 2007)

Page 55: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201155www.id-book.com

Fitts’ Law (Fitts, 1954)

• Fitts’ Law predicts that the time to point at an object using a device is a function of the distance from the target object & the object’s size.

• The further away & the smaller the object, the longer the time to locate it & point to it.

• Fitts’ Law is useful for evaluating systems for which the time to locate an object is important, e.g., a cell phone,a handheld devices.

Page 56: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201156www.id-book.com

A project for you …

• Use the web & other resources to research claims that heuristic evaluation often identifies problems that are not serious & may not even be problems.

• Decide whether you agree or disagree.• Write a brief statement arguing your position.• Provide practical evidence & evidence from the

literature to support your position.

Page 57: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201157www.id-book.com

A Project for you …Fitts’ Law

• Visit Tog’s website and do Tog’s quiz, designed to give you fitts!

•http://www.asktog.com/columns/022DesignedToGiveFitts.html

Page 58: ©2011 1 Introducing Evaluation Chapter 12 adapted by Wan C. Yoon - 2012

©201158www.id-book.com

Key points

• Inspections can be used to evaluate requirements, mockups, functional prototypes, or systems.

• User testing & heuristic evaluation may reveal different usability problems.

• • Walkthroughs are focused so are suitable for evaluating small parts of a product.

• • Analytics involves collecting data about users activity on a website or product

• • The GOMS and KLM models and Fitts’ Law can be used to predict expert, error-free performance for certain kinds of tasks.