ch 8: usability testing - 1 yonglei tao. usability learnability efficiency user retention over...
TRANSCRIPT
Ch 8: Usability Testing - 1
Yonglei Tao
Usability Learnability Efficiency User retention over time
Memorability Error rate
Error frequency and severity Subjective Satisfaction
ISO Usability Standard 9241 Effectiveness
Can you achieve what you want to? Efficiency
Can you do it without wasting effort? Satisfaction
Do you enjoy the process?
Why Usability Testing Developers and users possess different
models Developer’s intuitions are not always correct
There is no average user It is impossible to predict usability from
appearance Problems found late in the development
process are more difficult and expensive to fix
Common Usability Problems Reported by IBM Usability Experts Ambiguous menus and icons Single-direction movement through a system Lack of white space Annoying distractions Unclear step sequences Input and direct manipulation limits More steps to manage the interface than to
perform tasks Lack of system anticipation and intelligence Inadequate feedback and confirmation Inadequate error messages, help, tutorials,
and documentation
Common Activities in the Process Capture
Collecting usability data such as task completion time, errors, guideline violations, and subjective ratings
Analysis Interpreting data to identify usability problems
in the interface Critique
Suggesting solutions or improvements
Questions to Answer When to perform evaluation
Early, middle, or late What to achieve
Concerns and goals Tasks and scenarios Measurements
Which technique to use Who are involved and where to do How to analyze data and identify problems
When to Evaluate? Formative evaluation
Assessment that occurs during the formation of a design
Summative evaluation An evaluation of the final product
(implementation)
What to Evaluate Features Screen layout Navigational structure Reliability and performance User support Relative to end user tasks and skills
Which Technique to Use? Analytic evaluation
Keystroke-Level Model GOMS State Transition Networks Hick and Fitts’s laws
Expert reviews Cognitive walkthrough Heuristic evaluation Guidelines checklists Consistency inspection
Which Technique to Use? Observational methods
Think aloud protocol Co-discovery method Comparative evaluation Journaled sessions
Query techniques Contextual inquiry Interviews Questionnaires
Controlled experiments Eye tracking Performance measurement
How to Measure Objective measurements
Collect data from users during evaluation Subjective measurements
Users’ opinions, observations, and responses to interview and follow-up questions
Measuring Methods Time to complete a task Percent of task completed (per time unit) Number of user errors Time spent in errors Number of times that a user is not able to
proceed or makes a serious mistake Number of times that help is needed Number of commands used Number of repetitions of failed commands Overall satisfaction with a product User comments and questions
Some Metrics from ISO 9241
Usability Effectiveness Efficiency Satisfactionobjective measures measures measures
Suitability Percentage of Time to Rating scale for the task goals achieved complete a task for satisfaction
Appropriate for Number of power Relative efficiency Rating scale fortrained users features used compared with satisfaction with
an expert user power features
Learnability Percentage of Time to learn Rating scale forfunctions learned criterion ease of
learning
Error tolerance Percentage of Time spent on Rating scale for
errors corrected correcting errors error handling successfully
Ways to Determine if a usability goal is achieved With respect to information on
An existing system or previous version Competitive systems Carrying out the task without use of a computer
system An absolute scale Your own prototype User’s own earlier performance Each component of a system separately A successive split of the difference between best
and worst values observed in user tests
An Example – Usability Specification An electronic meeting calendar
Allows the user to schedule and keep track of future meetings
Replaces a paper-based system Specifying usability goals early on so as
to drive the design activity
Usability Specification - 1 Attribute: Ease of learning
Measuring concept: Ease of first use of system without training
Measuring method: Time to create first entry in the to- do list
Now level: 30 seconds on paper-based system
Worst case: 1 minute Planned level: 45 seconds Best case: 30 seconds (equivalent to now)
Usability Specification - 2 Attribute: Task efficiency
Measuring concept: Scheduling a weekly meeting Measuring method: Time it takes to enter a
weekly meeting appointment
Now level:
Worst case: Planned level: Best case:
Usability Specification - 3 Attribute: Recoverability
Measuring concept: Measuring method:
Now level:
Worst case: Planned level: Best case:
Discussion Imagine you have asked to produce a
prototype for this meeting calendar system. What would be an appropriate prototyping technique to enable you to test the design according to the usability specification given here, and why?
Usability Evaluation Process Determine what to find out Develop the tasks for each experiment
Writing Scripts
Get some users Set up the testing environment with
samples Run the test and collect data Analyze the data
Perform the Test Pre-Test
Have the users fill out a pre-test questionnaire if any
Provide Instructions to them During the Test
Let the users proceed with scripts Find out how they work with the product Collect data on how they are doing
Post-Test Have the users fill out a post-test questionnaire Verbal interview with them Organize and summarize test data
Process the Data Documenting Identify problems
Severity Frequency
Prioritize problems Find out reasons Work out solutions
Case Study Information retrieval tasks for INTUITIVE
Navigation and exploration User’s knowledge about the database is imprecise Need to show what is in the database and allow the
user to select entities for queries Query formation
Translate user’s need, expressed via a graphical notation, to SQL internally
Previewing the retrieved data Allow the user to select a subset of retrieved items
Presentation of retrieved items
Usability Evaluation User testing
Users were videotaped and timed when performing increasingly complex tasks
Captured data including time taken, errors, help accessed, and task steps missed
Expert evaluation Using Nielsen’s heuristic checklist by HCI
expert evaluators
Experiment Results Error count
Experts identified 86 usability problems Users identified 38, not a subset of the above
Cost 33.5 hours for heuristic evaluation 125 hours for end user testing More expensive to hire HCI experts
Ease of problem fixing HCI experts are better at accurate, thorough
reporting and identifying causes
Experiment Results (Cont.) Problems identified by end users, not by HCI
experts 39% of those identified by users Some examples
Repeated resubmission of the same queries Stuck on creating query Cannot make sense of results Difficult in moving windows
Experiment Results (Cont.) Problems identified by both end users and
HCI experts More feature related, rather than performance
related Some examples
No complete view of the query – only one entity at a time
Menu terminology confusing Attribute selection unpredictable
Experiment Results (Cont.) Problems identified by HCI experts, not by
end users 40% of those identified by experts Some examples
No arrows indicating the direction of relations Not able to cancel the operation after a query is
submitted Slow response time to display results There should be a clean up command
Discussions Heuristic evaluation
Identify an interface error by predicting user problems it will cause
Good at finding poor terminology and lack of clarity
Ten heuristics are inadequate as a guide A subjective process
Discussions (Cont.) End user testing
Identify the symptom and infer its cause Good at finding problems while performing real
tasks Task-based May miss features not encountered in tasks Users tend to blame themselves rather than the
interface
Summary Both techniques share the same goals, but the
actual results are quite different End user testing indicates the symptom of a problem Heuristic evaluation identifies its cause
Heuristic evaluation helps analyze observed problems Observation of novices is still vital as many
problems are a consequence of the user’s knowledge, or lack of it
Usability testing is costly and time-consuming Necessary to use a variety of techniques Need to focus on the aspects each one does best
Comparative Study of Heuristic Evaluation and User Testing A study published by National Institute of Health
Heuristic evaluation, cognitive walk-throughs, and usability testing are increasingly used to evaluate and improve the design of clinical applications
Need to find out how they can be used effectively Conducted heuristic evaluation and usability
testing on four major patient record systems which cover 80% of the market in general dentistry
Both methods indicate that there are significant usability problems with those systems
Comparative Study (Cont.) Heuristic evaluation
Evaluators were dentists with significant background in information systems and usability evaluation
Evaluated each of the four dental systems Usability testing
Four groups of users with five novice users in each Each participant used one system and worked through
nine clinical tasks using a think-aloud protocol Two researchers recorded both task outcomes (rate of
completed tasks, incomplete tasks and incorrectly completed tasks) and the types of usability problems for each task
Comparative Study (Cont.) An average of 50% of empirical determined usability
problems were identified by heuristic evaluation Some statements of heuristic violations were specific
enough to precisely identify the actual usability problems
Other violations were less specific, but still revealed usability problems and poor task outcomes
“Heuristic evaluation may, under certain circumstances, be a useful tool to determine design problems early in the development cycle.”
Usability Problems Identified with the Two Usability Methods
“Specific” Heuristic Violations Which Directly Predicated a Usability Problem
“General” Heuristic Violations Which Suggested a Usability Problem