benchmarking information and call handling on powercom · 2001. 1. 18. · ben shneiderman’s...

85
Benchmarking information and call handling on PowerCom Master Thesis Department of Computer Science Umeå University Marcus Nyberg Supervisors: Lecturer Lena Kallin Westin Department of Computing science, Umeå University Associate professor Mikael Goldstein Usability & Interaction Laboratory, Ericsson Research Stockholm, Sweden 2001-01-11

Upload: others

Post on 28-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Benchmarking information andcall handling on PowerCom

Master ThesisDepartment of Computer Science

Umeå University

Marcus Nyberg

Supervisors:

Lecturer Lena Kallin WestinDepartment of Computing science, Umeå University

Associate professor Mikael GoldsteinUsability & Interaction Laboratory, Ericsson Research

Stockholm, Sweden 2001-01-11

Page 2: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface
Page 3: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

AbstractPowerCom is a prototype application for a Personal Digital Assistant featuring an in-formation visualisation technique called Flip Zooming. The purpose is to give both anoverview and a detailed view of information presented on a small touch sensitivescreen. Quick retrieval of related objects is enabled by information integration, whereadvanced call handling is included. The usability of PowerCom was compared to twoother devices in a controlled experiment using 18 subjects that were exposed to fourinformation handling tasks and three typical call handling tasks. Effectiveness, effi-ciency, mental workload and subjective satisfaction were measured. Major differencesregarding effectiveness and efficiency, disfavouring PowerCom, were found for oneinformation and one call handling task. The call handling task included both the crea-tion of a conference call and creating a private call with one of the conference mem-bers. The problems can be explained in terms of using drag and drop and its effect onthe breadth of actions available at each navigation step. Previous studies have arguedthat the breadth should be increased rather than the navigation depth. The results hereimply that this is not the case for call handling tasks.

Swedish abstractPowerCom är en protypapplikation för en PDA (Personlig Digital Assistent) byggd påFlip Zooming, en teknik för att visualisera information. Denna syftar till att både ge enöverblick och en detaljerad bild av information presenterad på en liten tryckkänsligskärm. Genom integration av olika objekt, inkluderat telefonsamtal, möjliggörs snabbåtkomst av relaterad information. Användbarheten hos PowerCom jämfördes med tvåandra gränssnitt i ett kontrollerat experiment med 18 stycken försökspersoner. Dessafick göra fyra informationshanteringsuppgifter och tre typiska uppgifter gällande hante-ring av flera telefonsamtal. Grad av fullbordan, effektivitet, mental ansträngning ochsubjektiv tillfredsställelse uppmättes. Skillnader, till nackdel för PowerCom, registrera-des för en informationshanteringsuppgift och en telefonsamtalsuppgift i vilka skapandetav ett konferenssamtal och ett privat samtal med en konferensmedlem ingick. Dessaproblem kan förklaras med drag and drop och dess effekt på bredden av valmöjlighetervid varje navigationssteg. Tidigare undersökningar har visat att bredden bör ökas sna-rare än navigationsdjupet. Resultaten i denna studie indikerar att så inte är fallet föruppgifter innehållande hantering av flera telefonsamtal.

Page 4: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface
Page 5: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

AcknowledgementsFirst of all I would like to thank Staffan Björk and the rest of the people at PLAYresearch studio for providing me with information about the prototype and givingvaluable feedback on my interface suggestions and experimental design. I would like tothank my supervisor at Umeå University, Lena Kallin Westin, for reading through themaster thesis several times and giving useful comments on the paper from anindependent view. Many thanks to my supervisor at Usability & Interaction Lab atEricsson Research, Mikael Goldstein, who has guided me through the work and helpedme with many advises along the way. I would also like to thank the other members ofthe Usability & Interaction Lab, Jost Werdenhoff, Mikael Anneroth and DidierChincholle for different kinds of support.

Page 6: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface
Page 7: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Table of ContentsINTRODUCTION.....................................................................................................................................9

ASSIGNMENT...........................................................................................................................................9THESIS OUTLINE ....................................................................................................................................10

BACKGROUND......................................................................................................................................11

PERSONAL DIGITAL ASSISTANTS...........................................................................................................11CALL HANDLING....................................................................................................................................12INTEGRATION OF CELLULAR TELEPHONES AND PDAS...........................................................................14

INTERFACE CHARACTERISTICS....................................................................................................17



POWERCOM..........................................................................................................................................20



HEURISTIC EVALUATION OF THE POWERCOM GUI ..............................................................28

METHOD................................................................................................................................................28Jakob Nielsen's ten usability heuristics ...........................................................................................28Ben Shneiderman’s eight golden rules ............................................................................................29Task scenarios .................................................................................................................................30

RESULTS................................................................................................................................................30IMPROVEMENTS.....................................................................................................................................33DISCUSSION...........................................................................................................................................33

THE KEYSTROKEMAPPER ...............................................................................................................34

INTRODUCTION......................................................................................................................................34EVALUATION.........................................................................................................................................35

Temporal aspect and annotation .....................................................................................................35Scalability ........................................................................................................................................37

DISCUSSION...........................................................................................................................................38

THE EXPERIMENT ..............................................................................................................................39

HYPOTHESES .........................................................................................................................................39EXPERIMENTAL DESIGN.........................................................................................................................39INDEPENDENT VARIABLES .....................................................................................................................40

Usability & Interaction laboratory..................................................................................................40Subjects............................................................................................................................................41Interfaces and apparatus .................................................................................................................41Tasks ................................................................................................................................................44

DEPENDENT VARIABLES ........................................................................................................................45Effectiveness ....................................................................................................................................45Efficiency .........................................................................................................................................46Optimum Path..................................................................................................................................46Mental work load assessment ..........................................................................................................46Subjective satisfaction .....................................................................................................................46

PROCEDURE ..........................................................................................................................................47

Page 8: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

RESULTS.................................................................................................................................................50

OBJECTIVE MEASURES...........................................................................................................................50Effectiveness ....................................................................................................................................50Efficiency .........................................................................................................................................51Optimum Path..................................................................................................................................52

SUBJECTIVE MEASURES .........................................................................................................................53Mental Workload (NASA-TLX) ........................................................................................................53Subjective satisfaction .....................................................................................................................54

DISCUSSION ..........................................................................................................................................56

POWERCOM CONCEPT ...........................................................................................................................56BREADTH VS. DEPTH .............................................................................................................................56IDIOMATIC VS. METAPHORICAL ELEMENTS............................................................................................59OBJECT-FUNCTION VS. FUNCTION-OBJECT...........................................................................................60VISIBILITY AND FEEDBACK....................................................................................................................61EXPERIMENTAL SHORTCOMINGS ...........................................................................................................61

CONCLUSIONS......................................................................................................................................62

FUTURE WORK......................................................................................................................................63

REFERENCES........................................................................................................................................64

APPENDIX 1 – TASKS PRESENTED TO THE SUBJECTS ............................................................69

APPENDIX 2 – THE NASA-TLX MENTAL WORKLOAD FORM.................................................73

APPENDIX 3 – SUBJECTIVE SATISFACTION FORM ..................................................................74

APPENDIX 4 – THE RESULTS............................................................................................................75

APPENDIX 5 – THE STATISTICAL ANALYSIS ..............................................................................81

APPENDIX 6 – THE KEYSTROKEMAPPER CHARTS ..................................................................84

Page 9: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Introduction 9

Introduction

AssignmentThe Usability & Interaction Laboratory at Ericsson Research in Kista occasionallytakes part in joint projects with external research units. One such unit is PLAY, aresearch studio of the Interactive Institute located in Gothenburg, Sweden. The co-operation can take different forms from project to project. In this case, PLAY did mostof the creative work and implementation of prototypes, while the Usability &Interaction Laboratory focused on the evaluation and the usability tests.

The project has its origin in two different ideas. The first one came from PLAY, whohad an idea on how to visualise information on a small touch sensitive display, whicheventually gave birth to the PowerView prototype [Björk, Redström, Ljungstrand,Holmquist 2000]. This application was intended to give mobile workers fast access toinformation in e.g. an Address Book or a Calendar. It was implemented on a CasioCassiopeia E-11 PDA (Personal Digital Assistant) and benchmarked against thestandard interface Windows CE running on the same Casio device [Hellstrand 1999].This evaluation showed many interesting results. One of relevance in this master thesisis that no significant differences existed between the two interfaces considering taskcompletion. The second idea came from an article published by Siemens [Grundel &Schneider-Hufschmidt 1999], which described direct manipulation of call handling(handling more than one telephone call at the same time). The Siemens people haddeveloped a prototype designed for large screens, which gave the Usability &Interaction Laboratory the idea to implement a prototype on a PDA. PowerView wasextended with functions for advanced call handling and the second prototype, namedPowerCom, was ready for a usability evaluation in May 2000.

The assignment in this master thesis was primary to perform these usability evaluationsaccording to the standard procedure at the Usability & Interaction Laboratory.Important parts were planning the study, designing representative tasks, findingsubjects, perform the actual study and analyse the results. Some in depth studiesconcerning information visualisation and call handling were also planned. Thesestudies and the usability evaluation resulted in an alternative solution designed as apaper based mock-up. This can unfortunately not be presented in this thesis since it willbe further investigated at the Usability & Interaction Laboratory. A special loggingtechnique to capture the user actions called KeystrokeMapper [Goldstein, Werdenhoff& Backström 2000], which also is within the information visualisation area, was usedthroughout the whole evaluation. Some time was spent on evaluating and trying toimprove this method.

The whole work was carried out at Ericsson Research, Applications Research, Usability& Interaction Laboratory in Kista, Stockholm from April to October 2000.

Page 10: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Introduction 10

Thesis outlineThe first chapter gives some background information on Personal Digital Assistants,cellular telephones and the possible integration of the two. The area of call handling isa central feature in this process and will therefore be explained in more detail.Following this, a few interface characteristics of interest in this thesis, and whendesigning interfaces for PDAs, are discussed. These will also be used later to classifythe interfaces and to explain the differences between the interfaces in the concludingchapter. Then the PowerCom prototype is described and some typical tasks in order togive a good picture of the application. A heuristic evaluation of PowerCom wasperformed by the author before the actual experiment. The potential usability problemsfound and the methods used are discussed. In the next chapter, the focus is on theKeystrokeMapper. This tool was used throughout the evaluation to log and visualise theusers' actions. An explanation of this paper based prototype is given and somesuggestions on how to improve it are presented as well. After that comes a thoroughdescription of the experiment. The experimental design, devices and interfaces, whatwas measured and how it was done are presented in detail. The following sectioncontains a quantitative analysis of the results, before going into a more qualitativediscussion. Important issues to consider when designing call handling interfaces andsome proposals for further work concludes the thesis.

Page 11: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Background 11

Background

Personal Digital AssistantsThe abbreviation PDA stands for Personal Digital Assistant, which explains the devicevery well. A PDA is small cognitive tool that comes in many different models andhelps people remembering daily tasks and common information. Simplified, a PDA canbe seen as a digital filofax [Hellstrand 1999], containing e.g. all the upcomingappointments, to do tasks and various telephone numbers. These PDAs are rapidlyincreasing in popularity among people who need access to information while on themove [Björk, Redström, Ljungstrand & Holmquist 2000]. They are made for use inmobile environments, therefore it cannot be assumed that the lightning is good, theworking position is comfortable, etc. The intended tasks are different compared tothose carried out on a personal computer since the PDA is designed to support user’swork in a mobile environment. Often, the users want to put in as little effort as possibleperforming these tasks and they might be under time pressure due to other ongoingactivities. Therefore, the interaction with a small handheld device should be simple andclear and tailored for the mobile tasks.

Another important aspect is that the users are not willing to spend much time onreading manuals and learning how the device or the interface works. It has beensuggested that small computers are treated differently than large ones by Goldstein,Alsiö & Werdenhoff [1999], since they found no evidence that the media equation[Reeves & Nass 1996] was applicable to PDAs. The media equation states that peoplebehave politely towards computers, television and media. However, the findings ofGoldstein, Alsiö & Werdenhoff [1999] showed that this might not be the case for smallhandheld devices. Perhaps factors like size, computational power and number ofbuttons have an effect on the users’ expectations on the PDA usability. This puts newand challenging demands on the interface design. An interface having a size that limitsthe number of objects to be presented at the screen simultaneously and that ismanipulated through an inferior pointing device or hardware buttons.

In the picture to the right, a typicalPDA is displayed and an example of aninteraction with a so-called stylus isshown. Tapping on a touch sensitivescreen with such a tool is the mostcommon interaction method. The stylusreplaces the mouse used when inter-acting with a personal computer. MostPDAs use a single-click interactionmodel, but double-clicks and drag &drop are possible to implement. If theuser wants to enter information, either asmall soft QWERTY keyboard or aJOT character recognition area can beshown at the bottom of the screen.

Figure 1. An example of the interactionwith a PDA, using a stylus to enterletters on the soft QWERTY keyboard.

Page 12: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Background 12

Being a device designed for use in mobile situations, a PDA must also give the users anopportunity to retrieve information using one-handed interaction. The buttons belowthe display in Figure 1 are used for this purpose. Common applications and items canbe chosen with them. What are the classical features of a PDA then? The following fiveapplications can be said to exist in almost every PDA on the market today:

Contacts Is often called Address Book, information about friends and contacts arestored here. It can be the full address, telephone numbers, emailaddresses, web page address, birthday, interests, etc.

Calendar Upcoming meetings and appointments are written down in the calendar.Their due date and time can be added and a remainder (e.g. an alarmsignal) can be selected.

Tasks A list of important tasks that needs to be carried out. As for the Calendarapplication, a remainder by e.g. an alarm signal can be added.

Mailbox In this application, mails that have been transferred from the PC can beread while on the move. New mails can also be written and transferred tothe PC before finally sent to the receiver.

Notes Long or short notes about whatever the user wants.

Other common applications are a basic calculator and a find/search function, in whichthe users can search for keywords in the names of the objects, or if more advanced, inthe contents of all data. The list of applications is extended all the time and programs ofall kinds are put into these small computers as they increase in power and storagecapacity. Some examples of new applications are a web browser, an image viewer, asound recorder, a video clip viewer and games. With the exception of a few devices, nocommunication directly through the PDA is yet possible. The mails, notes, pictures, etc.still have to be transferred to a PC before sending them to the destination. The next stepis therefore to build communication applications like telephone calls, Internet browsingand mobile awareness [Falk 1999], i.e. the indications of people with a similar devicebeing physically close, into the PDA. The area of communication leads us into a topicof interest in this thesis, namely call handling.

Call handlingCall handling is a term that refers to having multiple telephone calls active at the sametime. Many people have probably tried the simplest form of call handling at home, i.e.when you have an active call and it is indicated by two tones that someone else wantsto speak to you. By pressing a certain key combination, a swap to the waiting call ismade and the active call is put on hold. In this report, call handling for cellulartelephones is discussed and more complex states and actions are considered. E.g. aconference call containing up to five persons can be started and a member of it beselected for a private call. All call handling states available for cellular telephones arelisted on the next page, these are adapted from Christian Ewertz [1999].

Page 13: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Background 13

1. One active call (you are talking to the person)2. One held call (the other person cannot hear you and you cannot hear him/her)3. One waiting call (someone is calling you)4. One active call and one held call5. One active call and one waiting call6. One active call, one held call and one waiting call7. One active conference call (3-6 persons are talking simultaneously)8. One held conference call9. One active conference call and one held call10. One active call and one held conference call11. One active conference call and one waiting call12. One active conference call, one held call and one waiting call13. One active call and one held conference call and one waiting call

As you can see from these examples of states, cellular telephones today can only handletwo ongoing calls at the same time. If you e.g. have one active call and one held callwhen a third person calls you, one of the first two calls must be ended in order toanswer the waiting call. An alternative is to merge the first two calls into a conferencecall and then answer the waiting call. Another technical limit makes it impossible tohandle two conference calls at the same time. You can of course participate in aconference call started and handled by another person and then start up a secondconference call yourself, but you cannot be the centre of two conference calls at thesame time. Theoretically, networks of conference calls can be created that enable muchmore than five persons to speak to each other. However, this will not be discussed anyfurther in this paper. The technical limitations are likely to disappear if we look acouple of years ahead and consider e.g. the mobile Internet revolution. Therefore, wecan use them as a guide but we should not let them limit the design solutions.

To move between the states listed on the previous page, either a user action or anwaiting telephone call is required. These actions can be classified as events. The mostcommon events available in cellular telephones today are listed below. Transfer call isnot available for private persons in the Swedish telephone network except as especiallydesigned company solutions. Therefore, this event has not been evaluated in any way.

1. Make a call2. Someone calls you3. Answer a call4. Reject a call5. End a call6. Put a call on hold7. Activate a held call8. Swap between two calls (put the active call on hold and activate the held call)9. Create a conference call (merge two telephone calls into one)10. Start a private call with a conference call member (other members are put on hold)11. Transfer a call (connect an active call with a held call)

Some of these events automatically trigger others. If you for instance have an activecall and choose to call a second person, the first one is put on hold when the operationis initiated. Moving to a certain state from a particular state might also require more

Page 14: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Background 14

than one event. If the user e.g. has one active call and wants to create a conference callby calling another person, the upper sequence in Figure 2 below is a possible solutioninitiated by the user.

hold call create conference

add one person to the conference call

Figure 2. Two examples of the user-activated events required and the states needed togo through, when moving from the state One active call to the state One activeconference call.

This was call handling from a cellular telephone perspective. If we examine thesedevices further we see that they are becoming more advanced all the time. Today theycan be used for much more than just making telephone calls.

Integration of cellular telephones and PDAsAt least two different device families exist that tries to integrate voice and informationin one device. The first family is the communicators, which are information deviceswith voice features. The other one is advanced cellular telephones called smartphones,a voice centric device with information features. The smartphones offer, except forregular telephone applications a Calendar, an Address Book, Notes and access toInternet via WAP (Wireless Application Protocol), etc. The interaction style is alsochanging from button-based to more direct manipulation with the informationpresented on a larger display. Stylus input on a touch sensitive screen is even possibleon a few smartphones, e.g. the Ericsson R380.

But is the smartphone really a PDA in a cellular telephone and not a cellular telephonein a PDA [Usability & Interaction Laboratory 2000]? The answer is differentdepending on whom you ask. A cellular telephone manufacturer might think the firstalternative, while a company more focused on handheld computers believes the secondalternative is best. No matter which alternative you choose, there is a convergence ofthe two devices that implies an integration between data and voice. Therefore, as wellas investigating a cellular telephone with PDA applications, we can investigate a PDAwith added communication applications. This is also more appropriate in this paper,since one of the main topics in this master thesis is call handling by directmanipulation. PDAs have its origin in the computer area and have traditionally largerdisplays, better input facilities and more computational power than cellular telephonesdo. It has been suggested by Nielsen [1997] that the design of multiple features,including telephone calls, are better done with computer thinking, since the experiencefrom integrating different user interfaces is larger in that area. When dealing with callhandling through direct manipulation on a handheld computer, advanced software cansolve complex events in one step instead of several unnatural steps. By letting theactions be performed very quickly in the background, only noticed by the software, the

One activecall

One heldcall

One active call& one held call

One conference call

One active call One conference call

Page 15: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Background 15

interaction can be made more direct and intuitive. An example of this is shown in thebottom sequence of Figure 2 on the previous page, where the user again has one activecall and wants to create a conference call. It would be much easier to simply add a newconference call member by selecting or somehow moving this person to the ongoingcall. All the actions required would then be solved in the background by the software,unseen by the user.

Grundel and Schneider-Hufschmidt [1999] thought about these possibilities andproposed a direct manipulation interface to control communication processes includingcall handling. They developed a prototype user interface for telephone applications on alarge screen. The application was called Communication Circle [Grundel andSchneider-Hufschmidt 1999] and the interaction concept was to move communicationpartners in or out of the circle by direct manipulation including drag & drop operations.Grundel and Schneider-Hufschmidt [1999] also stressed the advantages with directinteraction and that the actions are initiated by the user, who does not have to explicitlyselect a function and know its precise meaning. The Communication Circle prototypewas evaluated in a usability test with good results, especially the ones consideringsubjective satisfaction. In Figure 3 two images of the basic design and schematicrepresentation is shown.

Figure 3(ab). The basic design and schematic representation of the CommunicationCircle. Adapted from Grundel and Schneider-Hufschmidt [1999, p. 3-4].

In Figure 3b, some of the interface elements and actions possible are numbered. Theexplanation of those as described by the designers follows below. No explanation tonumber 6 was given in the article.

1. Initiate consultation by pressing a softkey2. Initiate consultation by touching or clicking with the mouse3. Direct party selection by touching or clicking with the mouse4. Direct party selection by drag & drop5. Direct party selection by pressing a softkey

Page 16: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Background 16

Future development was planned to include a user interface implementation on asmartphone or a PDA. PowerCom is an interface that takes this step and implementsdirect manipulation of call handling on a PDA. The ideas from the CommunicationCircle are used, but the similarities between the interfaces are few, as you will see. Thisapplication, featuring complex call handling combined with classic PDA functions, isexplained in detail later in this paper. First, some interface characteristics of importancefor PDA interfaces as well as call handling interfaces will be introduced.

Page 17: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Interface characteristics 17

Interface characteristics

Idiomatic vs. Metaphorical ParadigmsCooper [1995] discusses three interface paradigms in his book About Face. They arecalled the technology paradigm, the metaphor paradigm and the idiomatic paradigm.The technology paradigm is based on understanding how things work in order tosuccessfully manage the interface. Considering the variety of users with differentbackgrounds, this is a bad design method according to Cooper. The users probably donot have the time or the will to learn the complex details about a device and itssoftware, "Users would rather be successful than knowledgeable" [Cooper 1995, p. 55].

The metaphor paradigm is based on intuiting how things work by relating interfaceobjects to existing knowledge. Therefore, it is no need to understand how the softwareworks in detail. This paradigm has been used in many applications and operatingsystems, i.e. Microsoft Windows. Metaphors are regarded as an efficient way to boostinitial performance and can exist on many levels. Visual metaphors indicate by using apicture what it represents. Functional metaphors can be a whole program concept, e.g.an Address Book of paper based style with animations. Object/tool metaphors can bedrawing with a pen or erasing text with an eraser. Organisational metaphors can be anaction like moving objects between different areas. According to Cooper [1995],metaphors are risky and rely on that both the designer and the user perceive theassociations in a similar way.

The idiomatic paradigm is based on learning how things work. Idioms are figures ofspeech, like knock me down with a feather or spitting image. They are understoodsimply because they are distinctive, "All idioms must to be learned. Good idioms onlyneed to be learned once" [Cooper 1995, p. 59]. If the idiomatic paradigm is comparedto the technological paradigm, they both include some kind of learning. The differenceis that the idioms just have to be learned, not understood, in order to use them. Manyelements and actions in GUIs, even those based on a metaphor paradigm, are somehowidiomatic. Some examples are menus, double-clicks, right-click and close-buttons.Cooper [1995] separates the idiomatic paradigm and the metaphorical paradigm totallyin his discussions. However, as well as idiomatic elements can exist in a metaphoricalinterface, so can metaphorical elements exist in an idiomatic interface. An example ofthe latter will be presented in form of PowerCom.

Breadth vs. depthBreadth has been defined as the number of choices per menu and depth as the numberof menu levels by Miller [1981]. He conducted an experiment where he comparedusers’ performance when searching for information in a semantic hierarchy on aninteractive computer terminal. Speed and accuracy was measured as the breadth variedfrom two to 64, while the depth varied from one to six. If we express the hierarchies interms of BD, where B is the breadth and D the depth, the examined hierarchies were 26,43, 82 and 641. Miller [1981] found that the 82 hierarchy was the best one of the fourand suggested that the depth of a hierarchy should be minimised. A few studies havebeen made in the area since that, and the results have been confirmed for web page

Page 18: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Interface characteristics 18

design by Larson and Czerwinski [1998], who concluded that a 32 x 16 (depth=2)hierarchy was better than a 83 hierarchy. Both these hierarchies were experimentallyconstructed and regularly ordered, which is not always when the case with a graphicalinterface. Goldstein et al [1999] investigated the interface of a smartphone GUI to seehow the depth factor effected task completion. They defined depth as the number ofdifferent windows that had to be traversed by an experienced user in order toaccomplish a particular task. The results showed that navigation depth was criticalwhen it comes to task successful completion and recommended a maximum depth offive levels/windows/dialogs for novice users. No definition of the breadth for such aninterface has been presented, but in this paper the breadth of a GUI will be calculatedby considering what interface operations that can be applied to each object presented onthe screen at each procedural level. These interfaces operations are one part of Cooper’s[1995] canonical vocabulary where they are classified as compounds. This is describedmore in the following chapter.

Canonical VocabularyRestricting the vocabulary is one of the reasons the GUI's are superior to textbasedinterfaces according to Cooper [1995]. He discusses the restricted set of mouse actionscompared to the infinite number of character combinations. He calls these basic actionslike Tap, Release, Keypress and Drag for atomic elements. The fewer atomic elementsthe easier is an interface to learn, but the fewer things can also be expressed in it.Cooper [1995] has created a canonical vocabulary were the atomic elements are thebase for the compounds, which are the interface operations like Double-tap, Buttonclickand Drag & Drop. The compounds form in turn the base for the actions like Delete andCreate. The canonical vocabulary discussion is closely related to the breadth vs. depthdiscussion. As mentioned in the previous chapter is the number of objects on the screenin combination with the compounds used in this paper to roughly decide the breadth.This can be expressed in a formula according to Equation 1, where n is the number ofobjects at a given level and m is the number of compounds at the same level.

This equation gives an approximate value for the breadth. However, sometimes thebreadth is smaller since every compound is not applicable to every object andsometimes it is wider when different variants of a single compound can be applied tothe objects. A common guideline [Sun 1999] when designing applications for handhelddevices is to use a single-click interaction model. The atomic elements and thecompounds are minimised to one when using this model (m=1), which means that onlythe number of objects decides the breadth at each procedural level.

Object-Function vs. Function-ObjectThe introduction of GUI's and direct manipulation has resulted in an object-orienteddesign both on the implementation and on the interface level. This is characterised byfirst selecting an object and then the function to apply to this object. Object-orientation

∑ ∑•≈n m

compoundsobjectsbreadth1 1

1)(Equation

Page 19: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Interface characteristics 19

has been considered as one of the factors that contribute to the usability of directmanipulation interfaces. Kunkel et al [1995] decided to investigate this and definedfour ways to specify an object and activate a function:

• Explicit Object-Function. The first click on the object and the second on thefunction.

• Explicit Function-Object. The first click on the function and the second on theobject.

• Implicit Object-Function. A drag & drop operation of the object to the function.

• Implicit Function-Object. A drag & drop operation of the function to the object.

The results of their experiment showed that the no significant differences existedbetween Object-Function and Function-Object interaction styles. However, explicitselection was superior to implicit selection when it comes to efficiency and frequencyof errors. A question that the authors [Kunkel et al. 1995] discuss, is the importance ofother factors. One is the use of metaphors, e.g. dragging a document to a waste-basketsupports the Object-Function syntax, while the use of an erase to delete text supports anFunction-Object syntax. In the experiment, the user task was to paint different shapeson a screen. This task might be better suited for a Function-Object interaction style,since you normally bring the paint to the house or move the pencil towards the paper.This gives an example of how difficult it is to measure only one factor, since they areall related to each other. Every single situation has to be considered separatelyconsidering breadth, metaphors, Object-Function, etc.

Page 20: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

PowerCom 20

PowerComSome parts in the explanation of PowerCom are taken from discussions with the maindesigner, Staffan Björk. The concept of PowerCom and some ideas behind the designdecisions, as explained by the creators, can be found in Björk [2000], Björk &Redström [2000], Björk, Redström, Ljungstrand & Holmquist [2000] and in Holmquist[2000].

IntroductionThe interface is designed in a very different way from what a Microsoft Windows userexpects. Common GUI elements like buttons, menus and scrollbars have been avoidedto fully explore alternative interaction styles [Björk & Redström 2000]. The purpose isto evaluate interaction elements less used in common GUIs and perhaps find anefficient representation. PowerCom is built on direct manipulation of objects includingdrag & drop and interface elements like dropareas are introduced. PowerCom is alsodesigned to efficiently visualise information on a small display, which is discussed inmore detail below. It is an interface developed to give users fast access to commoninformation on PDAs and consists of four basic applications: Contacts, Calendar, Notesand Mailbox. Advanced simulated call handling is possible by linking the telephonecalls to the contacts. The interaction is task oriented and the applications are designedto support specific tasks, e.g. finding information about a meeting, searching for orreading mail from a specific person or starting a conference call with a group of people.

Information visualisationGrasping the whole, seeing patterns and gaps, recognise trends, etc. is a well knownhuman problem [Shneiderman 1998]. Information visualisation techniques attempts tosolve these problems and provide the user with an efficient visual presentation thatgives a qualitative overview of complex data sets and not just an ant vision.Shneiderman [1998] gives many examples on how information visualisations can beused. Some examples are grouping related information visually, compact informationinto a smaller area and use hierarchical search in overviews to locate information that ismore detailed or by allow zooming and details on demand techniques. An overview isnot enough when performing many tasks though, a detailed view is often needed aswell. One typical example, where both an overview and a detailed view are needed, iswhen moving files between different folders in Windows Explorer. This example alsoshows that the meaning of objects of primary interest often depends on the context theyexist in.

Several information visualisation techniques have been presented that intend to providethe user with enough information, both in detail and in overview [Shneiderman 1998].These techniques are named Focus+Context visualisations and some classicalexamples are the Fisheye View [Furnas 1986], the Bi-focal Display [Spence & Apperly1982] and the Perspective Wall [Mackinlay et al. 1991]. To the left in Figure 4 on nextpage, the Bi-focal display is shown where the data is distorted in a 2D perspectivealong the horizontal dimension only. To the right in the same Figure, the PerspectiveWall is displayed. Here a 3D perspective is used that originally was developed to

Page 21: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

PowerCom 21

present temporal data, but the distance from the focus can also represent spatialdifferences. These two information visualisation techniques can be labelled ascontinuous Focus+Context visualisations, since they also provide the user with moreexact information of the spatial or temporal distance from the focus to the context

objects.

Figure 4. Two ways of visualising the same information, to the left according to the Bi-focal display and to the right according to the Perspective wall. Adopted from anexample by Lars-Erik Holmquist [2000].

The Focus+Context techniques were originally developed for presentation on a largescreen and not much research has been done on how to apply these techniques to asmall display. At least one method has been developed though, and that is FlipZooming [Holmquist 2000]. This technique presents discrete and sequentialinformation divided into a number of tiles from top left to bottom right. One object iscentralised and enlarged, this is the focus of attention. The other objects are smaller andmight contain less detailed information, this is the context. In Figure 5 below threeexamples of Flip Zooming in the PowerCom GUI are shown, two are from the Contactsapplication and one is from the Calendar. As you can see, the visualisations areidentical for all context objects. The distance from the focus to a context is of less

importance and need not to be visualised in a particular way.

Figure 5 (abc). The left and middle screenshot (a and b) shows the address Book in theContacts application in PowerCom with focus on the letter S and N respectively. Theright picture (c) displays a year view from the Calendar in PowerCom with focus onthe month August.

Page 22: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

PowerCom 22

The tile in focus is allowed to take up most of the screenspace in order to give the usera detailed focus. The other tiles are smaller and give the users less informationdepending on the screenspace available. However, enough information is available togive the user a clear picture of the surrounding context. In the Contacts applicationexamples (Figures 5a and 5b), the lines in a miniaturised tile indicate how manycontacts there are with surnames starting on that letter. The focus is easily changed in arandom access style by flipping another tile into focus. This can be done by tapping onit directly and since the objects are sequentially ordered forward and backwardnavigation is also possible. In the Calendar (Figure 5c) the context is allowed to showmore details, which is an advantage as long as the focus is clear. If the amount ofinformation in the context is too large, some of the idea behind Focus+Contextvisualisation is lost.

PowerCom implements hierarchical Flip Zooming visualisations, a selected object ispresented in more detail and visualised by Flip Zooming at a more detailed informationlevel. On personal computers with a large screen, it is possible to visualise informationfrom different levels in a Flip Zooming hierarchy simultaneously. This is sometimesimpossible on small PDA displays and probably makes it more difficult for a user tofind wanted information. At the bottom level, when an object has been chosen, there isin fact no need for an information visualisation of that particular object. However, inPowerCom the Flip Zooming technique is also used to display connections betweenobjects. These connections are called information links and are explained in the nextchapter.

Information linksWhen retrieving information for different types of objects, most PDAs require the userto move between the different applications and imagine the connections between theinformation. This is a complex task, especially on a PDA where only one applicationwindow can be displayed at the same time. The information links in PowerCom is of asemantic type and forms a heterogeneous context to the chosen object in focus [Björk2000]. In spite of having objects from different information domains, the Flip Zoomingtechnique can be used to switch from one focus to another. In Figure 6 on the nextpage, one example of this is shown were the user's task is to find the meetings bookedwith Stig Smedberg. Figure 6a shows the Overview with focus on the Contactsapplication. If the user taps on the letter S in the Contacts tile, the address book withfocus on S is displayed (Figure 6b). With a tap on Stig Smedberg, the informationcontext view for him is opened (Figure 6c). For a detailed view of the meetings withStig Smedberg, the Calendar tile is put into focus by tapping on it. From this point(Figure 6d), the user can open a context view for one of the meetings by dragging thechosen meeting to the Open Context droparea and drop it there. Notice that a databasefile that the program reads on startup defines the links in this prototype. The user canalso create own links by simply dragging an object to the desired information contextdisplayed on the bottom bar. A possible improvement of the software would be tocreate some links automatically.

Page 23: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

PowerCom 23

Figure 6 (abcd). An interaction sequence showing the information links property inPowerCom. The user’s task is to find out the meetings booked with Stig Smedberg.

Call handling in PowerComSeveral context views can be open at the same time and the user can quickly shift focusbetween them to retrieve different kinds of information. The multiple views representthe users’ parallel activities. A telephone call can also been seen as one kind of activityand are therefore part of the information contexts and integrated in the views. Since atelephone call is closely related to a contact, it is as default associated with the contextview of the contact. However, as described in Figure 6, the focus can be shiftedbetween different kind of information within a context and a telephone call can bemade from any context view where a contact is represented. Of course, he/she needs tohave a telephone and a telephone number associated to him. A simple call handlingexample is described in Figure 7 below.

In the leftmost screenshot the user is talking to Kent Nilsson, when an waiting call fromLisa Person is indicated. By tapping on the Lisa Person context on the bottom bar(Figure 7a), the focus is changed to her call (Figure 7b). It can then be answered bytapping on the telephone icon to the right of her name (Figure 7b). Kent Nilsson isautomatically put on hold when this action is carried out. A swap back to Kent Nilssoncan be made by tapping on his context at the bottom bar (Figure 7c), with the outcomethat Kent Nilsson´s call is activated (Figure 7d).

Figure 7 (abcd). An interaction sequence in PowerCom showing call waiting, answer& put on hold and swap between two calls.

Page 24: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

PowerCom 24

This example shows that if multiple context views are an option for informationhandling, it is a requirement when handling more than one telephone call at the sametime. However, there are some situations that need to be analysed separately for callhandling situations though. The audial modality is primary used when having atelephone conversation and the visual modality is used when handling information.This allows the users to simultaneously talk to a person and look for information. Acontext view containing an active telephone call is therefore kept semi-active whenswitching focus to another context view not containing a telephone call. Semi-active ina sense that the conversation can continue and is indicated at the bottom bar. Creatingconference calls is another central feature that needs to be considered separately.

In order to be consistent with the Focus+Context technique, the people participating inthe conference call must be within the same context, i.e. they must all be in focus.According to the information integration a person can be linked to many objects andhence represented in many context views at the same time. Therefore, the telephoneicon was chosen to represent the status of a telephone call with a person and the icon isneeded to move between the different context views. Since the telephone icon isattached and closely related to a person in Contacts, it was decided that the person’sname should be presented next to the telephone icon at all times. An example of this isshown below, where the user first creates a conference call and then a private call.

Figure 8 (abcd). An example of the creation of a conference call in PowerCom,followed by a selection of a conference member for a private conversation.

The leftmost screenshot (Figure 8a) displays a call handling situation where the user istalking to Stig Smedberg and has Jessica Gren on hold. A conference call is thencreated by dragging the telephone icon to the right of Stig Smedberg to the Jessica Grencontext on the bottom bar (Figure 8a). Both names are visualised next to each other inthe focus view when this operation is completed (Figure 8b). A private conversationwith Stig Smedberg can be initiated by dragging the telephone icon back to the StigSmedberg context on the bottom bar. It can also be done as shown in Figure 8b, wherethe user first taps on the Stig Smedberg context on the bottom bar. The telephone iconto the right of Stig Smedberg in his context then displays an ongoing call in anothercontext (Figure 8c). By clicking on it, the call with him is moved to this context asshown in Figure 8d.

Page 25: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

PowerCom 25

Something to consider is the effect the information context views have on the callhandling situations. Two persons you have spoken to in a conference call are likely tobe present in the same context view. This means a shortcut to the creation of aconference call and a less complex interaction sequence.

CharacteristicsPowerCom is an interface that uses the idiomatic paradigm in the concept FlipZooming [Holmquist 2000]. The tiles that expand and shrinks depending on the user'staps cannot be intuited, they have to be learned. The information contexts where objectsare linked to each other is another idiomatic design feature. Interface elements likedropareas, a function that is activated on an object when it is dropped there, can be saidto implement idiomatic drag & drop.

If you take a closer look at the interface, some metaphorical elements can be foundthough. The Address Book in the Contacts application is one example of this, whereone tile for each letter is represented. This can be seen as a filofax metaphor and all thepages where you write down your contacts, one page for each letter. The tap on a tilecan be compared to the selection of a letter and the turning to that page in one action.Furthermore, the tile in focus is larger than the tiles representing the context. This isbased on how the human vision works, by selecting different objects the user can movethe centre of attention to another area without loosing the overview. However, aninterface can be mainly idiomatic or mainly metaphorical and PowerCom is definitelymainly idiomatic.

PowerCom is built on direct manipulation including drag & drop, which means manyoptions of what to with the objects. The canonical vocabulary is quite large, to use thewords of Cooper [1995]. The atomic elements are tap, release tap and drag. Togetherthey form the compounds tap and drag & drop that can be applied to each object.Several parallel information contexts and the integration of information and callhandling increases the number of objects presented on the screen. In combination withthe drag & drop operations, the result is a larger breadth. If we look at the navigationdepth for simple tasks, it is smaller than the recommended five or six windows[Goldstein, Anneroth & Book, 1999]. Tasks that are more complex naturally exceedthis number. We must also remember that the information contexts are designed forquick retrieval. When the user has to perform several consecutive tasks using anotherPDA, he/she only has to perform one task using PowerCom's task-oriented design. Thishas a big effect on the total number of navigation steps for multiple tasks. Therefore,PowerCom can be considered as an interface with quit a large breadth and a smallernavigation depth, which is in line with the recommendations.

This interface is, like many other GUI's, built on an Object-Function interaction style.The function is normally not specified directly by the user though, instead it is definedby the actual tap. When retrieving information it is all about viewing and selecting,which means that an object is chosen by tapping on it and another view is automaticallydisplayed. Creating links between objects is a central feature that requires a drag &drop operation, this is done by dragging an object to another information context. In hiscase, the function is defined by the action itself as well. Some exceptions exist for

Page 26: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

PowerCom 26

specific actions such as modify and open. If the user wants to modify an object it isdone in an implicit Object-Function style exactly as described by Kunkel et al. [1995].

ImplementationThe PowerCom prototype was implemented by the PLAY research studio underWindows CE using WABA [Wabasoft 2000], which is a subset of the Javaprogramming language and designed for writing applications for small devices. ACasio Cassiopeia E-105 was the device chosen for the implementation, a picture of thisis displayed in Figure 9. The reasons for choosing this device were simply that it hadthe most computational power and one of the best touch sensitive screens, consideringe.g. resolution and colours, when the project started in the beginning of year 2000.

The device has four buttons on the front below the display, one is used for navigationthrough menus and three are used as quick shortcuts to the applications Main Menu,Calendar and Contacts. It also has three buttons on the left side, of which one is torecord voice messages. The other two are navigation buttons where one is used forexiting a view and the other one is the action wheel. The latter is a navigation buttonthat can be rotated in two directions as well as pressed and used for navigating throughmenus and selecting items. In this experiment, the hardware buttons were notimplemented at all, partly due to the results in the previous PowerView experiment[Hellstrand 1999].

Technical specifications

Display - 240x320 pixels, TFT liquid crystal, 65536colour touch sensitive screen

CPU - 131 MHz MIPS R4000

Memory - 32MB RAM, 16 MB ROM

Built in - Microphone, speaker, stereo earphone jack

Weight - 255 grams

Dimensions - 19.8 x 82.8 x 129.8 mm

Figure 9. PowerCom running on a Casio Cassiopeia E-105.

The former prototype, PowerView, was optimised for one-handed information retrievalby using the action wheel. However, this could not be properly tested during theexperiment since the users systematically refused to use one handed navigation.Although one hand was occupied holding a cellular telephone while standing and theywere prompted by the experimenter to use one-handed navigation, they preferred tohold the device in an awkward position and use the stylus.

Page 27: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

PowerCom 27

Many possible reasons for this are discussed in the paper by Hellstrand [1999] and onerecommendation was to not spend time implementing an interface optimised for one-handed navigation on a Casio Cassiopeia since the form factor and design of hardwarebuttons are not suitable for this purpose. This does not mean that one-handednavigation should not be considered. In the PowerView experiment the device andapplication was described as a digital filofax and the tasks were information oriented. Apaper based calendar is often used holding it in one hand and a pen in the other. If youpresent the PDA as a communication device including filofax applications and let thesubjects solve call handling tasks, the result might be different although the form factoris not optimal. However, the decision was made to not use the hardware buttons.

This Casio device is a regular PDA and has no built in telephone, i.e. no real telephonecalls can be made and the call handling tasks has to be simulated somehow.Fortunately, the WABA programming language offers the possibility to play backsound clips through the speaker of the device. Certain call handling events weretherefore chosen to trigger the playback of certain sound clips. If you e.g. called aperson, a sound clip associated with that person was played back. When the person wasput on hold or the call was ended, the sound clip was stopped. Interleaving theparticipating person's sound clips was a solution to simulate advanced call handlingsituations, such as a conference call.

Page 28: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Heuristic evaluation of the PowerCom GUI 28

Heuristic evaluation of the PowerCom GUIA heuristic evaluation was not part of the original master thesis work. However, it is anatural part before performing a more formal usability evaluation. If an informalevaluation is performed, many usability problems can be eliminated and make thefollowing experiment more useful. Informal testing is flexible, less expensive, less timeconsuming and more participatory compared to the formal experiments [Nielsen 1993,Mack & Nielsen 1994]. It is also very well suited in the early phases of a designprocess. This informal evaluation can be carried out in many ways, e.g. as anethnographical study, through interviews, by task scenarios or a combination ofdifferent methods. The most common is that an HCI-expert conducts a heuristicevaluation following some predefined rules for interface design, e.g. Jakob Nielsen’sten guidelines [Nielsen 1993, Mack & Nielsen 1994] or Ben Shneiderman’s eightgolden rules [Shneiderman 1998]. These guidelines are based on both designers'experience and on the human cognitive capabilities and have been validated in anumber of experiments.

MethodThe guidelines mentioned above combined with a few typical task scenarios werechosen for this heuristic evaluation. Nielsen's usability heuristics and the golden rulesof Shneiderman have many common parts, so a combination of the two was used whenworking through the different scenarios. The purpose of the task scenarios was to get apicture of the interaction at a higher level and without getting lost in too many details.Some of these guidelines were particularly developed for personal computers and largescreens. They are not applicable to the design of small displays for PDAs and weretherefore not considered.

Jakob Nielsen's ten usability heuristicsThese ten usability heuristics are taken from the Jakob Nielsen's website [Useit 2000]and is a refined version of the ten originally developed usability heuristics that can befound in Nielsen [1993, p. 19-20].

• Visibility - The system should always keep users informed about what is going on,through appropriate feedback within reasonable time.

• Match between system and the real world - The system should speak the users'language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. It should follow real-world conventions, making information appearin a natural order.

• User control and freedom - Users often choose system functions by mistake andwill need a clearly marked "emergency exit" to leave the unwanted state withouthaving to go through an extended dialogue. Support undo and redo.

• Consistency and standards - Users should not have to wonder whether differentwords, situations, or actions mean the same thing. Follow platform conventions.

• Error prevention - Even better than good error messages is a careful design thatprevents a problem from occurring in the first place.

Page 29: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Heuristic evaluation of the PowerCom GUI 29

• Recognition rather than recall - Make objects, actions, and options visible. Theuser should not have to remember information from one part of the dialogue toanother. Instructions for use of the system should be visible or easily retrievable.

• Flexibility and efficiency of use - Accelerators, unseen by the novice user, mayoften speed up the interaction for the expert user such that the system can cater toboth inexperienced and experienced users. Allow users to tailor frequent actions.

• Aesthetic and minimalist design - Dialogues should not contain information that isirrelevant or rarely needed. Every extra unit of information in a dialogue competeswith the relevant units of information and diminishes their relative visibility.

The following two guidelines were not considered in this evaluation, since PowerComdid not provide any error messages to the user and no documentation was used in theexperiment.

• Help users recognise, diagnose, and recover from errors - Error messages shouldbe expressed in plain language and precisely indicate the problem, and suggest asolution.

• Help and documentation - Even though it is better if the system can be used withoutdocumentation, it may be necessary to provide help and documentation. Any suchinformation should be easy to search, focused on the user's task, list concrete stepsto be carried out, and not be too large.

Ben Shneiderman’s eight golden rulesTheses rules are taken from the book Designing the User Interface [Shneiderman 1998,p. 74-76].

• Consistency - There are many types of consistency to consider, e.g. consistentsequence of actions in similar situations and identical terminology in differentprompts and menus. Very similar to Nielsen’s consistency and standards.

• Use of shortcuts - Frequent users desire to reduce the number of interactions andhence increase the pace. Some possible accelerators are special function keys andhidden commands. Compare to Nielsen’s flexibility and efficiency of use.

• Informative feedback - For every user action, there should be a system feedback.The amount of feedback is of course depending on the type of action. For major andinfrequent actions the system response should be more substantial.

• Closure of dialogs - A sequence of actions should be designed to give the users afeeling of accomplishment. There should be a beginning a middle and an end togive a sense of relief and opportunity to drop plans and options in order to preparefor the next action.

• Error prevention and error handling - The system should be designed to preventerrors as far as possible, e.g. menu selections instead of forms. If an error howeveroccur the system should give simple and constructive instructions for recovery.

• Permit easy reversal of actions - As often as possible, actions should be reversible.This relieves the user from a lot of anxiety, knowing that errors can be undone. Italso encourages exploration of unfamiliar features.

• Support internal locus of control - The users should initiate the actions and not justbe respondents to them. Surprising system actions build up the dissatisfaction andgive no sense of being in charge.

Page 30: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Heuristic evaluation of the PowerCom GUI 30

• Reduce short-term memory load - The human information processing in short-termmemory is limited. The magic number 7±2 presented by Miller [1956] is a rule ofthumb for how many items a human can keep in the short-term memorysimultaneously. Chunking objects into groups is one way to extend this limit, stillthe interfaces should be kept clear and simple and multiple pages should beconsolidated.

Task scenarios

Information handling

For all four applications (Contacts, Calendar, Mailbox and Notes), the tasks findinformation, enter new information and change existing information was chosen. SomeFocus+Context tasks such as find all the mails from a specific person, find all personslinked to specific meeting, link a note to a meeting context or link a mail to a specificperson, were also analysed.

Call handling

The possible call handling states have been mentioned earlier in this paper. From thislist, the most common states and the most common ways to move between them wereselected. These include simple operations such as call a person and end the call with aperson. More complex call handling operations like call a second person and put thefirst one on hold, answer an waiting call and put the ongoing on hold, swap betweentwo calls, create a conference call and start a private call with a conference memberwas also considered.

ResultsThe potential usability problems are sorted after the guideline violated.

Informative feedback

PowerCom is an interface built on a new visualisation technique and designed to fullyexplore alternative interaction styles. Proper feedback is therefore needed in order forthe users to better understand the concept.

Headings were not presented in all the tiles, which may confuse the users. It is notalways obvious what is a mail and what is a meeting in the context views. Some of theheadings were placed at different positions in the tile, which might be a consistencyquestion. The recommendation is to provide headings everywhere and to place them atthe same position in the tile.

No direct feedback was given when the information of an object was changed. If theuser wanted to make sure it had been done properly, the context of the object had to bereactivated. When a change has been done, it is an advantage to give the user directfeedback. This is also related to the closure guideline presented by Shneiderman, inwhich he discussed the clear termination of a task.

Page 31: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Heuristic evaluation of the PowerCom GUI 31

The icons indicating the states of telephone calls were very small and some of themwere difficult to understand. The difference between some states was not clearly visibleand a redesign of some icons would improve the interface.

When a conference call was created, no visual feedback was given. The participatingconference members was not presented next to each other and no indication was giventhat a conference call was ongoing. This complex call handling task should give betterfeedback to the user so they can be sure of who participates the conference call.

Visibility

The persons in the Contacts application and the notes in the Notes application wereordered alphabetically. In the Overview a shortcut was provided where the users couldpress a tile symbolising a letter directly. However, no letters were written out due to thelimited screen space available. If understanding how the tiles were ordered, one had tocount from A to the sought letter. This can be solved in many ways, the best one is towrite out all the letters and rearrange the other information in the tile. Another solutionwould be to write out some of the letters to at least indicate what information that isrepresented. A third would be to group the letters, for instance three and three.

Some commands necessary to navigate was hidden and not easily figured out duringfirst time usage. To move up a level in the information hierarchy or to close aninformation context, the background had to be tapped on. However, none of these twooperations were visible to the user. A solution is to write out on the background whathappens if it is tapped. This information can disappear after e.g. 20 successful taps onit. Creating buttons for these two functions can also solve the problem. However, this isnot in line with the alternative interaction style thinking.

In the Overview, the applications gave no hint of were to tap, in order to enter theapplication. The solution was simple, you can tap anywhere. However, for a noviceuser, a visible place to tap is better. Creating a tile heading with inverted text, making itlook as a tappable area is one alternative that also solves the heading problem. Asecond possibility is to use colours to indicate that some of the information in the tilescan be tapped on as a shortcut.

Consistency

In the Overview, all applications provided shortcuts so the user could make a selectiondirectly and move to the next level in the hierarchy. In the Calendar application, thisshortcut was not implemented although enough information was given for it. A simplesolution is to just add this shortcut possibility.

Different icons representing states of telephone calls was used for cellular telephonenumbers and stationary telephone numbers. This in itself does not have to be a usabilityproblem. However, the icons behaved differently when tapping on them. A simplesolution would be to change the terminology so the two different icons behave the sameway.

Page 32: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Heuristic evaluation of the PowerCom GUI 32

Recognition rather than recall

When looking for an object, the user has to remember some part of the name in order tofind the object. No list of objects or search function is implemented to help informationretrieval. A recommendation is to make an optional list of objects to help the users thatdoes not remember the exact name of the object.

Aesthetic and minimalist design

The tiles representing the four applications contained very much information in theOverview and some of it could be removed. E.g. knowing the total number of objects isnot necessary. Flip Zooming is a technique that in a way reduces the number of objectsby grouping them together. However, the focus and the groups of objects could bemore clearly indicated by e.g. colours, headings and faded backgrounds to assist theuser's information processing.

Speak the users’ language

The Calendar was called Meetings, the Mailbox was called Letters, Cancel was labelledas Abandon and Save & Close was called Close & Save. This might be considered asdetails, but all these small things increase the users’ cognitive load. It is better to usestandard words to avoid confusion.

Task scenarios

The task scenarios were helpful when analysing the call handling tasks. Simple actionslike making a call and swapping between calls did not seem to be a usability problem.However, if we consider more complex call handling, e.g. creating a conference call, afew problems were found. The solution to them was not obvious and many differentsolutions were discussed on how to solve the creation of conference calls and privatecalls. Merging telephone calls by merging context views was the most common way totry to create a conference call when a few colleagues tested the scenarios. However,this is not in line with the information concept, where merging views means creatinglinks between the different contexts. Moving persons between different context viewswas another option. However, there is the problem with that a person can be present inseveral contexts at the same time. Therefore, something else had to represent the stateof a telephone call and the designers had already chosen and implemented an icon nextto the names for this purpose.

According to the designers, a conference call was created by dragging the telephoneicon to another context containing a second telephone call. This could be a complextask for a novice user and a better solution might be to allow the dragging of the entirename to another context, i.e. to see the person and the telephone call to the person asone unit. Choosing one of the conference call participants for a private call could alsobe solved by a drag & drop of the telephone icon back to the original context. Anotheroption was a change of focus to the context containing the person selected for a privateconversation and then tap on his telephone icon to activate the call. When trying thisoperation, problems with the telephone icons were found. The icons in the contexts onthe bottom bar did not reflect the state of the telephone calls within the contexts, whichmight cause confusion and insecurity of the telephone call states among the users. If it

Page 33: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Heuristic evaluation of the PowerCom GUI 33

is a held call, this should be indicated on the bottom bar and if it is a conference call, asymbol representing this should be displayed there.

The task scenarios showed that the integration of information and call handling did nothave any simple solutions. The idea that the centre of attention could be a meeting aswell as a telephone call was more complex than expected. Some designs that were anadvantage from a call handling point of view proved to be inconsistent with theinformation context view, and vice versa.

ImprovementsNot all of the suggested changes were carried out before the formal experiment. Somewere judged as not critical for the evaluation, too time consuming too do somethingabout or not consistent with the underlying thoughts of the application. No changeswere done that effected or changed the main application design.

The feedback was improved by writing out the names of the conference membersnext to each other. If e.g. the telephone icon of a person was dragged to anothercontext, the name was automatically added to this context as well.

Headings were created for the applications in Overview. By inverting them, thevisibility problem of knowing where in the tile to tap was also solved.

The telephone icon design was both a visibility problem and a consistency problem.Creating new cellular telephone icons that behaved in the same way as thestationary ones solved parts of this.

Some letters were displayed in Overview for the Contacts and Notes application toindicate what the tiles contained.

Text was written out on the background that explicitly said what a background taphad as effect.

DiscussionAs mentioned above, this informal evaluation is very well suited when applied early inthe design process. At the beginning of this heuristic evaluation, the PowerComapplication was already implemented and running. The ideal would be to start with apaper mock-up, do an informal usability evaluation of that, then move on to apresentation program on a computer where the screens can be altered according to theusers’ actions. Only after these low fidelity prototypes have been evaluated, the actualimplementation should be considered. The main reason for this is that it is very difficultto change some parts of the interface when it has been fully implemented. Anotherreason is that the designer/programmer is less willing to make changes late in theprocess. He/she has then created a complete application and might feel it is not his/hersdesign if too many things are changed according to a usability evaluator’s opinions. Athird reason is that time is often a critical factor in the end of the design process. Duringthis expert evaluation, all these three problems were experienced.

Page 34: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The KeystrokeMapper 34

The KeystrokeMapper

IntroductionEfficiently logging user actions and efficiently describing them is a problem manyusability evaluators experience. Existing tools do not provide an opportunity tovisualise user interaction and most analysis is done on a text level. Noldus ObserverPro and UsabilityWare [Noldus 2000] are two tools made to support the evaluationprocess, but they are only designed to tag and classify deviating actions. When the userfollows the Optimum path[Mohageg 1992], it is not indicated in any way. Optimumpath has been defined as the shortest navigation path through the interface in order tosolve a task. This is the path a skilled, experienced user would take to accomplish atask. Sometimes several, equally good, Optimum paths can exist. TheKeystrokeMapper is a data visualisation tool that proposes a raw way of visualising andannotating novice user behaviour when interacting with an interface compared to anexperienced user's Optimum path [Goldstein, Werdenhoff & Backström 2000]. This isdone by indicating the Optimum path as a straight diagonal string of keystroke actionsand in the same graph plotting the user keystrokes according to a specific annotation.The KeystrokeMapper yet only exists as a paper based mock-up.

An example of how the KeystrokeMapper can be used during the experiment is shownin Figure 10 below. The two parts of the picture show how to log the user's actions,either in real time or when watching video recordings of a user solving a predefinedtask. The left picture (Figure 10a) displays the starting state where only the Optimumpath is plotted. As the user performs keystroke actions, they are indicated in the graphby the evaluator (Figure 10b).

Figure 10 (ab). An example of how the KeystrokeMapper can be used to log a noviceuser's actions compared to the Optimum path for a predefined task.

If the user switches to or from the Optimum path, it is indicated by an arrow. Thekeystroke actions are considered at a reasonable high level, a double tap is consideredas one action, several consecutive scrolling taps are also considered as one action, so is

Page 35: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The KeystrokeMapper 35

entering information in a text area. The annotation proposed by Goldstein, Werdenhoff& Backström [2000] is the following:

(•••) Optimum path user keystroke actions appears as filled circles.

(ooo) Optimum path keystroke actions not mimicked by the user are leftunfilled.

(aaa) Alternative user keystroke actions to solve a task are indicated this way.They may not be as effective as the Optimum path, but they willeventually lead the user to an accomplished task or a subtask.

(+++) Deviating user keystroke actions that are not necessary to take the usercloser to solving a task or a subtask.

(P) Task successfully passed.

As mentioned, the KeystrokeMapper only exists as a paper and pencil mock-up and hasnot been evaluated yet. Therefore, the tool was decided to be used in this experiment tosee if it is applicable during a usability experiment and to investigate the possibilities toimprove the visualisation of a novice user's path compared to the Optimum path by e.g.aggregating the results from many plots into one. One drawback with this method isthat one single plot has to be made for each user and each task. The total number ofplots becomes subjects*tasks*interfaces.

Evaluation

Temporal aspect and annotationIn the original version proposed by Goldstein, Werdenhoff & Backström [2000], thetemporal aspect is not considered, each keystroke action is only visualised once. Theplot in Figure 10b does only indicate a deviation from the Optimum path and not howmany keystroke actions that were needed. If the user performs an action twice, it isvisualised by an arrow going back to that action. The question is what happens if theuser's behaviour is repetitive and he/she performs the same sequence several times?Goldstein, Werdenhoff & Backström [2000] has mentioned this problem but notsuggested a solution. By adding a temporal aspect and moving to the right all the timewhen the graph is plotted, the visualisation might be simpler to understand as well asmore describing. Another advantage is that the usage of the KeystrokeMapper in realtime during an experiment is improved. The complexity for the experimenter issignificantly decreased when not having to keep track of arrows back and forth andinstead just moving right along the timeline. It might be added that repetitive behaviouris very common among novice users.

When adding a temporal aspect, the annotation was modified slightly. If the userfollows the Optimum path without time latency, it is annotated with filled circles( ••• ) as explained before.

Page 36: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The KeystrokeMapper 36

If the user after having had some trouble finds the right way and follows the Optimumpath with a time latency, it is annotated with filled circles in a grey shade ( ) as inFigure 11. Another way to improve the visualisation and give the readers a feeling forwhat the action sequence represents is to plot the deviating keystroke actions (+++) as acurved line symbolising the user's trouble finding the right way. Alternative actions(aaa) leading towards the goal can be visualised as a straight line towards the Optimumpath (Figure 11). This was also the version of the KeystrokeMapper used in thisevaluation. Some example user paths can be found in Appendix 6.

Figure 11. The KeystrokeMapper version 2 with a temporal aspectadded. This is the same task as in Figure 14.

The temporal aspect can also be extended into a continuous form, if using the ideaspresented by Urokohara et al. [2000]. They have suggested a method called NEM(Novice Expert ratio Method) and analyses the completion time for each successivekeystroke in order to calculate a novice/expert user time ratio. A high ratio for akeystroke action is an indication of a usability problem. However, no visualisation ofthe user path towards the goal is made. By combining NEM with the KeystrokeMapper,we might get both a visualisation of the user's path and an indication of the time spenton each keystroke (Figure 12).

Figure 12. The KeystrokeMapper with a continuous temporal aspect added.

Page 37: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The KeystrokeMapper 37

By grouping the deviating and alternative actions together, we know what kind ofaction it is and it does not have to be written out in the plot. In this way, cluttering thegraph with too much information is avoided.

ScalabilityWhen the total number of keystroke actions increases, the problem with the paper sizewill occur. However, there are many ways to solve this. The resolution can be varieddepending on the users actions, if a completely wrong application is entered a lessdetailed plot can be applied to that specific part. When the user then gets back on trackagain, the resolution can be increased again. The situation that the number of steps tosolve a task exceeds the number possible to visualise in one plot might arise anyway. Inthat case, the user's path can first be plotted on a higher level, and if any deviationsfrom the Optimum path have occurred, they can be analysed in detail in a separategraph using a higher resolution. Still, there is a limit of around 40 keystroke actions fora plot in portrait mode and 60 for a plot in landscape mode when using A4 paper. Thislimitation is not relevant on a large screen and a software using scrollbars.

The second scalability problem is that many plots are needed to visualise all the usersbehaviour. It would be very useful if the results from all the users on one task could beaggregated onto one plot. Siirtola [2000] has proposed a way to directly manipulateparallel coordinates where he describes how to visualise and summarise a set ofpolylines. An example of how a parallel coordinate plot can look like is displayed inFigure 13 below [Ward 2000].

Figure 13. A parallel coordinate plot of polyline showing the relation between differentcar engine factors. Each line symbolises the relations for one single car engine. Usedwith permission of Matt Ward [2000].

Multiple novice user paths can be seen as a set of polylines. By summarising andvisualising them in a parallel coordinate plot using the methods Siirtola [2000] suggest,the patterns and relations between different user actions could be revealed.

Page 38: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The KeystrokeMapper 38

If we look in Figure 13, a low value on the first variable seems to produce a high valueon the second variable considering all the different combinations of car engines. In thesame way, the relation between different procedural steps in the novice user's pathcould be indicated by displaying several paths in the same plot. Unfortunately, somedrawbacks were found using the method in combination with the KeystrokeMapper. Tostart with, in Figure 13 the number of polylines is several hundred, which makes thepatterns visible. In a usability evaluation the number of subjects and polylines are veryseldom more than 20, and this is not enough for a parallel coordinate visualisation toclearly display patterns in form of common novice user paths. The variables in Figure13 have additive properties and the exact position is of less importance when findingrelations. If we consider the KeystrokeMapper the variables (actions) are of moresubstitutive character and the exact position is of importance to understand the users’path. Another problem concerns the annotation, when the number of polylines increasesit is impossible to write out the letters symbolising alternative and deviating actionswithout cluttering the plot. Just removing them is not a very good solution either. Oneuser can perform a keystroke action, which in that context symbolises a deviatingaction and another user can at the same place in the plot perform an action that followsthe Optimum path towards the goal.

DiscussionThe KeystrokeMapper is a very good and supportive tool when viewing all the videorecordings from an experiment and analysing the results. If the user's keystroke actionsfor each task are carefully logged in a plot you only have to view the video once, all theanalysis can then be done by looking at the plots. Task completion times, keystrokes,Optimum path usage and deviations from Optimum path can all be gathered on onepage. These results can then be compared and common problem areas for each taskidentified. The plot is also a very good way to visualise the user's actions and a betteralternative than writing a long story in words.

The KeystrokeMapper, both as suggested by Goldstein, Werdenhoff & Backström[2000] and as modified in this paper, is a qualitative rather than quantitative methodand more suitable to visualise individual user paths, typical deviations from theOptimum path and alternative sequences. It is a tool that can be of help whenidentifying areas of interest and appropriate parameters for a quantitative analysis.

If implemented as a computer based product, the video recordings and the plots can belinked to each other and viewed simultaneously, both when entering the user path andwhen analysing it. Intelligent algorithms can be used to search for patterns, findcommon problem areas and then construct typical interaction sequences from the plotsthat have been entered into the system. Some kind of hierarchical visualisation isanother possibility, the most common paths for problem areas can be organised intoseparate visualisations. Tables containing quantitative values of which particularkeystroke actions that is chosen in a certain situations can be constructed. How manyper cent of the subjects that are still using the Optimum path after a number ofkeystrokes is another question that this software could answer. If time stamps areentered into the system, analyses can be made of single keystrokes in a way similar towhat have been proposed by Urokohara et al. [2000].

Page 39: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 39

The experimentThe usability of information handling and call handling on PowerCom was evaluatedby benchmarking it against two interfaces already existing on the market. The first onewas the Nokia 9110 Communicator and the second one were Windows CEcomplemented by a Nokia 6110 cellular telephone. Independent variables were theUsability & Interaction laboratory, the subjects, the interfaces and the tasks. Thesubjects carried out the same seven tasks on all three interfaces in the same order andseveral dependent variables were analysed. The objective variables effectiveness,efficiency, Optimum path and deviating keystrokes were recorded using digital video.The subjective variables mental workload and subjective satisfaction were recordedthrough questionnaires.

HypothesesThe following null hypotheses were set for the three interface conditions:

• No difference in effectiveness, i.e. level of completion, for any task.• No difference in completion time for any task.• No difference in the number of keystrokes for any task.• No difference in the use of Optimum path for any task.• No difference in mental workload for any scale.• No difference in subjective satisfaction for any statement.

The hypotheses were tested using a Repeated measurement general linear model(GLM) in the SPSS 10.0 software [SPSS 1999]. The significance level was set to thestandard 5% and the overall differences were analysed using a univariate approach.More information about this analysis and how to interpret the results are found on Page44 in the Results chapter. Multiple pairwise comparisons were Bonferroni adjusted.Each hypothesis, and hence dependent variable, was tested and analysed for each task,scale or statement using a separate statistical model of type:

Output = average + interface effects + user effects + (Equation 2) learning effects + tiredness effects + … + usability lab

The different outputs describe the same thing using different words, namely theusability of the interface. The interaction between task effects and interface effectsexists, but have not been considered since each task is evaluated independently and thesame tasks are presented on all three interfaces. This model is significantly simplifieddue to the experimental design explained in the next chapter.

Experimental designA 3x7 factor (Interface (PowerCom, Nokia 9110 Communicator, Windows CE andNokia 6110)) x (Task (1-7)) within subject repeated measurement was used. The lastterm means that all subjects did all tasks on all interfaces. Egan [1988] has shown thatindividual differences are often larger than the differences between interfaces ordifferences in training procedures. This experimental design minimises these individual

Page 40: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 40

differences, because a person who does well on one interface is likely to do well on allinterfaces. Similarly, a person who performs badly probably does it on every interface.By using a repeated measurement design, the problems with selecting narrowly definedcategories of users can also be eliminated. A narrow selection is also inappropriate inthis case, because PDAs and cellular telephones are intended to be used by the public atlarge. To avoid learning effects and tiredness effects, the three interfaces werecounterbalanced according to all possible permutations of interface orderings. The 18subjects were randomly assigned to a time for the experiment and hence randomlydivided into six groups (table 1).

Group Users Interface 1 Interface 2 Interface 31 1-3 PowerCom Nokia Communicator WinCE and Nokia 61102 4-6 Nokia Communicator WinCE and Nokia 6110 PowerCom3 7-9 WinCE and Nokia 6110 PowerCom Nokia Communicator4 10-12 WinCE and Nokia 6110 Nokia Communicator PowerCom5 13-15 Nokia Communicator PowerCom WinCE and Nokia 61106 16-18 PowerCom WinCE and Nokia 6110 Nokia Communicator

Table 1. The user groups and their interface presentation order according to allpossible permutations of three elements.

When using this experimental design, the model in Equation 2 is reduced by a coupleof factors. The interface effects are of course still there but the user effect is removedby the repeated measurement design. The learning effects and tiredness effects arebalanced through the altered device order. The effects from the usability lab wereconstant and can be neglected. Therefore, the final model becomes very simple:

Output = average + interface effects (Equation 3)

If the null hypotheses are true, the interface effects are zero and no differences existbetween the three interfaces for any of the Output values.

Independent variables

Usability & Interaction laboratoryThe department of Application research at Ericsson Research in Kista has access to aUsability Laboratory for evaluations of different interfaces and systems. The laboratoryconsists of two rooms separated by a one-way mirror (Figure 14 on the next page). Therooms are sound isolated and the windows are covered by dark curtains. The controlroom has computers and video recording equipment that the experimenter uses tocontrol the cameras and microphones in the test room. In this experiment two camerasin the test room were used. One was located in the ceiling above the testpersons andvideotaped the interfaces, one was located in front of the subject to record facialexpressions. The picture from the camera in the ceiling was recorded all the time andthe picture recording the faces was sometimes mixed into the resulting movie. Thismakes it easier to interpret the users’ actions when viewing the film. The experimentercommunicated with the subjects by a microphone that could be switched on and off.

Page 41: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 41

The sounds from the test room were recorded continuously and therefore the subjectcould say something to the experimenter at any time.

Figure 14. Two pictures taken from the Usability & Interaction Laboratory showing thecontrol room and the test room.

SubjectsThe 18 subjects were mainly Ericsson employees and chosen from the Usability &Interaction Laboratory internal database. This database can only be found on theEricsson Intranet and consists of volunteers that have registered as willing to participatein usability studies. Some master thesis workers and vacation staff were also recruited.The subjects were 13 men and five women and differed in age from 19 to 43(mean=29). They all owned a cellular telephone and used it on a daily basis. Most ofthem had tried some form of call handling but only a few had tried advanced operationssuch as creating a conference call. Ten of them owned or had owned a PDA, the othereight had tried using a PDA or seen a person using one. They were all very familiarwith computers and the operating system Windows. None of the subjects had anyexperience from the three interfaces that were used in the study. The subjects all spokeSwedish and English fluently. At the completion of the test, they received agratification of four cinema tickets.

Interfaces and apparatus

PowerCom

This application running on a Casio Cassiopeia E-105 has been described in the chapterabout PowerCom. It can be seen as an interface in this context because no possibility toexit was given after it once had been started. PowerCom is the only interface of thethree using direct manipulation with a stylus.

Nokia 9110 Communicator

The Nokia 9110 Communicator (see Figure 15) can be used as an ordinary cellulartelephone in closed mode, but can also be used as a kind of a integrated handheldcomputer and cellular telephone in open mode. The operating system is called GEOSand consists of several partly separated applications. The ones of interest in thisexperiment are Digital Cellular Telephone, Internet with email, Contacts, Notes, andCalendar. The navigation between the different applications is done using the hard keys

Page 42: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 42

above the keyboard. The display is not touch sensitive. Therefore, the navigationthrough menus and selection of menu items within an application requires the soft keysto the right of the keyboard and/or the arrow keys. A few shortcuts are possible byentering letters directly. If we compare this interface to PowerCom, we see an interfacewith a smaller breadth at each level. Sometimes, this means a larger navigation depth insubmenus for instance. There are several reasons for this difference. One is theseparation of applications. Only objects from one domain/application are visible at atime. This reduces the number of objects on the screen. Another reason is the restrictedcanonical vocabulary, the only atomic element [Cooper 1995] is a keypress. Thismeans that the breadth is decided by the number of keys to choose between togetherwith the number of objects on the screen. If the keyboard is considered as one unit,both of them are restricted.

Technical specification

Display - Monochrome, 360 x 120pixels not touch sensitive

CPU - Embedded AMD 486

Built in - Memory card, microphone,speaker, small keyboard

Weight - 253 grams

Dimension - 27 x 158 x 56 mm

Figure 15. The Nokia 9110 Communicator in open mode.

The interface is very straightforward and mainly idiomatic since the options or objectsare presented in long lists, highlighted by using the arrow keys and selected by pressingone of the softkeys situated on the right hand side of the screen. However, a couple ofmetaphor exceptions exist, e.g. the person highlighted in the Contacts application ispresented as a small business card to the right of the list. The telephone application usesa simple but effective room metaphor to visualise the different calls.

The interaction style is explicit Object-Function [Kunkel et al. 1995]. An object ischosen with the arrow keys and then a function by using one of the four softkeys. Morecomplex actions, e.g. the creation of a conference call and the selection of a conferencemember for a private call, are carried out in a Function-Object style.

Windows CE + Nokia 6110

Windows CE is the standard operating system that comes with Casio Cassiopeia E-105.It has only applications for information handling and needed to be complemented by aNokia 6110 cellular telephone for the call handling tasks. This condition is in fact twoseparate interfaces. However, from the experiment view, it can be seen as one interfacewith two parts that are used to solve the seven tasks. The two devices are pictured inFigure 12 on the next page.

Page 43: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 43

Technical specification Nokia 6110

Display - Five lines for text, numbersand graphics. Approximately 15 lettersper line

Memory - 50 names and numbers intelephone, 250 in SIM card and aunknown number of Calendar items

Weight - 137 grams (slim battery)

Dimensions - 28 x 47 x 128 mm

A technical specification of the Casiodevice is presented in the PowerComchapter.

Figure 16. Windows CE running on a Casio Cassiopeia E-105 and a Nokia 6110.

If we first look at Windows CE it is, like PowerCom, a graphical interfaceimplementing direct manipulation. It is a minimised version of Windows 95/98/NT,which use the desktop metaphor. The screen size on a handheld computer is muchsmaller though, and this metaphor is not very suitable. Although many icons presentvisual metaphors, Windows CE can be regarded as a partly idiomatic interface. If notthe similarity to Windows 95/98/NT should be regarded as a metaphor. Most usershave large experience from at least one of these operating systems for stationarycomputers. The designers have tried to keep as many features as possible from the PCinterface, which means the number of objects on the screen is very large. The canonicalvocabulary of Windows CE can be compared to PowerCom although the atomicelements [Cooper 1995] are limited to tap only. The compounds consist of tap, double-tap and tap-and-hold and form quite a large vocabulary. This in combination with themany objects results in an interface with large breadth of actions available at eachpoint. The navigation depth is within the recommended six windows and most op-erations are carried out in an Object-Function style.

The Nokia 6110 is the opposite of Windows CE. The navigation in menus can only bedone using the softkeys and arrow keys next to the screen. As for the Nokia 9110Communicator, the canonical vocabulary is very restricted since the atomic elementsconsist of keypresses only. If we consider the number buttons as one unit, we get a totalbreadth of at most five at each level including the two softkeys and two arrow keys.Logically the navigation depth has to be increased significantly with the use ofsubmenus. The interaction is explicit Function-Object style all the way and since theinterface is based on menus and submenus it must be classified as mainly idiomatic.However, some visual metaphors in form of icons are used to indicate telephone callstates. In order to get a clear picture of the telephone display and these metaphoricalelements when the subjects were making a telephone call, a handsfree equipment wasused.

Page 44: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 44

Apparatus for simulating answering persons

To make the conditions for all three interfaces as equal as possible, real telephone callswas not used for the two Nokia interfaces either. The sound clips played back on thePowerCom prototype were used here as well. In this case, two solutions were adopted.One was that the sound clips were played back on a laptop computer and the answeringtelephone was pressed against the computer’s loudspeaker. The second was that thesound clips were saved as a voice mail message and the answering telephone wasturned off. The apparatus required for these operations were:

One IBM Thinkpad 760XL laptop computer was used to play back voice messages forthe simulated conversation. Four wav-files were recorded and simulated four differentpersons. The procedure was that the experimenter answered the call from the subjectwith one of the cellular telephones mentioned below, started the sound clip and thenpressed the telephone against the loudspeaker of the computer.

An Ericsson GH388 cellular telephone of this model simulated two of the answeringpersons. Two SIM cards were used to enable different telephone numbers for the fictivepersons. This is required in certain call handling situations if you want the stored nameassociated to the telephone number to be displayed on the device screen.

Two voice mail messages were recorded using an Ericsson SH888 cellular telephone.The reason for this was to minimise the load on the experimenter and automate as muchas possible. The voice mail messages were altered between the call handling tasks.

An Ericsson DT368 DECT telephone was included to simulate one person and increasethe realism by letting different stored persons in the devices have different telephonenumbers.

TasksFour information handling tasks (1-4) that represents the normal usage of a PDA werecreated. Three typical call handling tasks (5-7) were designed to test the most commonoperations available. Some of the tasks were quite complex and therefore divided intosubtasks. All the tasks were given a detailed scenario to give the subjects a meaningfulcontext and were identical for all three interfaces. The intention was to presented thetasks in an ascending order of complexity. This presentation order and their Optimumpath complexity described in terms of breadth vs. depth are listed in Table 2 on the nextpage. As introduced in the chapter covering the Interface Characteristics, BD is used todescribe the task for each interface (B=breadth, D=depth).

An interface is not regularly ordered and cannot be classified in this way without usingaverage breadth values. The approximate breadth has been measured consideringequation 1, i.e. each compound that can be applied to each object. Only meaningful(according to the experimenter) choices and actions have been taken into account. Forthe tasks divided into subtasks (task 1, 6 and 7), breadth and depth have been calculatedfor both the whole task as well as the subtasks individually. The breadth value for thewhole task is an average value and the depth value for the whole tasks includes all therelevant steps. The tasks as presented to the subjects can be found in Appendix 1.

Page 45: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 45

PowerComNokia 9110

Comm.WinCE +

Nokia 6110Category Task DescriptionBD BD BD

1a Find contact info 82 73 94

1b Change contact info 124 53 104

1 - 106 66 98

2 Find note info 85 65 84

3 Find meeting info 107 63 105

Information handling

4 Find emails about meeting 88 611 97

5 Call a person 84 75 64

6aAnswer waiting call andput ongoing on hold

52 51 51

6b Swap between two calls 81 102 51

6c End a call & continue held 102 52 41

6 - 95 85 53

7aCall a second person andput the first on hold

63 73 54

7b Create a conference call 181 61 53

7cStart a private call with aconference member

181 53 44

Callhandling

7 - 125 67 511

Table 2. The presentation order of the tasks and their complexity measured in breadthand depth considering the Optimum path for each task on the three interfaces. Theapproximate breadth values are calculated according to Equation 1.

Dependent variables

EffectivenessEffectiveness can be explained as the level of completion. It is normally measured asthe percentage of subjects that passes a task on the first try, or as the percentage oftasks passed on the first try for an interface. In this experiment, both information andcall handling tasks were presented. They are very different when it comes to faulttolerance. When solving an information handling task several errors may be acceptedby the user, but call handling tasks involve other humans that might not accept morethan a few attempts. Therefore, the effectiveness was measured using a speciallydesigned scoring system where each subtask was given a score between 0 and 1 points.

• 1 point was given for each subtask successfully solved on the first try.• 0,5 points were given for each subtask solved on the second try.• 0 points were given if the user failed on the second attempt as well.• 0,25 points were subtracted from the score if the experimenter had to give a hint.

Asking for help was also scored as fail. Subtasks that were not reached on the firstattempt because of an early error doing a preceding subtask, were also scored as fail.This means that if the subject failed the first subtask two times in a row, a score of 0points were given to the following subtasks as well. During the test some situationsoccurred that had not been considered before and a few adjustments to the scoringsystem were done. If a subject e.g. phoned the wrong person, 0,25 points weresubtracted. If a subject by mistake deleted some data, the same reduction in score was

Page 46: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 46

administered. By using this scoring system, the call handling tasks were judged a bitharder than the information handling tasks and a maximum of two attempts wasallowed for the call handling tasks.

EfficiencyEfficiency has been defined as the cognitive resources expanded in relation to theaccuracy and completeness of the achieved goals [Gleiss 1992]. In this experiment,efficiency was simply measured as completion time and number of keystrokes. Thetime was measured from the moment the subject had correctly understood the task andstarted solving it until the moment an answer had been written down and the subjectreported the task as accomplished. This completion time was compared to an optimumtime, which serves as a usability criterion. The optimum time was measured as the timeit takes for an expert to solve the task using the Optimum path (discussed below) anddoing the different operations at a normal and relaxed pace. The total number ofkeystrokes used for a task is closely related to the completion time. A large number ofkeystrokes are often an effect from a long completion time (or the opposite).

Optimum PathThe Optimum path [Mohageg 1992] has been defined in the chapter about theKeystrokeMapper. In this experiment the tasks were quite small and the Optimum pathcould be described close to a keystroke level. A simplified version of the Optimum pathmeasure was applied, where using Optimum path = 1 and not using Optimum path = 0.This discrete way of measuring the Optimum path usage can be a bit harsh in certainsituations. If a user makes one or two extra operations, the Optimum path variable is setto zero. However, the user still has solved the task in a very good way. Therefore, thismeasure was complemented by a deviation metric as proposed by Mohageg [1992]. Ascan be seen in Equation 4, the total number of keystrokes required for a novice user tosolve a task is divided by the number of relevant keystrokes, i.e. the Optimum path. Alarge keystroke deviation ratio is a sign of uncertainty during performance.

4)(Equation keystrokesrelevant of no.

keystrokes of no.=D

Mental work load assessmentA multidimensional rating scale called NASA-TLX [Hart and Staveland 1988] wasused. It reflects the mental, physical and temporal demands as well as the performance,effort and frustration levels. Each of the six scales was a single 100 mm line withoutnumbers and with bipolar descriptors at each end (see Appendix 2). The left hand de-scriptor was set as Low and the right hand descriptor as High, except for thePerformance scale where the left descriptor was Good and the right was Poor. Thismeans that low values indicate positive ratings in form of low workload and goodperformance and high values indicate negative ratings in form of high workload andpoor performance.

Subjective satisfactionThe subjects were asked to rate each interface according to 15 different statements.Each statement had a scale ranging from 1 to 9 with bipolar descriptors at each end (seeAppendix 3). The left hand descriptor represented a low rating and the right hand the

Page 47: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 47

descriptor a high rating for the specific part of the interface. The design of thisquestionnaire and the statements in it are from the examples in Shneiderman [1998,chapter 4]. The 15 statements were chosen to represent many different aspects of theinterfaces, e.g. icons, terminology, system speed, first impression, etc.

ProcedureWhen arriving to the Usability & Interaction Lab the subjects were shown the tworooms and given a short explanation of how usability tests are carried out in general.When seated in the test room, they were asked a few questions to record theirbackground and previous knowledge of computers and PDAs. The test situation wasthen explained for both information and call handling and the recorded messages andthe lack of audible interaction were mentioned. In order to give the subjects a goodpicture of call handling, some of the main features were discussed. Then the followingprocedure was repeated for each interface:

• Introduction to the device and explanation of the interaction properties. Thisintroduction to the interaction styles was done without telling the subjects how toactually perform different tasks. The purpose was only to help them overcome theinitial learning step. This particularly applies to PowerCom, which has someunusual interface elements. The idiomatic features of this interface were primed tothe subjects. Previous studies [Goldstein, Bretan, Sallnäs & Björk 1999] haveshown that might be necessary when evaluating new interaction styles. If not, thesubjects might not even consider the possibilities.

• Five minutes of familiarisation and free browsing through the applications. Thesubjects had the opportunity to try different operations themselves during thefamiliarisation period and no video recordings were made. This fact was stressed tothe subjects and they were encouraged to trial and error. No manuals were allowedat any point.

• Performing the seven tasks. The subjects were told to solve a task, report to theexperimenter when ready and wait for instructions to continue. Somecommunication took place during the test, e.g. if the experimenter did notunderstand why the user took a certain path. If the subjects had severe problemssolving a task, a hint was given in first hand and help in second hand. This wholesession was recorded on video.

• Fill out the mental workload form. During this part, the subjects were of courseallowed to ask questions about the scales and statements.

• Fill out the subjective satisfaction form. The same applies to this part as theprevious one.

Between each interface condition a break for about ten minutes was inserted. After allthree interfaces had been completed, some concluding remarks from the users wereregistered and they were given the gratification in form of four cinema tickets. The totaltime for the experiment was about 150 minutes including breaks. The whole course ofevents is summarised and displayed in Figure 17 on the next page.

Page 48: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 48

Figure 17. Course of events for the experiment.

The first four tasks were, as discussed above, information handling related and did notrequire any interaction from the experimenter. The last three tasks, concerning callhandling, were also automated for the PowerCom interface, but the two Nokia devicesdemanded the experimenter to answer and make some telephone calls. Thisexperimenter interaction did not affect the subject’s performance since the calls weremade when the user had reached a certain point and the waiting calls were answeredafter a specific time interval. The times for the automated interface PowerCom wereused in these procedures to get an absolutely fair comparison.

Subjective satisfactionMental Workload(NASA-TLX)

Assessment

Interaction explanation Familiarisation with the device

Experiment - Interface C

Information handling Call handling

Subjective satisfactionMental Workload(NASA-TLX)

Assessment

Introduction

Tasks

Debriefing and paymentConcluding remarks

Final assessment

40 min

10 min

Introduction to theusability laboratory

Interaction explanation Familiarisation with interface

Interview

Experiment - Interface A

Information handling

Introduction

Call handling

Subjective satisfactionMental Workload(NASA-TLX)

Assessment

Introduction

Tasks

Interaction explanation Familiarisation with interface

Experiment - Interface B

Information handling Call handling

Introduction

Tasks 40 min

10 min

50 min

Introduction tocall handling

Page 49: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

The experiment 49

Task 5. The experimenter had to answer the call by starting the sound clip on the laptopcomputer associated with the called person and then press the Ericsson DT368 DECTtelephone against the loudspeaker at the laptop computer. When the call was ended, thesound clip was stopped and the telephone hung up.

Task 6 (abc). After the subject successfully had telephoned the first fictive person andlistened to the voice mail message recorded on the Ericsson SH888, the experimenterhad to call the subject. This call was made using the Ericsson GH388 and a sound clipplayed back on the laptop computer according to the procedure in task 5. When the taskwas finished the sound clips was stopped.

Task 7 (abc). This task could not could start before the experimenter had altered thevoice mail message for the Ericsson SH888 telephone and switched SIM-card in theGH388. The purpose of this was to make the right persons answer the subjects'telephone calls. After having made a telephone call and listened to the alternative voicemail message recorded on the Ericsson SH888, the subject was supposed to call asecond person. The experimenter, now using the Ericsson GH388 with the second SIM-card and another sound clip played back on the laptop computer answered this call. Theuser's creation of a conference call and a private call did not require any interactionfrom the experimenter. When the task was finished, the sound clips were again stopped.

Page 50: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Results 50

ResultsThe recorded data used in the repeated measurement analysis was assumed to benormally distributed. According to Hyperstat Online [2000], measures of readingability, job satisfaction and memory are among the many psychological variablesapproximately normally distributed. A repeated measurement analysis does also workvery well even if the distribution is only approximately normally distributed andsometimes even with very wide deviations from normality [Hyperstat Online 2000].However, for the differences between the interfaces to be statistically significant whenusing a univariate analysis approach, a sphericity test of the data must also be fulfilled.A sphericity test takes the variance of the measurements and the correlation intoaccount, as well as the assumption of normality. In this experiment, Mauchly's test ofsphericity was used to check this. Some of the differences recorded in the experimentdid not fulfil this sphericity test and could therefore not be regarded as statisticallysignificant. The complete documentation of results can be found in Appendix 4 and thestatistical measures are recorded in the following Appendix 5.

Objective measures

EffectivenessThe effectiveness was regarded as the most important measure, since it indicates howmany subjects that passed the tasks. The analysis was carried out on a subtask level toget a more detailed result. The average score for each task and interface is displayed inFigure 18 below.

Figure 18. The average effectiveness (score) for each subtask on the three interfaces.Task 1-4 is information handling tasks and task 5-7 is call handling tasks.

The general outcome of this experiment was that the subjects performed very wellusing the Nokia 9110 Communicator and Windows CE + Nokia 6110. With amaximum score of 12 the Nokia Communicator received 11.53 points, while WindowsCE + Nokia 6110 got 11.74 points. If we compare these results to PowerCom's 10.40points we get a difference that cannot be neglected. When considering the call handlingtasks the maximum score was 7 points. PowerCom received 5.74, while the Nokia 9110

0

0.2

0.4

0.6

0.8

1

1a 1b 2 3 4 5 6a 6b 6c 7a 7b 7cTask

Eff

ecti

ven

ess

(sco

re)

PowerCom

Nokia 9110 Communicator

Windows CE + Nokia 6110

Page 51: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Results 51

Communicator got 6.63 and Nokia 6110 6.88 points. Note that Nokia 6110 was usedwith only one hand in most cases. This high numbers regarding call handling tasks setsthe standard for future call handling interfaces. Nothing less than a performance closeto 100 per cent for the novice user is acceptable.

Considerable differences regarding the effectiveness existed for the total score.Comparisons between the interfaces for each task showed that no major differencesexisted between Nokia 9110 Communicator and Windows CE + Nokia 6110, but thatthe users had performed worse on the following tasks using PowerCom:

- Task 1b, change a telephone number of a contact- Task 7b, create a conference call- Task 7c, start a private call with a conference member

To notice is that no differences of importance were found between PowerCom andWindows CE for the information handling tasks (task 1-4).

EfficiencyEffectiveness and efficiency are two related measurements. A long completion timeand a large number of keystrokes is a sign of uncertainty during performance that oftenresults in task failure. The observed completion times and all the differences are clearlydisplayed in Figure 19 below.

Figure 19. The average completion times for each task on the three interfaces. Task 1-4are information handling tasks and task 5-7 are call handling tasks.

All the differences mentioned for effectiveness were reflected in this measure as well.This time the differences were significant for task one, find contact and changetelephone number (F[2,34] = 8,010, p = 0,001), and task seven, create a conference calland a start a private call with a conference member (F[2,34] = 51,930, p < 0,001).Pairwise comparisons showed that the subjects performed task one and sevensignificantly faster using the Nokia 9110 Communicator than using PowerCom orWindows CE + Nokia 6110. Considering task seven, the Nokia 6110 was alsosignificantly faster than PowerCom. A notable time difference was also found for taskfive, call a person and take a message A closer look at the numbers showed that the

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7Task

Co

mp

leti

on

tim

e (s

eco

nd

s)

Pow erCom

Nokia 9110 Communicator

Window s CE + Nokia 6110

Page 52: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Results 52

subjects performed this task faster using PowerCom compared to using the Nokia 9110Communicator or Nokia 6110. Again, no differences of importance existed betweenPowerCom and Windows CE for the information handling tasks (task 1-4).

If we look at the number of keystrokes used, we find a significant difference for taskone, find contact and change telephone number (F[2,34] = 20,032, p < 0,001) betweenthe three interfaces. Pairwise comparisons revealed that the subjects used significantlyless keystrokes on Nokia 9110 Communicator than on PowerCom and Windows CE +Nokia 6110 for this task. Significant differences also existed for task three, find infoabout a meeting (F[2,34] = 7,251, p = 0,002). This time, the pairwise comparisonsdisplayed a difference between Nokia 9110 Communicator and PowerCom in favour ofthe Nokia device.

Figure 20. The average number of keystrokes for each task on the three interfaces.Task 1-4 are information handling tasks and task 5-7 are call handling tasks.

The time difference for task seven, create a conference call and then a private call hadan impact on the number of keystrokes for this task as well. The subjects used lesskeystrokes on Nokia 9110 Communicator than on PowerCom and Windows CE +Nokia 6110 for this task. Nokia 6110 also required less keystrokes compared toPowerCom. Considering task five, call a person and take a message, the situation wasreversed and PowerCom was superior to Nokia 6110 considering the number ofkeystrokes used.

Optimum PathNo statistical analysis has been made of the Optimum path usage and the relationbetween the completion time and an expert user's time, since it is regarded as a morequalitative analysis. Defining when the Optimum path has been used and when not, is ahighly subjective decision made by the experimenter. However, the relation of a noviceuser's path to the Optimum path and the relation between the completion time for annovice user to the completion time for an expert user are interesting to consider. TheOptimum path followed by an expert user is in most cases the designer's model on howto solve a task. Therefore, these relations give a picture of how well the designer's con-

0

5

10

15

20

25

30

35

40

1 2 3 4 5 6 7

Task

No

. of

keys

tro

kes

Pow erCom

Nokia 9110 Communicator

Window s CE + Nokia 6110

Page 53: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Results 53

ceptual model maps onto the novice users' mental model [Norman 1993] and indicateswhere usability problems occurred.

Each interface is constructed differently and some tasks require more keystrokes and alonger completion time on one interface compared to another. Table 3 below shows theratio (average user completion time / Optimum path time) and (average number of userkeystrokes / Optimum path keystrokes) for each task on the three interfaces. A ratiohigher than 2 is considered as a notable deviation from the Optimum time andOptimum path and depicted in bold style. If we look at the PowerCom numbers, we cansee that the tasks giving the users most trouble (task one and seven) has a high ratio forboth completion time and keystrokes. Task five, who was performed faster usingPowerCom has a very low ratio for both measures.

Task 1 2 3 4 5 6 7IF time keyst time keyst time keyst time keyst time keyst time keyst time keystPC 4,29 4,75 1,53 1,27 2,32 1,38 2,44 1,75 1,19 1,28 1,38 1,32 2,37 2,17NC 2,47 1,52 1,84 1,37 1,90 1,79 1,75 1,27 1,79 1,46 1,41 1,27 1,14 1,04W6 3,82 2,65 2,17 2,38 1,92 1,68 1,39 2,10 2,44 2,74 1,30 1,38 1,44 1,29

Table 3. The ratio time = (average user completion time / Optimum path time) andkeyst = (average number of user keystrokes / Optimum path keystrokes) for each taskon the three interfaces (IF). PC=PowerCom, NC=Nokia 9110 Communicator andW6=WindowsCE+Nokia 6110. Task 1-4 are information handling tasks and task 5-7are call handling tasks.

Task three, was performed using significantly fewer keystrokes on Nokia 9110Communicator compared to PowerCom. Still, the keystroke ratio was much higher forthe Nokia device, as the table shows. This indicates that the uncertainty duringperformance was higher using Nokia 9110 Communicator in spite of the absolutedifference in number of keystrokes. A large completion time ratio compared to thekeystroke ratio indicates that the subjects spent a long time thinking of which action toperform. The opposite, a large keystroke ratio compared to the completion time ratio isa sign of the subjects using trial and error. This is the case for many tasks solved usingWindows CE, which is an interface very similar to Windows 95/98/NT that all subjectsknew very well.

Subjective measures

Mental Workload (NASA-TLX)Significant differences between the interfaces existed on three out of the six scales;Mental demand, Effort and Frustration level (F[2,34] > 9,516, p < 0,002). PowerComwas rated as significantly more demanding than Nokia 9110 Communicator on thethree scales mentioned and as significantly more demanding than Windows CE +Nokia 6110 on Mental demand and Frustration level. Significant differences alsoexisted between Nokia 9110 Communicator and Windows CE + Nokia 6110 in favourof the first device, regarding Effort level. Considerable differences were found four forthe Performance scale as well. Both PowerCom and Windows CE + Nokia 6110 were

Page 54: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Results 54

rated as more demanding than the Nokia 9110 Communicator according to thecomparison made between the interfaces.

As Figure 21 shows, PowerCom was only rated better (not significantly) than WindowsCE + Nokia 6110 on Physical Demand, which probably is a result from the manykeypresses on the telephone. The Nokia 9110 Communicator is rated as less mentallydemanding than Windows CE + Nokia 6110 on overall, including the Performancelevel. This subjective performance difference is not consistent with the objectiveperformance (Appendix 4), where Windows CE + Nokia 6110 was the best interfaceconsidering the total score.

Figure 21. The average mental workload ratings for each scale on all three interfaces.

Note that the maximum average value for the Mental workload rating was 54(frustration level on PowerCom), which is only slightly higher than the middle value onthe scale. This indicates that none of the three interfaces was considered as verydemanding regarding mental workload.

Subjective satisfactionSignificant differences were found on nine out of 15 statements (F[2,34] > 3,787, p <0,034). PowerCom was rated significantly lower than the Nokia 9110 Communicatoron eight ratings and significantly lower than Windows CE + Nokia 6110 on six ratings.PowerCom was also rated significantly higher than Nokia 9110 Communicator on thestatement regarding system speed and responses. No significant differences existedbetween the two Nokia devices. A difference seemed to exist for statement six as well,logical steps to complete a task, where the Nokia 9110 Communicator was rated higherthan PowerCom.

The subjective ratings (Mental Workload and Subjective Satisfaction) showed verylarge differences disfavouring PowerCom. One reason for this could be that thesubjects performed worst on the last task using PowerCom, while they performed beston the last task using the other two interfaces. The subjects had that fresh in mind whenfilling out the forms, not saying that it is the only explanation.

0

10

20

30

40

50

60

70

80

90

100

Men

tal d

eman

d

Physic

al de

man

d

Tempo

ral d

eman

d

Perfo

rman

ce

Effort

Frustr

ation

leve

l

Ra

tin

g (

mm

)

Pow erCom

Nokia 9110 Communicator

Window s CE + Nokia 6110

Page 55: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Results 55

Another argument for this is that Windows CE + Nokia 6110 is rated very high onaverage in spite of some problems using Windows CE. In the concluding interview,Windows CE was rated as equal to PowerCom. An interesting point considering theSubjective Satisfaction ratings, is that the information appliance Nokia 6110 is ratedhighest on telling the status of calls and overall call handling experience in spite of thesmall display. One explanation is that the good objective performance affects thesubjective satisfaction. Goldstein, Alsiö and Werdenhoff [1999] have discussed thisproblem with performance bias in usability evaluations. All the average ratings aredisplayed in Figure 22.

Figure 22. The average subjective satisfaction ratings for each statement on the threeinterfaces.

1

2

3

4

5

6

7

8

9

Overa

ll rea

ction

Feelin

g of

lost

Scree

n se

quen

ce

Syste

m in

fo

Predic

table

resu

lt

Steps

in ta

sk

Info

arra

ngem

ent

Icons

and

labe

ls

Term

inolog

y

Lear

nabil

ity

Trial a

nd e

rror

Styste

m sp

eed

Overa

ll call

han

dling

Statu

s of c

alls

User c

ontro

l call

han

dling

Rat

ing

(1-

9)

Pow erCom

Nokia Communicator

Window s CE + Nokia 6110

Page 56: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Discussion 56

DiscussionWhat are the reasons then for the differences observed in this experiment? Why did thesubjects fail on some tasks using PowerCom? These questions and a few more will bediscussed below. Interface details will not be mentioned separately, since many of themhave been considered in the heuristic evaluation.

PowerCom conceptIf we first look at the main idiomatic characteristics of PowerCom, namely the FlipZooming technique, no usability problems can be related directly to it. The users didnot have any problems understanding the interaction. Some wanted the focus to be onthe same place all the time and were confused when the tiles moved around on thescreen. A dedicated area for the focus is a possible solution implemented in the formerprototype PowerView. The advantage is a consistent place for the focus, which meansthe users know where to look. One drawback is that it becomes harder to perceive alinear ordering of the objects. Moving up a level in a Flip Zooming hierarchy is doneby tapping on the background, i.e. the space between the tiles. When the tiles movearound, the background spaces also moves around. This caused a greater irritation thanwhen the focus was moving around. They wanted to tap on the same spot in order toquickly go up a level in the hierarchy or reverse an action.

The information links between different objects forming the information contexts wasan appreciated feature for the users missing the overview in a regular PDA.Unfortunately, many users did not quite grasp the information linking concept. In spiteof having to use the links in task three when searching for the participants at ascheduled meeting, only two out of 18 subjects used the information links in thefollowing task four. Instead of using the Optimum path and open a scheduled meetingand reading the mails in the meeting's information context, the users browsed throughthe whole Mailbox searching for them the hard way. The lack of headings in the in-formation contexts also made some subjects insecure of what kind of objects that wererepresented in each tile.

The idea that a telephone call can be the focus of attention as well as an informationobject caused no problems for the less advanced call handling tasks, but when difficultoperations were encountered the situation changed. Many situations occurred when thesubjects had problems knowing what was information related and what was callhandling related. More reasons for the problems are discussed in the followingchapters.

Breadth vs. depthIf we first consider the information handling tasks, PowerCom has an average breadththat ranges from eight to twelve and a depth from five to eight. There is nothing in theresults indicating that an increased depth affects the users' performance negatively onPowerCom. If the results from all three interfaces are considered we get a depth rangefrom three to eleven, but still no indication that depth is critical. The subject commentsduring and after the test showed that the depth still is something important whendesigning interfaces. Just a couple of extra clicks when navigating back and forth

Page 57: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Discussion 57

searching for information made many subjects react in a dissatisfying way. Thesefindings are consistent with the ones presented by Ödman [2000]. If we make the sameanalysis of the breadth, we see that some of the problems can be explained in terms ofbreadth. PowerCom extends the canonical vocabulary and makes use of an additionalcompound [Cooper 1995, drag & drop. The users have to think about different waysthan single-taps to manipulate an object, which increases the breadth considerably. InWindows CE many tasks can be solved by a double-tap, this extra compound alsoincreases the breadth compared to a single tap interaction model.

In tasks 5-7, call handling was introduced to the users. If we first look at PowerComwhere information and call handling is integrated, the breadth factor is further increasedcompared to the information handling tasks. Many objects represented on the screensimultaneously in combination with the extended canonical vocabulary is the mainreason for this. Task seven was performed significantly worse using PowerCom andthis task also had the largest average breadth. The most severe usability problemsoccurred at the level were the breadth peaked (Figure 23).

Figure 23. Task 7b in PowerCom, Figure 24. Task 6b in Nokia 9110create a conference call. Communicator, swap to the held call.

When looking at the two other call handling interfaces, we see that Nokia 6110 with itssmall canonical vocabulary and small menu breadth had a performance level close to100 per cent although the navigation depth was large. No usability problems werefound and only two subjects made an accidental mistake each. The Nokia 9110Communicator keeps the breadth and depth more or less constant compared to theinformation handling tasks. The telephone function has its own application and theinteraction buttons are the same. However, something interesting happened in subtask6b (Figure 24). A simple swap operation caused trouble for many subjects. This subtaskrequired the users to move between two calls using the vertical arrow keys (∧, ∨), justpressing the softkeys next to the screen was not enough. In spite of having used thearrow keys in the four information handling tasks, many subjects somehow did notconsider them as obvious. They started looking on all the QWERTY keyboard keys andpressed key 1 or key 2 (call number) and key K or key L (first letter in the calledperson's name). For those who did not understand that the arrow keys were the rightchoice, the breadth at this level increased from four to at least 12. This suggests thatcall handling tasks cannot be compared to information handling tasks.

Page 58: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Discussion 58

Nokia 6110

Nokia Communicator

PowerCom

As mentioned, recorded messages were used in this experiment to simulate thetelephone calls. Nevertheless, a different behaviour by the users was observedcompared to when searching for information, when a mistake was ignored and a trialand error style adopted. For the call handling tasks, the smallest mistake seems to affectthe subjects negatively and every action was considered carefully. There are manyexplanations to this behaviour, one possible explanation is the involvement of otherhumans. If you make an error, some other person will immediately know about yourincompetence. In our culture, it is important to be viewed as competent and no onewants to make a fool of themselves in public.

It seems to exist a speed-accuracy trade-off [Wickens 1992, p. 318-322] in the users'behaviour. For information handling tasks, speed is most important, but for callhandling tasks accuracy is crucial. Regarding call handling systems for personalcomputers, it has been suggested that voice is just another datatype [Nielsen 1997].According to this experiment, this is certainly not the case for handheld devices. Thisspeed-accuracy trade-off is related to the breadth vs. depth discussion. A smallerbreadth gives few choices at each level and a high accuracy. A smaller breadth alsoresults in a larger depth and probably a slower performance. In Figure 25 below, thereis a 3Dgraph showing the effects from breadth and depth on the percentage of users thatpassed the call handling tasks (5-7) on their first attempt. It is also indicated in whichareas the different interfaces were situated regarding breadth and depth. For detailedinformation of the breadth and depth for particular tasks, see Table 2.

Figure 25. The effects of breadth and depth on the per cent of users that passed thewhole call handling tasks (5-7) on their first attempt (the subtasks have not beenanalysed individually). The graph is created using multiple isolines in the 3D fieldsoftware.

Page 59: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Discussion 59

This graph shows that the breadth is of highest importance and that it isrecommendable to increase the depth rather than the breadth for call handling tasks.This is not in line with the previous studies in the area, but should obviously beconsidered in telephone interfaces. An interesting point is that when a couple of extrasteps for the information handling tasks were spotted at once by the users, not a singlesubject complained about the many navigation steps needed for the Nokia 6110 in taskseven (conference & private call).

Idiomatic vs. metaphorical elementsThe only information handling task that resulted in serious trouble for the subjects wastask 1b (changing the telephone number of a contact) on PowerCom. The problems inthis task can be related to the breadth vs. depth discussion. However, they can also beexplained in terms of idiomatic features. Dragging the contact to the Modify Entrydroparea is an example of an implicit Function-Object interaction style [Kunkel et al.1995]. This idiomatic operation seems to be very difficult to figure out for novice users,only six out of 18 subjects managed it by themselves. Most users knew they weresupposed to do something with the contact object and something with the Modify Entrytext, but they could not put it together. When giving the hint "think about drag & drop"most users managed to solve the task, which indicates that this idiom will not be aproblem once having learnt it.

The creation of a conference call in PowerCom could be solved in two ways. Thefastest one was an idiomatic drag & drop. The telephone icon of one person wasdragged to the other person's context on the bottom bar and dropped there (Figure 8a).The second solution was to drag the name of one person to the other person's contexton the bottom bar and then tap on his telephone in that context, which was in active-in-other-context mode. The results showed that although having been primed withidiomatic drag & drop, only one subject out of 18 used this Optimum path withouthints. The other solution which was more of a metaphorical drag & drop where aperson was moved to another telephone call (Figure 26a on next page) did not matchthe subjects' thinking very well either.

The most common operations that the subjects tried were two other metaphorical drag& drops. The first one was to drag the context representing the held call to the activecontext on the bottom bar (Figure 26b). This can be seen as an attempt to merge thetwo contexts (and telephone calls) into one and is not consistent with the concept ofinformation contexts. The other operation was to drag the context representing the heldcall into the tile representing the Contacts in the active information context (Figure26c). In this case the users associated the Contacts tile with a telephone call area ofsome kind and tried to add the held call to it. This action is possible to implement,although it is very strange from an information handling point of view. Adding theobjects of a context view to the Contacts in focus is not possible.

Page 60: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Discussion 60

Figure 26 (abc). Three common ways the subjects tried to create a conference call inPowerCom. The leftmost solution is the only one leading towards the goal.

It seems like complex and less general idiomatic drag & drops are difficult to learn.The subjects were primed several times during the experiment without any success. Themetaphorical drag & drops were the opposite, they came more natural to the subjects inadvanced call handling situations. This can also be related to the breadth vs. depthdiscussion. An idiomatic drag & drop can look very different, an object can be draggedto any object, which gives (no. of objects)2 options. A metaphorical drag & dropattempts to get to existing knowledge among the users. Therefore, the breadth can bereduced significantly if the right metaphor is chosen. Nokia 9110 Communicator uses asimple room metaphor to visualise the states of different telephone calls. Simple callhandling events, e.g. swap, are also carried out within this metaphor concept. However,complex call handling events are solved in a Function-Object style not using the roommetaphor at all. This is briefly discussed in the next chapter.

Object-Function vs. Function-ObjectI have not found any results clearly indicating that one interaction style is to prefer.However, when handling complex call situations, it seems safe to have a dedicatedfunction. E.g. as in Nokia 9110 Communicator, a dedicated softkey that saysConference Call when this is a possible option and a softkey that says Conferencecommands when a conference call is active. Considering the restricted canonicalvocabulary of the Nokia 9110 Communicator (m=1 according to Equation 1) the task isreduced to simple label following [Wharton et al. 1994]. Nokia 6110 has another smartFunction-Object solution. One of the two softkeys is dedicated for the most commonoperation in the specific situation and the other is called Options. In most cases theusers want to execute the common operation and can do that in one keypress. For theother cases additional navigation in a submenu is required. If Object-Function onanother level is considered, namely the integration of the Address Book and theTelephone Book in PowerCom, a successful solution is encountered. In this way, theusers can choose an object (contact) first and then the function (address, informationcontext, or telephone call). This was superior to the separated applications in Nokia9110 Communicator. The subjects often mixed these two applications up and a greatconfusion followed.

Page 61: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Discussion 61

Visibility and feedbackPowerCom is built on Flip Zooming, an information visualisation technique that haveproved to be useful, both in this experiment and in the evaluation of the formerprototype PowerView [Hellstrand 1999]. However, PowerView was designed forinformation handling only and the main concept was unchanged for PowerCom. Thecall handling features was added on top of the information handling. Therefore,PowerCom is not particularly designed to visualise telephone calls. From a callhandling point of view, the focus and context is more or less limited to the bottom bar.This is in fact equal to the visualisation of Nokia 6110, in spite of that device's limiteddisplay size. The integration of information and call handling, result in many objects onthe screen. This effects the visibility and feedback, since more interface elements haveto be considered when evaluating the state of the system. The Nokia 9110Communicator has a separate telephone application and all information in it concernstelephone calls. Nokia 6110 cannot visualise the call handling states and eventscompletely, instead an intelligent feedback solution is adopted. Whenever the usersperform an operation, it is written out clearly on the screen what they just did. Theusers never have feel insecure if they performed the right operation, since they cancompare their goal to the feedback text directly.

Experimental shortcomingsThe start time of the experiment was unfortunately just before the release of two newdevices/interfaces. The first one was Ericsson R380 smartphone, which could havebeen used instead of the Nokia 9110 Communicator. The Ericsson R380 implementsdirect manipulation with a stylus and is closer to PowerCom in interaction style. How-ever, Nokia 9110 Communicator and Ericsson R380 have many similarities consideringcall handling and perhaps the results would have been similar as well.

The other interface was PocketPC, which could have been used instead of WindowsCE. In this interface, many of the problems encountered for Windows CE in thisexperiment have been corrected. This interface might have been a better comparisonthan the mediocre Windows CE.

The task design could have been slightly different. Task 1b proved to be very difficultfor the users and gave many subjects a bad start of the experiment. This task could havebeen placed as number four (last of the information handling tasks) in order to get anincreasing complexity.

The separation of information and call handling tasks maybe controlled the users toomuch. They knew which interface to chose of Windows CE and Nokia 6110 and thecomplexity of having two separate interfaces was somehow lost. Interleavinginformation and call handling tasks better reflects normal usage as well.

Since mobile devices was evaluated in this experiment, one handed navigation or atleast standing posture should be considered in one way or another.

Page 62: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Conclusions 62

ConclusionsBefore concluding this thesis, it should be stressed that this evaluation of PowerComwas conducted with novice users. Performance at first time usage after only a fewminutes introduction was measured. The results would probably have been different ifexpert users' performance had been taken into account. PowerCom is also the interfacewith least similarities to existing ones and might have the largest increase inperformance after longer training. However, the importance of novice’s performanceshould not be neglected. The success of many commercial products depends on howwell a customer understands the interface when he/she tries it for the first time. If it isdifficult, he/she might not even buy the product. When advanced features like callhandling is considered and other humans are involved, another aspect is added. If theusers have failed creating a conference call two times in a row, does he/she dare too tryone more time?

PowerCom is a prototype interface and not a finished product. Therefore, someinformation handling applications and features obviously are missing which needs to beimplemented before taking another step in the development. Many of those possibleimprovements on the usability have been mentioned in the heuristic evaluation. Still,the information retrieval tasks were performed as good using PowerCom as the Nokia9110 Communicator and Windows CE and the Flip Zooming technique efficientlyguided the users through the information hierarchies.

Call handling proved to be something different, and the integration in PowerCom wasless successful regarding the advanced call handling tasks. Many reasons have beendiscussed, but as conclusion some important issues and lessons from this experimentare listed below:

• Since the nature of the two tasks information handling and call handling are verydifferent from a user perspective, a separate call handling application or at least aseparate call handling visualisation is to recommend.

• The accuracy is regarded as much more important than speed when performing callhandling tasks. Therefore, the breadth should be reduce to minimise the number ofchoices. If necessary, increase the depth instead.

• Advanced call handling, e.g. creating a conference call, can benefit from a singleclick interaction model and a Function-Object selection style to limit the breadth.This interaction style suits one-handed navigation very well.

• The use of drag & drop increases the canonical vocabulary, but also theexpressiveness in the interface. Call handling does also seem to have a strongmetaphorical affordance for drag & drop in advanced call handling situations.

• If using drag & drop, limit the number of objects on the screen since the increasedcanonical vocabulary increases the breadth as well. Avoid complex idiomatic drag& drops, use a simple and general implementation of this feature instead.

Page 63: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Conclusions 63

Future WorkSome of the possible future work has already been done as a natural continuation ofthis master thesis. An paper based design of a separate call handling visualisation hasbeen developed and some different interaction styles have been discussed.Unfortunately, this work cannot be presented in this paper, since it will be investigatedfurther at Ericsson Research. Testing this design in another experiment is the nextlogical step, where the goal should be Nokia performance (close to 100 per cent).

Regarding the Flip zooming technique, there are several areas where this techniquecould be implemented. All applications where a detailed picture of the objects in focusis needed at the same time as an overview picture suit this technique.

Page 64: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

References 64

ReferencesBjörk, S. (2000). Hierarchical Flip Zooming: Enabling Parallel Exploration ofHierarchical Visualisations. In Proceedings of Advanced Visual Interfaces (AVI)2000.

Björk, S. and Redström, J. (2000). Redefining the Focus and Context ofFocus+Context Visualisation. In Proceedings of IEEE Symposium on InformationVisualisation 2000.

Björk, S., Redström, J., Ljungstrand, P., and Holmquist, L.E. (2000).POWERVIEW: Using Information Links and Information Views to Navigate andVisualise Information on Small Displays. In Handheld and Ubiquitous Computing2000 (HUC2k), Bristol, U.K.

Björk, S., Hellstrand, M., Redström, J., Goldstein, M., Ljungstrand, P., Anneroth,M., Holmquist, L-E., Werdenhoff, J. & Chincholle, D. (1999). An action control butno action: Users dismiss single-handed navigation on PDAs. Full paper submitted toCHI'2000.

Cooper, A. (1995). About Face: The essentials of user interface design. ProgrammerPress, IDG Books, Foster City, California.

Egan, D.E. (1988). Individual Differences in Human-Computer Interaction. InHandbook of Human-Computer Interaction, p. 543-568, ed. by M.Helander.Amsterdam: Elsevier Science Publishers.

Ewertz, C. (1999). Definition of call handling states for cellular telephones. InternalEricsson Paper.

Falk, J. (1999). Mobile Awareness. Master of Science Thesis, Department ofInformatics, Göteborg University.

Furnas, G.W. (1986). Generalised Fisheye Views. Proceedings of CHI´86, ACMPress.

Gleiss, N. (1992). Usability concepts and evaluation, TELE2, p. 24-30, SwedishTelecom, English Edition.

Grundel, C. & Schneider-Hufschmidt, M. (1999). A direct manipulation userinterface for the control of communication processes. Making call handlingmanageable. Conference of Human Factors in Telecommunication 1999.

Goldstein, M., Alsiö, G., & Werdenhoff, J. (1999). The media equation does notalways apply: People are not polite towards small computers. Internal EricssonPaper.

Page 65: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

References 65

Goldstein, M., Bretan, I., Sallnäs, E.-L., Björk, H. (1999). Navigational abilities inaudial voice-controlled dialogue structures. In Behaviour and InformationTechnology, 1999, Vol. 18, No. 2, p. 83-95.

Goldstein, M., Werdenhoff, J., Backström, T. (2000). What does the user do: A toolfor visualising the novice user's interaction relative to Optimum path. Full paperpresented at NordiCHI'2000.

Goldstein, M., Anneroth, M., & Book, R. (1999). Usability Evaluation of a High-Fidelity Smart Phone Prototype: Task Navigation Depth Affects Effectiveness. InProceedings of HCI International’99, 2, p. 38-42. , New Jersey: Lawrence ErlbaumAssociates.

Hart, S.G. and Staveland, L.E. (1988). Development of NASA-TLX (Task LoadIndex): Results of empirical and theoretical research. In Human Mental Workload,ed. by P.A. Hancock and N. Meshkati, North-Holland: ElsevierScience PublishersB.V.

Hellstrand, M. (1999). Comparing the PowerView interface with the traditionalWindows CE interface: Novice users favour two-handed navigation over one-handed on PDAs. Master Thesis in psychology, University of Lund.

Holmquist, L.E. (2000) Breaking the Screen Barrier. Ph.D. thesis GöteborgUniversity, Department of Informatics.

Hyperstat Online (2000). Available at http://davidmlane.com/hyperstat/.Responsible for page: David M. Lane (2000-10-18).

Kunkel, K., Bannert, M., Fach, P.W. (1995). The influence of design decisions theusability of direct manipulation user interfaces. Behaviour and informationtechnology (14), no. 2, p. 93-106.

MacKinlay, J.D., Robertson, G.G., & Card, S.K. (1995). The perspective wall:Detail and context smoothly integrated. In Proceedings of CHI´95, ACM Press.

Miller, D.P. (1981). The Depth Breadth Trade-Off in Hierarchical Computer Menus,In Proceedings of the Human Factors Society 25th Annual Meeting, Human FactorsSociety, Santa Monica: CA, p. 296-300.

Miller, G.A. (1956). The magic number seven plus or minus two: Some limits of ourcapacity for information processing. In Psychological Review, 63(2), p. 81-97.

Mohageg, F.M (1992). The influence of hypertext linking structures on theefficiency of information retrieval. In Human Factors, 34(3), p. 351-367.

Nielsen, J. (1993). Usability Engineering. Academic Press, Boston.

Nielsen, J., & Mack, R. L. (1994). Usability Inspection Methods. John Wiley &Sons, New York, NY.

Page 66: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

References 66

Noldus (2000). Noldus Video-Pro product sheet. Available athttp://www.noldus.com/products/index.html. Responsible for page: NoldusInformation Technology (2000-10-18).

Useit (2000). Jakob Nielsen's website containing a refined version of the usabilityheuristics. Available at http://www.useit.com/papers/heuristic/heuristic_list.html.Responsible for page: Jakob Nielsen (2000-10-18).

Norman, D. (1990). The Design of Everyday Things. Basic Books, New York.

Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S. & Carey, T. (1994).Human-computer-interaction. Addison Wesley. Reading, Mass.

Reeves, B. & Nass, C. (1996). The media equation. How people treat computers,television and new media like real people and places. Cambridge University Press,CLSI Publications, Stanford, California.

Shneiderman, B. (1998). Designing the User Interface: Strategies for effectiveHuman-Computer Interaction. Third edition. Addison-Wesley.

Siirtola, H. (2000). Direct Manipulation of Parallel Coordinates. In proceedings ofCHI'2000. Interactive Posters, ACM Press, p. 119-120.

Spence, R. & Appearly, M.D. (1982). Data Base Navigation: An office environmentfor the professional. Behaviour and information technology, 1, p. 43-54.

SPSS (1999). SPSS Base 10.0 Applications Guide. SPSS Inc., Chicago, Illinois.

Sun (1997). From desktop to consumer devices: The Applet Writer's Style GuideDraft - Version 0.8. Sun Microsystems, Inc.

Urokohara, H., Tanaka, K., Furuta, K., Honda, M. and Kurosu, M. (2000). NEM:"Novice Expert ratio Method" A usability evaluation method to generate a newperformance measure. In proceedings of CHI'2000, Extended Abstracts, ACMPress, p. 185-186.

Usability & Interaction Laboratory (2000). Internal Ericsson webpage.

Wabasoft (2000). Information about the WABA programming language, developertools, application examples, help, etc. Available at: http://www.wabasoft.com.Responsible for page: Wabasoft Corporation (2000-10-11).

Ward, M. (2000). Introduction to Data Visualisation: Vis '97 Tutorial. Available athttp://www.cs.wpi.edu/~matt/v97_tut/. Responsible for page: Matthew Ward. (2000-10-11).

Page 67: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

References 67

Wharton, C., Rieman, J., Lewis, C., Polson, P. (1994) The Cognitive WalktroughMethod: A Practioner’s Guide. In Usability Inspection Methods, Nielsen J and R.L.Mack (Eds.), p. 105-141.

Wickens, C.D. (1992). Engineering psychology and human performance. 2nd ed.,New York: Harper Collins Publisher Inc.

Ödman, J. (2000). Riktlinjer för utformning av webgränssnitt för intranät - en kritiskgranskning. Master Thesis in Computer Science, Royal Institute of technology.Available at: http://www.d.kth.se/~d94-jod/anvandbarhet.html. Responsible forpage: Jonas Ödman (2000-10-11).

Page 68: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

68

Page 69: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 1 – Tasks presented to the subjects 69

Appendix 1 – Tasks presented to the subjects

Task oneYou have received a moving card from your good friend Glen Jonssoncontaining information about his new telephone number, new email address,etc.

Check the details for Glen Jonssons telephone number and email address inthe device and write them down below.

Are they according to the details you received in the moving card? If they arenot, change them.

Task two

You cannot remember your username for a certain computer service that youvery seldom use. Find out if there is a note that says something about aUsername. If you can find a note like that, write down what it says below.

MOVING CARD

Glen JonssonStorgatan 17987 43 Stockholm

Telephone number:+46 70 123 45 67

Email address:[email protected]

Page 70: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 1 – Tasks presented to the subjects 70

Task three

Your boss just phoned you and wanted to meet the first week in August. Is itpossible to fit in a meeting on August 4 before 12:00? If not, write downbelow the event that is booked at this time, as well as the participants.

Task four

You are going to Gothenburg on September 18 for a few days vacation andyou are waiting for some additional information to arrive from the hotel aswell as from one of your friends. See if there are any emails received aboutthe trip to Gothenburg in September. If so, write down the headers of theemails below.

Task five

ScenarioWhen you were out for while, Peter Berg called you, so you give him a callon his work telephone to see what he wanted.

Action sequence

• Call Peter Berg• Write down his message

• End the call

Page 71: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 1 – Tasks presented to the subjects 71

Task six

ScenarioYou are in charge of a department running a number of projects. You call oneof your staff members, Kent Nilsson, to find out how his project is going.After having spoken to him for a while, your manager Lisa Person calls youabout an important matter. You answer the call from her without ending theKent Nilsson call. When you have listened to what she has to say, you goback to the call with Kent Nilsson without ending the important call from LisaPerson. You pass on the information from your manager to him, then end thecall with Kent Nilsson. Then you continue the call with Lisa Person again forsome further discussions. Finally, you end the call with your manager.

Action sequence

• Call Kent Nilsson• Write down his message

• Answer the incoming call from your manager Lisa Person without endingthe ongoing call with Kent Nilsson

• Write down her message

• Go back to Kent Nilsson without ending the ongoing call with Lisa Person• Make sure you are only talking to Kent Nilsson

• End the call with Kent Nilsson

• Continue the call with Lisa Person• Make sure you are only talking to Lisa Person

• End the call with Lisa Person

Page 72: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 1 – Tasks presented to the subjects 72

Task seven

ScenarioYour friend, Jessica Gren, wants to sell her car. You are interested and callher to discuss the prize. After having spoken to her for a few minutes, youfeel that you need a second opinion. You decide to call Stig Smedberg,without ending the call with Jessica Gren. You and Stig Smedberg decide toarrange a group call situation where you all can speak together and StigSmedberg can ask Jessica Gren about some details. When the three of youhave spoken for a while, you start a private call with Stig Smedberg to make afinal decision about buying the car. When that has been made, you recreatethe group call and pass on the decision to Jessica Gren. Finally, you end thegroup call.

Action sequence

• Call Jessica Gren• Write down her message

• Call Stig Smedberg without ending the ongoing call with Jessica Gren• Write down his message

• Arrange so you all can talk in a group call situation• Make sure you can hear both Jessica Gren and Stig Smedberg

simultaneously

• Then start a one to one call with Stig Smedberg without ending the callwith Jessica Gren

• Make sure you are only speaking to Stig Smedberg

• Recreate the group call situation where the three of you can speak together• Make sure you can hear both Jessica Gren and Stig Smedberg

simultaneously again

• End the group call

Page 73: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 2 – The NASA-TLX mental workload form 73

Appendix 2 – The NASA-TLX mental workloadform

Place a mark on each scale below that represents the magnitude of each factor for allthe tasks you have just performed using the interface XX.

1. Mental demandHow much mental and perceptual activity was required (e.g., thinking, deciding,calculating, remembering, looking, searching, etc.)? Was the task easy or demanding,simple or complex, exact or forgiving?

Low High

2. Physical demandHow much physical activity was required (e.g., pushing, pulling, turning, controlling,activating, etc.)? Was the task easy or demanding, slow or brisk, slack or strenuous,restful or laborious?

Low High

3. Temporal demandHow much time pressure did you feel due to the rate or pace at which the task or taskelements occurred? Was the pace slow and leisurely or rapid and frantic?

Low High

4. PerformanceHow successful do you think you were in accomplishing the goals of the task set by theexperimenter (or yourself)? How satisfied were you with your performance inaccomplishing these goals?

Good Poor

5. EffortHow hard did you have to work (mentally and physically) to accomplish your level orperformance?

Low High

6. Frustration levelHow insecure, discouraged, irritated, stressed and annoyed versus secure, gratified,content, relaxed and complacent did you feel during the task?

Low High

Page 74: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 3 – subjective satisfaction form 74

Appendix 3 – subjective satisfaction form

1. Overall reaction to the system terrible wonderful1 2 3 4 5 6 7 8 9

2. The feeling of being lost always neveroccurred 1 2 3 4 5 6 7 8 9

3. The sequence of screens confusing clear1 2 3 4 5 6 7 8 9

4. The system keeps you never alwaysinformed about what happens 1 2 3 4 5 6 7 8 9

5. Performing an operation leads never alwaysto a predictable result 1 2 3 4 5 6 7 8 9

6. The steps to complete a illogical logicaltask 1 2 3 4 5 6 7 8 9

7. Arrangement of information illogical logicalon the screen 1 2 3 4 5 6 7 8 9

8. The icons and labels hard to understand easy tounderstand 1 2 3 4 5 6 7 8 9

9. The terminology relates never alwaysto the tasks performed 1 2 3 4 5 6 7 8 9

10. Learning to use the interface difficult easy1 2 3 4 5 6 7 8 9

11. Exploration of features by discouraging encouragingtrial and error 1 2 3 4 5 6 7 8 9

12. The system speed and its too slow fast enoughresponses 1 2 3 4 5 6 7 8 9

13. Overall reaction to the call terrible wonderfulhandling features 1 2 3 4 5 6 7 8 9

14. Telling who is speaking and difficult easythe status of the other calls 1 2 3 4 5 6 7 8 9

15. The user control and degree low highof freedom for call handling 1 2 3 4 5 6 7 8 9

Page 75: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 4 – The results 75

Appendix 4 – The results

Effectiveness and Optimum PathIn the following three tables for the different interfaces, the scores for all subjects on alltasks are displayed. A bold number indicates that the Optimum path was used. At thebottom, the average score on the particular tasks is calculated.

PowerComTask

Subject1a 1b 2 3 4 5 6a 6b 6c 7a 7b 7c Total

1 1 0,75 1 1 1 1 1 0 0 1 0,5 0,5 8,752 1 0,75 1 1 1 1 1 1 1 1 1 1 11,753 1 0 1 1 1 1 1 0,5 0,5 1 0,5 0 8,54 1 1 1 1 1 1 1 1 1 1 0,75 0,5 11,255 1 1 1 1 1 1 1 1 1 1 0,5 0,5 116 1 0,75 1 1 1 1 1 1 1 1 0,75 0 10,57 0,75 0,5 1 1 1 1 1 1 1 1 1 1 11,258 1 0,75 1 1 1 1 1 1 1 0,5 0,25 0,25 9,759 1 0,75 1 1 1 1 1 1 1 1 1 0,5 11,25

10 1 1 1 1 1 0,75 1 1 1 1 0,5 0,5 10,7511 1 1 1 1 1 1 1 1 1 1 0,75 0,75 11,512 0,75 1 1 1 1 1 1 1 1 1 0 0 9,7513 1 0,5 1 1 1 1 1 0,5 0,5 0 0 0 7,514 1 0,75 1 1 1 1 1 1 1 1 0,75 0,25 10,7515 1 1 1 1 1 1 1 1 1 1 1 1 1216 1 0 1 1 1 0,75 1 1 1 1 1 1 10,7517 1 0,5 1 1 1 1 1 0,5 0,5 1 0,25 0,5 9,2518 1 0,5 1 1 1 1 1 1 1 1 1 0,5 11

Avg 0,97 0,69 1,00 1,00 1,00 0,97 1,00 0,86 0,86 0,92 0,64 0,49 10,40

Nokia 9110 CommunicatorTask

Subject1a 1b 2 3 4 5 6a 6b 6c 7a 7b 7c Total

1 1 1 1 1 1 1 1 1 1 1 1 1 122 1 1 1 1 1 1 1 1 1 1 1 1 123 1 1 1 1 1 1 1 1 1 1 1 1 124 0,75 0,75 1 0,75 1 0 1 0,5 0,5 1 1 1 9,255 0,75 1 1 1 1 1 1 1 1 1 1 1 11,756 0,75 1 1 1 0,75 1 1 0 0 1 1 1 9,57 1 1 1 1 1 1 1 1 1 1 1 1 128 1 1 1 1 1 1 1 0,5 0,5 1 1 1 119 1 1 1 1 1 1 1 1 1 1 1 1 12

10 1 1 1 1 1 1 1 1 1 1 1 1 1211 1 1 1 1 1 1 1 0,25 0,25 1 1 1 10,512 1 1 1 1 1 1 1 1 1 1 1 1 1213 1 1 1 1 1 1 1 1 1 1 1 1 1214 1 1 1 1 1 1 1 1 1 0,75 1 1 11,7515 1 0,75 1 1 1 1 1 1 1 1 1 1 11,7516 1 1 1 1 1 1 1 1 1 1 1 1 1217 1 1 1 1 1 1 1 1 1 1 1 1 1218 1 1 1 1 1 1 1 1 1 1 1 1 12

Avg 0,96 0,97 1,00 0,99 0,99 0,94 1,00 0,85 0,85 0,99 1,00 1,00 11,53

Page 76: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 4 – The results 76

Windows CE + Nokia 6110Task

Subject1a 1b 2 3 4 5 6a 6b 6c 7a 7b 7c Total

1 1 0,5 1 1 1 1 1 1 1 1 1 1 11,52 1 1 1 1 1 1 1 1 1 1 1 1 123 1 0,75 1 1 1 1 0,5 0,5 0,5 1 1 1 10,254 1 0,75 1 1 1 1 1 1 1 1 1 1 11,755 1 1 1 1 1 1 1 1 1 1 1 1 126 1 1 1 1 1 1 1 1 1 1 1 1 127 1 1 1 1 1 1 1 1 1 1 1 1 128 1 0,75 1 1 1 1 1 1 1 1 1 1 11,759 1 1 1 1 1 1 1 1 1 1 1 1 12

10 0,75 0,75 1 1 1 0,5 1 1 1 1 1 1 1111 1 1 1 1 1 1 1 1 1 1 1 1 1212 0,75 1 1 1 1 1 1 1 1 1 1 1 11,7513 1 1 1 1 1 1 1 1 1 1 1 1 1214 1 1 1 1 0,75 1 1 1 1 1 1 1 11,7515 1 1 1 1 1 1 1 1 1 1 1 1 1216 1 1 1 1 1 1 1 1 1 1 1 1 1217 1 1 0,75 1 1 1 1 1 1 1 1 1 11,7518 1 0,75 1 1 1 1 1 1 1 1 1 1 11,75

Avg 0,97 0,90 0,99 1,00 0,99 0,97 0,97 0,97 0,97 1,00 1,00 1,00 11,74

Efficiency – completion time & number of keystrokes

Times and keystrokes for an expert userExpert user times and keystroke for Optimum path (IF = interface, keyst = keystrokes).

Task 1 2 3 4 5 6 7IF time keyst time keyst Time keyst time keyst time keyst time keyst time keystPC 55 7 35 6 45 9 40 9 50 5 120 14 140 17NC 50 7 35 6 45 4 70 13 50 6 120 12 140 16W6 50 9 35 5 45 6 70 8 50 5 110 11 150 23

PowerComIn the three tables below, the completion times and number of keystrokes for allsubjects are displayed. At the bottom, the average measures are calculated.

Task 1 2 3 4 5 6 7subject time keyst time keyst Time keyst time keyst time keyst time keyst time Keyst

1 295 45 46 6 130 13 99 18 60 9 356 35 223 412 174 25 51 7 94 9 82 12 39 5 101 15 225 373 331 26 27 6 93 12 112 13 65 9 203 25 361 344 128 20 20 6 68 9 61 10 50 5 142 18 348 455 184 31 48 10 85 12 70 9 53 5 112 14 256 246 174 38 54 7 79 12 58 12 67 5 114 16 421 587 237 37 53 6 80 9 87 16 62 5 163 19 217 218 229 37 54 7 73 10 77 15 40 5 102 15 388 449 203 40 58 9 62 9 108 16 52 9 118 16 215 27

10 224 43 62 7 115 10 132 21 99 10 215 25 408 3111 98 9 47 7 99 14 46 11 69 7 131 14 445 4112 178 40 25 7 61 11 39 11 53 6 106 17 304 3713 269 38 51 6 114 18 144 21 52 6 183 20 403 4614 196 30 86 10 115 9 165 25 66 5 114 14 503 4415 97 14 44 7 68 9 59 12 54 5 191 15 269 2616 460 55 62 8 253 27 121 19 74 9 174 19 247 3117 374 28 93 12 97 10 185 18 48 5 270 17 402 4318 398 43 84 9 195 20 112 24 71 5 189 18 333 33

Avg 236,1 33,3 53,6 7,6 104,5 12,4 97,6 15,7 59,7 6,4 165,8 18,4 331,6 36,8

Page 77: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 4 – The results 77

Nokia 9110 CommunicatorTask 1 2 3 4 5 6 7

subject time keyst time keyst time keyst time keys time keyst time keyst time Keyst1 195 21 133 26 81 4 78 13 57 6 119 16 145 162 80 7 47 6 86 7 97 13 45 6 123 12 135 163 128 11 81 10 59 4 171 18 71 8 153 14 147 164 185 19 47 7 183 13 272 22 243 25 347 27 169 165 134 9 89 6 91 10 123 13 76 7 145 14 181 186 156 11 63 7 84 4 102 14 66 6 260 18 201 167 68 7 54 6 48 4 115 14 132 14 109 12 134 168 75 7 54 6 63 6 81 15 93 9 240 23 123 189 87 10 49 6 84 7 141 24 58 6 118 12 130 16

10 173 7 64 6 106 6 150 18 105 6 181 12 236 1611 101 7 33 6 54 4 105 17 138 15 298 21 139 1612 77 9 32 6 60 10 62 13 57 6 104 16 136 1613 181 7 112 13 101 6 128 16 102 6 167 12 200 1614 95 11 53 6 68 4 205 22 62 7 122 14 172 2015 156 16 70 8 84 10 95 16 72 7 166 14 167 1616 90 8 47 6 69 4 78 14 89 10 140 12 135 1617 66 7 55 6 106 9 165 22 83 8 112 12 151 2018 179 17 77 11 112 17 43 13 59 6 140 14 165 16

Avg 123,7 10,6 64,4 8,2 85,5 7,2 122,8 16,5 89,3 8,8 169,1 15,3 159,2 16,7

Windows CE + Nokia 6110Task 1 2 3 4 5 6 7

subject time keyst time keyst time keyst time keyst time keyst time keyst time Keyst1 184 23 69 12 145 21 78 13 91 12 115 12 180 282 68 9 31 6 57 7 91 17 54 7 118 12 174 243 141 16 35 8 58 7 103 23 138 15 230 27 188 294 106 13 35 8 82 8 84 15 108 15 126 16 167 335 139 17 109 17 87 12 73 15 114 15 122 13 214 326 75 11 49 5 64 6 105 21 71 7 113 13 190 287 97 9 62 7 82 9 154 17 105 16 139 17 207 288 242 21 217 38 81 8 106 17 91 15 129 17 183 329 129 13 115 20 89 16 158 34 59 5 90 13 174 26

10 369 53 63 6 152 11 147 22 407 43 203 16 365 3111 398 41 103 11 128 8 106 16 171 12 223 17 343 2812 292 45 43 8 87 9 70 10 90 15 125 17 180 3013 139 21 64 7 65 6 62 10 99 11 116 14 273 3314 135 20 55 9 58 7 102 22 85 9 103 13 182 3115 207 31 90 13 92 16 78 11 127 12 124 11 191 2916 233 22 34 7 83 12 46 8 94 9 141 11 211 2417 152 10 135 25 48 7 131 23 154 18 187 19 254 3818 333 54 61 7 98 11 58 9 139 11 163 15 207 29

Avg 191,1 23,8 76,1 11,9 86,4 10,1 97,3 16,8 122,1 13,7 142,6 15,2 215,7 29,6

Relative efficiencyIn the table below, the ratio time = (average user completion time / expert user timefollowing Optimum path) and keyst = (average number of user keystrokes / Optimumpath keystrokes) is shown (Cond = condition, keyst = keystrokes).

Task 1 2 3 4 5 6 7Cond time keyst time keyst time keyst time keyst time keyst time keyst time keystPC 4,29 4,75 1,53 1,27 2,32 1,38 2,44 1,75 1,19 1,28 1,38 1,32 2,37 2,17NC 2,47 1,52 1,84 1,37 1,90 1,79 1,75 1,27 1,79 1,46 1,41 1,27 1,14 1,04W6 3,82 2,65 2,17 2,38 1,92 1,68 1,39 2,10 2,44 2,74 1,30 1,38 1,44 1,29

Page 78: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 4 – The results 78

Mental workloadIn the following three tables representing the different interfaces, the mental workloadfor each subject is displayed. At the bottom the average rating for each scale iscalculated.

PowerComQuestionSubject

1 2 3 4 5 6 Total

1 35 22 62 63 20 55 2572 23 52 23 16 34 51 1993 46 13 50 73 64 68 3144 26 28 6 41 35 75 2115 44 28 46 31 40 28 2176 41 39 41 19 59 77 2767 21 17 12 39 60 50 1998 71 78 76 47 79 74 4259 59 9 31 29 64 12 204

10 49 7 27 24 43 52 20211 35 11 33 43 40 43 20512 52 50 30 75 59 47 31313 80 26 25 75 75 83 36414 92 55 77 61 78 87 45015 61 62 38 27 66 67 32116 60 57 55 25 62 50 30917 33 32 54 35 32 52 23818 59 8 2 22 39 0 130

Avg 49,3 33,0 38,2 41,4 52,7 53,9 268,6

Nokia 9110 CommunicatorQuestionSubject

1 2 3 4 5 6 Total

1 42 37 13 43 22 15 1722 8 12 9 2 19 2 523 11 9 13 18 13 8 724 25 13 13 11 14 52 1285 21 33 40 12 34 52 1926 37 10 44 13 16 43 1637 13 6 10 5 6 7 478 26 42 15 13 26 48 1709 59 58 62 7 49 9 244

10 5 25 3 2 4 2 4111 16 6 3 17 12 23 7712 24 20 17 38 17 15 13113 26 32 47 2 70 56 23314 62 54 30 29 28 42 24515 16 37 88 3 15 38 19716 32 41 38 25 30 43 20917 36 69 68 32 45 36 28618 12 4 25 25 18 1 85

Avg 26,2 28,2 29,9 16,5 24,3 27,3 152,4

Page 79: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 4 – The results 79

Windows CE + Nokia 6110QuestionSubject

1 2 3 4 5 6 Total

1 34 72 28 48 37 26 2452 29 50 15 18 29 25 1663 11 10 2 52 49 14 1384 4 4 3 7 6 6 305 68 58 54 46 38 61 3256 54 63 43 77 64 76 3777 20 12 22 7 54 7 1228 52 53 76 27 77 41 3269 51 18 52 12 42 2 177

10 25 0 2 30 43 32 13211 15 12 8 17 12 8 7212 17 15 34 60 22 21 16913 47 71 47 5 63 66 29914 12 14 24 27 20 18 11515 29 70 27 13 31 61 23116 38 39 45 35 36 27 22017 68 71 84 69 66 86 44418 4 24 1 2 28 1 60

Avg 32,1 36,4 31,5 30,7 39,8 32,1 202,7

Subjective satisfactionIn the following three tables representing the different interfaces, the subjectivesatisfaction for each subject is displayed. At the bottom the average rating for eachquestion is calculated (Quest = question, Subj = subject).

PowerComQuestSubj

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Total

1 6 3 5 3 3 6 7 4 7 4 9 8 7 5 7 842 6 7 7 5 6 7 5 6 8 8 9 9 6 5 6 1003 4 6 2 2 3 5 5 6 5 3 2 8 1 2 5 594 4 4 6 5 7 6 5 7 5 7 5 8 4 4 4 815 8 6 6 8 6 6 7 6 5 7 7 7 7 3 6 956 6 6 8 7 6 7 6 6 6 7 4 8 3 4 4 887 5 4 7 6 3 4 6 7 7 7 5 8 3 4 4 808 7 5 7 3 4 6 7 7 7 6 8 8 5 4 3 879 7 5 9 8 7 7 7 8 6 6 8 8 8 7 8 109

10 5 7 5 4 4 6 7 8 8 4 5 8 4 5 7 8711 5 5 7 6 6 7 8 7 7 8 6 8 6 4 5 9512 5 4 5 4 4 4 5 3 4 4 6 8 3 4 4 6713 3 3 7 3 3 7 7 2 7 3 3 5 4 3 3 6314 3 3 3 3 3 4 4 4 4 4 4 7 2 2 3 5315 6 3 4 4 4 7 7 8 6 3 6 7 4 6 4 7916 5 3 5 5 3 7 7 7 8 7 5 4 5 4 5 8017 6 4 7 3 7 7 7 7 8 7 6 4 7 8 7 9518 8 7 8 7 7 7 8 8 9 7 8 8 7 5 5 109

Avg 5,5 4,7 6,0 4,8 4,8 6,1 6,4 6,2 6,5 5,7 5,9 7,3 4,8 4,4 5,0 83,9

Page 80: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 4 – The results 80

Nokia 9110 CommunicatorQuestSubj

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Total

1 3 7 7 7 7 8 7 7 6 8 7 7 8 9 8 1062 9 8 6 3 8 7 8 9 8 9 7 7 7 9 7 1123 8 7 8 7 8 6 5 8 7 7 7 7 8 7 8 1084 8 4 5 8 7 8 5 8 5 8 5 7 7 8 6 995 7 7 7 6 7 8 6 5 7 6 6 4 8 6 6 966 8 4 4 8 8 8 8 6 7 9 6 8 9 9 8 1107 7 7 8 8 7 8 8 9 8 8 7 8 8 9 8 1188 7 8 7 8 6 7 7 8 8 7 7 8 8 8 8 1129 4 5 5 8 6 7 5 7 7 5 5 6 6 8 5 89

10 9 9 9 7 7 8 9 9 8 8 8 8 8 8 8 12311 7 6 8 8 8 8 9 7 8 7 7 9 7 9 8 11612 7 7 7 8 7 8 8 8 7 8 7 8 7 8 8 11313 7 7 7 6 7 8 4 8 7 8 7 5 7 8 8 10414 3 7 5 7 4 5 6 6 6 7 6 2 6 6 6 8215 7 7 6 8 8 9 6 8 7 5 6 4 4 4 7 9616 5 7 6 5 6 6 5 6 6 7 5 2 6 6 6 8417 7 4 6 7 7 7 6 7 8 7 5 4 7 8 7 9718 7 5 8 8 8 8 9 9 7 7 6 6 7 9 8 112

Avg 6,7 6,4 6,6 7,1 7,0 7,4 6,7 7,5 7,1 7,3 6,3 6,1 7,1 7,7 7,2 104,3

Windows CE + Nokia 6110QuestSubj

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 tot

1 3 5 2 4 7 6 4 5 7 5 7 8 7 7 7 842 4 6 6 8 6 6 9 9 9 7 7 7 6 6 6 1023 5 6 7 6 8 5 7 5 6 7 7 6 5 8 7 954 9 9 8 7 8 8 7 9 8 9 5 8 8 9 7 1195 5 3 4 6 6 6 4 2 8 6 6 4 8 8 6 826 6 4 7 8 6 8 8 6 7 8 4 8 6 8 7 1017 6 4 7 8 8 9 7 8 7 9 5 6 8 9 8 1098 6 6 6 5 7 7 7 8 7 7 8 8 9 9 9 1099 6 6 5 5 9 8 6 9 7 7 7 9 7 6 6 103

10 7 6 8 8 9 9 9 4 8 8 7 7 8 8 8 11411 7 9 7 6 7 8 7 8 8 9 8 8 9 9 7 11712 7 8 8 7 8 8 8 8 8 8 8 8 8 8 8 11813 5 4 6 7 7 7 4 6 6 7 7 2 6 7 4 8514 7 8 6 7 7 7 7 7 7 7 7 7 7 8 8 10715 4 5 6 6 5 6 6 7 5 6 5 6 8 7 7 8916 6 7 7 5 6 6 6 7 5 6 6 6 6 7 6 9217 5 2 3 6 4 4 5 8 6 6 4 5 6 7 7 7818 9 7 8 9 8 8 8 8 7 7 8 9 7 9 7 119

Avg 5,9 5,8 6,2 6,6 7,0 7,0 6,6 6,9 7,0 7,2 6,4 6,8 7,2 7,8 6,9 101,3

Page 81: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 5 – The statistical analysis 81

Appendix 5 – The statistical analysisIn the tables below is the p-value written out both for overall comparisons according toa univariate analysis and for the pairwise comparisons. If the p-value is less than 5%, agrey background indicates it. The letters after the pairwise comparisons tells whichinterface that was superior (PC = PowerCom, NC = Nokia 9110 Communicator, W6 =Windows CE + Nokia 6110).

For each section the data have been tested for normality by using Mauchly's test ofsphericity. The null hypothesis was that normality existed, which means that asignificant measure violates this assumption. In those cases, a grey backgroundindicates it.

EffectivenessAverage points Significance

Task PC NC W6 overall PC-NC PC-W6 NC-W61a - find contact 0,972 0,954 0,972 F[2,34]=0,159 p=0,854 1,000 NC 1,000 1,000 NC1b - change tel.no. 0,694 0,972 0,903 F[2,34]=8,308 p=0,001 0,012 NC 0,061 W6 0,288 NC2 - find a note 1,000 1,000 0,986 F[2,34]=1,000 p=0,378 - 0,994 PC 0,994 NC3 - find meeting info 1,000 0,986 1,000 F[2,34]=1,000 p=0,378 0,994 PC - 0,994 W64 - find emails 1,000 0,986 0,986 F[2,34]=0,486 p=0,619 0,994 PC 0,994 PC 1,0005 - call a person 0,972 0,944 0,972 F[2,34]=0,191 p=0,827 1,000 PC 1,000 1,000 W66a - answer & put on hold 1,000 1,000 0,972 F[2,34]=1,000 p=0,378 - 0,994 PC 0,994 NC6b - swap 0,861 0,847 0,972 F[2,34]=1,211 p=0,311 1,000 PC 0,311 W6 0,430 W66c - end & activate 0,861 0,847 0,972 F[2,34]=1,211 p=0,311 1,000 PC 0,311 W6 0,430 W67a - call & put on hold 0,917 0,986 1,000 F[2,34]=1,519 p=0,233 0,863 NC 0,562 W6 0,994 W67b - create conference 0,639 1,000 1,000 F[2,34]=19,678 p=0,000 0,001 NC 0,001 W6 -7c - start private call 0,486 1,000 1,000 F[2,34]=37,000 p=0,000 0,000 NC 0,000 W6 -Total 10,403 11,528 11,736 F[2,34]=10,757 p=0,000 0,033 NC 0,000 W6 1,000 W6

Task p-value for the sphericity test1a - find contact 0,1041b - change tel.no. 0,0052 - find a note .3 - find meeting info .4 - find emails 0,0605 - call a person 0,0006a - answer & put on hold .6b - swap 0,0166c - end & activate 0,0167a - call & put on hold 0,0007b - create conference .7c - start private call .Total 0,020

Efficiency – completion timeAverage completion time [sec] Significance

Task PC NC W6 overall PC-NC PC-W6 NC-W61 - find/change contact 236,06 123,67 191,06 F[2,34]=8,010 p=0,001 0,002 NC 0,551 W6 0,047 NC2 - find a note 53,61 64,44 76,11 F[2,34]=2,230 p=0,123 0,542 PC 0,156 PC 1,000 NC3 - find meeting info 104,50 85,50 86,44 F[2,34]=1,546 p=0,228 0,552 NC 0,503 W6 1,000 NC4 - find emails 97,61 122,83 97,33 F[2,34]=2,664 p=0,084 0,256 PC 1,000 W6 0,194 W65 - call a person 59,67 89,33 122,06 F[2,34]=7,669 p=0,002 0,053 PC 0,004 PC 0,330 NC6 - answer/hold/swap/end/activate 165,78 169,11 142,61 F[2,34]=1,011 p=0,374 1,000 PC 0,500 W6 0,457 W67 - call/hold/conference/private 331,56 159,22 215,72 F[2,34]=51,930 p=0,000 0,000 NC 0,000 W6 0,001 NC

Page 82: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 5 – The statistical analysis 82

Task p-value for the sphericity test1 - find/change contact 0,4062 - find a note 0,0703 - find meeting info 0,3534 - find emails 0,6335 - call a person 0,0446 - answer/hold/swap/end/activate 0,0207 - call/hold/conference/private 0,075

Efficiency – keystrokesAverage number of keystrokes Significance

Task PC NC W6 overall PC-NC PC-W6 NC-W61 - find/change contact 33,28 10,61 23,83 F[2,34]=20,032 p=0,000 0,000 NC 0,111 W6 0,005 NC2 - find a note 7,61 8,22 11,89 F[2,34]=2,879 p=0,070 1,000 PC 0,108 PC 0,441 NC3 - find meeting info 12,39 7,17 10,06 F[2,34]=7,251 p=0,002 0,005 NC 0,374 W6 0,103 NC4 - find emails 15,72 16,50 16,83 F[2,34]=0,290 p=0,750 1,000 PC 1,000 PC 1,000 NC5 - call a person 6,39 8,78 13,72 F[2,34]=8,501 p=0,001 0,252 PC 0,003 PC 0,116 NC6 - answer/hold/swap/end/activate 18,44 15,28 15,17 F[2,34]=3,196 p=0,053 0,246 NC 0,095 W6 1,000 W67 - call/hold/conference/private 36,83 16,67 29,61 F[2,34]=64,877 p=0,000 0,000 NC 0,010 W6 0,000 NC

Task p-value for the sphericity test1 - find/change contact 0,2742 - find a note 0,0183 - find meeting info 0,7974 - find emails 0,0115 - call a person 0,0506 - answer/hold/swap/end/activate 0,3127 - call/hold/conference/private 0,000

Mental workloadAverage rating Significance

ScalePC NC W6 overall PC-NC PC-W6 NC-W6

Mental demand 49,28 26,17 32,11 F[2,34]=9,517 p=0,001 0,000 NC 0,053 W6 0,807 NCPhysical demand 33,00 28,22 36,44 F[2,34]=0,917 p=0,410 1,000 NC 1,000 PC 0,621 NCTemporal demand 38,22 29,89 31,50 F[2,34]=1,058 p=0,358 0,767 NC 0,697 W6 1,000 NCPerformance 41,39 16,50 30,67 F[2,34]=10,741 p=0,000 0,000 NC 0,390 W6 0,023 NCEffort 52,72 24,33 39,83 F[2,34]=16,608 p=0,000 0,000 NC 0,061 W6 0,015 NCFrustration level 53,94 27,33 32,11 F[2,34]=12,290 p=0,000 0,000 NC 0,014 W6 1,000 NC

Scale p-value for the sphericity testMental demand 0,199Physical demand 0,872Temporal demand 0,329Performance 0,044Effort 0,981Frustration level 0,273

Page 83: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 5 – The statistical analysis 83

Subjective satisfactionAverage rating Significance

Question PC NC W6 overall PC-NC PC-W6 NC-W61. Overall reaction 5,50 6,67 5,94 F[2,34]=2,501 p=0,097 0,127 NC 1,000 W6 0,540 NC2. Feeling of being lost 4,72 6,44 5,83 F[2,34]=5,114 p=0,011 0,006 NC 0,228 W6 0,888 NC3. Sequence of screens 6,00 6,61 6,17 F[2,34]=0.656 p=0,525 0,902 NC 1,000 W6 1,000 NC4. System information 4,78 7,06 6,56 F[2,34]=12,340 p=0,000 0,001 NC 0,005 W6 0,852 NC5. Predictable results 4,78 7,00 7,00 F[2,34]=16,229 p=0,000 0,000 NC 0,002 W6 0,852 NC6. Steps to complete a task 6,11 7,44 7,00 F[2,34]=6,704 p=0,004 0,001 NC 0,217 W6 0,570 NC7. Arrangement of info on screen 6,39 6,72 6,61 F[2,34]=0,309 p=0,736 1,000 NC 1,000 W6 1,000 NC8. Icons and labels 6,17 7,50 6,89 F[2,34]=3,788 p=0,033 0,032 NC 0,627 W6 0,517 NC9. Terminology related to task 6,50 7,06 7,00 F[2,34]=1,434 p=0,252 0,288 NC 0,852 W6 1,000 NC10. Learning to use the interface 5,67 7,28 7,17 F[2,34]=9,744 p=0,000 0,009 NC 0,010 W6 1,000 NC11. Exploration by trial and error 5,89 6,33 6,44 F[2,34]=0,878 p=0,425 1,000 NC 0,808 W6 1,000 W612. System speed and responses 7,28 6,11 6,78 F[2,34]=4,279 p=0,022 0,013 PC 0,464 PC 0,579 W613. Overall reaction call handling 4,78 7,11 7,17 F[2,34]=15,429 p=0,000 0,002 NC 0,000 W6 1,000 W614. Telling the status of calls 4,39 7,72 7,78 F[2,34]=37,335 p=0,000 0,000 NC 0,000 W6 1,000 W615. User control for call handling 5,00 7,22 6,94 F[2,34]=15,864 p=0,000 0,001 NC 0,003 W6 1,000 NC

Scale p-value for the sphericity test1. Overall reaction 0,9902. Feeling of being lost 0,6173. Sequence of screens 0,4784. System information 0,8195. Predictable results 0,2916. Steps to complete a task 0,0447. Arrangement of info on screen 0,0608. Icons and labels 0,4419. Terminology related to task 0,04510. Learning to use the interface 0,14411. Exploration by trial and error 0,02212. System speed and responses 0,08613. Overall reaction call handling 0,21714. Telling the status of calls 0,30615. User control for call handling 0,087

Page 84: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface

Appendix 6 – The KeystrokeMapper charts 84

Appendix 6 – The KeystrokeMapper charts

Task 7 – PowerComThe full task description is available in Appendix 1.

Task 6 - Nokia 9110 CommunicatorThe full task description is available in Appendix 1.

Page 85: Benchmarking information and call handling on PowerCom · 2001. 1. 18. · Ben Shneiderman’s eight golden rules ... and challenging demands on the interface design. An interface