analysis of four usability evaluation methods applied to...

Analysis of Four Usability Evaluation Methods Applied toAugmented Reality Applications

Hector Martıneza,b, Payel Bandyopadhyayc,d

aSenseTrix, PL 20 FI-00101 Helsinki, FinlandbTampere University of Technology, PO Box 527, FI-33101 Tampere, Finland

cUniversity of Helsinki, P.O. 68 (Gustaf Hallstromin katu 2b), FI-00014 Helsinki, FinlanddHelsinki Institute for Information Technology (HIIT), PO Box 68, 00014 University of Helsinki, Finland

Abstract

The way users interact with computers/mobile devices is changing drastically with the new emergingtechnologies. Augmented Reality (AR) is one of the new technologies which defines a new way ofuser interaction. There has been a large amount of research work done in evaluating user interfaces oftraditional systems such as mobile devices and web interfaces. Since AR is one of the new emergingtechnologies, the number of systematic evaluations done in AR interfaces is relatively low. In thisproject, a systematic evaluation of the user interface of an AR application has been done. Out of theexisting usability evaluation methods, four methods have been chosen as a guideline to evaluate thetargeted AR application. In order to cover all the aspects of usability methods, two methods fromusability inspection (Cognitive walkthrough and Heuristic evaluation), one from usability testing(Laboratory observation) and one from user reports (Questionnaire) have been chosen. The ARapplication that has been evaluated in this project is “Augment - 3D Augmented Reality”. Theresults obtained from the four usability methods have been described in this document.Usually, due to limited time and resources, applying all the methods to evaluate an user interface isnot feasible. Hence, a comparison of the results of the four usability methods has been carried out.In this comparison, a justification, based on the results obtained, about which usability evaluationmethod would be more suitable in case of AR interfaces is presented. Finally, a set of designguidelines for AR applications has been proposed.

Keywords: Augmented Reality, Usability evaluation, Cognitive walkthrough, Heuristicevaluation, Laboratory observation, Questionnaire

1. Introduction

Nowadays, there are many innovative waysof user interaction with environments, startingfrom real to virtual environments. Figure 1shows an overview of the reality–virtuality con-tinuum defined by Milgram [1]. In virtual en-vironment (also known as virtual reality), usersees a completely synthetic environment which

bears no connection with the real environment.This means that user remains unaware of the sur-rounding real environment. Augmented Reality(AR) is a variation of virtual environment [2]. Itcombines real and virtual objects in a real envi-ronment [3]. This means that user can see thereal environment with some additional objectsadded to the real environment. Any applicationhaving the following properties can be classified

Preprint submitted to Elsevier April 24, 2014

as an AR application [4]:

• Combination of real and virtual objects in areal environment.

• Interactive and real time operation.

• Registration (alignment) of real and virtualobjects with each other.

Figure 1: Milgram’s reality–virtuality continuum [1].

AR is one of the most promising research ar-eas of user interface. Whenever a user interface isinvolved with any system, the concern of usabil-ity evaluation and design guidelines of the userinterface appears. Usability evaluation plays animportant role in the application developmentprocess. Usually, application developers are ex-perts in their respective fields and their devel-oped user interfaces might seem very simple touse from their point of view. Unfortunately, thetarget users are often novice users of the appli-cations and find difficult to use the application.Hence, usability evaluation plays a crucial rolein any application having a user interface.

Though AR applications have been into exis-tence for many years, the number of usabilityevaluations applied to AR interfaces is low [5].Therefore, in this project a systematic evaluationof an AR application has been done. The AR ap-plication that has been chosen is “Augment”[6]. This application is available to download inGoogle play and App store.

The rest of the document has been structuredin the following manner. Section 2 provides thedescription and analysis of the AR that has beenused as a prototype for evaluation. Section 3 pro-vides background information of usability meth-ods in general with focus on the usability meth-ods adapted in this project. Section 4 describes

the detailed adaptation and results of the meth-ods that have been used to evaluate the AR ap-plication. Section 5 shows a comparison of theresults of the adapted usability methods. Sec-tion 6 finally describes a set of design guidelinesfor AR interfaces, based on the results of theadapted usability methods. Section 7 finally con-cludes the whole work done in this project.

2. Background

AR being one of the most promising technolo-gies is gaining its importance in various fields ofapplication. Therefore, usability evaluation ofAR applications is of prime concern.

Usability evaluation of traditional systems likemobile applications or web interfaces is donebased on pre-defined usability methods. Thesemethods can be categorised as [4, 7] : inspec-tion methods, testing methods and user re-ports. Table 1 shows usability evaluation meth-ods and their corresponding category. Note thatthis is approximate categorization. There maybe other methods also which fall in these cate-gories.

Inspection methods: In these methods,evaluators are involved. These methods are lesstime consuming than the other categories. Meth-ods like heuristics, cognitive walkthrough, fea-ture inspections, guideline checklist and perspec-tive based inspection fall in this category. Inthis document, two of these methods have beenchosen to evaluate the AR prototype. The twochosen methods are:

Heuristic evaluation [3]: In heuristic eval-uation, one or more evaluators are recruited.The evaluators are often novice to the given sys-tem’s design. Evaluators examine the user inter-face of a given prototype and try to find prob-lems in the interface’s compliance according to10 standard usability principles. These princi-ples are often called as “heuristics” because theyare more in the nature of rules of thumb thanspecific usability guidelines. The 10 heuristicsare explained as ([8, 9]):

2

1. Visibility of system status: The systemshould always keep users informed aboutwhat is going on, through appropriate feed-back within reasonable time.

2. Match between system and the real world:The system should speak the users’ lan-guage, with words, phrases and concepts fa-miliar to the user, rather than system- ori-ented terms. Follow real-world conventions,making information appear in a natural andlogical order.

3. User control and freedom: Users oftenchoose system functions by mistake and willneed a clearly marked ”emergency exit” toleave the unwanted state without having togo through an extended dialogue. Supportundo and redo.

4. Consistency and standards: Users shouldnot have to wonder whether different words,situations, or actions mean the same thing.Follow platform conventions.

5. Error prevention: Even better than good er-ror messages is a careful design which pre-vents a problem from occurring in the firstplace. Either eliminate error-prone condi-tions or check for them and present userswith a confirmation option before they com-mit to the action.

6. Recognition rather than recall: Minimize theuser’s memory load by making objects, ac-tions, and options visible. The user shouldnot have to remember information from onepart of the dialogue to another. Instruc-tions for use of the system should be visibleor easily retrievable whenever appropriate.

7. Flexibility and efficiency of use Accelerators– unseen by the novice user – may oftenspeed up the interaction for the expert usersuch that the system can cater to both in-experienced and experienced users. Allowusers to tailor frequent actions.

8. Aesthetic and minimalist design:: Dialoguesshould not contain information which is ir-relevant or rarely needed. Every extra unitof information in a dialogue competes with

Table 1: Various usability evaluation methods and theircorresponding category [7].

Category Usability evalua-tion methods

Inspection methods HeuristicsCognitive walk-throughsPluralistic walk-throughsFeature inspectionsGuideline checklistPerspective-basedinspection

Testing methods Co-discoveryQuestion askingprotocolThink aloud proto-colPerformance mea-surementField observationLaboratory obser-vation

User reports InterviewQuestionnaire

the relevant units of information and dimin-ishes their relative visibility.

9. Help users recognize, diagnose, and recoverfrom errors: Error messages should be ex-pressed in plain language (no codes), pre-cisely indicate the problem, and construc-tively suggest a solution.

10. Help and documentation: Even though itis better if the system can be used withoutdocumentation, it may be necessary to pro-vide help and documentation. Any such in-formation should be easy to search, focusedon the user’s task, list concrete steps to becarried out, and not be too large.

Cognitive walkthrough [10]: Cognitivewalkthrough is an evaluation method which isused to inspect the interaction between the userand the interface through some pre-defined tasks.

3

In this method, the main focus is on exploratorylearning [11]. Exploratory learning in this con-text means how well a novice user is able to usethe interface without any prior training. Thismethod can either be applied at the early stageof designing an interface with paper prototypesor during beta testing phase. This method alsoincludes recruiting evaluators who are system de-signers and designers who are novice to the givenuser interface [12]. The system designers preparethe first phase of activity which involves prepar-ing the task and the corrective actions. Thenovice designers of the given user interface thentry to analyse the task from the perspective of anovice user. This method is based on CE+ the-ory [13] which defines the four phases of activity,states the following:

• The user sets a goal to be accomplished.

• The user searches the interface for availableactions.

• The user selects an action that seems likelyto make progress toward the goal.

• The user performs the action and checksto see whether the feedback indicates thatprogress is being made towards the goal.

For every given step that a user would have takento complete a task, all the above steps are re-peated. The task of system designers in the firstphase includes several prerequisites to the cogni-tive walkthrough procedure which are [13]:

1. A general description of who the users willbe and what relevant knowledge they pos-sess.

2. A specific description of one or more rep-resentative tasks to be performed with thesystem.

3. A list of the correct actions required to com-plete each of these tasks with the interfacebeing evaluated.

Then the designer who is novice to the user inter-face tries to answer the following four questionsfor each correct action involved in a task [10]:

1. Will the user try to achieve the right effect?2. Will the user notice that the correct action

is available?3. Will the user associate the correct action

with the desired effect?4. If the correct action is performed, will the

user see that progress is being made towardsthe solution of the task?

Testing methods: In these methods, users areinvolved. These methods are more time con-suming and costly than the other categories [14].These methods measure the extent to which theproduct satisfies its target users. Factors af-fecting usability testing are shown in Figure 2.Methods like co-discovery, question asking pro-tocol, think aloud protocol, performance mea-surement, laboratory testing and field observa-tion fall in this category. In this project, one ofthese methods has been chosen to evaluate theAR prototype. The chosen method is:

Figure 2: Usability testing evaluation methods [15].

Laboratory evaluation [15]: Laboratoryevaluation is done in controlled environmentwhere the experimenter has full control of assign-ments of subjects, treatment variables and ma-nipulation of variables [16]. This “laboratory”does not need to be a dedicated laboratory [17].Laboratory in this context means controlled en-vironment which mimics real life scenario. Thismethod is most advantageous in case of eval-uating user interfaces because there is a pos-sibility for the experimenter to do video/audiorecordings of the user interface and user inter-actions. This helps the experimenter to anal-yse that given a user interface, how the users

4

are going to use it. Since, in laboratory evalu-ation, experimenter has full control of the as-signment of variables and recruitment of par-ticipants, the experimenter can recruit partici-pants depending on the target users. The par-ticipants can be either novice users of the ap-plication, have used similar type of applicationsbefore, have computer science (cs) background(like cs students, researchers, professors) or non-cs background. Laboratory evaluation gives theexperimenter the entire freedom to choose thetarget users. All depends on the experimenters’need of evaluating the application. The usersperform a set of pre-defined tasks in a usabilitylaboratory. In laboratory testing, experimentercan provide the users with two types of tasks tofind the usability problems: structured tasks andunstructured tasks [15]. The details of the thesetasks are explained below:

1. Structured tasks: This type of tasks arestructured in a way that the experimentercreates a step by step to-do list which theuser performs in order to complete the task.The steps needed to complete the task areclearly defined by the experimenter. Thistype of tasks can be written down in a verydetailed manner, like providing a realisticscenario explaining what the user needs todo. Figure 3 demonstrates an example ofstructured task.

2. Unstructured tasks: This type of tasks arewritten down in an abstract level. The usershave the full control of the steps needed tobe taken in order to complete the given task.In our project, we have used this task type,hence an example of this type of task can befound in our task description.

Usually, video/audio recordings of user task andinteraction with user interface are done. Fromthe recorded videos/audios, evaluators try toanalyse number of errors, time spent and usersatisfaction.User reports: In these methods, users are in-

volved. These methods are less time consumingthan the other categories. Methods like inter-views and questionnaires fall in this category. In

Figure 3: A sample of structured user task definition [18].

this project, one of these methods has been cho-sen to evaluate the AR prototype. The chosenmethod is:

Questionnaire [19]: Questionnaires areusually performed before or after the testingmethods. Often it is difficult to measure certainaspects of users objectively, in those cases thismethod is used to gather subjective data fromusers. It involves querying users to gather useropinion and preferences related to Operability,Effectiveness, Understandability and Aestheticsof the user interface [4]. This method is referredto as indirect method and cannot be used asa single method to analyse the user interface.The main reason for this is that this techniquedoes not analyse the user interface of the proto-type but collects opinions about the user inter-face from the users. Collected raw data of users’behaviours from other methods are consideredmore reliable than raw data of users’ opinionsabout the user interface [19]. In this method,users are first allowed to use the prototype andthen users fill up or rate pre-defined questions

5

prepared by evaluators. Evaluators collect thedata and try to analyse them in some statisticsformat.

3. Evaluated application

This section provides an overall description ofthe AR application that has been used in thisproject for usability study.

3.1. System Description

The selected application for the usabilitystudy has been “Augment”. The application istargeted to those users who want to provide ARsolutions for sales and marketing fields. How-ever, the application can be used by anyone forfun or for other ideas that users may come upwith. Figure 4 shows a screenshot from the ini-tial view of ”Augment”. The developers [6] de-scribe the application as:“Augment is a mobile app that lets you

and your customers visualize your 3Dmodels in Augmented Reality, integratedin real time in their actual size and en-vironment. Augment is the perfect Aug-mented Reality app to boost your sales andbring your print to life in 3 simple steps.”

The application has two versions that providedifferent features. The versions are:

i. Free version.

ii. Paid versions (Starter, Business, Enter-prise).

The free version has some limited features com-pared to paid versions. In paid versions, userscan upload their own 3D models and markers,use the History feature, etc. For the study pre-sented in this document, the free version hasbeen chosen. The 3D models can be uploadedfrom the following sites/softwares:

i. Sketchup.

ii. ArtiosCAD.

iii. AutoCAD.

iv. Rhino V5.

Figure 4: A screenshot of the initial view of ”Augment”.

It also supports the plugins for the following soft-wares:

i. 3ds Max plugin (for 3ds Max 2012, 2013,2014).

ii. Cinema 4D plugin (for both Mac and PC).

iii. Blender plugin (for Blender 2.68 and above).

3.2. Software Requirements

“Augment” is an AR application currentlyavailable in the following platforms:

i. Android.

ii. iOS.

Windows version of this application is still notavailable. ”Augment” supports 3D models inCollada, Wavefront and STL file formats. Ta-ble 2 provides a more detailed description.

6

Table 2: The 3 standard 3D formats supported in Aug-ment that can be exported from most 3D softwares [6].

3D fileformat

Extension The followingzipped filesshould be up-loaded on Aug-ment Manager

Collada .dae or.zae

dae + texture filesor .zae alone

Wavefront .obj .obj + .mtl (materi-als) + textures

STL .stl .stl only A STL fileis never texturedand appears blue inAugment

3.3. System architecture

Due to the fact that the application has beenmade by a third party and not by the authors, adetailed description of the architecture of the ap-plication cannot be provided in this document.Also, the results obtained in this document can-not be directly applied to improve the design ofthe application.

3.4. System Functionality

The application allows users to create AR ex-periences with their phones and tablets. The ap-plication can be used for 2 purposes. The pur-poses are:

i. Sales and design.

ii. Interactive print.

Sales and design: This functionality of the ap-plication is represented by ”BROWSE” option(see Figure 4) on the user interface. This optionallows users to select the desired 3D model touse for the augmentation. After the selection ofthe 3D model, the application view changes tothe camera view and the positioning of the 3Dmodel begins (in the cases where the feature isavailable). From that view, the user is also ableto create a marker for the augmentation and usethe different options for 3D model manipulation

(rotation, translation and scaling) and sharing(using e-mail and/or social networks). Figure 5shows how this feature works.

Figure 5: A screenshot of 3D Model user interface [6].

Interactive print:This functionality of the ap-plication is represented by ”SCAN” option (seeFigure 4) on the user interface.This option is in-tended to be used in the cases where the userwants to either scan a QR code or when the useraims to detect one of the predefined markers thatthe application contains. This option can also beused for visualizing 3D models of a catalogue in3D. The only requirement is that the image to beused for the augmentation is registered in theirsite. In order for users to know it, each imagein the catalogue should contain the ”Augment”logo. Figure 6 shows how this feature works. Thepossibility of scanning QR codes option has beenanalysed only in the heuristic evaluation due tothe limitations of the available time.

The application provides several 3D modelsthat can be used to augment the real environ-ment. “Augment” uses image tracking technol-ogy for the positioning and orientation of the 3Dmodels. Users can decide to use the predefinedimages or to use their own images by creatinga marker within the application. A marker isan image that the “Augment” application rec-ognizes to place the 3D model in space and atthe right scale, in Augmented Reality. Figure7 show a screenshot of “Augment” application

7

Figure 6: A screenshot of Scan user interface [6].

where a virtual stool has been augmented over acover image from a magazine which is acting asa marker.

Figure 7: Screenshot from “Augment” showing howmarkercan be used to place 3D model.

As it has been explained before, Android hasbeen selected as the operating system for theanalysis of the four evaluation methods. Withthe aim of analysing the consistency of the ap-plication through different use cases, two deviceshave been selected to perform the evaluations.The selected devices are a mobile phone and atablet (see specifications in AppendixA). Thereason for the selection of these devices is thatthey are different enough in terms of screen, cam-era resolution and processor to detect inconsis-tencies in the design of the application.

Table 3: Maximum number of textures per model accord-ing to the selected 3D model’s textures resolution [6].

Colorspace

512 x512

1024 x1024

2048 x2048

RGB 33 tex-turesmax

8 texturesmax

2 texturesmax

RGBA 25 tex-turesmax

6 texturesmax

1 texturemax

3.5. System limitations

This application has the following two limita-tions [6]:

i. Platform limitation.

ii. 3D model limitations.

Platform limitation:This application is currentlynot available for Windows. Therefore, users hav-ing windows phone or tablet cannot have accessto this application, even if this application inter-ests them.

3D model limitations: Since mobile device isnot as powerful as a professional computer, the3D models of Augment have certain limitationson total polygons count and file size. This meansthat not any 3D model will be compatible withthis application. Current polygon count limit isbetween 125,000 polygons and 400,000 polygons.The zip file uploaded in “Augment” (see table 2)must not exceed 15MB. There is also a limitationof the file size of textures of 3D models. Thedetailed are shown in table 3.

These limitations somewhat limit the usage ofthis application. The above mentioned featurelimitations are not easy to understand for usershaving a non-cs background. Even if users arehaving a cs background, they may need to havesome knowledge about graphics.

4. Usability methods

In this project four usability methods havebeen chosen to evaluate the AR application

8

“Augment”. The selected usability methods forthe proposed study are the following:

i. Usability inspection: Cognitive walk-through.

ii. Usability inspection: Heuristic evaluation.

iii. Testing methods : Lab observations.

iv. User reports: Questionnaires.

The details of each method have been describedbelow. Each of the subsections below is furthersub-divided into experimental design and results.Experimental design describes how the methodwas adapted. Results describes the outputs ob-tained from the respective method.

4.1. Cognitive walkthrough

4.1.1. Experimental design

As mentioned in Section 2, this method is di-vided into 2 phases:

• Preparation phase

• Evaluation phase

The details of the experiment are mentioned be-low:Participants/Evaluators: This method

has been performed by two evaluators. Oneof the evaluators has performed the preparationphase and the other evaluator has performed theevaluation phase. Since the application used inthis project has not been developed by the au-thors of this document, the authors have playedthe evaluators’ role one of whom was a doctoralstudent in AR area and the other was a researchassistant of human computer interaction area.Both evaluators were familiar with AR conceptsand some concepts of human computer interac-tion. None of the evaluators had previous expe-rience of using cognitive walkthrough method asa usability method for evaluating applications.Both evaluators were provided with lectures onusability evaluation methods, so that the methodcould be applied in a proper way.Procedure: This phase is often referred as

preparation phase. In this phase one of the

evaluator (referred as evaluator 1) prepared thetask that was to be evaluated by another evalu-ator and the target users of the application.

Task chosen: The first step in cognitivewalkthrough includes one of the evaluators (re-ferred as evaluator 1) choosing the task that willbe evaluated. Since the application used in thisproject “Augment” is targeted for purchasingproducts from catalogues, evaluator 1 has de-fined a task of choosing a 3D model to imaginehow the model would look if placed in the desiredsurrounding.

Task description: Task description was pro-vided from the point of view of first time users ofthe application [20]. Evaluator 1 described a pri-mary task that a user might do with the given ap-plication. The described task contained furthersub-tasks which required to be done to achievethe higher level task described below [21]. Thetask prepared by evaluator 1 was:

Select a Samsung Smart TV 55” andplace it in a realistic scale over a table.Then, take a photo and save it. For theaugmentation, use the provided image tocreate a marker. The system will be in astate such that someone could immediatelystart testing.

Correct sequence of actions: For eachtask that is analysed a corresponding correct se-quence of tasks is described. The correct actionsto complete the above defined task are describedin Figure 8.

Anticipated Users: Evaluator 1 describedthe targeted users as: people who have expe-rience with smartphones but limited experiencewith AR applications in smartphones. Theyshould have basic knowledge of how to use anapplication in a smartphone and will have gonethrough the demo the AR application.

User’s initial goal: In this case, if the theuser might have some other goals at the begin-ning of the task then it is listed down. Here,evaluator 1 noted what goals the user might haveat the start of the interaction. In our case, theuser might get 2 goals:

i. User might select the option to ”browse” the

9

Figure 8: Correct steps to be taken to complete the de-fined task.

3D galleries (a step towards our task) - suc-cess.

ii. User might select the option to ”scan” 2Dcatalogues - failure.

According to evaluator 1, if user chose the firstoption to “browse galleries” then user would defi-nitely compete given task and it would be a “suc-cess” story. It might also happen that the usermight choose the second option to “scan images”then no matter whichever correct action the usertakes the user will not reach the given goal andit would result in a “failure”. AppendixB showsthe cognitive start-up sheet used in this project.

Data collection: This phase is often knownas evaluation phase. Data were collected in fol-lowing way: Evaluator 1 served as “scribe”, whorecorded the actions. Evaluator 2 served as a“facilitator” who performed the task and eval-uated the user interface. A complete analysisof the interaction between the user and the in-terface has been done. The evaluation has beendone in the following 3 ways:

i. Facilitator tried to compare the user’s goalsand the goals required to operate the userinterface.

ii. Given a goal, facilitator tried to determinethe problems a user might have in selectingthe appropriate action to go a step furthertowards the goal.

iii. Facilitator tried to evaluate how likely itis that the users’ goals might change withthe correct user actions and the systems’ re-sponse.

The facilitator chose an action and recorded an-swers to the four questions (mentioned in Section2). To assist this process, the scribe used a datacollection sheet mentioned in preparation phasecontaining the four questions and the list of pos-sible user actions. Second, the scribe took notesindividually on corrective step provided by thefacilitator.

The cognitive walkthrough session took ap-proximately two hours to complete. At the endof the cognitive walkthrough, the evaluators ex-pressed their overall conclusions about the ARapplication according to the task.

Data Analysis: For every step of the giventask the facilitator answered the following fourquestions:

1. Will the user try to achieve the right effect?

2. Will the user notice that the correct actionis available?

3. Will the user associate the correct actionwith the desired effect?

4. If the correct action is performed, will theuser see that progress is being made towardsthe solution of the task?

Scribe recorded the following datas:

1. Number of attempts to complete the task.

2. Bugs and usability design issues.

3. Number of times the application crashed.

4.1.2. Results

In this section, the outcomes of the cognitivewalkthrough are presented. Each of the sub top-ics corresponds to each of the measured data thathave been analysed. The evaluations of facili-tator are shown in Appendix AppendixB. The

10

Table 4: Specific issues identified during the cognitive walkthrough. Solutions marked with an asterisk (*) indicatechanges that were discussed before the CW was done, but were also revealed by the CW.

Description Usability impactAfter opening the application user may get confused with 2 options (*) Serious

User may get confused in choosing the desired 3D model CosmeticUser may not know that swiping the menu bar will show more options Critical

User may not be able to create the marker in a proper way CriticalUser may not be able to enlarge the 3D model as desired Critical

Option to rotate the marker is not visible CosmeticUser may not use the ”help” menu (*) Serious

evaluator has used mobile device to evaluate theAR application.

Number of attempts to complete thetask: Facilitator had sufficient previous knowl-edge of using the AR application. Hence, thetask was completed at first attempt. This previ-ous knowledge has biased this result. Due to lim-ited time and resources, hiring a separate evalua-tor (having no previous knowledge of the system)was not possible. If the facilitator did not haveany previous knowledge of the AR applicationthis result might have changed.

Bugs and Usability design issues: Theevaluators have identified two software bugs thathad some impact on the evaluators’ ability tocomplete tasks efficiently. The first includes therate of application getting crashed. The secondincludes device not working properly. These re-sults are reported in next part. In addition, theevaluators identified 7 areas where ”Augment”could be improved to make it easier to use, eas-ier to learn by exploration, and to better supportachievement of goals. The design issues judgedto have a Critical, Serious and Cosmetic impacton usability are listed in Table 4. The definitionsof the criteria used are illustrated below [22]:

• Critical problems

Will prevent user from completingtasks and/or

Will recur across majority of the users.

• Serious problems

Will increase test subjects’ time tocomplete task of users severely and/or

Will recur frequently across test sub-jects and

Users will still manage to complete taskeventually.

• Cosmetic problems

Will increase users’ time to completetask slightly and/or

Will recur infrequently across test sub-jects and

Users will complete task easily.

Number of times the application crashed:The facilitator has used a mobile device to eval-uate the application. The application crashed 5times during the evaluation phase which was fortwo hours. Provided the facilitator would haveused a tablet, the results might have been differ-ent in this case.

The scribe used a tablet to prepare the exper-iment. Though in the tablet the application didnot crash while using it, the device stopped work-ing properly since it has been installed. Hence,users might be forced to uninstall the applica-tion, even if it served for their purpose of visual-izing catalogue in 3D.

4.2. Heuristic evaluation


Heuristic evaluation method requires morethan one evaluator to be reliable, due to the

11

reason that it is not possible that one evalua-tor can find all design problems. As a result ofthe limited resources, the authors of this docu-ment have acted as evaluators. The evaluationhas been based on the 10 heuristics describedby Nielsen [8, 9]. The evaluators have analysedthe application against the 10 heuristics individ-ually, without any interaction. After the individ-ual evaluation the results have been compared ina group session.

Each evaluator has used the application in-dividually and with different devices (the de-vices specifications can be found in AppendixA).Evaluator 1 has used the mobile phone withSpanish language for the application, while eval-uator 2 has used the tablet with the applicationin English.

4.2.2. Results

Every evaluator has provided an individual re-port of the heuristic evaluation. The reports canbe found in AppendixC. In this section, the fi-nal results after both evaluations are discussed.Table 5 provides a summary of the errors foundby the two evaluators. As evaluations were car-ried out individually, the reports obtained fromboth evaluators were not unified. Therefore, atable summarizing the number of design prob-lems found in both cases was needed.

As it can be seen from the table, several de-sign problems have been found in both evalua-tions. Also, as Nielsen stated, not all problemshave been found by both evaluators and there-fore the need of more than one evaluator is jus-tified. Some of the problems, though, have beenfound only by one evaluator as a result of the useof different devices. Also, some problems are re-lated to translation problems only and therefore,they have been found only in the case of evalua-tor 1 who has used the application in Spanish.

There are several problems that have beenfound by both evaluators. In fact, there isat least one design problem for every heuristic,which means that an additional effort is neededto enhance the application. One important prob-lem stated by both evaluators is the way in which

the two main options of the application (Browseand Scan) are displayed. When starting the ap-plication, both options without further help aredisplayed (violation of heuristic 10). As a result,the user may get confused (violation of heuristic1) and choose the wrong option. Moreover, ifthat happens, the back button (which by defaultis used to return to the previous view in An-droid) exits the application, preventing the userto undo the mistake (violating error preventionheuristic) and violating the standard (heuristicnumber 4) of the platform.

Some functional problems have been also con-sidered as design problems as they also violatesome of the heuristics. The option of scan-ning a tracker is sometimes confusing as thereis not feedback on why a tracker is good or not.Although the application clearly shows when atracker is valid or when it is not valid, there isno information on why and how to get a validtracker. Also, when a QR code is not recognizedby the application, there is no further informa-tion about why it has not been recognized even ifthe QR code was in the screen. These problemsviolate several heuristics, such as 1, 2, 5, 6 and9.

Another common problem found by both eval-uators is related to the manipulation of the 3Dobjects. Rotation and translation of the 3Dmodels is not possible in all axis. Moreover,adding new models to the augmented view re-sets all changes made by the user in terms of ro-tation, translation and scaling (without an undooption, which violates heuristic 5).

As it has been explained before, some prob-lems are related to the specific device. Fromevaluator 1 (mobile phone), problems numberedas 3, 7 and 8 are problems that do not appearin tablet. Therefore, a problem of consistency inthe application has been found when comparingresults from both evaluators. The applicationshould maintain consistency through all screensizes.

Regarding to language translation, several de-sign problems have been found. Some of themare related to missing of translation, which

12

Table 5: Summary of the problems found by both evaluators correlated to the 10 heuristics. For each evaluator, thenumber of design problems found for each heuristic is shown. Also, the identifying number of each problem is shownin parenthesis.

Heuristic Evaluator 1, number ofproblems found (# problem)

Evaluator 2, number ofproblems found (# problem)

1. Visibility of system status 10 (3, 4, 6, 7, 8, 10, 11, 13,19, 29)

2 (1.i, 1.ii)

2. Match between systemand the real world

9 (3, 8, 12, 16, 17, 18, 20, 21,22)

1 (2)

3. User control and freedom 1 (2) 1 (3)4. Consistency and standards 4 (1, 2, 7, 8) 1 (4)5. Error prevention 5 (4, 14, 30, 31, 33) 5 (5.i, 5.ii, 5.iii, 5.iv, 5.v)6. Recognition rather thanrecall

2 (7, 10) 2 (6.i, 6.ii)

7. Flexibility and efficiencyof use

3 (7, 9, 26) 1 (7)

8. Aesthetic and minimalistdesign

3 (24, 25, 27) 0

9. Help users recognize,diagnose, and recover fromerrors

2 (3, 5) 6 (5.i, 5.ii, 5.iii, 5.iv, 9.i, 9.ii)

10. Help and documentation 1 (22) 1 (10)

means that the messages were displayed in En-glish although the application was in Spanish(problems numbered as 3, 16 and 22 from eval-uator 1). Another kind of language problemsis related to a bad translation of the originalwords, creating non-understandable texts (prob-lems numbered as 17, 18 and 21 from evaluator1). Therefore, several improvements in the trans-lation of the application are recommended fromthe heuristic evaluation.

A comprehensive list of all design problemsfound in both evaluations can be found inAppendixC.


In order to evaluate the AR application fromthe perspective of the target users, a laboratoryevaluation has been performed. In this project,the goal of using laboratory evaluation methodis to determine that given the AR application, inthis case “Augment”, how well the smartphoneusers are able to use it for a desired purpose (de-

scribed by the authors of this document). Thedetails of the laboratory evaluation method de-ployed are described below.

Setting: The usability laboratory was set upto simulate a part of a room having TV table athome. The laboratory set up was done in orderto make it look as realistic as possible, so thatusers perform the tasks as a customer would havedone at home. Due to limited resources, therewas a small difference, as shown in Figure 9(a)and Figure 9(b), between how a TV table wouldhave been placed at home and how it was placedin the laboratory. The test environment was con-trolled in the context that all the users were pro-vided the same device, all the users performedthe experiment in a very quiet environment, allthe tests were done using the same network withsame speed and the language of the applicationused was English for all the users. The devicethat was provided to the users was:

• Samsung Galaxy Tab 10.1 (Touchscreen)

13

(a) TV table at home (b) TV table in laboratory environment

Figure 9: A comparison of real life scenario with laboratory set up.

Model Number: GT-P7500 (see AppendixAfor specifications).

4.3. Laboratory observation

Test subjects: Seven users (four females andthree males) aged between 19 and 24 years wererecruited for participating in the experiment.All the users were bachelor or master degreestudents of Computer Science in University ofHelsinki. All the recruited users were smart-phone users but none of them had any experi-ence with AR applications. To avoid biasing ofthe results, none of the users had any personalrelation with the experimenters. None of theusers had their mother tongue as English but allwere able to read, write, speak and understandEnglish properly. Only English speaking userswere recruited as the application’s language wasset to English. Hence, language played an im-portant role in understanding the user interfaceof “Augment”.Tasks: All the users were given the same task

to perform. The task provided to the users wasunstructured. The task was described from theperspective of the purpose of the AR applica-tion that is being evaluated (“Augment”). Since“Augment” is targeted for customers who wouldvisualize the 2D model from the catalogue in 3D

and try to see how the model looks in their sur-rounding, the task was described in that man-ner. We tried to provide the task that a userwould perform before purchasing a smart TV forhis/her home. Following shows the exact taskdefinition that was provided to the users:

Use the application to select a 3D model whichshould be Samsung Smart TV 55” and imaginethat you will buy a Samsung Smart TV 55”. Be-fore purchasing the Samsung Smart TV 55” youwould like to visualise that how Samsung SmartTV 55” would look in your surrounding. So,place the 3D model in such a way as you wouldhave done in real scenario. We will provide youwith an image over which you will place the 3Dmodel.

Procedure: Before the start of the labora-tory evaluations, all the users were given a livedemo of the application. All the users were askedwhether they felt comfortable about using theapplication. After assuring that the users wereconfident about using the application, the exper-iment started. The demo provided to the usersincluded the choosing of the 3D models function-ality; how to select the marker; how to place the3D model over the marker; how to rotate andhow to move the 3D model; how to zoom in orzoom out the 3D model; how to use the help

14

(a) user 1 selecting the 3D model (b) user 2 placing the 3D model in desired position

Figure 10: Screenshots from the video recordings of user interactions from 2 different users.

menus and how to save the AR desired user in-terface. We did not specify a time limit to theusers. The evaluations lasted for 5-7 minutes fol-lowed by a questionnaire session.Roles: The evaluation sessions were con-

ducted by the 2 authors of this document. Oneauthor showed a live demo of the AR applicationto the users in all sessions and the other authordid a video recording of the user interactions inall sessions.Data Collection: The data collection of the

laboratory evaluations was done by video andaudio recording of the user interactions. Allthe recordings were done after having permissionfrom all the users. The video and audio record-ings were done using Canon PowerShot A2500.Figure 10(a) and Figure 10(b) show screenshotsfrom the video recordings of user interactionsfrom 2 different users.Data Analysis: The analysis of the video

and audio recordings was done accurately sothat the maximum number of usability problemscould be found. The laboratory tests helped inanalysing the following measures:

• Task completion time.

• Usability problems encountered by users.

• Problems that cognitive walkthrough eval-uator thought might never occur but actu-

ally occurred (the results of this measure areshown in comparison subsection 5.1).

• Success rate of task completion.

• Number of users who used the help menus.

A track of all the video numbers was kept whilethe above measurements were done. This wascrucial because an analysis of number of usabilityproblems faced by each individual user was kept.

4.3.1. Results

In this section, the outcomes of the lab evalu-ation are presented. Each of the sub topics cor-responds to each of the measurements that havebeen analysed.

Task completion time: The time taken byeach user to complete the given task was mea-sured. The average task completion time of allthe 7 users was 1.12 secs. Though all the userswere treated equally, one of the users (user num-ber 4) had some advantages over other users.User number 4 not only saw the live demo ofthe application (given by the experimenters) butalso saw the previous user performing the sametask. User number 4 also did a demo herself be-fore performing the experiment. This affectedthe task completion time severely. User number4 took approximately half the time that the other

15

Figure 11: Graph showing the task completion times of the 7 users. All the 7 users were novice users in terms ofusing AR applications. All the 7 users were shown a live demo of the application and how to perform the task.User number 4 saw the demo once, saw the previous user performing the same task and did a demo herself beforeperforming the experiment. Hence, the task completion time of user 4 was approximately half that of the average ofother users but the user made a lot of errors while completing the task.

users took to complete the task. This demon-strates that once a user starts using this AR ap-plication, the user will get more familiar with theuser interface and hence it will be much easier forusers to complete the tasks. The graph in Figure11 demonstrates these findings. Though user 4completed the task in approximately half dura-tion than the other users, the user missed fewsteps in between to complete the task and alsotook few wrong steps to complete the task. Allthe details of these errors are described below.

Usability problems encountered byusers: In this case, the evaluators have anal-ysed the problems faced by users by observingthe recorded videos. The problems have beencategorised according to the standard measures[23] used: ’critical’, ’severe’ and ’cosmetic’problems. The definitions of the criteria used

in case of laboratory observation are illustratedbelow [22] (these are modified respect to theones used in Cognitive walkthrough - section4.1):

• Critical problems

Prevented test subject from completingtasks and/or

Recurred across majority of the testsubjects.

• Serious problems

Increased test subjects’ time to com-plete task of test subjects severely and/or

Recurred frequently across test sub-jects and

Test subjects still managed to completetask eventually.

16

• Cosmetic problems

Increased test subjects’ time to com-plete task slightly and/or

Recurred infrequently across test sub-jects and

Test subjects could complete task eas-ily.

A total of six usability problems (one critical,three severe and two cosmetic problems) were be-ing experienced by all the 7 users in laboratoryobservation. Since the testing was done using atablet, most of the problems that would have oc-curred using mobile device did not come forward.Table 6 shows a summary of the problems expe-rienced by the 7 users in laboratory observation.The detailed descriptions of the problems havebeen attached as appendix to this report.

Table 6: Number of identified usability problems (totalnumber of individual problems found in all users and totalnumber of common problems experienced by all users).

Usabilityproblems

Individualproblems

Commonproblems

Criticalproblems

3 1

Serious prob-lems

4 3

Cosmeticproblems

2 2

Success rate of task completion: Onlyuser 1 and user 2 could complete the task suc-cessfully, as specified in the task. Both userstook exactly the same steps as anticipated bythe evaluator (who prepared the cognitive walk-through task). The remaining 5 users completedthe task but missed many steps that were re-quired to complete the task.Number of users who used the help

menus: None of the users utilised the help menuoption provided in the user interface. Most ofthem asked for manual help from the experi-menter. This observation clearly demonstratesthat designers should try to design the user in-

terface in such a manner that it should have leastlearnability curve.

4.4. Questionnaire


After the laboratory tests, users have beenasked to fill a questionnaire. The goal of thequestionnaire is to evaluate the degree of usersatisfaction after the laboratory tests performedby the users. The questionnaire contains 13statements that have been designed taking intoaccount the results of the heuristic evaluation.However, in this chapter we are analysing theresults of the questionnaires as an isolated us-ability evaluation method.

As stated before, the users that have filled thequestionnaire are the same users that have per-formed the laboratory tests (i.e. 7 Universitystudents (4 female and 3 male) with ages be-tween 19 and 24, smartphone users and with noprevious experience in AR applications). Usershave been asked to grade their conformity withthe 13 statements from 1 (totally disagree) to5 (totally agree) and they have had the oppor-tunity to comment every statement individuallyin a text area. The questionnaire answers canbe found in the AppendixE. The reader shouldnote that although the laboratory tests and thequestionnaires have been carried out by the sameusers, the results obtained from the question-naire do not reflect the same results obtainedfrom observation of the laboratory tests. How-ever, this chapter will not take into account thisfact and will analyse the raw data of the ques-tionnaires as if the observations obtained fromthe laboratory tests were not performed.

4.4.2. Results

Figure 12 shows the frequency of each markfor the 13 statements.

Following, the 13 statements and an analysisof the results obtained for each statement arepresented.

1. The system provided me with feedback onwhat I was working. I was not confused orlost while performing the task

17

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7 8 9 10 11 12 13

1

2

3

4

5

Figure 12: Results from the questionnaires. The image shows the frequency of every mark for each statement. Thereader should note that there were 4 statements (number 4, 5, 8 and 12 - refer to section 4.4.2) that were not answeredby all users.

The majority of the users have felt that theyhave been in control of the application andthat they have not been confused while us-ing it.

2. The messages that appeared in the applica-tion were self-explanatory and I understoodwhat they were trying to explainAlso the messages have been clear for allusers.

3. I was in control of the application all thetimeThe majority of the users have rated posi-tively this statement. However, one user hasrated this statement as 2, showing that notall users are happy with the way that theapplication is controlled.

4. I could easily undo/redo any action if I feltto do itThere is no uniformity in the results of thisstatement. One user has commented thatrotation with fingers was a hard task.

5. If I mistakenly chose a wrong 3D model, I

could easily stop uploading it

The majority of users have rated this state-ment as 3. The reason of this grade is prob-ably that they have not faced such a prob-lem (some have stated this in the comment)as they have felt pleased with the selectedmodel, even if it was not the model that theyhave been asked to select.

6. The navigation through the application waseasy

Users have been able to navigate throughthe application easily. The users were in-structed before performing the tests, so theresults of this statement are the expected.

7. The application had many errors and/orcrashed

All users have agreed that this statement isfalse, as they have not encountered any erroror crash (note that one user has rated thisstatement as 5, but in the comment of thisstatement he has written “Didn’t crash atall, always a good thing for an app”, which

18

means that he misunderstood the markingof this statement).

8. The application always verified the 3D modelbefore loadingUsers have agreed with this statement.Probably the question should have been re-defined to more clearly reflect the problemsfound in the other usability methods.

9. The option to select the desired function wasalways clear and available all the time andthe icon images helped me to know the ap-propriate functionality of the available op-tionsAll users have rated 4 this statement. Thismay lead to think that although they werecomfortable with the layout of the options,some improvements can still be done to pro-vide a better experience. One user com-mented that words are easy to understandeven for a non-native English speaker.

10. It was easy to find the desired options at anytimeThe grades of this statement are positive ingeneral. However, as all users have beeninstructed before using the application, amore unified grading (i.e. a majority of 5s)should have been expected.

11. The application was well designed visuallyUsers have considered that the applicationis well designed visually.

12. If error messages appeared, they were clearin their description and probable steps to re-cover from it were providedUsers have rated this statement either asgood (4 or 5) or as neutral (3). The reasonfor these grades is that users have not en-countered errors while performing the task.This has been also reflected in the commentsto this statement.

13. When I needed help, the demo videos andfeedback helped me to complete my task suc-cessfullyThe majority of users have rated positivelythis statement. However, they may havebeen rating not only the help of the appli-cation, but also the instructions presented

to them before the tests, as reflected in onecomment.

As a general conclusion of the results obtainedfrom the questionnaires, it can be said that usershave felt comfortable using the application ingeneral. A larger number of users is required toobtain more robust conclusions. However, dueto the restricted conditions in terms of resourcesand time of this work, the results can be con-sidered appropriated and may open a way forfuture evaluations. Probably, instructing usershas introduced a bias in the way of how a newuser would use the application. However, theinstructing session has been essential as userswere not familiar with AR technology. Also,users have not encountered errors and crasheswhile using the application. Although this isa good aspect of the application, several errorshave been detected while performing the otherusability evaluations with the phone case. Asthese errors have not appeared during the lab-oratory sessions, the design problems related toerrors and their texts have not been found by theusers.

In a future laboratory session, it would be in-teresting to include tests with users that havenot been instructed in the application before us-ing it. In this new session, results would prob-ably differ in some statements (e.g. statement5). Also, another test with the phone instead ofthe tablet would have probably produced moreerrors/crashes in the tests and some statementswould have received different grades (e.g. state-ment 7).

It would be also important to carry out an-other session with users that are familiar withthe concept of AR. From the results obtained inthe grades and comments of the questionnaires,the majority of users have been surprised by thenovelty of what they have been seeing by firsttime. Therefore, users have been more concen-trated in understanding AR and how to use itrather than detecting the real design problems.This could have been very different if the appli-cation to be evaluated was from a field more fa-miliar to them, such as a messaging application.

19

In that case, they would have had the chance tocompare how the application is designed againstsome previous knowledge that they already havefrom previous experiences. One positive aspectof this fact, however, is that although they werenot familiar with AR, they were able to rapidlyuse the application, showing once more the fastlearning curve of AR technology, as it is ex-plained in [24].

Finally, one important design problem foundwhile performing the other usability evaluationswas the translation of text to Spanish. The lab-oratory tests have been performed with the En-glish version only. In a future laboratory ses-sion, it would be good to include some tests withSpanish speakers in order to analyse these lan-guage problems. However, due to the restrictedresources and time and in order to maintain uni-formity in the results, testing with other lan-guages has been left outside of this first labo-ratory session.

5. Comparison of results

This section presents a comparison of the re-sults found from the four usability methods. Thepurpose of this comparison is to find the mostappropriated method to be applied while evalu-ating an AR application. The main reason foranalysing this factor is that, due to limited timeand resources, organizations are not able to ap-ply all the usability methods. Hence, choosingthe best usability methods is an important is-sue, so that most of the usability problems couldbe figured out from the method.

5.1. Cognitive walkthrough and lab observationsresults comparison

The problems that the users have encounteredwith respect to the task analysis carried out inCognitive Walkthrough by the facilitator havebeen demonstrated. In Cognitive Walkthrough,the facilitator has analysed the same task thathas been given to the users to perform in labora-tory evaluation. The scribe has provided correctsteps that a user would have taken to complete

the tasks. Figure 8 describes the task dividedinto steps that the facilitator has used to evaluatethe user interface of the AR application (“Aug-ment”). At first, the behaviour of the users inthose steps has been described. Then, the opin-ion of the facilitator has been compared with it.The following list shows the steps of the correctactions described in Figure 8 and correspondingto it, users’ behaviour is described.

a) Since all users were smartphone users, theycould easily open the application at first at-tempt.

b) The analysis of this question is a bit trickyin this respect. This is because, in the demosession none of the users was shown the possibil-ity to select the second option. So, all users se-lected exactly the same option which was shownin demo session. As experimenters, we think thatif the users were shown the possibility to selectthe second option then they would have done so.Hence, many of the users would have chosen thewrong option and would have been directed inthe wrong path.

c) Though the exact name of the 3D model tobe used was described in the search task, none ofthe users used the search box option to search forthe model. The possibility of using the search op-tion was not shown in the demo session. Hence,it reflects that most users will probably ignorethe search option provided at the top of the userinterface of this application. Most of the userstried to fetch the 3D model from the catalogue ofthe 3D models. The raw data obtained from thevideo recordings is shown in AppendixD. One ofthe serious problems found was finding the 3Dmodel. Few users took some manual help fromthe experimenters while few users chose entirelydifferent 3D model than the one that was askedin the task to move forward towards task com-pletion. Figure 13 presents the findings. Figure13 shows the percentage of users who could eas-ily find the 3D model and percentage of userswho had difficulty in finding the 3D model.

d1) The demo was shown using the tablet inhorizontal direction. Hence, all the users holdedthe tablet in horizontal direction. Therefore, it

20

Figure 13: The blue color represents the percentage ofusers who could find the desired 3D model very easily.The red color represents the percentage of users whocould not find the desired 3D model easily.

is not possible to make any evaluation on thistask.

d2) One of the critical problems that wasfound was creating the tracker for placing the 3Dmodel. Since tablet was used for usability test-ing, the 3D model appeared on the user interfaceeven if the tracker was not created. Unfortu-nately, in case of mobile device, the 3D modelwill not appear if users are not able to createtracker properly. Hence, this problem was cate-gorised in critical level. If a mobile device wouldhave been used for laboratory testing, then mostof the users would not have completed the task.Figure 14 summarizes the results of this obser-vation. The raw data obtained from the videorecordings is shown in AppendixD. Figure 14shows the percentage of users who could easilycreate the tracker and percentage of users whohad difficulty in creating the tracker.

d3) To create the tracker, the user had to pressthe window button that provides the user withhelp menu with information about how to usethe marker. Though the users who were ableto create the marker opened the window, noneof the users read the help instructions. Theyperformed as it was shown in the demo session.

d4) Perform the scan: This step is the onewhere the creation of the marker is performed.Users who remembered the need for creating themarker were able to scan the marker as they were

Figure 14: The blue color represents the percentage ofusers who could easily create the tracker. The red colorrepresents the percentage of users who could not createthe tracker easily.

explained in the demo session that they had topress the screen when the green light appears onthe screen. However, some users needed moretime than others to obtain the green color asthere was not additional information in the ap-plication on how to improve the scanning of themarker.

e) Most of the users could easily place themodel. This is because in laboratory evaluations,tablet was used. In tablet, even if the user is notcreating a marker the model will appear in frontof the camera.

e.i) The flash option was not shown to theusers in the demo. None of the users used theflash option, though it was one of the option dis-played in the bottom menu.

e. ii) a) All the users could easily enlarge the3D model. b) All the users could make the 3Dmodel smaller. Both of the above options wereshown clearly in demo session to all the users.

e. iii) Most of the users who used the rotatebutton tried to rotate the model in x and z di-rections also. Current state of application, allowsrotation only in y direction. Figure 15 summa-rizes the results of this observation. The rawdata obtained from the video recordings is shownin AppendixD.

f) All the users could perform this step suc-cessfully, as it was shown in demo session.

Hence, the results of the two methods demon-

21

Figure 15: The plot shows that users using the ”rotate”button tried to rotate the model in x and z axis also.

strates that the facilitator’s opinion is not alwaysaccurate due to the large variability of the users’behaviours. Hence, both methods should be usedto find a larger number of usability problems.

5.2. Heuristic evaluation and questionnaire

In this section, the results obtained fromheuristic evaluation and questionnaires are com-pared. Although the number of problems foundby evaluators in the heuristic evaluation has beenlarge, the users have not reflected this fact in thequestionnaires. The statements presented in thequestionnaire have been designed to deal withthe heuristic problems found. However, usershave rated all statements positively and there-fore, the problems found in the heuristic evalua-tions have not been found in the questionnaires.Therefore, a proper comparison of results cannotbe carried out for these two methods.

As it has been mentioned before, several prob-lems found in the heuristic evaluation are relatedto the specific device (mobile phone) and lan-guage of the application (Spanish). As the ques-tionnaires have been filled by the users that car-ried out the laboratory tests with the tablet andwith the application in English, it is obvious thatthose problems could not have been found in thequestionnaires.

Some further tests have been already proposedin order to obtain a larger set of questionnairesthat could help to detect design problems. How-ever, with current results, questionnaire method

cannot be suggested as a good tool for detectingdesign problems compared to heuristic method.Therefore, for small scale evaluations, heuristicmethod is suggested if the number of users forquestionnaire method is low.

6. Conclusion

6.1. Summary

In this document, the analysis of four usabilityevaluation methods has been presented. From allavailable usability methods, two methods fromusability inspection (Cognitive walkthrough andHeuristic evaluation), one from usability testing(Laboratory observation) and one from user re-ports (Questionnaire) have been selected. Analready available AR application, called “Aug-ment” has been selected as target of the fourevaluations. The goal of this work is to analysethe four methods and to detect the suitability ofthe methods for the proposed aim.

Due to the limited resources and time, the au-thors of this document have acted as evaluatorsin the methods were expert evaluators were re-quired while a limited number of users (7 realusers) has been used to carry out the user eval-uations. Although the application is availablein two operating systems, Android has been se-lected as the target operating system due to thesame restrictions.

6.2. Discussion

During the evaluation carried out in this work,several design problems have been found. Theresults obtained show that there is a need forthe developers to revisit the application in orderto enhance the design of the interface.

Even several problems have been found in thelaboratory observations, these problems have notbeen reflected by the users in the questionnaires.This may be due to the fact that users are not fa-miliar neither with the design problems nor withAR applications.

From the four methods, questionnaires appearas the method that has provided a lower number

22

of design problems. Although it is a good prac-tice to include users with no previous knowledgewith AR applications, users with prior knowl-edge could have probably provided better resultsin the questionnaires.

However, the combination of methods hasshown good results as not all design problemshave been found in all methods. Moreover, us-ing two evaluators in the heuristic evaluation hasdemonstrated that using more than one evalua-tor in the method provides better results.

According to the results obtained from the lab-oratory tests, the learnability curve of the noviceusers of this application seems high. In order toevaluate the learnability curve of the AR appli-cation, further analysis based on the commonproblems has been performed. The four possibleactions by the users are summarized as:

1. User who easily found out 3D model.

2. User who could not easily found out 3Dmodel.

3. User who could easily create the tracker forthe 3D model.

4. User who could not easily create the trackerfor the 3D model.

In order to analyse the percentage of users whofaced problems in both selecting the 3D modeland creating the tracker, users who faced eitherof the problems and users who faced none of theproblems, a graph showing the results is pre-sented in Figure 16.

With the aim of detecting the larger num-ber of possible design problems, the methodshave been combined with two different devices(AppendixA) and two languages (English andSpanish). This decision has demonstrated to bea good approach as several design problems havebeen found only for a specific device and/or fora specific language.

6.3. Design guidelines

In this section, a series of design guidelineson how to improve the application and how toperform usability evaluations in this kind of ARapplications are presented.

Figure 16: Red color represents percentage of users whofaced both the problems. Prussian blue color representspercentage of users who did not face any problem. Lightblue color represents percentage of users who found se-lecting 3D model a problem but could easily create thetracker. Yellow color represents percentage of users whocould easily select a 3D model but could not create thetracker.

6.3.1. Design guidelines for “Augment”

The application has shown several designproblems. Some recommendations to enhancethe application interface are the following:

• Follow the standards of Android platform.

• Provide more intuitive interfaces and a moreorganized “option dialog”.

• Improve the manipulation of 3D objects.

• Provide more information about AR con-cept and about what a good marker is.

• Provide help tips in the appropriate context.

• Provide better translations of the languages.

6.3.2. Design guidelines for AR evaluation

Evaluating AR interfaces is a relative novelissue compared to other interfaces. Therefore,the combination of several methods, like the ap-proach presented in this document is suggestedin order to obtain more accurate results. Fromthe four methods studied in this work, the fol-lowing conclusions can be summarized:

23

• Combining one usability inspection methodwith one usability testing method is recom-mended to obtain a reliable outcome.

• Using more than one expert in the inspec-tion methods is suggested.

• If questionnaire method is going to be used,the number of users to fill it should be largeenough and contain a variety of users, in-cluding AR experts.

6.4. Future work

The work presented in this document faces oneproblem of available resources and time due tothe restrictions of the context of the work. Inorder to obtain more robust results, further eval-uations need to be done.

One of the first ideas of future work for theproposed study is to create a larger laboratoryexperiment including more users in order to ob-tain more statistically consistent results in both,laboratory observations and questionnaires. In-cluding AR experts in the list of users couldprovide a large advantage as many users havedemonstrated to be paying attention to the nov-elty of AR concept rather than trying to carryout the task in a proper way. Also, the exper-iments should include both devices to be usedby the evaluators and also several users carryingout the tasks in Spanish language would be de-sirable. If conditions were ideal, more user testswith other devices and also with iOS devices aswell as other languages should be carried out.

Apart from more user tests, the cognitivewalkthrough and heuristic evaluation methodsshould be carried out by more evaluators. Itwould be desirable to recruit a number of 3 eval-uators for each method in order to obtain moreaccurate results.

Finally, including further methods apart fromthe four studied methods would provide a betteroverview of which usability evaluation methodcan be more suitable for the evaluation of ARinterfaces.

7. Acknowledgement

This project work has been a part of Designand User Evaluation of Augmented-Reality In-terfaces course held in University of Helsinki.The authors of this document thank Dr. ElisaSchaeffer for her guidance and periodic feedbacksto complete this project.

8. References

[1] P. Milgram and F. Kishino. A tax-onomy of mixed reality visual displays.IEICE Trans. Information Systems, E77-D(12):1321–1329, dec 1994.

[2] Ronald T. Azuma. A survey of augmentedreality. Presence: Teleoperators and VirtualEnvironments, 6(4):355–385, aug 1997.

[3] Ivan E. Sutherland. A head-mounted threedimensional display. In Proceedings of theDecember 9-11, 1968, Fall Joint ComputerConference, Part I, AFIPS ’68 (Fall, part I),pages 757–764, New York, NY, USA, 1968.ACM.

[4] Nektarios N. Kostaras and Michalis N.Xenos. Assessing the usability of augmentedreality systems.

[5] Andreas Dunser, Raphael Grasset, HartmutSeichter, and Mark Billinghurst. In 2ndInternational Workshop on Mixed RealityUser Interfaces: Specification, Authoring,Adaptation (MRUI ’07), 2007.

[6] Augment. Augment official website,http://augmentedev.com/, 2011.

[7] M.N. Mahrin, P. Strooper, and D. Car-rington. Selecting usability evaluationmethods for software process descriptions.In Software Engineering Conference, 2009.APSEC ’09. Asia-Pacific, pages 523–529,Dec 2009.

24

[8] Jakob Nielsen and Rolf Molich. Heuristicevaluation of user interfaces. In Proceedingsof the SIGCHI conference on Human factorsin computing systems, page 249–256. ACM,1990.

[9] Jakob Nielsen. Heuristic evaluation. Usabil-ity inspection methods, 24:413, 1994.

[10] Cathleen Wharton, John Rieman, ClaytonLewis, and Peter Polson. Usability inspec-tion methods. chapter The Cognitive Walk-through Method: A Practitioner’s Guide,pages 105–140. John Wiley & Sons, Inc.,New York, NY, USA, 1994.

[11] John Rieman, Marita Franzke, and DavidRedmiles. Usability evaluation with the cog-nitive walkthrough. In Conference Compan-ion on Human Factors in Computing Sys-tems, CHI ’95, pages 387–388, New York,NY, USA, 1995. ACM.

[12] Heather Desurvire, Jim Kondziela, andMichael E. Atwood. What is gained andlost when using methods other than empiri-cal testing. In Posters and Short Talks of the1992 SIGCHI Conference on Human Fac-tors in Computing Systems, CHI ’92, pages125–126, New York, NY, USA, 1992. ACM.

[13] Xiangyu Wang. Using cognitive walk-through procedure to prototype and evalu-ate dynamic menu interfaces: A design im-provement. In Computer Supported Cooper-ative Work in Design, 2008. CSCWD 2008.12th International Conference on, pages 76–80, April 2008.

[14] S. Rosenbaum. Usability evaluations versususability testing: when and why? Profes-sional Communication, IEEE Transactionson, 32(4):210–216, Dec 1989.

[15] M. Alshamari and P. Mayhew. Task design:Its impact on usability testing. In Internetand Web Applications and Services, 2008.ICIW ’08. Third International Conferenceon, pages 583–589, June 2008.

[16] F.H.A. Razak, H. Hafit, N. Sedi, N.A.Zubaidi, and H. Haron. Usability testingwith children: Laboratory vs field stud-ies. In User Science and Engineering (i-USEr), 2010 International Conference on,pages 104–109, Dec 2010.

[17] Jesper Kjeldskov and Connor Graham. Areview of mobile hci research methods. InHuman-computer interaction with mobiledevices and services. Springer Berlin Hei-delberg, 2003, pages 317–335, 2003.

[18] Morten Sieker Andreasen, Henrik Ville-mann Nielsen, Simon Ormholt Schrøder,and Jan Stage. What happened to re-mote usability testing?: An empirical studyof three methods. In Proceedings of theSIGCHI Conference on Human Factors inComputing Systems, CHI ’07, pages 1405–1414, New York, NY, USA, 2007. ACM.

[19] Andreas Holzinger. Usability engineeringmethods for software developers. Commun.ACM, 48(1):71–74, January 2005.

[20] Clayton Lewis Cathleen Wharton,John Rieman and Peter Polson. Thecognitive walkthrough method: A prac-titioner’s guide. in usability inspectionmethods. John Wiley and Sons, New York,48(1):71–74, January 2005.

[21] Liu Fang. Usability evaluation on websites.9th International Conference on Computer-Aided Industrial Design and Conceptual De-sign, pages 141–144, November 2008.

[22] Henry Been-Lirn Duh, Gerald C. B. Tan,and Vivian Hsueh-hua Chen. Usability eval-uation for mobile device: A comparison oflaboratory and field tests. In Proceedings ofthe 8th Conference on Human-computer In-teraction with Mobile Devices and Services,MobileHCI ’06, pages 181–186, New York,NY, USA, 2006. ACM.

[23] Rolf Molich. Usable web design. NytTeknisk Forlag, 2008.

25

[24] Desi D. Sumadio and Dayang R.A. Ram-bli. Preliminary evaluation on user accep-tance of the augmented reality use for edu-cation. In 2010 Second International Con-ference on Computer Engineering and Ap-plications, page 461–465, 2010.

26

AppendixA. Devices used for the evaluation

The evaluation of the application “Augment” has been carried out in two different devices runningAndroid OS: a mobile phone and a tablet. The reason for this selection is to evaluate the consistencyof design and usability of the same application in two remarkable different devices using the sameOS. While the mobile phone has a smaller screen and less powerful camera and processor, the tabletprovides a large screen with better hardware specifications. Table A.7 shows the specifications forboth devices.

Table A.7: Specifications of the selected devices.

Device SpecificationsMobile phone: Sony Xperia E OS: Android 4.1.1 (Jelly bean)

Screen: 320 x 480 pixels, 3.5 inchesVideo resolution: 640x480px

Processor: Qualcomm MSM7227A Snapdragon 1GHzTablet: Samsung Galaxy Tab 10.1 OS: Android 4.0.4 (Ice cream sandwich)

Screen: 800 x 1280 pixels, 10.1 inchesVideo resolution: 720p

Processor: Nvidia Tegra 2 T20 Dual-core 1GHz

27

AppendixB. Cognitive walkthrough

AppendixB.1. Preparation Phase

The next page shows the cognitive start-up sheet used for the cognitive walkthrough preparationstart up sheet.

28

Cognitive walkthrough start-up sheet Interface: Augment -3D Task: Visualize your furniture in your room before purchasing it Evaluator(s): Date: Task Description: Describe the task from the point of view of the first time user. Include any special assumptions about the state of the system assumed when the user begins to work. Select a furniture 3D model and place it in your room in a desired place and save it. The system will be in a state such that someone could immediately start testing. Action Sequence. Make a numbered list of the atomic actions that the user should perform to accomplish this task. a) Open the application b) Choose the desired option c) Choose the desired 3D model

c1) Only for smartphone in vertical position c2) Select "create marker" c3) Read and close help window c4) Perform the scan

d) Place it in your environment in a desired way: i) Turn on flash (if required) ii) Adjust the scale of the 3D model:

a) Make it big if required b) Make it small if required

iii) Rotate it in your desired location e) Take a photo of it and save it Anticipated Users. Briefly describe the class of users who will use this system. Note what experience they are expected to have with systems similar to this one, or with earlier versions of this system. People who have experience with smartphones but limited experience with Augmented Reality applications in smartphones. They should have basic knowledge of how to use an application in a smartphone and will have gone through the demo the Augmented Reality application. User's Initial Goals. List the goals the user is likely to form when starting the task. If there are likely goal structures list them. In Augment application, the user might get 2 goals: i) user might select the option to browse the 3D galleries (a step towards our task) - success ii) user might select the option to scan 2D catalogues -failure

AppendixB.2. Evaluation Phase

The next page shows the cognitive evaluation sheet (divided into 2 parts) used for the cognitivewalkthrough evaluation.

30

We consider that theuser understands thegoal of the app andsome basic knowledgeof how AR works

Step a:Opentheapplication

Step b:Choose thedesiredoption

Step c:Choosethedesired3D model

Step c1(verticalsmartphoneonly): Swipeto see morebuttons

Step c2:Select"Createmarker"

Step c3:Readandclosehelpwindow

Step c4:Performthe scan[1]

Step d:Place itin therightplace

Step d.i: Turnon flash

Step d.ii:Adjust thescale of the3D model(applied innextcolumns)

Step d.ii.a:Make it bigif required

Step d.ii.b:Make itsmall ifrequired(there wasno need forthis step)

Step d.iii:Rotate it inyourdesiredlocation -->It can bedone eitherby rotatingthe paper(not relatedto the app)or by usingthe rotationfunction ofthe app.

Step e:Take aphoto of itand save it

Will the user try toachieve the right effect? 5 5 4 2 3 5 5 5 - No need 3 - 4 4

It is clear thatuser needs toselect oneoption

User willprobablyscrolldown andfind the3D model.User mayalso usethesearchbox forfindingthe model

User hasbeen askedto create amarker, sohe/she maydiractly tryto point thecamera tothe imagethat will actas marker

When the"createoption"appears,the user willprobablyselect it

It is clearthat userneeds topressAccepttocontinue

Theuser willtry toscan theimage

User willtake themarker tothedesiredplace

Noflash isavailable

User maynotunderstandthat he/sheis able toscale themodel bypinchingthe screen

User mayrotate themarker sothat theobject willbe rotated

User mayunderstandthat thephoto canbe takendirectlyfrom thisview

Will the user notice thatthe correct action isavailable? 5 4 5 1 5 5 5 5 - 1 - 1 4

The usermayunderstandthat theoption is oneof the twoavailableoptions

The optionis "hidden"in the toolbar as only3 icons arevisible in theverticalposition ofthe phone

The optionfor resizingis onlyexplainedin the help,so unlessthe user isusing thehelpfeature, theoption isnot visible

The optionto rotate byusing thesoftware isnot eithervisible, norintuitive

Icon isavailablewith animage of acamera

1. We don't consider here the problem of not getting the appropiate image and the failures due to technological issues as they do not concern to design problems

Will the user associatethe correct action withthe desired effect? 5 3 5 1 4 5 5 5 - 3 - 3 4

There are 2options: a)Browse 3Dmodels (rightoption) maybe chosen asthe task asksfor "select afurniture 3Dmodel". b)SCAN optionmay bechosen as itsays that thisoption issuited toscan animage

When therightmodel isfound, theuser willunderstandhe/sheneeds toclick on it

It is veryunlikely thatthe user willrealize thathe/sheneeds toscroll thetoolbar tofind the rightoption

It is likelythat userwillassociatethe "createmarker"option withthe rightstep tofollow

Colorsand textare selfexplanatory

Movementof themarkerwill showamovementin the 3Dmodel

If users arefamiliarizedwithzoomingimages insmartphones,they mayunderstandthat theycan scalewith samefeature

If user findsthe optionfor rotating,he/she mayunderstandthat it willrotate themodel.However,the rotationis not thedesired

If usersrecognizesthe icon,he/she willknow thatit is thebutton topress

If the correct action isperformed, will the usersee that progress isbeing made towards thesolution of the task? 5 4 4 5 5 5 5 5 - 4 - 5 5

If userselects theright option,the selectionof modelsappear whichmay lead theuser to thinkhe/she is inthe rightdirection.However,some usersmay thinkthat this isnot the rightway toproceeed

User willsee thatthe nextscreen isa videocapturescreen

If userscrolls thetoolbar, the"createmarker"optionappears infirst place

A helpmessageappearsexplainingtheprocedureto createthe marker

A newwindowwith aframewillappear

The nextview istheactualaugmentedscene

If usersacles themodel,he/she willprobablysee that itis beingscaled,althoughsmalescales maynot benoticed

Amessageconfirmsthat imagehas beensaved

Markingconvention1Totallywrong2 Quitewrong3Confusing4 Quiteobvious5Totallyobvious

AppendixC. Heuristic evaluation

In this appendix, the results of the individual heuristic evaluations are shown.

AppendixC.1. Evaluator 1

This section presents the results obtained from evaluator 1, summarized as a table.

# Problem Heuristicnumber

Possible solution

1 Options are not providedwhen pressing the defaultbutton of Android devices

4 Provide options whenpressing the button.

2 When pressing back from the“selecting model” window,

the application exists insteadof going back to the previouswindow (the initial window).This happens in more places.

3, 4 Provide history navigationthrough the application.

3 A message appears whentrying to use the SCAN

feature saying that the app isnot going to work, but it still

works. The same messageappears in the AR feature.

The message and the optionsprovided create confusion.

Also, the app works inSpanish, but this message

appears in English.

1, 2, 9 Do not show the errormessage or clarify themessage. Also, provide

translation for the message.

4 When scanning a QR code, IfQR code is detected but notrecognized, it gets a loadingdialog and after that a new

interface with the view of thecamera, a button for taking aphoto and a button with anunknown behaviour (when

pushing it nothing happens).

1, 5 Provide an error messageinstead of a new interface.

34

5 A message appears aftersome time when SCAN

feature has found no codes orimages to scan. It says it was

not found and offers to tryagain, but the only option is

to push accept and thattakes the user back to themain window, but not toreally “try again” option.

9 Implement the “try again”option.

6 Saving images from the appis working but the

application says that theimage has been saved to the

gallery before the imageappears in the gallery.

1 Either wait until the image issaved to display the messageor replace the message with atext saying “the image willappear soon in the gallery”.

7 When using the AR feature,the icon bar or tool bar

shows 3 icons if using thephone vertically and there is

no sign of more icons.However, there are more

icons that can be accessed bysliding the finger over the

icon bar. When usinghorizontally, 5 icons appear,but there is no evidence of

the availability of more icons(same as before).

1, 4, 6, 7 Show hints for the user toknow that there are more

available icons.

8 The text of the icons iscropped for some icons.

1, 2, 4 Reduce text or avoidcropping.

9 When trying to create a newmarker, the help message

appears every time,regardless the feature hasbeen used before and the

user is experienced with theapp.

7 Show help message only thefirst time.

35

10 When trying to create a newmarker with an image thatthe system considers notappropriated, a message

saying that it is not a goodmarker appears, but there isno way to know why it is not

a good marker.

1, 6 Provide information on whyit is not a good marker.

11 Sometimes, the app switchesbetween “not a good

marker” and “too close”which creates confusion as

user may still think that theimage is valid as a marker.

1 Provide more preciseinformation about the

suitability of the marker.

12 “Too close” message isdisplayed, but sometimes

getting closer takes the userto a “good marker”.

Therefore, the “too close”message creates confusion.

2 Provide more preciseinformation about the

suitability of the marker.

13 When using the sharingfunction, it is not clear whatit is going to be shared. Theuser may think that he/she issharing an image, but what

it is shared is a link to awebpage where an image of

the 3D model and a QR codeis displayed. They ask the

user to scan the QR code tovisualize the 3D model in

AR, but when scanning thecode (with Augment app), it

takes you to downloadAugment app.

1 Provide more info aboutwhat is going to be shared.

14 Sharing by e-mail is notalways working (it might beproblem of third party app).

5 Check against errors.

15 After trying to share twiceby e-mail, the app crashed.

After the crash, when tryingto open the app again, it

keeps crashed.

Programmingerror

Handle crash events.

36

16 The app works in Spanish,but the model categories are

in English.

2 Translate categories.

17 Some messages are using not“natural language”. For

example: “Usted requiereestar firmado al sistema para

disfrutar de estacaracterıstica”. The Word“firmado” is never used in

this context. Theappropriate word would be

“registrado”.

2 Improve translation.

18 Help has been translated toSpanish, but the sentences

lack of meaning. Forexample: “Fresco” has nosense here and probably

comes from a bad translationof “Cool”. “Genial” could bea more appropriated word.

2 Improve translation.

19 When trying to send amarker to an e-mail account,the app asks for the e-mail

address, but keyboard showsup very late.

1 Improve behaviour.

20 When sending the marker bye-mail, the user feels he/she

is sending his/her ownmarker, but what is sent is alink to a webpage where theyexplain the functioning of the

application.

2 Explain clearly what is goingto be shared.

21 The words “marcador” and“rastreador” are used for the

same purpose. It needscoherency here.

2 Use always same notation.

22 When using the help inSpanish, the external help is

in English.

2, 10 Provide external Spanishhelp.

23 There is no option to changethe language.

Functionalityproblem

Provide the feature ofchanging language.

37

24 When the app starts, it offers2 options (AR and SCAN).

Later, in the applicationoptions, the SCAN feature isclearly visible, but the ARoption is not visible and

there is a need to scroll down(even below Help section) togo to Categories and enterone of these categories to

access the same feature thatwas easily accessible when

starting the app.

8 Reorganize the applicationoptions.

25 Also, the list of categories isdisplayed in the same

context as the other options.It may be more suitable to

provide an AR optiontogether with the scan optionand inside the AR option to

display the categories.

8 Reorganize the applicationoptions.

26 There are shortcuts, but theyrequire to be registered. Forsome it may have sense (own

models) but for others itshould be possible to access

without being registered(History of 3D models,

favourites).

7 Allow some shortcuts fornon-registered users.

27 There are 3 interactionoptions: translate, scale androtate. In the tool bar, only

rotate is available.

8 Provide the other 2interaction options as

buttons. Another optionwould be eliminate the

rotation from the tool bar.28 Rotation is possible in 1 axis

only. Translation is possibleonly along the plane defined

by the marker.

Functionalityproblem

Provide interaction in moreaxis/planes.

38

29 There is one button in themain AR view that is not

clear what it does (actually,the button resets the model,

so it is only possible tounderstand what is does until

some modifications havebeen made to the model).

1 Provide a hint or better iconfor the user to understand

the behaviour.

30 The button can be pressed byerror (as user is not aware ofthe behaviour of the button)

and it will eliminate themodifications in the modelwithout asking the user or

without any “undo” option.

5 Provide an undo option forthe action.

31 When adding a second modelto the AR view, the changesin translating, rotating and

scaling the first modeldisappear and the model is

restarted to the originalposition.

5 Maintain the position andorientation of previous

models.

32 Scaling models when morethan 1 is visible is morecomplicated than when

having 1 only.

Programmingproblem

Improve the control.

33 Adding models to the sceneis possible, but removingmodels is not possible.

5 Provide undo and/or deleteoption.

AppendixC.2. Evaluator 2

This section presents the results of evaluator 2 as a written document.

39

Heuristics Evaluation of Augmented Reality

1. Visibility of system status:

i) The application doesn’t shows you what should be an ideal tracker. The user uses the

tracker and then doesn’t knows which is the ideal way to create a tracker.

ii) The application doesn’t shows you why your tablet can’t detect the QR code or

Augment symbol even after you are scanning the right image.

2. Match between system and the real world

In “scan” part of the application, no error message is provided if the tablet camera is

unable to detect the QR code or the Augment symbol. This thing confuses the user, as

whether the user is doing the correct action or not.

3. User control and Freedom

Very poor. User doesn’t have any control of the system. At the start of the application

there are 2 options that a user can choose:

i) BROWSE

ii)SCAN

If the user chooses “browse” then if the user wants to go back to the “scan” and presses

“back” button then the application just closes and doesn’t shows the user with the “2

options” interface. Everytime, user wants to navigate from one option to the other, the

user have to close the application and then again relaunch it.

4. Consistency and standards

Consistency is not maintained. Since, it is a 3D model it should have possibility of rotation

in 3 axis. “rotate” button is not consistent.

The 3D model can be rotated only in y-axis whereas a 3D model should be such that it

can be rotated in 3 directions, so that users can place the model in right direction.

5. Error Prevention

No error messages are shown. If user performs a task no confirmation is shown. For

example:

i) if user mistakenly selects the wrong 3D model then no confirmation is showed on

whether the user wants to select this model or not.

ii) If user mistakenly chooses the wrong “tracker” (this can happen because even if the

tracker is not in correct position, the user interface provides a green colored feedback),

no confirmation of whether the user is satisfied with the tracker or not.

iii) No confirmation is done before user clicks the “save” button. User can mistakenly

click the “save” button even if user didn’t get the desired position of the model.

iv) In “email tracker” button, even if the tracker is not created, if user presses the “email

tracker” button, user is asked to enter the email address. No feedback is given to the

user that the tracker has not been create hence nothing can be emailed.

v) In “share” button, it directly asks the user to choose the mode of sharing. After the

user completes all the steps of sharing like write email address if user wants to share

through gmail then the user is shown the feedback that user first needs to upload the 3D

model in website. If the error was shown before, the user would have warned

beforehand.

6. Recognition rather than recall

Instructions for using the system is not visible or easily retrievable whenever

appropriate.

i) When user opens the application, user is directly shown with two options. User is confused on what to do. ii) When user clicks the “browse” button, user is again displayed with a catalogue of articles. How to use the catalogues or how to select the catalogues is not provided.

7. Flexibility and efficiency of use

No such feature is found in this prototype.

8. Aesthetic and minimalist design

Information is relevant and to the point

9. Help users recognize, diagnose, and recover from errors

i) In “scan” part, the user doesn’t understands why the camera is not recognising the QR

code or Augment symbol even if the QR code and Augment symbols are present. No

feedback is given.

ii) In “browse” part, the user is not given any feedback on how to place the background

image so that it can be used as a marker.

10. Help and documentation

No help or documentation is provided on how to use the application at the beginning

when user launches the application for the first time. Every user has to see the video on

Augment’s website to know how to use the system.

11. MISCELLANEOUS:

i) In case of tablet, most of the time the model didn’t place in the correct place and in

correct size.

ii) The application provides limited functionality in free version and all functionality in

paid version. When a user user uses the application, no such feedback is shown. For

example, that this particular feature can be accessed in paid version or some feedback

like, “you are currently using a free version which has limited functionality”.

AppendixD. Laboratory observations

In this appendix, the results of the laboratory observations are shown.

AppendixD.1. Raw data of figure 11

1. The time required to complete the task by User 1: 1.24 mins.



4. The time required to complete the task by User 4: 0.54 secs.





1. User 1 easily found out the 3D model described in the search task.

2. User 2 easily found out the 3D model.

3. User 3 could not find out the 3D model.



6. User 6 was at first confused seeing a list of 3D models in the catalogue but later on could findthe exact 3D model.

7. User 7 could easily find out the 3D model.


1. User 1 could easily create the tracker.

2. User 2 could easily create the tracker.

3. User 3 was confused regarding how to create the tracker. He asked for manual help from oneof the experimenters to move forward towards the goal.

4. User 4 (had seen the demo once, seen the previous user performing the same task, did a demoherself before performing the experiment) could easily create the tracker.

5. User 5 could not even remember that he/she had to create a tracker.

6. User 6 was totally confused on how to create the tracker. User asked for manual help.

7. User 7 could not even remember that he/she had to create a tracker.


1. User 1 used the rotate button to rotate the 3D model. Since the rotate button only providesthe option of rotating the 3D model in y-axis, the user tried to unsuccessfully rotate the 3Dmodel in x-axis. User 1 also used the tracker to rotate the 3D model.

2. User 2 used the rotate button to rotate the 3D model. Since the rotate button only providesthe option of rotating the 3D model in y-axis, the user tried to unsuccessfully rotate the 3Dmodel in x-axis and z-axis. User 2 was a bit confused while using the rotate button becauserotating the TV on y-axis would put in the TV in top view position with respect to the tablewhich is not an obvious way to put the TV. User 2 also used the tracker to rotate the 3Dmodel.

44

3. User 3 used the rotate button to rotate the 3D model. User 3 did not use the tracker to rotatethe 3D model.

4. User 4 used the rotate button to rotate the 3D model. User 4 did not use the tracker to rotatethe 3D model.

5. User 5 used the rotate button to rotate the 3D model. Since the rotate button only providesthe option of rotating the 3D model in y-axis, the user tried to unsuccessfully rotate the 3Dmodel in x-axis and z-axis. User 5 did not use the tracker to rotate the 3D model.

6. User 6 used the rotate button to rotate the 3D model. Since the rotate button only providesthe option of rotating the 3D model in y-axis, the user tried to unsuccessfully rotate the 3Dmodel in x-axis and z-axis. User 6 could create tracker but did not use the tracker to rotatethe 3D model.

7. User 7 used the rotate button to rotate the 3D model. Since the rotate button only providesthe option of rotating the 3D model in y-axis, the user tried to unsuccessfully rotate the 3Dmodel in x-axis and z-axis. User 7 could create tracker and used the tracker to rotate the 3Dmodel.

45

AppendixE. Questionnaire

Next page shows the questionnaire that was given to the users.

46

1

Use

r Eva

luation Q

uest

ionnaire

Augment

Questionnaire for Augment application

Place: University of Helsinki

We request your help for the design evaluation of Augment application. Please complete the

following questionnaire based on your experience when trying to complete the requested task.

Thank you for your time.

Device: Language of the app: Age (optional):

Provided tablet

Provided smartphone

Other:

_____________________________

English

Spanish

Other:

_____________________________

For what purpose do you use

your smartphone?

Have you used any Augmented

Reality application before?

Receive/send calls

Messaging

Calls + messaging

All above + using apps

Yes

No

Answer the questionnaire rating the statements from 1 (totally disagree) to

5 (totally agree). After each question, you can add a comment to specify

the reasons of your answer.

1. The system provided me with feedback on what I was working. I was not

confused or lost while performing the task

1 2 3 4 5

2

Comment:

2. The messages that appeared in the application were self-explanatory and I

understood what they were trying to explain

1 2 3 4 5

Comment:

3. I was in control of the application all the time

1 2 3 4 5

Comment:

4. I could easily undo/redo any action if I felt to do it

1 2 3 4 5

Comment:

5. If I mistakenly chose a wrong 3D model, I could easily stop uploading it

1 2 3 4 5

Comment:

6. The navigation through the application was easy

1 2 3 4 5

3

Comment:

7. The application had many errors and/or crashed

1 2 3 4 5

Comment:

8. The application always verified the 3D model before loading

1 2 3 4 5

Comment:

9. The option to select the desired function was always clear and available all

the time and the icon images helped me to know the appropriate

functionality of the available options

1 2 3 4 5

Comment:

10. It was easy to find the desired options at any time

1 2 3 4 5

Comment:

11. The application was well designed visually

1 2 3 4 5

4

Comment:

12. If error messages appeared, they were clear in their description and

probable steps to recover from it were provided

1 2 3 4 5

Comment:

13. When I needed help, the demo videos and feedback helped me to complete

my task successfully

1 2 3 4 5

Comment:

14. Comments / Testimonial:

Thank you very much for taking the time to complete this questionnaire. Your feedback is

valued and very much appreciated!

analysis of four usability evaluation methods applied to...

Documents