evaluating early prototypes in context: trade …dearman/papers/2005-earlyprototypes...scavenger...

9
42 PERVASIVE computing Published by the IEEE CS and IEEE ComSoc 1536-1268/05/$20.00 © 2005 IEEE RAPID PROTOTYPING P ervasive computing’s mixed success as a paradigm in everyday life makes evi- dent the need for evaluation methods that provide insights into likely usage patterns. 1 Focused research questions tend to be explored in lab experiments, 2 while larger projects often involve implementation and field evaluation of a completely realized concept. It’s in the interest of designers and researchers alike to achieve the middle ground: that is, con- textual evaluation of early or incomplete prototypes. How- ever, getting participant buy-in can be difficult, results can be misattributed owing to proto- type deficiencies, and a host of other methodological chal- lenges exist. We’ve investigated two approaches that let us evaluate early prototypes by achieving useful approximations of evaluation in context. We used these approaches to evaluate prototypes designed to facilitate navigation in groups. The case for approximation Pervasive computing applications aren’t straightforward to create. The up-front costs of hardware and software infrastructure are usually very high owing to a lack of standards and know- how and to a need to use expensive, emerging, or experimental technology. Users often must learn new skills for interaction and alter their activities in ways that are perhaps more disruptive than those presented by “traditional” desktop appli- cations. It’s unsurprising that so much research in the area has occurred in controlled laboratory settings or that development sometimes bypasses prototype evaluation altogether. In addition, ongoing evaluation of prototypes is important. A cautionary example is the mobile guide developed for the Exploratorium, an inter- active science museum in San Francisco. 3 Design of a complete working prototype was based on a needs analysis for museum visitors. The devel- opers then evaluated the resulting prototype on site, only to find that the mobility and hands-on nature of the exhibits made the guide cumber- some to use most of the time. A complete redesign led to a passive solution that implicitly collected experiences for the museum visitor to retrieve later. Such late revelations are costly. Ideally, developers would evaluate prototypes in context earlier and continually during devel- opment to avoid the need to scrap fully articu- lated designs. However, without a robust, func- tioning system, getting the buy-in required to permit true contextual evaluation is difficult. 4 This is particularly pronounced in pervasive applications, which are integrated with the envi- ronment and context of use. Our two approaches address this seeming Early-prototyping approaches that ultimately sacrifice some realism or allow some bias in order to approximate contextual evaluation can provide useful evaluations of design choices. Derek Reilly, David Dearman, Michael Welsman-Dinelle, and Kori Inkpen Dalhousie University Evaluating Early Prototypes in Context: Trade-offs, Challenges, and Successes

Upload: others

Post on 21-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

42 PERVASIVEcomputing Published by the IEEE CS and IEEE ComSoc ■ 1536-1268/05/$20.00 © 2005 IEEE

R A P I D P R O T O T Y P I N G

Pervasive computing’s mixed success asa paradigm in everyday life makes evi-dent the need for evaluation methodsthat provide insights into likely usagepatterns.1 Focused research questions

tend to be explored in lab experiments,2 whilelarger projects often involve implementation andfield evaluation of a completely realized concept.It’s in the interest of designers and researchersalike to achieve the middle ground: that is, con-

textual evaluation of early orincomplete prototypes. How-ever, getting participant buy-incan be difficult, results can bemisattributed owing to proto-type deficiencies, and a host of other methodological chal-

lenges exist. We’ve investigated two approachesthat let us evaluate early prototypes by achievinguseful approximations of evaluation in context.We used these approaches to evaluate prototypesdesigned to facilitate navigation in groups.

The case for approximationPervasive computing applications aren’t

straightforward to create. The up-front costs ofhardware and software infrastructure are usuallyvery high owing to a lack of standards and know-how and to a need to use expensive, emerging, orexperimental technology. Users often must learn

new skills for interaction and alter their activitiesin ways that are perhaps more disruptive thanthose presented by “traditional” desktop appli-cations. It’s unsurprising that so much researchin the area has occurred in controlled laboratorysettings or that development sometimes bypassesprototype evaluation altogether.

In addition, ongoing evaluation of prototypesis important. A cautionary example is the mobileguide developed for the Exploratorium, an inter-active science museum in San Francisco.3 Designof a complete working prototype was based on aneeds analysis for museum visitors. The devel-opers then evaluated the resulting prototype onsite, only to find that the mobility and hands-onnature of the exhibits made the guide cumber-some to use most of the time. A complete redesignled to a passive solution that implicitly collectedexperiences for the museum visitor to retrievelater. Such late revelations are costly.

Ideally, developers would evaluate prototypesin context earlier and continually during devel-opment to avoid the need to scrap fully articu-lated designs. However, without a robust, func-tioning system, getting the buy-in required topermit true contextual evaluation is difficult.4

This is particularly pronounced in pervasiveapplications, which are integrated with the envi-ronment and context of use.

Our two approaches address this seeming

Early-prototyping approaches that ultimately sacrifice some realism orallow some bias in order to approximate contextual evaluation canprovide useful evaluations of design choices.

Derek Reilly, David Dearman,Michael Welsman-Dinelle, and Kori InkpenDalhousie University

Evaluating EarlyPrototypes in Context:Trade-offs, Challenges,and Successes

Page 2: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

OCTOBER–DECEMBER 2005 PERVASIVEcomputing 43

conundrum by recognizing the challengesof evaluating early prototypes in context,accepting certain limitations yet provid-ing real evaluative power. The first, avariant of experience prototyping,5 per-mits some bias in evaluation to allow useof prototypes in real situations. The sec-ond approach, which employs Wizard ofOz prototyping or functional approxi-mation, sacrifices some realism to evalu-ate the prototype with impartial partici-pants. By acknowledging and workingwith these approaches’ inherent limita-tions, we’ve been able to evaluate a rangeof early prototypes. The results must beviewed in relation to the limitations butcan provide valuable insight into the suit-ability and effectiveness of designs forpervasive computing applications.

Group navigationTechnology supporting group naviga-

tion might increase the quality of com-munication by promoting awareness ofone another’s actions and intentions orby providing shared visual aids. It mightdecrease the chances of navigation errorby making information available whenneeded or by supporting group problem-solving approaches. Motivated by ourown experiences traveling in groups, weidentified three lightweight “technolog-ical interventions” and explored each inisolation, allowing distinct lines ofinquiry to develop as appropriate. Incombination, the research helped us bet-ter understand how technology mightsupport group navigation.

Coordinated Views, the first techno-logical intervention, considers how shar-ing common views and annotations onhandheld computers might assist groupsnavigating together in close proximity.For example, when viewing a map that’slarger than the handheld screen, a groupcan automatically converge on the sameportion of the map. Sharing annotationsmight let the group better communicateand record plans.

Marked-Up Maps looks at how to useRFID-tagged paper maps with handhelddevices to support navigation and infor-mation retrieval. The impetus for thisresearch was to consider the commonsituation of several people sharing onepaper map. Adding handheld interac-tivity might permit capturing chunks of

the map for collaborative search tasks,while maintaining a large, shared view(the paper map).

Rendezvous explores different tech-nologies’ impact on the act of meeting ata negotiated time and place. Technolo-gies studied include a location-awarehandheld application and regular mobilephones. Scenarios include establishing ameeting place dynamically, respondingto lateness, and making last-minutechanges to meeting plans.

Partly because we’re most interested inunderstanding the problem domain andnot explicitly trying to develop a com-plete system, the need for contextual eval-uation has driven our prototyping effortsfrom the outset. However, our experi-ences also apply to those who are moreinterested in designing real applications.

Experience prototypingThis approach emphasizes “the expe-

riential aspect of whatever representa-tions are needed to successfully (re)liveor convey an experience with a product,space or system.”5 As an overall philos-ophy for prototyping, you can apply itto understanding existing practices, com-municating design ideas, and evaluatingthe prototype. One concrete approachto experience prototyping is for design-

ers themselves to become immersed inthe target population’s current practicesto better understand their needs andperspective.

In our research, we’ve applied anexperience-prototyping technique inwhich researchers are participants inactual scenarios; that is, they’re not just

pretending to be members of the targetpopulation. Researchers or designersmediate their own authentic experiencesusing prototypes in context, yieldingpowerful, visceral impressions that canprovide insight when subject to carefulreflection. The participating researchers’desire for the activity to succeed can mit-igate their bias toward the technology(we discuss this bias in more detail later).Designers can also more effectively man-age the early prototypes’ rough edges;there’s less possibility of misinterpretinga problem with a prototype as a prob-lem with the design. Finally, directly con-fronting the design assumptions’ impactis invaluable to researchers and design-ers when they’re refining their prototype.

To investigate experience prototyping,we used Coordinated Views and Marked-Up Maps.

Coordinated ViewsWe explored the use of our Coordi-

nated Views software during City Chase(www.thecitychase.com), an organizedcity scavenger hunt. Three pairs ofresearchers participated in the race. Dur-ing the race, each pair had access to ourCoordinated Views prototype.

The prototype provided shared viewsand annotations using Bluetooth peer-

One concrete approach to experience

prototyping is for designers themselves to

become immersed in the target population’s

current practices.

Page 3: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

44 PERVASIVEcomputing www.computer.org/pervasive

R A P I D P R O T O T Y P I N G

to-peer communication on iPaq hand-held computers (see figure 1). We identi-fied suitable documents for use in ourscavenger hunt: a bus schedule and atransit map with landmarks. We devel-oped the prototype implementationaround the use of these two documentsand implemented logging to captureinteractions. Users could continually syn-chronize with their partner’s view ormaintain an independent view. A buttontoggled the function of the stylus betweenpanning and annotating the document.Annotations appeared on both PDAs.

We decided to permit the use of otherresources beyond the Coordinated Viewsprototype. This would let us continuethe race if something went wrong withthe prototype or if we had difficultymanaging the prototype during morephysically demanding points in the race.These resources included direct alterna-tives to the software, including lami-nated paper maps and bus schedules,and constant access by cell phone to a“control center” of people ready toresearch an information request.

The participants had high expectationsof the software’s utility. This remainedtrue even after a pilot trial in which noone used it. They largely dismissed thisas being due to the pilot’s condensedcourse, which was in a very familiar partof town and involved familiar land-marks. However, the software remainedlargely unused on the race day as well.

We conducted a debriefing after therace. The participants suggested severalreasons why they didn’t use the software.In addition to the availability of alter-nate, familiar resources, some partici-pants felt that they were too familiarwith the area to require more involvedresearch, which the Coordinated Viewssoftware might have facilitated. Cer-tainly the event’s time pressure was alsoa factor in their choices. The only timethey used PDAs was when they were tiedup in transit. The one pair that actuallyused shared annotations did so on theferry, and another pair tried to use thePDA with GPS for positioning onlywhile riding the bus. Finally, weatherplayed a role: it rained heavily during therace’s first half, which motivated the par-ticipants to use the laminated maps.

Marked-Up MapsAn experience-prototyping opportu-

nity for Marked-Up Maps presenteditself in an upcoming conference thatseveral in our group were to attend.Before the conference, we built a proto-type that linked a paper tourist map ofNottingham, UK, with basic tourist-related information available on a PDA.

We affixed RFID labels to the back ofthe map to indicate map locations. Weattached an RFID reader to the back of aPDA such that users could query locationsby holding the PDA display-side up infront of a map region (see figure 2a). In

this first prototype, users retrieved gen-eral information about a location byclicking a button on the PDA when it wasover the location. Because we couldn’teasily connect the reader directly to thePDA, a server on a laptop connected tothe reader processed the RFID reads.Pressing a physical button on the PDA(mapped to a request for the server’sbase URL) caused the server to placeHTML-formatted information mappedto the most recently read tag ID at theURL, which the system then displayedon the PDA screen. In this way, the pro-totype permitted point-and-click inter-action with individual locations on thepaper map.

We used the prototype while touringdowntown Nottingham (see figure 2b).We obtained tourist information fromvarious Web sites about Nottingham andits attractions. Unfortunately, we didn’tresolve the prototype’s glitches until afterthe conference, and only one of usremained in the city. That person usedthe prototype intermittently throughoutone day of sightseeing. He took notes ashe used the prototype and made sum-mary notes at the end of the day.

This combination of a PDA and papermap, and the wired connection to thelaptop in a backpack, made access cum-bersome while moving, especially whenonly one person was using the system.So, the user accessed information fromthe PDA using the map while sitting

Figure 1. Participation in a city scavenger hunt: (a) Coordinated Views software on paired iPaq handhelds, (b) a team planning itsnext move, without the handhelds, and (c) the pair on the move, with an observer in tow.

(a) (b) (c)

Page 4: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

down. This was acceptable because hewas using the information to plan anitinerary and explore options. Feedbackthat a read had occurred would havebeen welcome, however. (Before we usedthe prototype outdoors we had relied onan audio cue from the server to indicatea read, but this audio signal was too softto be heard outdoors.)

After a short period of using the pro-totype, the user determined that framingqueries according to themes (for exam-ple, admissions and hours of operation,or bus schedules) would have been desir-able. However, this wasn’t possible withthe prototype. Also, the interface wasgood to browse with, but because theuser had manually associated the guideresources (the HTML pages) to the maplandmark icons and knew what was“hidden” behind the links, he couldn’treally “surf” the map resources. Eventu-ally the PDA became a secondary tool foroccasionally retrieving details, and theuser kept the map handy while touring.

Lessons and recommendationsIn both projects, taking the techniques

into an actual usage scenario providedvaluable insight. We evaluated our tech-niques in unforgiving, nonfabricatedcontexts, giving ourselves as researchersan intimate appreciation for the tech-nique’s potential in the context.

Introspection was important duringand after each experience. The feelingthat we didn’t want to use the handheldsduring the City Chase was palpablethroughout the event. After the event, asa group we could discuss and reflect onour experiences. Our awareness of needsand possible patterns of use for Marked-Up Maps emerged largely on account ofthe touring scenario. This first-handexperience facilitated the developmentof subsequent map implementations anda second functional prototype.

The evaluations required fewer over-all resources (cameras, observers) thanthey would had we used nonresearcherparticipants. We also didn’t need to feelthat our technology was imposing on theparticipants’ experiences. For the soli-tary experimenter/tourist in Notting-ham, reflection on the application’s util-ity could occur at a natural pace, and

tie-ups due to prototype limitations(waking the devices, handling RFID mis-reads, and missing details in the hyper-text) could be accepted without preju-dicing the experience as a whole. TheCity Chase’s added time pressure didn’tallow for this luxury. We used the Coor-dinated Views software only when wehad enough time and little else to do.

Researchers who are working alonecan reflect on their own use in contextbut might miss some of their own behav-ior or the impact of some aspect of thetechnology. When concentrating on usein a real environment, such researchersmight forget that they’re supposed toalso be observing their experience. Evenif they’re vigilant, they might not be ableto detect subtle or characteristic behav-ior patterns, because these are oftenmore obvious through focused arms-length observation. This might be ad-dressed somewhat if a group conductsthe evaluation (as with the City Chase),perhaps including a researcher whoseprimary purpose is observation, not par-ticipation. However, as we experiencedin the City Chase, groups can also easily

OCTOBER–DECEMBER 2005 PERVASIVEcomputing 45

Figure 2. Testing the Marked-Up Maps prototype: (a) Users can obtain details about a tourist attraction by placing a handheld computer over a paper map of Nottingham with RFID tags embedded. (b) A researcher uses the the prototype in Nottingham’sBroadmarsh bus station.

(a) (b)

Page 5: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

get caught up in the moment, dependingon the nature of the activity.

Both these studies involved activitiesin which we could reasonably participate.They required no specialized knowledgeor experience (although the City Chaserequired a certain level of fitness).Reflective self-evaluation in a real sce-nario can take place only if the re-searchers are suitable participants. This

quickly becomes problematic in applieddomains (for example, archeology andpolice work). It might be possible inmany cases to find a real scenario thatshares many aspects of a specific targetactivity and in which researchers couldreasonably take part. Our City Chasescenario fit this description well. Thisscenario isn’t a likely target for tech-nology development but shares enoughin common with other group naviga-tion scenarios to be a reasonable testbed or activity in which to explore theCoordinated Views prototype. Addi-tionally, authentic activities for experi-ence prototyping often come with a def-inite time frame; prototyping needs tomanage time carefully to avoid a missedopportunity.

One potential problem would be biasin the researcher’s vision of a technique’spotential. When researchers are also par-ticipants, they need to be particularlycareful not to let the interaction experi-ence descend into artifice. It’s tempting toget carried away with a new technique’spotential or perceived benefits. Expec-tations of interaction style and environ-mental impact can cloud your impres-sions of the actual experience to thepoint where you might adjust yourbehavior to accommodate your expec-

tations, no matter how problematic orlimiting the device is.

The real-life scenario’s impact can bestronger than any researcher bias, how-ever, as was the case during the CityChase. In contrast, the Nottingham tour-ing experience was to some extent drivenby the information resource we built toaugment the map. A landmark often wasof interest because it was linked from the

map, not necessarily because it was ofinterest to a tourist. In addition toresearcher bias and expectations for aprototype, researchers are motivated tocapture useful information and so mightfocus unduly on technology use to thedetriment of the experience as a whole.This might have limits, however. Even inthe Nottingham study, the user used themarked-up map with the PDA onlywhen he was seated in an area that facil-itated this use.

We need to be careful when drawingconclusions from personal experiences.As we discussed, several powerfulsources of bias and challenges to authen-ticity are at play when researchersbecome participants in mobile environ-ments. In the case of the City Chase, itwas tempting after the event to dismissthe technique outright and the use ofhandhelds in such contexts. However, inboth cases we gained valuable experi-ences that aided us during subsequentprototyping.

Wizard of Oz and otherreasonable facsimiles

In Wizard of Oz prototyping, a pro-totype mocks up some or all of the func-tionality such that it appears to the userto be a functioning system. Functional

approximations are often more completethan typical prototypes but sacrificedetail or accuracy in their operation.

To evaluate both types of prototypes,we needed to perform controlled exper-imental simulation. Both the effort nec-essary to achieve consistent interactionand required presence of the “wizards”(behind-the-scenes humans who run thesimulation) made longer-term or ad hocuse prohibitive for our Wizard of Ozprototype. Our functional-approxima-tion prototype was also incomplete inways that prohibit its use in real scenar-ios. For example, the prototype didn’tcapture gestural interaction with thefidelity that a functioning system wouldrequire. The system’s response to userinteraction was therefore correspond-ingly coarse grained. While in otherdomains prototype fidelity might notimpact usability results,6 perceived re-sponsiveness, granularity, and accuracycan significantly affect the user experi-ence of pervasive applications.

RendezvousThis study aimed to explore how loca-

tion awareness affects a rendezvousbetween two individuals. It required arealistic environment to set the stage forthe scenarios and immerse the partici-pants in a setting they could relate to andinteract with. We had identified a suit-able setting in a busy shopping district.However, no suitable location-sensingtechnology was available in the area toprovide positioning data with the fidelityand robustness that the study required.To address this, we implemented a Wiz-ard of Oz prototype to provide and com-municate location information.

We intended the prototype to providethe participants with a perceived connec-tion between their handheld computerand a location-awareness service, pro-viding constant updates of their partner’slocation and some communication facil-ities. To accomplish this, a wizard fol-

46 PERVASIVEcomputing www.computer.org/pervasive

R A P I D P R O T O T Y P I N G

When researchers are also participants, they

need to be particularly careful not to let the

interaction experience descend into artifice.

Page 6: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

OCTOBER–DECEMBER 2005 PERVASIVEcomputing 47

lowed each rendezvousing participant,pushing information updates onto thatparticipant’s handheld (see figure 3). Eachwizard was responsible for updating theparticipant’s location (using a Bluetoothconnection to the participant’s handheld)and communicating that location to theother participant’s wizard (using two-wayradios). During certain scenarios, partic-ipants could update the chosen meetingplace and request an acknowledgmentfrom their partner. The wizards also facil-itated this transaction.

The technique worked well when allproceeded as planned; however, at timesthe prototype influenced the study. TheBluetooth connection between partici-pant and wizard was subject to range lim-itations. While we envisioned that thewizard and participant would be in closeproximity at all times (see figure 3), thiswasn’t always the case. For example, sev-eral participants would run to makestreet-crossing signals, resulting in dis-connections. In two instances, the num-ber of disconnections became so frequentthat we had to throw out the study data.The participants became used to thedevices disconnecting. Whenever theirpartner wasn’t shown to be moving, theywould ask the wizard if they had lost theconnection. In addition, reconnectionrequired a pause in the scenario to let thewizard confirm the connection.

Given the need for proximity, partici-pants sometimes overheard communi-cation between wizards. The shoppingdistrict’s noise level was often very high,requiring wizards to raise their voice tobe heard over vehicle traffic and pedes-trian conversations. We observed thatthis distracted several participants.

We also had to ensure that the partic-ipants could take their time to completethe scenarios so that the wizards couldkeep up while performing their tasks. Wetherefore gave generous time frames tocomplete each scenario, where a morerestrictive time frame would probablyhave been more natural.

A final issue related to the partici-pants’ perception of the position data’sfidelity. When things ran smoothly, thewizards updated positional data with areliability and precision that intermit-tent GPS or WiFi positioning couldn’tmatch. So, the user might have per-ceived the information’s utility to begreater than what a real implementa-tion could reasonably deliver. The waysin which a rendezvous applicationmight use position data are influencedby the nature of the technology used.7

So, our study provided valuable datagiven a what-if scenario of consistent,accurate updates, which must be qual-ified against actual implementations’capabilities.

Marked-Up MapsThe first Marked-Up Maps prototype,

which we described earlier, permittedpoint-and-click interaction with indi-vidual points on a map. This was suit-able for information access tied to clearlyvisible landmarks on a map or for pro-viding overview information pertainingto a map grid square. However, wewanted to explore more complex inter-actions, including filtering informationby selecting categories on the map, trac-ing between two points on a map, andcircling map regions. These interactionsare suitable for map interaction in gen-eral and for expressing queries with ashared map in a group navigation set-ting. We restricted our evaluation to useby a single user, however, feeling that wecould examine group interaction afterevaluating the basic techniques.

In this study, the prototype gave visualand audio feedback indicating the selec-tion operations performed but didn’tretrieve information. Our software loggedwhether a particular interaction matchedan expected approach for a given task.We gave the user an indication of theitems selected, by highlighting mapicons, drawing paths between map iconsselected in sequence, or highlightingselected map grid squares (see figure 4).We call the prototype a functionalapproximation not simply because it was

(b)(a)

ActualBluetooth

connection

ActualBluetooth

connectionRemote

application

Communication of location and user interactions

GPS

Perceived connection

Perceivedconnection

Two-way radio

Headphonesand

microphone

Two-way radio

Perceivedconnection

User BUser A

Wizard BWizard A

Headphonesand

microphone

Figure 3. Implementing the Wizard of Oz rendezvous application: (a) The basic approach to location awareness. (b) An observerand a wizard closely followed each participant.

Page 7: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

incomplete but, more importantly,because the technology limited the gran-ularity with which it captured user inter-actions. This is similar to a location-based application prototype that usescell-tower positioning, where the finalimplementation will have greater posi-tional accuracy.

The granularity of the RFID grid wecould create was very coarse, allowingone tag per 30 mm square without inter-ference. This obviously prohibited inter-actions with finer regions—for example,tracing a walking path as it winds alonga map. We considered several alterna-tives, including using larger or coarsermaps, magnetic tracking, and vision-based methods. Ultimately we decidedthat RFID, despite its obvious accuracylimitations, offered the most direct pathfor an early prototype evaluation.

Although the granularity was poor, wewere able to mock up interactions bydetecting certain coarse interactions andinterpreting them in software as finerinteractions appropriate to a given task.For example, the prototype’s softwareinterpreted a sequence of RFID tag readsthat corresponded (roughly) to a map’scenter region as a request for informationabout a “pedestrianized” area, which was

generally contained in that region. Theprototype interpreted reading a sequenceof tags that contained or skirted a givenstreet’s boundary as a swipe along thelength of that street if this constituted asuitable interaction for a given task.

The participants underwent brieftraining to become familiar with theinteraction techniques that the prototypesupported. They then performed severaltasks that didn’t specify which interac-tion technique to use. In general, theydidn’t remark that selection operationswere difficult or unnatural, and as agroup they predominantly used the sameinteraction techniques to perform spe-cific tasks.

This ran counter to the results of anearlier “make believe” evaluation, wherewe presented the same maps, gave par-ticipants handhelds that were turned off,and simply asked them to demonstratehow they might interact with the papermaps to retrieve answers to questionsprovided to them. During that evalua-tion, participants envisioned point-and-click as the predominant (and sometimesonly) interaction style. The semifunc-tional prototype created a level of realismand coherence that the make-believe pro-totype couldn’t. So, we could assess recep-

tivity to interaction styles that didn’tappear obvious to the participants in themake-believe evaluation.

The functional-approximation approachrestricted how we designed our studyand analyzed our findings, however. Onthe map of Halifax we highlighted a gridsquare corresponding to each RFID tag.The squares were regularly spaced, sothey didn’t represent specific landmarks(see figure 4). When we asked the par-ticipants to trace a path along a set ofstreets, the visual feedback indicated thatthey had selected the grid squares thatcontained the streets in question. If thestudy’s purpose had been to determineoptimal visual feedback on the handhelddevice during selection operations, thisprototype implementation would havebeen suboptimal. Regardless, the feed-back is clearly insufficient for tracingroutes. When we were devising thisstudy, we needed to decide whether toinclude tasks such as route tracing forwhich the prototype might not providethe best solution. The prototype’s limi-tations regarding a subset of the activitiesmight cloud the participants’ overallimpression. However, we ultimatelydecided to include such tasks in our eval-uation because we were interested in col-

48 PERVASIVEcomputing www.computer.org/pervasive

R A P I D P R O T O T Y P I N G

Figure 4. The second Marked-Up Maps prototype: (a) A paper map used with the prototype. (b) One way the prototype providedfeedback was to highlight selected map grid squares in red.

(a) (b)

Page 8: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

lecting participant evaluations about theinteraction technique as an approach ingeneral. We adjusted the order in whichparticipants conducted activities to mit-igate any negative impressions given bythe path feedback.

The prototype didn’t go beyond pro-viding visual and audio feedback be-cause we intended to explore receptivityto interaction techniques, not to evaluatea complete system. If a participantassumed that the desired informationwould be easily retrieved from a selec-tion, it seems reasonable to expect thathe or she would be positive toward therelated interaction technique. Needlessto say, it’s difficult to evaluate somethingthat doesn’t actually fulfill the functionit’s meant to perform. Despite encour-aging results in relation to interactiontechniques for selection operations, weneeded to be careful to not conclude thatthe system is on the right track in termsof the entire workflow.

Lessons and recommendationsAs has been discussed here and else-

where,4 Wizard of Oz and other tech-niques requiring monitoring or follow-ing participants can be difficult tomanage in mobile environments and canlimit realism. For example, the semi-functional and incomplete Marked-UpMaps prototype was better suited tofocused evaluation in a stationary loca-tion. However, the Wizard of Oz imple-mentation in the Rendezvous study stillpermitted mobile evaluation.

Determining appropriate levels offidelity and granularity in a prototype iscritical before developing and evaluat-ing it. Indeed, rapid prototyping toolsfor pervasive computing often facilitatethis by allowing the introduction oferror.8,9 While it’s tempting to use thetechnology you plan to deploy in yourprototypes, the technology can oftenrequire further work before being ready(for example, combining RFID with

other techniques might improve granu-larity, and better algorithms might yieldbetter positional accuracy in a location-aware system). On the other hand, as wementioned before, using a Wizard of Ozapproach or an alternate technologymight introduce expectations that aren’tattainable with the deployed technology.If the capabilities suggested by the pro-totype are beyond even the far-termcapabilities of a real system, the resultswill have limited practical utility.

Pretending that a prototype is moreaccurate than it really is poses its owndifficulties. As sometimes occurred withMarked-Up Maps, participants can pickup on cues indicating that the system isless accurate than you or the interfacesuggest. A possible consequence is thatthe participant simply presumes the sys-tem will work “as advertised,” whichcomplicates evaluation of user interac-tions and impressions. Similar difficul-ties can occur when a Wizard of Oz pro-totype’s mechanics become apparent.

Providing some indication of a system’sfunctionality, even given these limita-tions, can elicit interactions and impres-sions that are valuable to designers andresearchers. When individuals envisionfunctionality without an interactive pro-totype, they can be unduly influenced byprior experience and expectations.

We would like to say thatexperience prototypingand Wizard of Oz proto-typing, taken together,

constitute a suitable alternative to truecontextual evaluation for early proto-types, one gaining realism at the expenseof impartiality, the other impartiality atthe expense of realism. As our experiencesshow, however, each pervasive applica-tion design poses unique challenges thatyou must consider when applying theseapproaches. Regardless, it’s often a good

trade-off to sacrifice some measure ofrealism to evaluate early prototypes. Eval-uation of early pervasive computing pro-totypes in context is a pragmatic exercise,

OCTOBER–DECEMBER 2005 PERVASIVEcomputing 49

the AUTHORS

Derek Reilly is a PhD stu-dent in computer science atDalhousie University and amember of the EDGE (Ex-ploring Dynamic GroupwareEnvironments) Lab. His re-search considers ways tointegrate geographic maps

with other spatial information in context. Hereceived his BSc in computer science fromMcGill University. He’s a member of the ACM.Contact him at the Faculty of Computer Sci-ence, Dalhousie Univ., Halifax, NS B3H 1W5,Canada; [email protected].

David Dearman is a gradu-ate student in the Faculty ofComputer Science at Dal-housie University and amember of the EDGE (Ex-ploring Dynamic GroupwareEnvironments) Lab. His re-search focuses on using

mobile location-aware devices in social coordi-nation. He received his BCSc in computer sci-ence from Dalhousie University. Contact him atthe Faculty of Computer Science, DalhousieUniv., Halifax, NS B3H 1W5, Canada; [email protected].

Michael Welsman-Dinelleis an undergraduate studentmajoring in computer sci-ence at Dalhousie University.During the winter term of2005 he contributed tohuman-computer interac-tion research in the EDGE

(Exploring Dynamic Groupware Environments)Lab as part of a co-op work term and intends totackle more research questions in the future forhis honors thesis. Contact him at the Faculty ofComputer Science, Dalhousie Univ., Halifax, NSB3H 1W5, Canada; [email protected].

Kori Inkpen is an associateprofessor in the Faculty ofComputer Science at Dal-housie University and is thedirector of the EDGE (Ex-ploring Dynamic GroupwareEnvironments) Lab. Her mainresearch interests involve

computer support for collaboration and ubiqui-tous computing. She received her PhD in com-puter science from the University of British Co-lumbia. She’s a member of the ACM. Contacther at the Faculty of Computer Science, Dal-housie Univ., Halifax, NS B3H 1W5, Canada;[email protected].

Page 9: Evaluating Early Prototypes in Context: Trade …dearman/papers/2005-EarlyPrototypes...scavenger hunt: a bus schedule and a transit map with landmarks. We devel-oped the prototype

50 PERVASIVEcomputing www.computer.org/pervasive

R A P I D P R O T O T Y P I N G

but one that is nonetheless informed bygeneral approaches that reflect and adaptto the challenges of pervasive applicationdevelopment.

ACKNOWLEDGMENTSThanks to all researchers involved in the studies dis-cussed in this article: Ritchie Argue, Colin Bate, VickiHa, Kirstie Hawkey, Melanie Kellar, Joe MacInnes,Bonnie MacKay, Mike Nunes, Karen Parker, Mal-colm Rodgers, and Tara Whalen. We also thank theNational Sciences and Engineering Research Coun-cil of Canada and Dalhousie University for support-ing the research, and the article reviewers for theirhelpful comments.

REFERENCES1. G. Abowd and E. Mynatt, “Charting Past,

Present and Future Research in UbiquitousComputing,” ACM Trans. Computer-HumanInteraction, vol. 7, no. 1, 2000, pp. 29–58.

2. J. Kjeldskov and K. Graham, ”A Review ofMobile HCI Research Methods,” Proc. 5thInt’l Symp. Human-Computer Interactionwith Mobile Devices and Services (MobileHCI 2003), LNCS 2795, Springer-Verlag,2003, pp. 317–335.

3. M. Fleck et al., “From Informing toRemembering: Deploying a Ubiquitous Sys-tem in an Interactive Science Museum,”IEEE Pervasive Computing, vol. 1, no. 2,2002, pp. 13–21.

4. M. Isomursu et al., “Experience Clip:Method for User Participation and Evalua-tion of Mobile Concepts,” Proc. 8th Bien-nial Conf. Participatory Design, ACMPress, 2004, pp. 83–92.

5. M. Buchenau and J. Suri, “Experience Pro-totyping,” Proc. Conf. Designing Interac-tive Systems: Processes, Practices, Methods,and Techniques (DIS 2000), ACM Press,2000, pp. 424–433.

6. M. Walker et al., “High-Fidelity or Low-

Fidelity, Paper or Computer? ChoosingAttributes When Testing Web Prototypes,”Proc. Human Factors and Ergonomics Soc.46th Ann. Meeting, Human Factors andErgonomics Soc., 2002, pp. 661–665.

7. M. Chalmers, “Seamful Design and Ubi-comp Infrastructure,” presented at theUbiComp 2003 workshop, At the Cross-roads: The Interaction of HCI and Sys-tems Issues in UbiComp, www.equator.ac.uk/var/uploads/M.%20Chalmers%20(2003),%20Seamful%20Design%20and%20Ubicomp%20Infrastructure.2003.pdf.

8. L. Yang et al., “Topiary: A Tool for Proto-typing Location-Enhanced Applications,”Proc. 17th Ann. ACM Symp. User InterfaceSoftware and Technology, ACM Press,2004, pp. 217–226.

9. R. Hull et al., “Rapid Authoring of Media-scapes,” Proc. 6th Int’l Conf. UbiquitousComputing (UbiComp 04), LNCS 3205,Springer-Verlag, 2004, pp. 125–142.

IEEE Intelligent Systems delivers the

latest peer-reviewed research on all

aspects of artificial intelligence,

focusing on the development of

practical, fielded applications.

Contributors include leading

experts in

■ Intelligent Agents

■ The Semantic Web

■ Natural Language Processing

■ Robotics

■ Machine Learning

For the low annual rate of $67,

you’ll receive six bimonthly issues of

IEEE Intelligent Systems. Upcoming

issues will address topics such as

■ Bioinformatics

■ Self-Managing Systems

■ The Future of AI

Don’t let the futurepass you by

Visit www.computer.org/ intel l igent/subscribe.htmVisi t www.computer.org/ intel l igent/subscribe.htm IEEESubscribe to IEEE Intelligent Systems