perceptually driven cognitive mapping in a virtual urban...

Perceptually Driven Cognitive Mappingin a Virtual Urban Environment

Randall W. Hill, Jr. Changhee Han Michael van Lent

USC Institute for Creative Technologies13274 Fiji Way, Suite 600, Marina del Rey, CA 90292-7008

{hill, changhee, vanlent}@ict.usc.edu

AbstractThis paper describes a method for building a cognitivemap of a virtual urban environment. Our routines enablevirtual humans to map their environment using a realisticmodel of perception. We based our implementation on acomputational framework proposed by Yeap and Jefferies(Yeap & Jefferies 1999) for representing a localenvironment as a structure called an Absolute SpaceRepresentation (ASR). Their algorithms compute andupdate ASRs from a 2-1/2D1 sketch of the localenvironment, and then connect the ASRs together to forma raw cognitive map. Our work extends the frameworkdeveloped by Yeap and Jefferies in three important ways.First, we implemented the framework in a virtual trainingenvironment, the Mission Rehearsal Exercise (Swartoutet al. 2001). Second, we describe a method for acquiring a2-1/2D sketch in a virtual world, a step omitted from theirframework, but which is essential for computing an ASR.Third, we extend the ASR algorithm to map regions thatare partially visible through exits of the local space.Together, the implementation of the ASR algorithmalong with our extensions will be useful in a wide varietyof applications involving virtual humans and agents whoneed to perceive and reason about spatial concepts inurban environments.

IntroductionOur goal is to develop virtual humans with believableperceptual and spatial behaviors. For a growing number ofcomputer games, military training simulations, andimmersive learning environments, the willingness of theparticipant to suspend disbelief hinges on the realism of thebehavior of the virtual humans. Behaviors such as self-location and way-finding have been investigated extensivelyin mobile robot applications, but there are numerous otherspatial tasks more human in nature that need to besimulated in these applications. Interesting examplesinclude communicating spatial information in natural 1 Marr (1982) defines a 2-1/2D sketch to be a list of surfaces

and their spatial layout. The sketch only includes thevisible portions of the surfaces in the agent’s field of view.

Copyright © 2002, American Association for Artificial Intelligence(www.aaai.org). All rights reserved.

language and social conventions such as initially blockinga doorway with your body and then stepping back to invitethe visitor in. In military training simulations theseinclude coordinated tactical movements, crowd control,avoiding snipers and ambushes, selecting helicopter landingzones, and establishing a security perimeter, to name a few.Underlying all these behaviors is the ability to perceive andbuild a spatial representation of the environment.

Humans are quite good at remembering the layout ofthe places they inhabit or have visited and using thisinformation to reason about everyday tasks such as findingthe local grocery store and locating a parking space in spiteof the traffic jam at one end of the parking lot. Becomingfamiliar with the configuration of a place like a town is aprocess that involves walking around, looking at buildings,trees, landmarks, streets and other details of theenvironment that are subsequently encoded into memoriesthat make the place recognizable and easily navigated. Theprocess of forming these spatial memories is calledcognitive mapping (Chown & Kaplan & Kortenkamp1995; Kuipers 1978; 2000; Yeap 1988; Yeap & Jefferies1999). The ability to build a cognitive map is useful forany agent that has a need for tracking its location,navigating, and determining where places are located withrespect to one another (Chown, Kaplan & Kortenkamp1995; Kortenkamp, Bonasso & Murphy 1998; Kuipers1978; 2000; Levitt & Lawton 1990).

This paper describes a method for building a cognitivemap of a synthetic urban setting based on the realisticlimits of human visual perception. Humans have a limitedfield of view and cannot see through solid objects like wallsand these same limitations are imposed on our virtualagents. Only by making a series of observations fromdifferent perspectives over time can a cognitive map bebuilt.

We based our implementation on a computationalframework proposed by Yeap and Jefferies (Yeap & Jefferies1999) that represents a local environment as a structurecalled an Absolute Space Representation (ASR). Buildingan ASR involves perceiving the local surroundings, thearea immediately visible to the viewer, and computing theboundaries and exits of this space. The boundaries areobstacles that prohibit movement through the space such aswalls. Exits are gaps in the boundaries that permit the

agents to leave one local space and enter another. Forexample, a room would be a single ASR with a number ofboundaries (walls) and a single exit (the door). The exitwould connect to another ASR (the hallway) with a numberof boundaries and exits (doors) connecting to more ASRsrepresenting other offices. By exploring a series of localspaces, representing them as ASRs, and connecting themtogether via their exits, a viewer builds a raw cognitivemap2. We have taken this framework and extended it in anumber of ways:ÿ We applied a theoretical computational framework of

cognitive mapping to a training application thatincludes virtual humans in a virtual environment. Todate most cognitive theories have been implemented inmobile robots, whose perceptual abilities aresomewhat different than a human’s, and whose purposeis not to exhibit human-like behavior. Yeap tested histheory with a simulated robot in a 2D world. Ourcognitive mapping is done in the urban environment ofthe Mission Rehearsal Exercise (Swartout et al. 2001).Urban environments are of particular interest to gamedevelopers and the military simulation community.

ÿ We extract a 2-1/2D sketch from a scene in agraphically rendered virtual world. Yeap finesses theissue of perception by assuming that a 2-1/2D map isgoing to be available. Computer games and militarysimulations generally also avoid the perception step byusing a database of 3D models.

ÿ We extended Yeap and Jefferies’ cognitive mappingalgorithms (Yeap & Jefferies 1999). Instead oflimiting the agent to only building one ASR at atime, focusing only on the immediate surroundings,we save the residue of what has been perceived throughthe exits in the local environment and begin theconstruction of the new ASRs before the areas areactually visited. This particular extension was madebecause we believe that cognitive mapping must notbe limited to places that have been physicallyexplored. Virtual humans need to build cognitivemaps in anticipation of the next space they will enter.

Figure 1: View of a street in a virtual urban environment.

2 A raw cognitive map contains just information about the

local environment without the addition of semanticinterpretation (Yeap 1988; Yeap & Jefferies 1999).

Figure 2: View of a second street with trees.

MotivationAs previously stated, we are developing virtual humans foran immersive military training environment called theMission Rehearsal Exercise (MRE) System. In the MREthe participants interact with virtual soldiers to performmissions involving tasks such as securing an area fromattack, controlling an angry crowd, tending to an injuredchild, and securing a landing zone for a medevac helicopter.To perform these tasks the virtual soldiers must exploretheir surroundings, locate a suitable clear space, identify thepotential lanes of attack into that space, and positionthemselves to block these lanes of attack. Performingthese tasks requires spatial knowledge about landing zonesand lanes of attack as well as perception of the environmentto locate regions and exits that match those spatialconcepts.

Many current applications finesse perception andspatial reasoning as much as possible. Computer games(Liden 2001), virtual worlds (Doyle, 2002) and militarysimulations (Reece, Kraus & Dumanoir 2000; Stanzione etal. 1996) often require a designer to annotate theenvironment with invisible spatial references to help virtualhumans behave believably. Another approach is to giveagents omniscient perception, giving them a complete mapof the static environment and the current location of everydynamic entity. The alternative, demonstrated by theresearch presented here and the research of Terzopoulos andRabie (Terzopoulos & Rabie 1995), is to give virtualhumans realistic perception of their environment.Perception would be realistic both in the types ofinformation sensed (no invisible spatial cues, no map) andthe limitations on that sensing (no 360 degree field ofview, no seeing through walls). As the virtual humanmoves around and views the environment from differentperspectives, it constructs a cognitive map of itssurroundings and uses that map for spatial reasoning.

Creating a cognitive map of the virtual environment,based on realistic perception, has a number of advantagesover annotating the environment with spatial references.Different virtual humans can represent the environmentwith different cognitive maps based on their roles andknowledge. While the underlying ASR representation maybe the same, the annotations placed on the spatial mapwould depend on the role and knowledge of the virtualhuman. A local resident’s cognitive map of their homecity, including street names and friend’s houses, would bevery different from the cognitive map of a soldier sent todefend that city which might include lines of attack and

defensive strong points. Different map representations,based on different roles, will have far-reaching implicationson the behavior of the virtual humans, affecting everythingfrom natural language understanding and generation tomovement and goal selection. In addition, cognitivemapping doesn’t require the environment designer to embedspatial information in the environment, which can be atime consuming process. When spatial knowledge isencoded in the model, the designer must anticipate everybehavior that could be potentially associated with a feature,leaving little for the agent to decide.

A cognitive map built from realistically limitedperception also has a number of advantages over givingagents omniscient perception. At first it might seem thatomniscient agents are simpler since they don’t require arealistic model of perception. However, for their behaviorto be believable, omniscient agents must pretend to ignorethe sensory information they wouldn’t realisticallyperceive. Differentiating between the information theyshould and should not pretend to ignore requires a model ofrealistic perception at some level. In fact, realisticallylimited perception can help to guarantee that a virtualhuman is behaving believably by not allowing behavior tobe affected by information a real human won’t know.Realistic perception will lead to virtual humans thatexplore the environment and look around realistically tomap their environment. In addition, these agents will getlost and make realistic mistakes based on their limitedknowledge of the environment.

Building A Cognitive MapBased on the Absolute Space Representation (ASR)algorithm developed by Yeap and Jefferies (Yeap & Jefferies1999), our virtual human maps the local environment bycontinuously perceiving a scene, constructing a 2-1/2 Dsketch of the surfaces, building a local map, and connectingit with other local maps that it has already constructed inthe process of exploring a virtual town. Our mappingalgorithm takes into account major static objects (i.e.buildings and trees) that exist anywhere in the urbanenvironment. Buildings are represented in the virtualenvironment by polygons that form the walls and roof ofeach building. Features on the walls (i.e. doors andwindows) are texture-mapped onto the polygons and arethus ignored by our system. Each tree is represented bytwo or three polygons arranged in an X or star-shape withthe image of the tree texture-mapped onto each polygon.The perception system constructs a 2-1/2D sketch fromthese static objects as described below.

The basic idea behind Yeap’s theory of cognitive maps(Yeap 1988) is to build a representation of the open spacearound the viewer. As previously mentioned this space isdefined by the boundaries and exits that surround theviewer. The key to Yeap’s construction of a raw cognitivemap is the identification of the exits, which are defined asgaps between obstacles. This is the commonsensedefinition of an exit. But how does one compute it? We

need to start by looking for gaps in the surfacessurrounding the viewer, beginning by looking for occludededges. An exit is a way of leaving a local space. It is alsoa signal to compute a new ASR. Exits serve anotherimportant purpose in that they identify places in the spacethat have not been uncovered yet. These are places that areoccluded and the viewer is not sure of. It may not actuallybe an exit, merely a place that has not been explored yet. Ifthe goal is to build a complete raw cognitive map of anarea, then the exits may actually be areas one needs toexplore more fully, thus guiding the mapping process.

Figure 3: Detecting edges in the urban scene from Figure 1.

Figure 4: Detecting edges in the urban scene from Figure 2.

Constructing a 2-1/2D sketchYeap and Jefferies’ cognitive mapping algorithm takes asinput a 2-1/2D sketch of the scene (Marr 1982; Yeap &Jefferies 1999). The sketch is the set of boundary surfaces,including depth information, currently perceived by theviewer. These surfaces are represented as an ordered list ofedges (with vertices), as they appear from left to right inthe field of view. But how is this sketch constructed? Theanswer depends on the domain of the application. Yeaptested the algorithm in a 2D simulated domain but nodetails are provided about how the sketch was derived. In amobile robot domain, the sensors and computer visionsystem detect the edges and surfaces and recognizes objectsin an effort to determine that the obstacles are indeedbuildings or other real things. Much progress has beenmade in this area (e.g., see Kortenkamp, Bonasso &Murphy, 1998, on mobile robotics), but it still remains asignificant challenge. One of the contributions in thispaper is an approach to building a 2-1/2D sketch ingraphically rendered virtual environments.

We took a hybrid approach to building the 2-1/2Dsketch that combines the use of the graphical model,known as the scene graph, which contains nodescorresponding to the objects in the scene, a graphics-rendering engine, and visual routines for edge detection.Each of the buildings and other objects in Figure 1 and 2are represented as nodes in the scene graph that will berendered in real time. Rather than relying on computervision to recognize that these are buildings or trees, we

simplify the process by using the scene graph todifferentiate between individual buildings, individual treesand the ground. But this only takes us part of the waytoward building a 2-1/2D sketch. To do this, we take thefollowing steps:1. Traverse the scene graph and assign a unique number

to each node corresponding to a static object (i.e.,building or tree). This is done by taking advantage ofthe node pre-draw callback function in the graphicsroutines. The advantage of this is that each of thestatic objects, which are fairly simple boxes or star-shaped “trees” underneath the texture maps, will beassigned a unique number that will be used later foredge detection.

2. Cull the nodes, leaving only the visible ones. Thisstep creates the occlusions that the viewer wouldexperience in the real world. Without this step themodel would be transparent to the viewer, enabling thevirtual human to see through solid walls. This step isessential for creating a 2-1/2D sketch. Without theocclusions the viewer would have be able to create afull 3D model.

3. Draw each node with its assigned number (color). Theresult of this step can be seen in figures 3 and 4, wherethe static objects appear as different colors,corresponding to the unique numbers that wereassigned.

4. Find the edges between the ground and the staticobjects using standard edge detection techniques.ÿ Use the graphics z-buffer to get the depth into the

picture—we need the (x,y,z) positions of thepoints.

ÿ Assume you know the color / # of the ground.Scan from the sky downward to find the groundedge. Do this across the image.

The result is a set of line segments along theboundaries between the static objects and the ground.Pixelation may result in short line segments that have tobe joined together to form longer lines. These longer linesare smoothed out using standard edge detection techniques.

The output from this step is a 2-1/2D sketch, which isa set of edges and vertices in a format that can be used forYeap’s ASR algorithm, which we will describe in the nextsection3.

3 For the details of the algorithm see Yeap and Jeffires.

Figure 5: (a) The 2-1/2D sketch of scene in (c). (b) Boundarysegments representing trees are identified in red. (c) Theboundary segments overlaid onto the urban environmentwith trees indicated by circles.

Mapping the local spaceOnce a 2-1/2D sketch has been built, the key to computingan ASR is detecting where the boundaries and exits arelocated in the local space. Exits serve not only the obviousfunctional role of providing egress from a local space, butpassing through an exit also triggers the construction of anew local map, which is represented as an ASR (Yeap &Jefferies 1999). Exits serve as the connections between themaps of local spaces (ASRs), and the raw cognitive mapends up being a network of exit nodes connecting localmaps. Finding the boundaries of the local space isimportant for defining the extent of the area. Locating theexits is essential, both as a way of indicating how to leavea local space, and as a way of connecting pieces of thecognitive map together into a whole.

Exits are detected by looking for places in the scenewhere one surface partially occludes another. The gapbetween the two surfaces is what Yeap and Jefferies (Yeap& Jefferies 1999) call an occluded edge. An occluded edgehas a visible vertex, which is also called the occludingvertex and is closest to the viewer, and an occluded vertex,which is where the occluded edge intersects with thebackmost surface.

Let’s assume we want to calculate an exit in figure 6using Yeap and Jeffries’ algorithm (Yeap and Jeffries,1999). An occluded edge, CD divides the surfaces in thecurrent field of view. Thus, we split the surfaces into two

(a) (b) (c)

groups: one containing all the surfaces left of vertex C andthe other containing all the surfaces right of vertex C. Anexit is the shortest span between the occluding vertex (i.e.,C) and a point in the second group of surfaces. In thisexample, CJ is selected as the shortest span. Othercandidates that were rejected as longer spans include CD,CE, CF, CP, CG, CH, CI, CJ, CK, and CL. Point P isidentified because CP is normal line to FG. In this case,CJ is a doubtless exit because J, the selected vertex, is notthe occluded vertex. The CJ exit is the gap that must becrossed in order to reach the occluded edge. If CD were theshortest span this exit would have been a doubtful exit.

Figure 6: Calculating an exit in a 2-1/2D sketch.

To identify an exit, the surfaces from the current 2-1/2D sketch are scanned in order, from left to right, in asearch for occluding vertices. Since the exit is the gap thatmust be crossed in order to reach an occluded edge,identifying the exit starts with the occluded edge. Eachunoccluded vertex is chosen and the closest point on asurface contained in the opposite group is found. The exitis the edge formed by the unoccluded vertex and the closestpoint. Once identified, it is then inserted into the list ofsurfaces in its logical place adjacent to the surfacescontributing the vertices. The surfaces beyond the exit aretrimmed from the ASR. They are no longer taken intoconsideration for mapping local space since they have beendetermined to be outside the exit. Yeap discards thetrimmings—but in our implementation this residue issaved and used to map spaces outside of one’s local space.This will be discussed in more detail in the section onmapping outside the local space. Figure 7 shows a 2-1/2Dmap of a virtual urban environment before the edges areidentified. Figure 8 shows the same 2-1/2D map with the

doubtless and doubtful exits identified and displayed. Theslight differences between the boundaries in the two imagesare the result of slight variations in the edge detectionbetween runs.

Updating the local mapSince the viewer’s perspective changes over time the ASRmust continually be updated. Even a simple shift in gazewill uncover more details about the environment. Movingthrough the environment will cause some occlusions to beuncovered and others to be formed, so the question is howto incorporate this information into the raw cognitive map.

Figure 7: A top-down perspective of a 2-1/2D sketch of thevirtual environment as perceived from the position markedby the end of the red line on the right.

Figure 8: The 2-1/2D sketch from Figure 7 after exitidentification. The building boundaries are shown as darklines, the doubtless exits as thin lines, and the doubtful exits

Agent’s eye

AB

C

D

GH

F

J

K

I

L

E

(P)

as dotted lines. The viewer’s position is shown by therightmost red line.

Yeap and Jefferies (Yeap & Jefferies, 1999) distinguishbetween two kinds of exits: doubtful and doubtless. Adoubtless exit is one that takes the viewer out of the localspace. It consists of two unoccluded vertices—they mustboth have been visible sometime during the mappingprocess. In determining a doubtless exit, it is the shortestpossible span between two surfaces. Once Yeap’salgorithm has determined that an exit is doubtless it notlonger needs to be updated.

When one of an exit’s vertices is occluded, it is adoubtful exit. As the viewer moves through theenvironment, this type of exit must be updated. This isbecause as one’s perspective changes, more of the occludededge may be uncovered and the location of the occludedvertex will also change to be the shortest distance spanningthe gap. This goes on until one of two things happens:either the exit is identified as doubtless (i.e., both verticesare unoccluded) or the occluded surface is completelyuncovered and it is discovered that there is no exit.

The ASR is updated once per frame, where the framerate may be as high as 20-30 frames per second. This mayprove to be excessive in the long run, but it works fornow. Each update involves taking the following steps4:1. Sense the environment and construct a 2-1/2D sketch.

Call this perspective CURRENT-VIEW.2. Check whether the viewer is still inside the current

ASR. This can be achieved with a simple intersectiontest: draw a line from the viewer’s current location tothe initial position in the ASR and check whether thisline intersects with the surface of the ASR.

3. If an exit has NOT been crossed, update the doubtfulexits based on the CURRENT-VIEW. If the change inperspective uncovers an occlusion, this will cause thesize of the corresponding doubtless exit to decrease.For each doubtful exit:a. Label the two surfaces that contribute vertices to

the doubtful exit as S1 and S2.b. If CURRENT-VIEW includes S1 and S2, then

replace the doubtful exit with the surfaces that liebetween S1 and S2. Note: We found that we hadto relax this condition somewhat because there arecases where the vertices of the doubtful exit areoutside of the field of view of the agent.

4. Else, if an exit has been crossed, this means that theviewer is no longer in the local space represented bythe current ASR. The next section deals with thissituation, which involves extending the raw cognitivemap with the current ASR and either starting a newASR or using a previously computed one.

4 These steps are based on the extend-ASR algorithm in Yeap

and Jefferies (1999).

Extending the Cognitive MapAs new areas are mapped they are added to a network ofASRs that comprise the raw cognitive map. Whenever theviewer crosses an exit and enters a previously unexploredarea, a new ASR is computed. Figure 9 shows a rawcognitive map with three ASRs. In this example theviewer starts where the arrows begin and proceeds up thestreet, turns left at an alley, goes between two buildings,and enters an open area surrounded by some buildings. Thefirst ASR maps the street and ends when the street enters anintersection with another street, the second ASR representsthe alleyway between the buildings, and the third ASR isstill being formed for the open area as shown on the leftside of figure 9. Note that the third ASR contains doubtfulexits on the left and right sides of the viewer. Thisindicates that the area has not yet been completely mapped.Once the viewer’s perspective has been rotated, these areaswill be filled in with surfaces and doubtless exits. Figure10 shows a more complete map of the third ASR overlaidonto the image of the town.

Extending a raw cognitive map requires the ability torecognize that an area that has previously been visited,otherwise areas would be re-mapped every time they werevisited. The recognition routine is triggered when theviewer crosses an exit.

When the viewer crosses an exit, there are threepossible cases:1. The newly entered space was previously mapped and

the exit is a known connector between the two ASRs.When this is the case, no updates to the raw cognitivemap are required. Use the ASR from the raw cognitivemap as a map of the local space.

2. The newly entered space was previously mapped, butit was not known that this exit connected these twoASRs. In this case update the raw cognitive map toreflect the fact that this exit is a connector, and use theASR from the raw cognitive map.

3. The newly entered space is unexplored, so the viewermust begin mapping it. The steps in mapping thisspace are: (1) place the just exited ASR into the rawcognitive map, (2) create a new ASR, and (3) connectthe ASR the viewer just departed with the new ASR atthe exit point.

Figure 9: Three ASRs are shown connected together. Thethird ASR contains both doubtless (thin lines) and doubtful(dotted lines) exits.

Mapping Outside the Local SpaceWe developed an extension to Yeap and Jefferies’ algorithmthat enables the viewer to map spaces outside the currentASR. In their version, the ASR algorithm maps the localspace by iteratively identifying exits and trimming off thesurfaces beyond the exit. The only thing that is mapped iswhat is in the current local space as they define it. Ourextension to Yeap’s approach is to use the surfaces beyondexits to create a preliminary map of spaces that aren’t localto the agent.

We do not believe that humans discard what they seeon the other side of an exit. The cognitive mappingprocess is not confined to one’s local space. A personwalking around in an unfamiliar building will probablyfocus their attention on perceiving and mapping the localspace, but it seems highly improbable that they wouldignore the layout of a room that happens to be on the otherside of a door or down a hallway. In fact, what is seendown the hallway (or down the street), which is a differentlocal space, may provide important information that willimpact the behavior of the viewer even before that space isentered.

An example of this arises in the context of anapplication that we have been working on for a militarytraining exercise in peacekeeping operations. Some virtualsoldiers are looking for an open area that would be suitablefor a medevac helicopter to land. A quick glance down analley or street may reveal that there is no open space in theimmediately adjacent spaces, but further down the streetthere is a major intersection where it may be possible for ahelicopter to land. The intersection can be observed andpartially mapped without physically leaving the currentlocal space. If we restricted the cognitive mapping to onlyareas that had been physically visited, then the soldierswould have to behave unrealistically to acquire knowledgethat is literally right before their eyes. For example, asoldier standing on the upper end of the first ASR shown in

figure 10 would be able to see into the intersection that iscovered by the red shading. But according to Yeap &Jefferies 1999 this would not be mapped and thereforewould not be accessible unless the soldier took a step outof the current ASR toward the intersection.

To map areas outside of the current local space, wemodified the ASR algorithm so that the areas outside theexits are not discarded. These are saved to form partialASRs of the adjacent local spaces.

Figure 10: A cognitive map, including residual-ASRs(shaded regions) constructed from the residue of localcomputations

Figure 11: Another cognitive map of a different virtualenvironment.

The basic idea is to not only compute an ASR of thecurrent local space, but at the same time also map the

perceivable surroundings outside the local space. We callthis set of surroundings outside the local space residual-ASRs since they are built by trimming the residue off ofthe current ASR. Residual-ASRs are updated everyperception cycle, and their composition relies completelyon the successive visual perspectives of the viewer.Computing a residual-ASR involves the following steps:

1. Each perception cycle create a 2-1/2D sketch of the areain the agent’s field of view5. We refer to this sketchas the CURRENT-VIEW.

2. Subtract the current ASR from the CURRENT-VIEW.Call the remainder the residue. This computationinvolves two steps:a. For each currently visible exit in the ASR,

identify the surfaces and gaps in the CURRENT-VIEW that appear through that exit. Designatethese surfaces and spaces as the residue for thatexit.

b. Once the residue for an exit has been identified,use it to compute an ASR, i.e., identify the exits(doubtless and doubtful) and the surfaces usingthe same algorithm described previously. Theresult is the current-residual-ASR for that exit.

3. After each perception cycle, update the cumulativeresidual-ASR for each of the exits. The current-residual-ASR is only a snapshot. Its results are usedto update the cumulative residual-ASR. The updatingmay involve adding new surfaces, changing exits fromdoubtful to doubtless, or reducing the size ofdoubtless exits where occlusions are uncovered.

With this extension to the basic ASR algorithm, avirtual human can map the perceivable areas outside of thelocal space while retaining the spatial interpretation affordedby the ASR. But what happens to these residual ASRs asthe viewer travels from one local space to another? Thereare three cases we have considered:

1. As the viewer moves from one local space (ASR) toanother, all of the residual-ASRs are saved andindexed by the location of the exit through which theresidue was collected. An ASR may have multipleresidual-ASRs, one for each exit. When the viewer re-enters an ASR, the residual-ASRs become availableagain.

2. When a viewer goes through an exit into an area thatwas not previously visited, it will likely have aresidual-ASR that it computed for that space. At thispoint the residual-ASR is discarded and an ASR iscomputed. In our future work we will use theresidual-ASR as a starting point for computing a newASR.

5 This is the same 2-1/2D sketch that is used as input to theASR-update algorithm.

3. When the viewer looks through an exit into a localspace that has already been visited, then the viewerwill recognize the space as having already beingmapped, so it will not create a residual-ASR. Itrecognizes the space by taking the coordinates of theexit and indexing into the raw cognitive map, whichcontains all the exits and their locations.

This extension to Yeap and Jefferies’ theory andalgorithms provides the viewer with the ability to mapareas outside of its local space. Figures 10 and 11 showsome residual-ASRs shaded in red. For example, on theright hand side of Figure 10 there is a residual-ASR for theexit between the two buildings, looking out to the spacebeyond. In some cases phantom edges were detected due inpart to the occlusions in the environment. In Figures 10and 11, there is a slight mismatch between the lines ofcognitive map and the background urban image due toscaling and alignment differences in the two softwarepackages used to produce the two images.

Applications Of Cognitive MapsOnce a cognitive map of an area of the environment hasbeen generated, the virtual human who generated that mapcan use it in a number of ways. In the Mission RehearsalExercise (Swartout et al., 2001) mentioned in Section 2,many of the predicates used by the virtual human’s plannerinvolve spatial concepts. These predicates representconcepts such as individuals or groups occupying a specificregion (medic-at-injury-site, crowd-in-landing-zone) andexits/entrances to a region being covered (landing-zone-secure, injury-site-secure). Currently the status of thesepredicates is updated through the script that drives theexercise. However, we are currently updating how thesepredicates are calculated within the virtual human’sperception and spatial reasoning. In the new approach thevirtual human will create a cognitive map that includesASRs corresponding to regions such as the landing zoneand injury site. Updating a predicate such as medic-at-injury-site will involve visually locating the medic andcomparing the medic’s location to the boundaries of theinjury site ASR. Updating the landing-zone-securepredicate will involve visually inspecting each exit of thelanding zone ASR to ensure that friendly soldiers areprotecting the exits.

In addition to updating spatial predicates, a cognitivemap can also be used to implement spatially orientedstrategies. For example, a flanking maneuver mightinvolve locating the ASR the enemy is in and attackingthrough two of that ASR’s exits simultaneously. Inherentin this strategy are the concepts of scouting, examiningmany ASRs to locate the enemy, and desirable defensivepositions, ASRs that have a small number of exits. AnASR with a single exit may not be desirable, as it leavesno escape route.

Cognitive maps will also be useful in communicatingspatial information between agents. If both agents havesimilar cognitive maps then, once a common set of names

for ASRs and exits has been negotiated, the agents canreference features of each other’s cognitive maps.Furthermore, one agent can add to another agent’scognitive map (at an abstract level) by communicatingspatial information about areas that the second agent hasn’tseen. For example, a sergeant might report to hislieutenant “We’ve located a suitable space for a landingzone. It’s an open area through the west exit of this area.It has three lanes of approach which have been secured.”

Related WorkCognitive mapping research has been applied in the areas ofmobile robotics, military simulations, and computergames. We briefly summarize the relationship of theresearch in these three areas to our own research (Hill, Han& van Lent 2002).

Kuipers (Kuipers 1978) did some of thegroundbreaking work in cognitive mapping. He recentlyproposed a spatial semantic hierarchy (Kuipers 2000) as away of representing knowledge of a large-scale space. Thespatial semantic hierarchy is actually a set of distinct butrelated ways of describing space, including sensory, control,causal, topological and metrical representations. He andRemolina recently also developed a formal logic for causaland topological maps (Remolina & Kuipers 2001). Kuipershas tested his approach on simulated robots. There arenumerous other researchers in mobile robotics who havealso developed and implemented cognitive mappingtechniques, e.g., see (Kortenkamp & Bonasso & Murphy1998; Levitt & Lawton 1990). Chown et al. (Chown &Kaplan & Kortenkamp 1995) developed the PLAN system,which also uses viewer-based information to build acognitive map. PLAN was implemented with aconnectionist network with the purpose of integratingwayfinding with cognitive mapping. While the research inmobile robotics has a lot in common with our domain, oneof the chief differences is that many of their methods weredeveloped to deal with noisy sensors and the difficulty ofdiscerning one’s location. Our emphasis is somewhatdifferent in that we are trying to build agents withbelievable human-like behaviors. The sensors are notnoisy, but they do operate with limitations. The end use ofour cognitive maps is also somewhat different in that weare not just concerned about wayfinding but also aboutspatial awareness for a wide variety of tasks that robots arenot normally concerned about.

Computer game characters commonly have perceptualomniscience. Their perception is not modeled after humancapabilities and limitations. To achieve human-likebehavior the designers have to give the appearance oflimited perception. Alternatively their superhumancapabilities are either attributed to superior ability or tocheating, which can be disheartening for human players.Spatial reasoning is frequently programmed into theenvironment rather than into the game’s characters (Liden2001). The game map consists of nodes linked togetherinto a graph structure, which are then used as paths for the

characters. For the characters to exhibit intelligentbehavior, knowledge is encoded into the nodes and linksabout what behavior is appropriate at those locations. So anode or link may have information saying that a location isgood for an ambush or that the character should crawl whentraversing this link to remain undercover. As wementioned earlier in this paper, the designers have to encodeeverything into the environment. While this is efficient interms of runtime computation, it does not address the issueof generality. It is a labor-intensive process that must bedone for each new game environment. An alternative toreal-time spatial reasoning is to automatically pre-computeand store information about the environment using themethods described here. This would avoid the problem ofhaving to analyze and hand encode the spatial characteristicsof the environment into the map representation. Laird(Laird 2001) is the one exception in the computer gamesworld. He combines the use of simulated perception (totrace the walls) and mapping to support more sophisticatedAI-based behaviors such as ambushing and trapping.

Military simulations generally require a lot of spatialreasoning. Research on virtual humans for militarysimulations has addressed the issue of spatial reasoningmore broadly than in games. For example, Reece et al.(Reece, Kraus & Dumanoir 2000) have built on the workin path planning from AI and robotics. For areas outside ofbuildings they use A* search and represent the space withcell decomposition, and graph planning (which issomewhat similar to what games do.) !But the charactersare not limited to the graph when moving through mostenvironments. Forbus et al. (Forbus, Mahoney & Dill2001) are striving to apply qualitative spatial reasoning toboth military simulations and to strategy games. They arecurrently looking at ways to improve path planning to takeinto consideration trafficability, visibility, and fields of fire.They use a hybrid approach that combines therepresentations from cell decomposition andskeletonization. Up until now, however, they have focusedon analyzing terrain rather than urban environments.

AcknowledgmentsThe work described in this paper was supported in part

by the U.S. Army Research Office under contractDAAD19-99-C-0046. The content of this article does notnecessarily reflect the position or the policy of the USGovernment.

ReferencesChown, E., Kaplan, S., and Kortencamp, D. 1995.Prototypes, location, and associative networks (PLAN):Towards a unified theory of cognitive maps. CognitiveScience 19:1-51.

Doyle, P. 2002. Believability through context.Proceedings of the First International Joint Conference on

Autonomous Agents and Multiagent Systems. ACM Press:New York.

Forbus, K., Mahoney, J., and Dill, K. 2001. Howqualitative spatial reasoning can improve strategy gameAIs. In Proceedings of the 2001 AAAI Spring Symposiumon Artificial Intelligence and Interactive Entertainment:AAAI Press.

Hill, R., Han, C., and van Lent, M. 2002. Perceptuallydriven cognitive mapping of urban environments. InProceedings of the First International Conference onAutonomous Agents and Multiagent Systems.

Kortenkamp, D., Bonasso, R.P., and Murphy, R.,(editors). 1998. Artificial Intelligence and Mobile Robots.Menlo Park, CA: AAAI Press.

Kuipers, B. 1978. Modeling spatial knowledge. CognitiveScience 2:129-153.

Kuipers, B. 2000. The spatial semantic hierarchy.Artificial Intelligence 119:191-233.

Laird, J. 2001. It knows what you are going to do: Addinganticipation to a Quakebot. In Proceedings of the FifthInternational Conference on Autonomous Agents.Montreal, Canada, May 28-June 1.

Laird, J. and van Lent, M. 2001. Human Level AI’s killerapplication: Interactive computer games. AI Magazine,Volume 22, Issue 2, Summer.

Levitt, T. and Lawton, D. 1990. Qualitative Navigation forMobile Robots. Artificial Intelligence 44(3): 305-361.

Liden, L. 2001. Using nodes to develop strategies forcombat with multiple enemies. In Proceedings of the 2001AAAI Spring Symposium on Artificial Intelligence andInteractive Entertainment: AAAI Press.

Marr, D. 1982 Vision: A computational investigation intothe human representation and processing of visualinformation: W.H. Freeman and Company.

Reece, D., Kraus, M., and Dumanoir, P. 2000. Tacticalmovement planning for individual combatants. InProceedings of the 9th Conference on Computer GeneratedForces and Behavior Representation. Orlando, FL.

Remolina, E. and Kuipers, B. 2001. A logical account ofcausal and topological maps. In Proceedings ofInternational Joint Conference on Artificial Intelligence(IJCAI-01). Seattle, WA.

Stanzione, T., Evans, A., Chamberlain, F., Buettner, C.,Mabius, L., Fisher, J., Sousa, M., and Lu, H. 1996.Multiple Elevation Structures in the Improved ComputerGenerated Forces Terrain Database. In Proceedings of the6th Computer Generated Forces and BehavioralRepresentation Conference. University of Central Florida.

Swartout, W., Hill, R., Gratch, J., Johnson, L.,Kyriakakis, C., LaBore, C., Lindheim, R., Marsella, S.,Miraglia, D., Moore, B., Morie, J., Rickel, J., Thiebaux,M., Tuch, L., Whitney, R., and Douglas, J. 2001. Towardthe Holodeck: Integrating Graphics, Sound, Character andStory. Proceedings of the Fifth International Conferenceon Autonomous Agents. ACM Press: New York.

Terzopoulos, D. and Rabie, T. 1995. Animat Vision:Active Vision in Artificial Animals. In Proceedings of theFifth International Conference on Computer Vision (ICCV’95). 801-808. Cambridge, MA, June.

Yeap, W.K. 1988. Towards a computational theory ofcognitive maps. Artificial Intelligence 34:297-360.

Yeap, W.K. and Jefferies, M.E. 1999. Computing arepresentation of the local environment. ArtificialIntelligence 107:265-301.

perceptually driven cognitive mapping in a virtual urban...

Documents