heuristic search techniques in video-game pathfinding- a survey of issues and techniques by(1)

5/20/2018 Heuristic Search Techniques in Video-Game Pathfinding- A Survey of Issues and Te...

http:///reader/full/heuristic-search-techniques-in-video-game-pathfinding-a-survey

International Journal of Research (IJR) Vol-1, Issue-7, August 2014 ISSN 2348-6848

HEURISTIC SEARCH TECHNIQUES IN VIDEO GAME PATH FINDING: A SURVEY OF ISSUES AND TECHNIQUESAzeem

Mohammad, Supreethi K.P

P a g e | 851

Heuristic Search Techniques in Video-Game Pathfinding:

A Survey of Issues and TechniquesAzeem Mohammad1, Supreethi K.P2

1Department of Computer Science and Engineering, JNTU College of Engineering,

Hyderabad, India

Email: [email protected] of Computer Science and Engineering, JNTU College of Engineering,

Hyderabad, India

Email: [email protected]

AbstractIndependent of its problem size the real-

time heuristic search algorithms need to

maintain a time bound. In environmentswhere memory and time are limited and

where fast response required. Pathfinding

in video games is a best example, where

multiple units are need to react promptly

according to the players commands.

Classical heuristic search techniques

cannot be applied because of their state

re-visitation problem. Recent algorithms

use database of pre-computed subgoals to

improve the performance. Pre-

computation time can be long and there is

no guarantee that pre-computed data can

yield the search space. To address these

sort comings Hill climbing and dynamic

programming are added to eliminate the

state re-visitation problems.

Keyords:Pathfinding, search space, real-time

search.

1. INTRODUCTION

Path finding is an active research area

in many computer domains and one of the

crucial areas is gaming. Many

methodologies have been devised to find

the best least cost path between two points.

As movement is main aspect in

videogames there is a need to develop

most feasible methods which can calculate

the path in less time and consume lessmemory. Finding a shortest path in a

bounded time period, which needs to be

met to suit the real-time gaming

environment is tedious task. This makes

pathfinding methodologies more complex

and necessitates them to process the pathin very less amount of time and using less

memory.

Path-finding calculates the best possible

shortest route between any two nodes,

thereby making it easy to move from one

point to another. One of the real time

applications is video games. The Heuristic

Search methods provide a significant part

in video game pathfinding, still there are

better and advanced methods being

developed to minimize the time andmemory requirements.

2. HEURISTIC SEARCH

Heuristic search is a core area of

Artificial Intelligence (AI) research and its

algorithms have been used in planning,

game playing and agent control. The

heuristic function is used to inform the

search about the goal. It gives an informed

way to guess which neighbour of a nodewill lead to a goal. One way of this

heuristic information about which nodes

seem the most feasible is a heuristic

function h(n), which takes a node nand

returns a non-negative real number that is

an estimate of the path cost from node nto

a goal node.

Following are the list of heuristic search

techniques Generate and Test Algorithm,

Hill Climbing, Stimulated Annealing,

Depth-First-Search, Breadth-First Search,Best-First Search (or) A* Search






P a g e | 852

The term heuristic function used for

algorithms which find solutions among all

possible ones, but there is guarantee that

best one will be found. Therefore they may

be considered approximate algorithms but

not accurate ones.

3. REALTIME HEURISTIC SEARCH

METHODS

Real Time heuristic search algorithms

satisfy a constant upper bound on amount

of planning per action, independent of

problem size. This property is important in

number of applications including

autonomous robots and agents in videogames. A general problem in video games

is searching for the path between two

points. In most real time games, agents are

expected to act quickly in response to

players commands and other agents

actions.

3.1 LRTA*: CORE ALGORITHM

The core of most real-time heuristic search

algorithms is an algorithm called LearningReal-Time A* (LRTA*). LRTA* is a

special case of value iteration or real-time

dynamic programming and has a problem

that has prevented its use in video game

path-finding. Specifically, the algorithm

updates a single heuristic value per move

on the basis of heuristic values of near-by

states. This means that when the initial

heuristic values are overly optimistic (i.e.,

too low), LRTA* will frequently re-visit

these states multiple times, each timemaking updates of a small magnitude. This

behaviour is known as scrubbing and

appears highly irrational to an observer.

There have been attempts to speed up the

learning process in LRTA*. Most of the

resulting algorithms can be described by

the following four attributes:

The local searchspace is the set of states

whose heuristic values are accessed in theplanning stage. The local learning space

is the set of states whose heuristic values

are updated. Common choices are: the

current state only, all states within the

local search space and previously visited

states and their neighbours. A learningruleis used to update the heuristic values

of the states in the learning space. The

control strategy decides on the move

following the planning and learning

phases. Commonly used strategies include:

the first move of an optimal path to the

most promising frontier state, the entire

path and back tracking moves.

3.2 THE ADVENT OF LRTA*

With the dynamic programming style

learning rule, researchers have attempted

to speed up the learning process and make

state re-visitation less apparent.

The next version of LRTA*, LSS LRTA*

expands the local search space using the

A* and updating the heuristics of all states

in the local search space in order to speed

up the learning. This significantly

eliminates state re-visitation and does noteliminate scrubbing problem and can still

result in highly suboptimal paths.

3.2.1 Pre-computed subgoals

The performance can be improved

significantly by solving a number of

problems offline and storing them in a

database. Then, online, these solved

problems can be used to guide the agent by

directing it to a nearby subgoal instead of a

distant goal.There are several, previously developed,

real-time heuristic search algorithms that

use pre-computed subgoals.

4. D LTRA*

Although in general planning a goal is

often represented as a conjunction of

simple subgoals, so far considered, the

only real-time heuristic search algorithm to

implement subgoaling is D LRTA*([1]).In its pre-processing phase, D LRTA* uses






P a g e | 853

the clique abstraction of Sturtevant and

Buro (2005) to create a smaller search

graph. The clique abstraction collapses a

set of fully connected states into a single

abstract state and can be applied iteratively

to compute progressively smaller graphs.

For example, a 2-level abstraction applies

the clique abstraction to a graph that has

already been abstracted once. Similarly, an

a-level abstraction applies the clique

abstraction a times. If we assume that each

abstraction reduces the graph by a constant

factor, an a-level abstract graph would

contain a times fewer states than the

original graph. This abstraction technique

in effect partitions the map into a numberof regions, with each region corresponding

to a single abstract state. Then for every

pair of distinct abstract states, D LRTA*

computes an optimal path between

corresponding representative states (e.g.,

centroids of the regions) in the original

non-abstracted space.

EXAMPLE OF D LRTA*

OPERATION

(a) off-line, the map is partitioned into

seven regions (or abstract states). Each

vacant cell is labelled with its region

number.

(b) off-line, an optimal path between

centroids of two regions (C1 and C2) is

computed and the entry state to the next

region (E) is recorded as a sub-goal for this

pair of regions.

.

(c) online, the agent intends to travel from

S to G, it determines

the corresponding regions and sets the pre-

computed entry state

E as its sub-goal.

There are three key problems with DLRTA*.






P a g e | 854

First, due to the fact that entry states (i.e.,

subgoals) have to be computed and stored

for each pair of distinct regions, the

number of regions has to be kept relatively

small. In D LRTA* this is accomplished

by applying the clique abstraction

procedure multiple times so that the

regions become progressively larger and

fewer in number. A side effect is that

regions will no longer be cliques and may,

in fact, be quite complex in themselves. As

a result, LRTA* may encounter heuristic

depressions within a region.

Second, each state in the original space

needs to be assigned to a region. Since the

regions are irregular in shape, explicitmembership records must be maintained.

This may require as much additional

memory as storing the original grid-based

map.

Third, clique abstraction is a non-trivial

process and puts an extra programming

burden on practitioners (e.g., game

developers).

5. TIME BOUNDED A* SEARCH

Another recent high-performance real-time

search algorithm is Time Bounded A*

search (TBA*), a time bounded variant of

classic A*. It expands states in an A*

fashion using a closed list and an open list,

away from the original start state, towards

the goal until the goal state is expanded.

However, unlike A* that computes

complete path before committing first

action, TBA* time slices the planning by

interrupting its search periodically andacts. Initially before a complete path to the

goal is known, the agent takes an action

that moves it towards the most promising

state on the open list. If on a subsequent

time slice an alternative most promising

path is formed and the agent is not on that

path, it backtracks its steps as necessary.

This interleaving of planning, acting, and

backtracking is done in such a way that

both real-time behaviour and completeness

are ensured. The size of the time-slice isgiven as a parameter to the algorithm,

using as a metric the number of states

allowed to expand before the planning

must be interrupted. Within a single time-

slice, however, operations for both state

expansions and backtracking the closed list

(to form the path to the most promising

state on the open list) must be performed.

The cost of the latter type of operations is

thus converted to state expansion

equivalence (typically several

backtracking steps can be performed at the

same computational cost as a single state

expansion). A key aspect of TBA* over

LRTA*-based algorithms is that it retains

closed and open lists over its planning

steps.Thus, on each planning step it does not

start planning from scratch, but continues

with its open and closed lists from the

previous planning step. Also, it does not

need to update heuristics online to ensure

completeness, nor does it require a pre-

computation phase. While the lack of pre-

computation is certainly its strong side, the

negatives include high sub-optimality if

the amount of time per move is low and

high on-line space complexity due tostoring closed and open lists.

6. INTUITION FOR KNN LRTA*

This attempts to address the short comings

of D LRTA* by not using the abstraction.

In our design of kNN LRTA* we address

the three shortcomings of D LRTA* listed

earlier. In doing so, we identify two key

aspects of a subgoal-based real-time

heuristic search. First, we need to define aset of subgoals that would be efficient to

compute and store off-line. Second, we

need to define a way for the agent to find a

subgoal relevant to its current problem on-

line.

Intuitively, if an LRTA*-controlled agent

is in the state s going to the state sgoalthen

the best

subgoal is a state sidealsubgoal that resides

on an optimal path between s and sgoaland

can be reached by LRTA* along anoptimal path with no state re-visitation.






P a g e | 855

Given that there can be multiple optimal

paths between two states, it is unclear how

to computationally efficiently detect the

LRTA* agents deviation from an optimal

path immediately after it occurs.

On the positive side, detecting state re-

visitation can be done computationally

efficiently by running a simple greedy hill-

climbing agent. This is based on the fact

that if a hill-climbing agent can reach a

state b from a state a without encountering

a local minimum or a plateau in the

heuristic then an LRTA* agent can travel

from a to b without state re-visitation.

Thus, we propose an efficiently

computable approximation to sidealsubgoal.Namely, we define the subgoal for a pair

of states s and sgoal as the state skNN

LRTA* subgoal farthest along an optimal

path between s and sgoal that can be

reached by a simple hill-climbing agent. In

summary, we select subgoals to remove

any scrubbing but do not guarantee that the

LRTA* agent keeps on an optimal path

between the subgoals In practice, however,

only a tiny fraction of our subgoals are

reached by the hill-climbing agentsuboptimally and even then the

suboptimality is minor.

This approximation to the ideal subgoal

allows us to effectively compute a series of

subgoals for a given pair of start and goal

states. Intuitively, we compress an optimal

path into a series of key states such that

each of them can be reached from its

predecessor without scrubbing. The

compression allows us to save a large

amount of memory without much impacton time-per-move. Indeed, hill-climbing

from one of the key states to the next

requires inspecting only the immediate

neighbors of the current state and selecting

one of them greedily. The re-visitation-free

reachability of one subgoal from another

addresses the first key shortcoming of D

LRTA* where the agent may get trapped

within a single complex region and thus be

unable to reach its prescribed subgoal.

However, it is still infeasible to compute

and then compress an optimal path

between every two distinct states in the

original search space. This problem can be

solved by compressing only a pre-

determined fixed number of optimal paths

between random states off-line. Then on-

line kNN LRTA*, tasked with going from

s to sgoal, retrieves the most similar

compressed path from its database and

uses the associated subgoals. We define

(dis-)similarity of a database path to the

agents current situation as the maximum

of the heuristic distances between s and the

paths beginning and between sgoaland the

paths end. Maximum is used because wewould like both ends of the path to be

heuristically close to the agents current

state and the goal respectively. Indeed, the

heuristic distance ignores walls and thus a

large heuristic distance to the paths either

end tends to make that end hill-climbing

unreachable.

We illustrate this intuition with a simple

example. Following figure shows kNN

LRTA* operation offline. On this map,

two random start and goal pairs areselected and optimal paths are computed

between them. Then each path is

compressed into a series of subgoals such

that each of the subgoals can be reached

from the previous one via hill-climbing.

The path from S1 to G1 is compressed into

two subgoals and the other path is

compressed into a single subgoal.

EXAMPLE OF KNN LRTA* OFF-LINE OPERATION:






P a g e | 856

(a): two subgoals (start, goal) pairs are

chosen: (S1;G1) and (S2;G2).

(b): optimal paths between then are

computed by running A*.

(c): the two paths are compressed into a

total of three subgoals.

Once this database of two records is built,

kNN LRTA* can be tasked with solving a

problem

on-line. In previous figure it is tasked with

going from the state S to the state G. The

database is scanned and similarity between

(S;G) and each of the two database records

is determined. The records are sorted bytheir similarity: (S1;G1) followed by

(S2;G2). Then the agent runs reachability

checks: from S to Si and from Gi to G

where i runs the database indices in the

order of record similarity. In this example,

S1 is found unreachable by hill-climbing

from S and thus the record (S1;G1) is

discarded. The second record passes hill-

climbing checks and the agent is tasked

with going to its first subgoal.

EXAMPLE OF KNN LRTA* ON-LINEOPERATION :






P a g e | 857

(a): the agent intends to travel from S to G.

(b): similarity of (S;G) to (S1;G1) and

(S2;G2) is computed.

(c): while (S1;G1) is more similar to

(S;G) than (S2;G2), its beginning S1 is not

reachable from S via hill-climbing and

hence the record (S2;G2) is selected and

the agent is tasked with going to subgoal 1.

The similarity plus hill-climbing check

approach makes the state abstraction of D

LRTA* unnecessary, thereby addressing

its other two key shortcomings: high

memory requirements and a complex pre-

computation phase.

7. HILL CLIMBING AND DYNAMIC

PROGRAMMING SEARCH (HCDPS)

:

The HCDPS algorithm operates in two

stages: offline and online. The offline

stage is performed once, before any

searches, and pre-computes information to

speed up subsequent searches. The offline

stage may take a considerable amount oftime and is not real-time. The online stage

takes a given search problem and uses the

pre-computed information to efficiently

solve the problem in real-time.

During the offline stage, the algorithm

analyzes its search space and pre-computes

a database of subgoals. The database

covers the space such that any pair of start

and goal states will have a series of

subgoals in the database. This is

accomplished by abstracting the space. Wepartition the space into regions in such a


http:///reader/full/heuristic-search-techniques-in-video-game-pathfinding-a-surve




P a g e | 858

way that any state in the region is mutually

reachable via hill climbing with a

designated state, called the representative

of the region. Since the abstraction builds

regions using hill climbing, which is also

used in the online phase, we are

guaranteed that for any start state , our

agent can hill climb to a region

representative of some region . Likewise,

for any goal state , there is a region that the

goal falls into, which means that the agent

will be able to hill climb from s

representative to . All we need now is a

hill-climbable path between the

representative of region and the

representative of region.For every pair of close regions, we run A

in the ground-level space to compute an

optimal path between region

representatives. We then use dynamic

programming to assemble the computed

optimal paths into paths between more

distant regions, until we have an

approximately optimal path between

representatives of any two regions. Once

the paths are computed, they are

compressed into a series of subgoals in thekNN LRTA fashion. Specifically, each

subgoal is selected to be reachable from

the preceding one via hill climbing. Each

such sequence of subgoals is stored as a

record in the subgoal database. Finally, we

build an index for the database that maps

any state to its region representative in

constant time.

Online, for a given pair of start and goal

states, we use the index to find their region

representatives. The subgoal path betweenthe region representatives is retrieved from

the database. The agent first hill climbs

from its start state to the region

representative. The agent then uses the

records subgoals one by one until the end

of the record is reached. Finally, the agent

hill climbs from the region representative

to the goal state.

8. CONCLUSION

In this paper we considered the problem of

real-time heuristic search whose planningtime per move does not depend on the

number of states. A new mechanism for

selecting subgoals automatically. The

resulting algorithm was shown to be

theoretically complete and, on large video

game maps, substantially outperformed the

previous state-of-the-art algorithms D

LRTA* and TBA* along several important

performance measures.

HCDPS, the first real-time heuristic search

algorithm with neither heuristic learning

nor maintenance of open and closed lists.

Database precomputation with HCDPS is

two orders of magnitude faster than kNN

LRTA and D LRTA . Finally, its read-only

database gives it a smaller per-agent

memory footprint than A or TBA with two

or more agents. Overall, we feel HCDPS is

presently the best real-time search

algorithm for video-game pathfinding on

static maps.

9. REFERENCES

[1] Vadim Bulitko, Yngvi

Bjornsson, Ramon Lawrence Case-

Based Subgoaling in Real-Time Heuristic

Search for Video Game Pathfinding

Journal of Artificial Intelligence Research

39 (2010) 269 - 300

[2] W. Zhang, Complete anytime

beam search, in Proc. 15th Nat. Conf.Artif. Intell., 1998, pp. 425430.

[3] Ramon Lawrence, Vadim

Bulitko Database-Driven Real-Time

Heuristic Search in Video-Game

Pathfinding COMPUTATIONAL

INTELLIGENCE AND AI IN GAMES,

VOL. 5, NO. 3, SEPTEMBER 2013, pp

227-241

[4] R. Korf, Real-time heuristic

search, Artif. Intell., vol. 42, no. 23,

pp.189211, 1990.


http:///reader/full/heuristic-search-techniques-in-video-game-pathfinding-a-surve




P a g e | 859

[5] S. Koenig and X. Sun, Comparing

real-time and incremental heuristic search

for real-time situated agents, Autonom.

Agents Multi-Agent Syst., vol. 18, no. 3,

pp. 313341, 2009

.[6] V. Bulitko, M. Lutrek, J. Schaeffer,

Y. Bjrnsson, and S. Sigmundarson,

Dynamic control in real-time heuristic

search, J. Artif. Intell. Res., vol. 32, pp.

419452, 2008.

[7] Vadim Bulitko, Yngvi Bjornsson,

Nathan R. Sturtevant, Ramon Lawrence

Real-time Heuristic Search for

Pathfinding in Video Games July 7, 2010

[8] Ramon Lawrence, Vadim Bulitko

Taking Learning Out of Real-TimeHeuristic Search for Video-Game

Pathfinding AI 2010: Advances in

Artificial Intelligence, December 2010

[9] N. Sturtevant and M. Buro, Partial

pathfinding using map abstraction and

refinement, in Proc. Nat. Conf. Artif.

Intell., 2005, pp. 13921397.

[10] N. Sturtevant, Memory-efficient

abstractions for pathfinding, in Proc.

Artif. Intell. Interactive Digit. Entertain.,

2007, pp. 3136.[11] R. Korf, Depth-first iterative

deepening: An optimal admissible tree

search, Artif. Intell., vol. 27, no. 3, pp.

97109, 1985

[12] M. Shimbo and T. Ishida,

Controlling the learning process of real-

time heuristic search, Artif. Intell., vol.

146, no. 1, pp. 141, 2003.

[13] Y. Bjrnsson, V. Bulitko, and N.

Sturtevant, TBA : Time-bounded A , inProc. Int. Joint Conf. Artif. Intell., 2009,

pp. 431436.

[14] I. Pohl, Heuristic search viewed as

path finding in a graph, Artif. Intell., vol.

1, no. 3, pp. 193204, 1970.

[15] C. Hernndez and J. A. Baier, Fast

subgoaling for pathfinding via real-time

search, in Proc. Int. Conf. Artif. Intell.

Planning Syst., F. Bacchus, C. Domshlak,

S. Edelkamp, and M. Helmert, Eds., 2011,

pp. 327330

heuristic search techniques in video-game pathfinding- a survey of issues and techniques by(1)

Documents