artificial intelligence in game design n-grams and decision tree learning

Artificial Intelligence in Game Design

N-Grams and Decision Tree Learning

Predicting Player Actions

Type of games where works best:• Player has choice between few possible actions

– Attack left

– Attack right

• Character can take simultaneous counteraction– Each player action has “correct” counteraction

• Attack left defend left

• Attack right defend right

• Goal: Character should learn to “anticipate” current player action based on past actions

http://images.google.com/imgres?imgurl=http://www.tontonxx.info/ggll/class_replays/img/w3xp/orc.gif&imgrefurl=http://www.tontonxx.info/ggll/detail_replay.php%3Fw3g_path%3Dreplay//%26id%3DUn3xp3cTiDvsGeththechobo&h=50&w=50&sz=2&hl=en&start=11&usg=__XR5p9fpJJKZjYRKOZk-55sV9fHo=&tbnid=CxujY1BVRkK_bM:&tbnh=50&tbnw=50&prev=/images%3Fq%3Dorc%26imgsz%3Dicon%26gbv%3D2%26hl%3Den

http://images.google.com/imgres?imgurl=http://www.tontonxx.info/ggll/class_replays/img/w3xp/orc.gif&imgrefurl=http://www.tontonxx.info/ggll/detail_replay.php%3Fw3g_path%3Dreplay//%26id%3DUn3xp3cTiDvsGeththechobo&h=50&w=50&sz=2&hl=en&start=11&usg=__XR5p9fpJJKZjYRKOZk-55sV9fHo=&tbnid=CxujY1BVRkK_bM:&tbnh=50&tbnw=50&prev=/images%3Fq%3Dorc%26imgsz%3Dicon%26gbv%3D2%26hl%3Den

Probabilistic Approach

• Keep track of last n actions taken by player– Size of n is “window” of character memory

• Compute probability of next player action based on these– Base decision about counteraction on that probability

• Example: Last 10 player actions L L R L R L L L R L

• Estimated probabilities of next action:

Left attack 70%

Right attack 30%

Probabilistic Actions

• Simple majority approach– Since left attack has highest probability, defend left– Problem: Character will take same action for long periods

• Too predictable!

– Example: • Player’s next 4 attacks are from right

• Character will still choose defend left since left attacks still have highest probability

• Character looks very stupid!

player: L L R L R L L L R L R R R Rcharacter: L L L L

Left attack still majority player action in last 10 moves

Probabilistic Actions

• Probabilistic Approach:Choose action with same probability as corresponding player action

• Biased towards recent player actions• Far less predictable

– Player may notice that character is changing tactics without being aware of how this is done

Left attack 70%

Right attack 30%

Left defend 70%

Right defend 30%

Window Size

• Key question: What is good value for n?

L L R L R L L L R L R R L

• Can be too small (n = 2, for example)

L L R L R L L L R L R R L

• Can be too large (n = 20, for example)

LLLLLLLLLLLLRRRRRRRR

• No best solution– Will need to experiment

Size of “window” used to determine probabilities

Character has no “memory” of past player actions

Left defend 50%

Right defend 50%

Too slow to react to changes in player tactics

Left defend 60%

Right defend 40%

N-Grams

• Conditional probabilities based on sequence of user actions• Example:

– “Last two player attacks were left and then right”

– “What has player done next after last times the attacked left then right?”

• After last 4 left then right attacks:– Attacked right 3 times– Attacked left 1 time

• Conclusion: Player has 75% chance of attacking right next

N-Grams Example

• Example:– Window of memory = last 12 actions– Base decision on last two actions taken by player (past

sequence length = 2)

• Goal: Determine what action player is likely to take next given last two actions L R

– Previous actions:L L R L R R L L R L L R ?

– Previous cases of L R:• Followed by L twice• Followed by R once

Left attack 67%

Right attack 33%

N-Grams and Sequence Size

• Number of statistics to keep grow exponentially with length of past sequences– Number of possible actions = a– Past sequences length = L– Number of possible configurations of past sequences = aL

• L must be small (no more than 2 or 3)

Storing N-Grams

• Algorithm:– Keep statistics for all possible action strings of length L

• Based on “window” of past actions

• Example: L L L L R R L L R L L R

Previous action string

Instances of next action

Probability next action is L

Probability next action is R

L L L L R R R 40% 60%

L R L R 50% 50%

R L L L 100% 0%

R R L 100% 0%

N-Grams and Updating

• When player takes new action– Add instance for that action and update statistics– Remove oldest action from list and update statistics

• Move “window”

• Example: L L L L R R L L R L L R LNew action

Now outside “window”

Previous action string

Instances of next action

Probability next action is L

Probability next action is R

L L L L R R R 25% 75%

L R L R L 67% 33%

R L L L 100% 0%

R R L 100% 0%

Decision Trees

Simple tree traversal•Node = question•Branch to follow = answer•Leaf = final action to take

Can create dynamically from a set of examples

Hit points < 5?

yes no

Obstacle betweenmyself and player?

yes

hide

no

run

Within one unit of player?

yes

attack

no

Path to playerclear?

yes

run towardsplayer

no

runsideways

Decision Tree Learning

• Based on a set of training examples– Attribute values about current situation

• From world or from character state

– Target action character should take on basis of those attribute values

• Example: Estimating driving time– Hour of departure: 8, 9, or 10– Weather: sunny, cloudy or rainy– Accident on route: yes or no– Stalled car on route: yes or no

– Commute time: short, medium, or long

attributes

desired action

Training ExamplesExample Hour Weather Accident Stall Commute

1 8 sunny no no long

2 8 cloudy no yes long

3 10 sunny no no short

4 9 rainy yes no long

5 9 sunny yes yes long

6 10 sunny no no short

7 10 cloudy no no short

8 9 rainy no no medium

9 9 sunny yes no long

10 10 cloudy yes yes long

11 10 rainy no no short

12 8 cloudy yes no long

13 9 sunny no no medium

Decision Tree Building

node BuildTree(examples[ ] E) {

if (all examples in E have same target action T) {

return new leaf with action T

}

else {

choose “best” attribute A to create branches for examples E set question at this node to A

for (all possible values V for attribute A) {

create branch with value V

EV = all examples in E that have value V for attribute A

attach BuildTree(EV ) to that branch }

}

}

Decision Tree Building

• Start tree by using BuildTree(all examples) to create root• Example: Suppose Hour were chosen at root:

Hour

Examples 1 – 13 4 short, 2 medium, 7 long

another question

8

Examples 3, 6, 7, 10, 11 4 short, 0 medium, 1 long

another question

10

Examples 4, 5, 8, 9, 130 short, 2 medium, 3 long

9

Examples 1, 2, 120 short, 0 medium, 3 long

Long

ID3 Decision Tree Learning

• Choose attribute with highest information gain• Based on definition of entropy from information theory

• Entropy over examples E with target actions T:

Entropy(E) = - | ET | log2 | ET | T | E | | E |

• Example: entropy for training set =-(4/13) log2(4/13) -(2/13) log2(2/13) -(7/13) log2(7/13) = 1.42 4 short examples 2 medium examples 7 medium examples

ID3 Decision Tree Learning

• Expected entropy remaining after attribute A used =– Sum of entropies of child nodes down branches V– Weighted by percentage of examples down that branch

| EV | Entropy (EV) V | E |

• Information gain for attribute A = that value subtracted from original entropy

Attribute Entropy Information Gain

hour 0.65 0.77

weather 1.29 0.13

accident 0.92 0.50

stall 1.17 0.25

Best attribute at root

Using ID3 in Games

• ID3 efficient enough for on-line learning– Execution time proportional to number of examples

• Best applied to games if:– Few possible actions for NPC to choose from

(attack, retreat, defend)

– Key attributes have few possible values(or continuous values partitioned into predefined ranges)

– Have easy way to determine desired actions NPC should take based on significant number of player actions

• May require tweaking rules of game!

Black and White Game

• Player given “creature” at beginning of game

• Creature “trained” by player by observing player actions in different situations– What other armies to attack in what

circumstances

• Later in game (after creature grows), creature takes those same actions


• Sample training set:

Example Allegiance Defense Tribe Attack

1 friendly weak Celtic no

2 enemy weak Celtic yes

3 friendly strong Norse no

4 enemy strong Norse no

5 friendly weak Greek no

6 enemy medium Greek yes

7 enemy strong Greek no

8 enemy medium Aztec yes

9 friendly weak Aztec no


• Tree created from examples:

Allegiance

Defense

friendly

No attack

enemy

weak

Attack

strong

No attack

medium

Attack


• Nature of game suited to decision tree learning:– Large number of examples created by player actions– Known target actions based on player actions– Small number of possible actions, attribute values

• Game specifically designed around algorithm!

artificial intelligence in game design n-grams and decision tree learning

Documents