game tree search thanks to andrew moore and faheim bacchus for slides!

42
Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Upload: jack-short

Post on 19-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Game tree search

Thanks to Andrew Moore and Faheim Bacchusfor slides!

Page 2: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Game tree searchIn this lecture we will cover some basics of

two playerzero sumdiscrete

finitedeterministic

games ofperfect information

Page 3: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 4: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 5: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

More on the meaning of “Zero-Sum”• We will focus on “Zero Sum” games.• “Zero-sum” games are fully competitive

• if one player wins, the other player loses• more specifically, the amount of money I win (lose)

at poker is the amount of money you lose (win)• More general games can be cooperative

• some outcomes are preferred by both of us, or at least our values aren’t diametrically opposed

Page 6: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

• Scissors cut paper, paper covers rock, rock smashes scissors

• Represented as a matrix: Player I chooses a row, Player II chooses a column

• Payoff to each player in each cell (P I / P II)

• 1: win, 0: tie, -1: loss• is this game “zero-sum”?

R P S

0/0

0/0

0/0

-1/1

-1/1

-1/1 1/-1

1/-1

1/-1

R

P

S

Player II

Pla

yer

I

Is Rock Paper Scissors “Zero-Sum”?

Page 7: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

• Dilemma: Two prisoners are in separate cells and there is not enough evidence to convict them

• If one confesses, while the other doesn’t: • the confessor goes free• the other sentenced to 4 years

• If both confess • both are sentenced to 3 years

• If neither confess:• both are sentenced to 1 year

on minor charge• Is this game “zero sum”?

Do Don’t

3/3

1/1

0/4

4/0

Do

Don’t

Is The Prisoner’s Dilemma “Zero-Sum”?

Prisoner II: confess?

Pri

soner

I: c

onfe

ss?

Page 8: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Is The Coffee-Bot Dilemma “Zero-Sum”?• Two robots: Green (Craig’s), Red (Fahiem’s)

• one cup of coffee and tea left• both Craig and Faheim prefer coffee (value 10)• but, tea is acceptable (value 8)

• Both robot’s go for coffee• they collide and get no payoff

• Both go for tea: • collide and get no payoff

• One goes for coffee, other for tea:• coffee robot gets 10• tea robot gets 8

Coffee Tea

0/0

8/10 0/0

10/0Coffee

Tea

Page 9: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 10: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 11: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 12: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 13: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

With a search space defined for II-Nim, we can define a Game Tree.

• A game tree looks like a search tree• Layers reflect alternating moves between A and B

• Player A doesn’t decide where to go alone• after Player A moves to a state, B decides which of

the state’s children to move to. • Thus, A must have a strategy:• A must know what to do for each possible move of

B.• “What to do” will depend on how B plays.

Page 14: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 15: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 16: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 17: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 18: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 19: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 20: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 21: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 22: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 23: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Question:what happens if

there are loops in the tree?

How would looping influence

your determination of

the minimax value for a node?

Page 24: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

•Imagine you have a game with N states, that each state has b successors, and the length of the game is usually D moves.

•Minimax will expand O(bD) states, which is both BEST and WORSE case scenario. This is different than regular DFS!

•But, what if N is less than bD? In chess, for example, bD = 10120, but N = 1040 ...

Managing games with fewer states than game tree nodes.

Note: info on this slide WON’T be on exams!

Page 25: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Managing games with fewer states than game tree nodes.

• Make a huge array of size N. Give each element in the array one of the following values:

• ?: We don’t know who wins from this state

• W: We know white wins from this position

• B: We know black wins from this position.

• Mark all terminal states with their values (‘W’ or ‘B’).

Suppose we have 4 pieces left at the end of a chess game. With enough computing power, we can

compute, for all such game states, if the position is a win for White, Black or a Draw.

Note: info on this slide WON’T be on exams!

Page 26: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Managing games with fewer states than game tree nodes.• Look through all states that remain marked with a ‘?’.

• For states where W is about to move:

• If all successors are marked ‘B’, mark the current state ‘B’.

• If any successors is marked ‘W’, mark the current state ‘W’.

• Otherwise leave the state unchanged.

• For states where B is about to move:

• If all successors are marked ‘W’, mark the current state ‘W’.

• If any successors is marked ‘B’, mark the current state ‘B’.

• Otherwise leave the state unchanged.

• Repeat! Until there are none of the elements in the array change their value.

• Any state remaining at ‘?’ is a state from which no one can force a win.

• Note: to turn this algorithm into a strategy, you also need to record pointers from given states to their best successors. This is a DYNAMIC PROGRAMMING technique.

Note: info on this slide WON’T be on exams!

Page 27: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 28: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 29: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 30: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 31: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 32: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 33: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Question:what happens if

there are loops in the tree?

Page 34: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

A

s1 s2s3

14 12 8

β = 8

2 4

α = 2, then 4, then ....

s4

s5

B

9 11

2

Example 1: We are currently expanding possible moves for

player A, from left to right. Which of the node expansions

above could we prune, and why?

Page 35: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

A

s1 s2s3

14 12 8

β = 8

2 4

α = 9

s4

s5

B

9 11

2

Once we discover a node with value ‘9’, there is no need to

expand the nodes to the right!

Page 36: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

B

s1 s2s3

6 2 7

α = 7

9 3

β = 9, then 3, then ....

s4

s5

A

4 2

8

Example 2: We are currently expanding possible moves for

player B, from left to right. Which of the node expansions

above could we prune, and why?

Page 37: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

B

s1 s2s3

6 2 7

α = 7

9 3

β = 3

s4

s5

A

4 2

8

Once we discover a node with value ‘3’, there is no need to

expand the nodes to the right!

Page 38: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Rational Opponents• This all assumes that your opponent is rational

• e.g., will choose moves that minimize your score• Storing your strategy is a potential issue:

• you must store “decisions” for each node you can reach by playing optimally

• if your opponent has unique rational choices, this is a single branch through game tree

• if there are “ties”, opponent could choose any one of the “tied” moves: must store strategy for each subtree

• What if your opponent doesn’t play rationally?• will it affect quality of outcome? will your stored

strategies work?

Page 39: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!
Page 40: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Heuristic evaluation functions in games

Some issues in heuristic search

• How far should we search in our game tree to determine the value of a node, if we only have a fixed amount of time?

• What if we stop our search at a level in the search tree where subsequent moves dramatically change our evaluation?

Page 41: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Heuristic evaluation functions in games

• Think of a few games and suggest some heuristics for estimating the “goodness” of a position• chess?• checkers?• your favorite video game?• “find the last parking spot”?

Page 42: Game tree search Thanks to Andrew Moore and Faheim Bacchus for slides!

Question: is there an alpha beta version you can use to search this

tree?