5 adversarial
Post on 01-Nov-2015
229 Views
Preview:
DESCRIPTION
TRANSCRIPT
-
Adversarial Search
Chapter 5
-
Outline
Optimal decisions
- pruning
Imperfect, real-time decisions
Stochastic Games
2
-
Games vs. search problems
"Unpredictable" opponent specifying a move for every possible opponent reply
Time limits since it is unlikely to find the goal, agent must approximate
3
-
Exercise
1. Tough question . Do you recognize this?
2. Name the game.
3. Is this an interesting or a dull
game? Why?
4. Characterize the game:
Performance Measure,
Environment, Actuators and
Sensors (PEAS)
4
-
Game tree (2-player,
deterministic, turns)
5
-
Exercise in pairs
Can you think of a heuristic function for the Tic-Tac-Toe game?
Using the heuristic function, devise a strategy to play tic-tac-toe.
If both players play their best, what is the depth of the tree (how many moves)?
(One move in this game corresponds to
two plies, where each ply is one players turn)
6
-
Tic-Tac-Toe heuristic function
7
-
Exercise in pairs
Compute the average branching factor.
Could the branching factor be reduced? How?
What would the reduced branching factor be?
You just prune the search tree.
8
-
Minimax
Perfect play for deterministic games
Idea: choose move to position with highest minimax value
= best achievable payoff against best play
E.g., 2-ply game:
9
-
Minimax
10
-
Minimax algorithm
11
-
Exercise in pairs
Analyze the minimax algorithm
Compare your strategy for Tic-Tac-Toe with the minimax algorithm
12
-
Properties of minimax
Complete? Yes (if tree is finite)
Optimal? Yes (against an optimal opponent)
Time complexity? O(bm)
Space complexity? O(bm) (depth-first exploration)
For chess, b 35, m 100 for "reasonable" games exact solution completely infeasible
13
-
Optimal decisions in multiplayer
games
14
-
- pruning example
15
-
- pruning example
16
-
- pruning example
17
-
- pruning example
18
-
- pruning example
19
-
Properties of -
Pruning does not affect final result
Good move ordering improves effectiveness of pruning
With "perfect ordering," time complexity = O(bm/2) doubles depth of search
A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)
20
-
- Prunning
is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max
If v is worse than , max will avoid it prune that branch
Define similarly for min
21
-
The - algorithm
22
-
The - algorithm
23
-
Imperfect Real-Time Decisions
Suppose we have 100 secs, explore 104
nodes/sec
106 nodes per move
Standard approach:
cutoff test:
e.g., depth limit (perhaps add quiescence search)
evaluation function
= estimated desirability of position
24
-
Evaluation functions
For chess, typically linear weighted sum of features
Where wi = the values of pieces, e.g. 9 for queen, 3 for bishops and 1 for pawns, with fi(s) = feature of position. fi could be the numbers of each kind of piece.
It assumes independence between features
Could be modified to include nonlinear combination of features
wi could be estimated using machine learning
EVAL = 11 + 22 + + =
=1
25
-
Cutting off search
MinimaxCutoff is identical to MinimaxValue except 1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval
Does it work in practice?
bm = 106, b=35 m=4
4-ply lookahead is a hopeless chess player! 4-ply human novice 8-ply typical PC, human master 12-ply Deep Blue, Kasparov
Kasparov m=12. If Minimax m=100
26
-
Stochastic Games
White movement
Black movement
Element of
chance
27
-
Stochastic Games
Characterize the environment for Backgammon
Deterministic/Stochastic
Continuous/Discrete
Dynamic/Semidynamic/Static
Fully observable/Partially observable
Episodic/Sequential
28
-
Stochastic Games
29
-
Stochastic Games
Positions do not have definite minimax values; compute expected value of a
position
Generalized minimax value to expectiminimax value
EXPECT IM IN IM AX =
UT IL ITY if TERM INALTEST
maxEXPECT IM IN IM AX RESULT , if PLAYER = MAX
minEXPECT IM IN IM AX RESULT , if PLAYER = MIN
EXPECT IM IN IM AX RESULT ,
if PLAYER = CHANCE
r represents a possible dice roll 30
-
Stochastic Games
EXPECTIMINIMAX, in addition to MIN and MAX, must also consider the possible dice rolls
O(bmnm) where b=branching factor, m=depth of search tree, and n is the number of distinct rolls.
Alternative: Montecarlo Simulation
From start position simulate thousands of games against itself using random dice rolls
31
-
Partially Observable Games
Battleship
Kriegspiel: Variant of Chess but each player only sees his/her pieces on the board. The referee does see all the
pieces
Each player in his/her turn announces move to the referee; the opponent does not hear the move
The referee announces if the move is legal or illegal; if illegal player may keep proposing moves until a legal one is found
Referee announces e.g. Capture on square X, or Check by D, where D is the direction of the check.
Referee also announces checkmate or stalemate.
32
-
Card games
Question:
Why are most card games different to dice
games?
Is Domino similar to a card game?
33
-
Domino
Possible algorithm:
Consider all possible deals of the invisible dominos
Solve each one as if it were a fully observable game
Choose the move that has the best outcome averaged over all the deals, i.e., if each deal s occurs with probability P(s), then the move to chose is:
Run MINIMAX if computationally feasible; or H-MINIMAX otherwise
argmax MIN IM AX RESULT ,
35
-
Domino
Is it computationally feasible to consider all possible deals of the invisible dominos?
Alternative: Montecarlo simulation
Take a random sample of N deals where the probability of deal s appearing in the sample is proportional to P(s):
This method is called Averaging over clairvoyance
argmax
1
MIN IM AX RESULT ,
=1
37
-
Games in practice
Checkers: Chinook uses alpha-beta search with a database of 39 x 1012 precomputed endgame positions.
Chess: Deep Blue uses 30 IBM RS/6000 processors doing alpha-beta search. Uses 480 custom VLSI chess processors, Searched up to 30 x 109 positions per move with an evaluation function with over 8000 features.
Othello: human champions refuse to compete against computers, which are too good.
Go: Computer programs in 19x19 board play at advanced amateur level. In go, b > 300, so most programs use Montecarlo simulation and pattern knowledge bases to suggest plausible moves.
Bridge: Bridge Baron program won the 1997 computer bridge championship. Uses complex hierarchical plans involving high level ideas, but is not optimal.
Scrabble: using a dictionary chooses highest-scoring move. Good but not expert player since game is partially observable and stochastic. Quackle defeated world champion David Boys 3-2 in 2006.
38
-
Summary
Games are fun to work on!
They illustrate several important points about AI
perfection is unattainable must approximate
good idea to think about what to think about (metareasoning)
39
top related