aima slides - drexel universityedge.cs.drexel.edu/regli/classes/ai/slides/chapter05.pdfdeep blue,...
TRANSCRIPT
Gameplaying
Chapter5,Sections1{5
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
1
Outline
}
Perfectplay
}
Resourcelimits
}
�{�
pruning
}
Gamesofchance
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
2
Gamesvs.searchproblems
\Unpredictable"opponent)
solutionisacontingencyplan
Timelimits)
unlikelyto�ndgoal,mustapproximate
Planofattack:
�algorithmforperfectplay(VonNeumann,1944)
��nitehorizon,approximateevaluation(Zuse,1945;Shannon,1950;
Samuel,1952{57)
�pruningtoreducecosts(McCarthy,1956)
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
3
Types of games
deterministic chance
perfect information
imperfect information
chess, checkers,go, othello
backgammonmonopoly
bridge, poker, scrabblenuclear war
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 4
Minimax
Perfect play for deterministic, perfect-information games
Idea: choose move to position with highest minimax value
= best achievable payo� against best play
E.g., 2-ply game:
MAX
3 12 8 642 14 5 2
MIN
3
A1
A3
A2
A 13A 12A 11A 21 A 23
A 22A 33A 32
A 31
3 2 2
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 5
Minimaxalgorithm
functionM
inimax-Decision(gam
e)returnsan
operator
foreachop
inOperators[gam
e]do
Value[op]
M
inimax-Value(Apply(op,gam
e),gam
e)
end
returntheop
withthehighestValue[op]
functionM
inimax-Value(state,gam
e)returnsa
utility
value
ifTerminal-Test[gam
e](state)then
returnUtility[gam
e](state)
elseifmax
istomoveinstatethen
returnthehighestM
inimax-ValueofSuccessors(state)
else
returnthelowestM
inimax-ValueofSuccessors(state)
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
6
Propertiesofminimax
Complete??
Optimal??
Timecomplexity??
Spacecomplexity??
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
7
Propertiesofminimax
Complete??Yes,iftreeis�nite(chesshasspeci�crulesforthis)
Optimal??Yes,againstanoptimalopponent.Otherwise??
Timecomplexity??O(bm
)
Spacecomplexity??O(bm
)(depth-�rstexploration)
Forchess,b�35,m
�100for\reasonable"games
)
exactsolutioncompletelyinfeasible
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
8
Resourcelimits
Supposewehave100seconds,explore104nodes/second
)
106nodespermove
Standardapproach:
�cuto�
test
e.g.,depthlimit(perhapsaddquiescencesearch)
�evaluation
function
=estimateddesirabilityofposition
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
9
Evaluation functions
Black to move
White slightly better
White to move
Black winning
For chess, typically linear weighted sum of features
Eval(s) = w1f1(s) + w2f2(s) + : : :+ wnfn(s)
e.g., w1 = 9 with
f1(s) = (number of white queens) { (number of black queens)
etc.
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 10
Digression: Exact values don't matter
MIN
MAX
21
1
42
2
20
1
1 40020
20
Behaviour is preserved under any monotonic transformation of Eval
Only the order matters:
payo� in deterministic games acts as an ordinal utility function
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 11
Cuttingo�search
MinimaxCutoffisidenticaltoMinimaxValueexcept
1.Terminal?isreplacedbyCutoff?
2.UtilityisreplacedbyEval
Doesitworkinpractice?
bm
=106;
b=35
)
m
=4
4-plylookaheadisahopelesschessplayer!
4-ply�
humannovice
8-ply�
typicalPC,humanmaster
12-ply�
DeepBlue,Kasparov
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
12
�{� pruning example
MAX
3 12 8
MIN 3
3
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 13
2
2
X X
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 14
14
14
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 15
5
5
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 16
2
2
3
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 17
Propertiesof�{�
Pruningdoesnota�ect�nalresult
Goodmoveorderingimprovese�ectivenessofpruning
With\perfectordering,"timecomplexity=O(bm
=2)
)
doublesdepthofsearch
)
caneasilyreachdepth8andplaygoodchess
Asimpleexampleofthevalueofreasoningaboutwhichcomputations
arerelevant(aformofmetareasoning)
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
18
Why is it called �{�?
..
..
..
MAX
MIN
MAX
MIN V
� is the best value (to max) found so far o� the current path
If V is worse than �, max will avoid it ) prune that branch
De�ne � similarly for min
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 19
The�{�
algorithm
BasicallyMinimax+keeptrackof�,�
+prune
functionM
ax-Value(state,gam
e,�,�)returnstheminimaxvalueofstate
inputs:state,currentstateingame
gam
e,gamedescription
�,thebestscoreformax
alongthepathtostate
�,thebestscoreformin
alongthepathtostate
ifCutoff-Test(state)thenreturnEval(state)
foreachsinSuccessors(state)do
�
M
ax(�,M
in-Value(s,gam
e,�,�))
if�
�
�
thenreturn�
end
return�
functionM
in-Value(state,gam
e,�,�)returnstheminimaxvalueofstate
ifCutoff-Test(state)thenreturnEval(state)
foreachsinSuccessors(state)do
�
M
in(�,M
ax-Value(s,gam
e,�,�))
if�
�
�
thenreturn�
end
return�
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
20
Deterministicgamesinpractice
Checkers:Chinookended40-year-reignofhumanworldchampionMar-
ionTinsleyin1994.Usedanendgamedatabasede�ningperfectplay
forallpositionsinvolving8orfewerpiecesontheboard,atotalof
443,748,401,247positions.
Chess:DeepBluedefeatedhumanworldchampionGaryKasparovina
six-gamematchin1997.DeepBluesearches200millionpositionsper
second,usesverysophisticatedevaluation,andundisclosedmethodsfor
extendingsomelinesofsearchupto40ply.
Othello:humanchampionsrefusetocompeteagainstcomputers,who
aretoogood.
Go:humanchampionsrefusetocompeteagainstcomputers,whoare
toobad.Ingo,b>
300,somostprogramsusepatternknowledgebases
tosuggestplausiblemoves.
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
21
Nondeterministic games
E..g, in backgammon, the dice rolls determine the legal moves
Simpli�ed example with coin- ipping instead of dice-rolling:
MIN
MAX
2
CHANCE
4 7 4 6 0 5 −2
2 4 0 −2
0.5 0.5 0.5 0.5
3 −1
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 22
Algorithm
fornondeterministicgames
Expectiminimaxgivesperfectplay
JustlikeMinimax,exceptwemustalsohandlechancenodes:
:::
ifstateisachancenodethen
returnaverageofExpectiMinimax-ValueofSuccessors(state)
:::
Aversionof�{�
pruningispossible
butonlyiftheleafvaluesarebounded.Why??
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
23
Nondeterministicgamesinpractice
Dicerollsincreaseb:21possiblerollswith2dice
Backgammon�
20legalmoves(canbe6,000with1-1roll)
depth4=20�(21�20)3�1:2�109
Asdepthincreases,probabilityofreachingagivennodeshrinks
)
valueoflookaheadisdiminished
�{�
pruningismuchlesse�ective
TDGammonusesdepth-2search+verygoodEval
�
world-championlevel
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
24
Digression: Exact values DO matter
DICE
MIN
MAX
2 2 3 3 1 1 4 4
2 3 1 4
.9 .1 .9 .1
2.1 1.3
20 20 30 30 1 1 400 400
20 30 1 400
.9 .1 .9 .1
21 40.9
Behaviour is preserved only by positive linear transformation of Eval
Hence Eval should be proportional to the expected payo�
AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 25
Summary
Gamesarefuntoworkon!(anddangerous)
TheyillustrateseveralimportantpointsaboutAI
}
perfectionisunattainable)
mustapproximate
}
goodideatothinkaboutwhattothinkabout
}
uncertaintyconstrainstheassignmentofvaluestostates
GamesaretoAIasgrandprixracingistoautomobiledesign
AIMA
Slidesc StuartRussellandPeterNorvig,1998
Chapter5,Sections1{5
26