aima slides - drexel universityedge.cs.drexel.edu/regli/classes/ai/slides/chapter05.pdfdeep blue,...

26

Upload: vohanh

Post on 18-Mar-2018

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Gameplaying

Chapter5,Sections1{5

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

1

Page 2: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Outline

}

Perfectplay

}

Resourcelimits

}

�{�

pruning

}

Gamesofchance

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

2

Page 3: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Gamesvs.searchproblems

\Unpredictable"opponent)

solutionisacontingencyplan

Timelimits)

unlikelyto�ndgoal,mustapproximate

Planofattack:

�algorithmforperfectplay(VonNeumann,1944)

��nitehorizon,approximateevaluation(Zuse,1945;Shannon,1950;

Samuel,1952{57)

�pruningtoreducecosts(McCarthy,1956)

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

3

Page 4: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Types of games

deterministic chance

perfect information

imperfect information

chess, checkers,go, othello

backgammonmonopoly

bridge, poker, scrabblenuclear war

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 4

Page 5: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Minimax

Perfect play for deterministic, perfect-information games

Idea: choose move to position with highest minimax value

= best achievable payo� against best play

E.g., 2-ply game:

MAX

3 12 8 642 14 5 2

MIN

3

A1

A3

A2

A 13A 12A 11A 21 A 23

A 22A 33A 32

A 31

3 2 2

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 5

Page 6: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Minimaxalgorithm

functionM

inimax-Decision(gam

e)returnsan

operator

foreachop

inOperators[gam

e]do

Value[op]

M

inimax-Value(Apply(op,gam

e),gam

e)

end

returntheop

withthehighestValue[op]

functionM

inimax-Value(state,gam

e)returnsa

utility

value

ifTerminal-Test[gam

e](state)then

returnUtility[gam

e](state)

elseifmax

istomoveinstatethen

returnthehighestM

inimax-ValueofSuccessors(state)

else

returnthelowestM

inimax-ValueofSuccessors(state)

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

6

Page 7: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Propertiesofminimax

Complete??

Optimal??

Timecomplexity??

Spacecomplexity??

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

7

Page 8: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Propertiesofminimax

Complete??Yes,iftreeis�nite(chesshasspeci�crulesforthis)

Optimal??Yes,againstanoptimalopponent.Otherwise??

Timecomplexity??O(bm

)

Spacecomplexity??O(bm

)(depth-�rstexploration)

Forchess,b�35,m

�100for\reasonable"games

)

exactsolutioncompletelyinfeasible

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

8

Page 9: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Resourcelimits

Supposewehave100seconds,explore104nodes/second

)

106nodespermove

Standardapproach:

�cuto�

test

e.g.,depthlimit(perhapsaddquiescencesearch)

�evaluation

function

=estimateddesirabilityofposition

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

9

Page 10: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Evaluation functions

Black to move

White slightly better

White to move

Black winning

For chess, typically linear weighted sum of features

Eval(s) = w1f1(s) + w2f2(s) + : : :+ wnfn(s)

e.g., w1 = 9 with

f1(s) = (number of white queens) { (number of black queens)

etc.

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 10

Page 11: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Digression: Exact values don't matter

MIN

MAX

21

1

42

2

20

1

1 40020

20

Behaviour is preserved under any monotonic transformation of Eval

Only the order matters:

payo� in deterministic games acts as an ordinal utility function

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 11

Page 12: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Cuttingo�search

MinimaxCutoffisidenticaltoMinimaxValueexcept

1.Terminal?isreplacedbyCutoff?

2.UtilityisreplacedbyEval

Doesitworkinpractice?

bm

=106;

b=35

)

m

=4

4-plylookaheadisahopelesschessplayer!

4-ply�

humannovice

8-ply�

typicalPC,humanmaster

12-ply�

DeepBlue,Kasparov

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

12

Page 13: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

�{� pruning example

MAX

3 12 8

MIN 3

3

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 13

Page 14: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

2

2

X X

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 14

Page 15: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

14

14

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 15

Page 16: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

5

5

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 16

Page 17: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

2

2

3

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 17

Page 18: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Propertiesof�{�

Pruningdoesnota�ect�nalresult

Goodmoveorderingimprovese�ectivenessofpruning

With\perfectordering,"timecomplexity=O(bm

=2)

)

doublesdepthofsearch

)

caneasilyreachdepth8andplaygoodchess

Asimpleexampleofthevalueofreasoningaboutwhichcomputations

arerelevant(aformofmetareasoning)

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

18

Page 19: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Why is it called �{�?

..

..

..

MAX

MIN

MAX

MIN V

� is the best value (to max) found so far o� the current path

If V is worse than �, max will avoid it ) prune that branch

De�ne � similarly for min

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 19

Page 20: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

The�{�

algorithm

BasicallyMinimax+keeptrackof�,�

+prune

functionM

ax-Value(state,gam

e,�,�)returnstheminimaxvalueofstate

inputs:state,currentstateingame

gam

e,gamedescription

�,thebestscoreformax

alongthepathtostate

�,thebestscoreformin

alongthepathtostate

ifCutoff-Test(state)thenreturnEval(state)

foreachsinSuccessors(state)do

M

ax(�,M

in-Value(s,gam

e,�,�))

if�

thenreturn�

end

return�

functionM

in-Value(state,gam

e,�,�)returnstheminimaxvalueofstate

ifCutoff-Test(state)thenreturnEval(state)

foreachsinSuccessors(state)do

M

in(�,M

ax-Value(s,gam

e,�,�))

if�

thenreturn�

end

return�

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

20

Page 21: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Deterministicgamesinpractice

Checkers:Chinookended40-year-reignofhumanworldchampionMar-

ionTinsleyin1994.Usedanendgamedatabasede�ningperfectplay

forallpositionsinvolving8orfewerpiecesontheboard,atotalof

443,748,401,247positions.

Chess:DeepBluedefeatedhumanworldchampionGaryKasparovina

six-gamematchin1997.DeepBluesearches200millionpositionsper

second,usesverysophisticatedevaluation,andundisclosedmethodsfor

extendingsomelinesofsearchupto40ply.

Othello:humanchampionsrefusetocompeteagainstcomputers,who

aretoogood.

Go:humanchampionsrefusetocompeteagainstcomputers,whoare

toobad.Ingo,b>

300,somostprogramsusepatternknowledgebases

tosuggestplausiblemoves.

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

21

Page 22: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Nondeterministic games

E..g, in backgammon, the dice rolls determine the legal moves

Simpli�ed example with coin- ipping instead of dice-rolling:

MIN

MAX

2

CHANCE

4 7 4 6 0 5 −2

2 4 0 −2

0.5 0.5 0.5 0.5

3 −1

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 22

Page 23: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Algorithm

fornondeterministicgames

Expectiminimaxgivesperfectplay

JustlikeMinimax,exceptwemustalsohandlechancenodes:

:::

ifstateisachancenodethen

returnaverageofExpectiMinimax-ValueofSuccessors(state)

:::

Aversionof�{�

pruningispossible

butonlyiftheleafvaluesarebounded.Why??

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

23

Page 24: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Nondeterministicgamesinpractice

Dicerollsincreaseb:21possiblerollswith2dice

Backgammon�

20legalmoves(canbe6,000with1-1roll)

depth4=20�(21�20)3�1:2�109

Asdepthincreases,probabilityofreachingagivennodeshrinks

)

valueoflookaheadisdiminished

�{�

pruningismuchlesse�ective

TDGammonusesdepth-2search+verygoodEval

world-championlevel

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

24

Page 25: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Digression: Exact values DO matter

DICE

MIN

MAX

2 2 3 3 1 1 4 4

2 3 1 4

.9 .1 .9 .1

2.1 1.3

20 20 30 30 1 1 400 400

20 30 1 400

.9 .1 .9 .1

21 40.9

Behaviour is preserved only by positive linear transformation of Eval

Hence Eval should be proportional to the expected payo�

AIMA Slides c Stuart Russell and Peter Norvig, 1998 Chapter 5, Sections 1{5 25

Page 26: AIMA Slides - Drexel Universityedge.cs.drexel.edu/regli/Classes/AI/slides/chapter05.pdfDeep Blue, Kaspa rov AIMA Slides c Stuart Russell and P eter Norvig, 1998 Chapter 5, Sections

Summary

Gamesarefuntoworkon!(anddangerous)

TheyillustrateseveralimportantpointsaboutAI

}

perfectionisunattainable)

mustapproximate

}

goodideatothinkaboutwhattothinkabout

}

uncertaintyconstrainstheassignmentofvaluestostates

GamesaretoAIasgrandprixracingistoautomobiledesign

AIMA

Slidesc StuartRussellandPeterNorvig,1998

Chapter5,Sections1{5

26