lecture6games - dept.cs.williams.edu
Post on 13-Feb-2022
1 Views
Preview:
TRANSCRIPT
2/14/17
1
Games
AndreaDanylukFebruary15,2017
Announcements
• ProgrammingAssignment1:Search– SCllinprogress– AnoteaboutdesigningheurisCcs:• Adda“feature”ataCme• Considerdifferentweightsfordifferentfeatures• ThinkbeyondaddingheurisCcinformaContogether• OnceyouhaveafuncConthatworkswell,removeelementstodeterminewhetheryoureallyneedthem
Today
• Games(repeatedfromlastCme)– Planning/problemsolvinginthepresenceofanadversaryèadversarialsearch
– Whygames?• Easytomeasuresuccessorfailure• Statesandrulesaregenerallyeasytospecify• InteresCngandcomplex
– SpaceandCmecomplexity– Uncertaintyofadversaries’acCon,rollsofdice,etc.
Go
• AlphaGobecamethefirstprogramtobeatahumanprofessionalGoplayerwithouthandicapsonafull19x19board.
• Ingo,b>300• UsesMonteCarlotreesearchtoselectmoves.• UsesknowledgelearnedfromacombinaConofreinforcementanddeeplearning.
Backgammon
• TDGammonusesdepth-2search+verygoodevaluaConfuncCon+reinforcementlearning(GerryTesauro,IBM)
• World-championlevelplay• 1stAIworldchampioninanygame!
Poker
• Libratus[SandholmandBrown,CMU]won$1.7m(inchips)from4professionalpokerplayersover20daysinJanuary2017
• No-limitTexasHold’em• Hardbecauseit’sagameofimperfectinformaCon.Can’tseetheopponent’shand.
• The“finalfronCer”ingames…
[AdaptedfromCS188Berkeley]
2/14/17
2
TypesofGames
Chess,Checkers,Go,ConnectFour
Backgammon
Bajleship,GuessWho?
Bridge,Poker,Scrabble
DeterminisCc Chance
PerfectInformaCon
ImperfectInformaCon
[AdaptedfromRussellandfromCS188Berkeley]
TypesofGames
Chess,Checkers,Go,ConnectFour
Backgammon
Bajleship,GuessWho?
Bridge,Poker,Scrabble
DeterminisCc Chance
PerfectInformaCon
ImperfectInformaCon
[AdaptedfromRussellandfromCS188Berkeley]
Wantalgorithmsforcalcula1ngastrategy(policy)thatrecommendsamoveineachstate
ConnectFourDemo
• Withperfectplay,firstplayercanforceawinbystarCnginthemiddlecolumn.
• BystarCnginoneofthetwoadjacentcolumns,thefirstplayerallowsthesecondplayertoreachadraw.
• BystarCnginanyofthefouroutercolumns,thefirstplayerallowsthesecondplayertoforceawin.
• Thereexistperfectplayers–mydemoprogramisnotoneofthem.
GamePlayingasaSearchProblem
Notethateachlevelinthegametree(i.e.,eachhalfmove)iscalledaply.
FormulaCngGamePlayingasSearch
• StatesS– DescripConofthecurrentstate/configuraConofthegame
• PlayersP={1,2,…,n}– Willtaketurnsinthegamesweconsider
• AcConsA– LegalacConsmaydependonplayerandstate
• TransiConmodel– DefinestheresultofanacConappliedtoastateforaparCcularplayer– Resultisanewstate
• Terminaltest– FuncCononstates;returnsTifstateisaterminalstateandF
otherwise• UClityfuncConSxP->value
– AlsocalledobjecCvefuncConorpayofffuncCon
[AdaptedfromCS188Berkeley]
GamesvsSearchProblems
• “Unpredictable”opponent⇒soluConisastrategy
• Timelimits⇒unlikelytoreachterminalstates.– Mustapproximate
2/14/17
3
MinimaxSearch
• Whenit’syourturn,generate(ideally)thecompletegametree.
• Selectthemovethatisbestforyou,assumingthatyouropponentwill,ateachopportunity,selectthemovethatisworstforyou(andthusbestforhim/her/itself)
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
-1
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
-1
-1
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
-1
-1
1
2/14/17
4
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
-1
-1
1
1
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
-1
-1
1
1
1
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
-1
-1
1
1
1
1
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
-1
-1
1
1
1
1 -1
AnExample:2-playerzero-sumgame
1 -1
-1 1
-11 1
-1 1 -1
Max
min
Max
min
Max
min
-1
-1
-1
1
1
1
1 -1
MinimaxSearchrevisited
• Astate-spacesearchtree• Playersalternateturns• Eachnodehasaminimaxvalue:bestachievableuClityagainstaraConaladversary
[AdaptedfromCS188Berkeley]
2/14/17
5
AnotherExample
7 26 3 0 6-2 52 96 2
AnotherExample
7 26 3 0 6-2 52 96 2
7
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
3
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
3
0
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
3
0 6
2/14/17
6
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
3
0 6
0
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
3
0 6
0
6
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
3
0 6
0
6 9
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
3
0 6
0
6 9
6
AnotherExample
7 26 3 0 6-2 52 96 2
7 3
3
0 6
0
6 9
6
Butreallydonedepth-first
2/14/17
9
Really…
3
Really…3
Really…3
Really…3
function MINIMAX-DECISION(state) returns an action areturn arg max a in ACTIONS(state) MIN-VALUE(RESULT(state, a))
function MIN-VALUE(state) returns a utility value v if TERMINAL-TEST(state) then return UTILITY(state) v = infinity for each a in ACTIONS(state) do v = MIN(v, MAX-VALUE(RESULT(state, a))) return v
function MAX-VALUE(state) returns a utility value v if TERMINAL-TEST(state) then return UTILITY(state) v = -infinity for each a in ACTIONS(state) do v = MAX(v, MIN-VALUE(RESULT(state, a))) return v
MinimaxReality• CanrarelyexploreenCresearchspacetoterminalnodes.
– DFShasgoodspacecomplexity,butbadCmecomplexity• Chooseadepthcutoff–i.e.,amaximumply• NeedanevaluaConfuncCon
– ReturnsanesCmateoftheexpecteduClityofthegamefromagivenposiCon
– MustordertheterminalstatesinthesamewayasthetrueuClityfuncCon
– Mustbeefficienttocompute• TradingoffpliesforheurisCccomputaCon• Morepliesmakesadifference
• ConsideriteraCvedeepening
top related