techniques for computing game-theoretic solutions vincent conitzer duke university parts of this...
Post on 22-Dec-2015
216 views
TRANSCRIPT
Techniques for Computing Game-Theoretic Solutions
Vincent Conitzer Duke University
Parts of this talk are joint work with Tuomas Sandholm
Introduction• Increasingly, computer science is confronted with settings where multiple, self-interested parties interact
– network routing– job scheduling– electronic commerce– e-government– …
• Parties can be humans or computers/software agents
A
BC
D
job 1 job 2 job 3
machine 1 machine 2v( ) = $500
v( ) = $700
> >
> >
Where do we face difficultcomputational problems?
• Running the mechanism under which the agents interact
• Computing how the agents should act under the mechanism (strategically)
• Computing which mechanisms give the best outcomes (under strategic behavior)
this talk
Game theory
Rock-paper-scissors
0, 0 -1, 1 1, -1
1, -1 0, 0 -1, 1
-1, 1 1, -1 0, 0
The presentation game
Pay attention
Do not pay attention
Put effort into presentation
Do not put effort into presentation
4, 4 -16, -14
0, -2 0, 0
“Should I buy an SUV?” (aka. Prisoner’s Dilemma)
-10, -10 -7, -11
-11, -7 -8, -8
cost: 5
cost: 3
cost: 5 cost: 5
cost: 5 cost: 5
cost: 8 cost: 2
dominance
Nash equilibrium
• Nash equilibrium: a strategy for each player so that no player has an incentive to change strategies
4, 4 -16, -14
0, -2 0, 0
A
NA
E NE
0, 0 -1, 1 1, -1
1, -1 0, 0 -1, 1
-1, 1 1, -1 0, 0Two Nash equilibria
No Nash equilibria!
Really? Can a game have no Nash equilibria?
Nash equilibrium…
• Mixed strategy: a probability distribution over (pure) strategies
4, 4 -16, -14
0, -2 0, 0
0, 0 -1, 1 1, -1
1, -1 0, 0 -1, 1
-1, 1 1, -1 0, 0A third Nash equilibrium
One Nash equilibrium
4/5 1/5
1/10
9/10
1/3 1/3 1/3
1/3
1/3
1/3
• At least one Nash equilibrium always exists (in finite games) [Nash 50]
How do we compute solutions?
• Computing dominance is easy (and many, though not all, variants are as well [C. & Sandholm EC05])
• Computing Nash equilibria is harder…
A useful reduction (SAT -> game) [C. & Sandholm IJCAI03/extended draft]
Formula: (x1 or -x2) and (-x1 or x2 )Solutions: x1=true, x2=true
x1=false,x2=false
Game: x1 x2 +x1 -x1 +x2 -x2 (x1 or -x2) (-x1 or x2) default
x1 -2,-2 -2,-2 0,-2 0,-2 2,-2 2,-2 -2,-2 -2,-2 0,1x2 -2,-2 -2,-2 2,-2 2,-2 0,-2 0,-2 -2,-2 -2,-2 0,1
+x1 -2,0 -2,2 1,1 -2,-2 1,1 1,1 -2,0 -2,2 0,1-x1 -2,0 -2,2 -2,-2 1,1 1,1 1,1 -2,2 -2,0 0,1+x2 -2,2 -2,0 1,1 1,1 1,1 -2,-2 -2,2 -2,0 0,1-x2 -2,2 -2,0 1,1 1,1 -2,-2 1,1 -2,0 -2,2 0,1
(x1 or -x2) -2,-2 -2,-2 0,-2 2,-2 2,-2 0,-2 -2,-2 -2,-2 0,1(-x1 or x2) -2,-2 -2,-2 2,-2 0,-2 0,-2 2,-2 -2,-2 -2,-2 0,1default 1,0 1,0 1,0 1,0 1,0 1,0 1,0 1,0 ε, ε
• Every satisfying assignment (if there are any) corresponds to an equilibrium with utilities 1, 1• Exactly one additional equilibrium with utilities ε, ε that always exists
What about just computing one (any) Nash equilibrium?
• Complexity was completely open for a long time– [Papadimitriou STOC01]: “together with factoring […] the most important concrete open question on the boundary of P today”
• Recent sequence of papers shows that computing one (any) Nash equilibrium is PPAD-complete [Daskalakis, Goldberg, Papadimitriou 05; Chen, Deng 05]
– Just as hard in symmetric games [C. 03/Tardos 03]
• All known algorithms require exponential time (in the worst case)
Search-based approaches• Suppose we know the support Xi of each player
i’s mixed strategy in equilibrium
• Then, we have a simple linear feasibility problem:– for both i, for any si Xi, Σp-i(s-i)ui(si, s-i) = ui
– for both i, for any si Si - Xi, Σp-i(s-i)ui(si, s-i) ≤ ui
• Thus, we can search over supports– This is the basic idea underlying methods in [Dickhaut &
Kaplan 91; Porter, Nudelman, Shoham AAAI04; Sandholm, Gilpin, C. AAAI05]
• Dominated strategies can be eliminated
A class of hard games[Sandholm, Gilpin, C. AAAI05]
0, 2 0, 3 3, 0 0, 0 0, 2 0, 0 0, 2
0, 2 0, 0 0, 0 0, 3 0, 2 3, 0 0, 2
2, 4 2, 0 2, 0 2, 0 4, 2 2, 0 3, 3
3, 3 2, 0 2, 0 2, 0 2, 4 2, 0 4, 2
0, 2 3, 0 0, 3 0, 0 0, 2 0, 0 0, 2
0, 2 0, 0 0, 0 3, 0 0, 2 0, 3 0, 2
4, 2 2, 0 2, 0 2, 0 3, 3 2, 0 2, 4
1/3 1/3 1/3
1/3
1/31/3
0 00000
00
Eliminability conceptsDominance: strategy always does worse than some other (mixed)
strategy- strong argument- local reasoning- easy to compute
- often does not apply
Nash equilibrium: strategy does not appear in support of any Nash
equilibrium-weaker argument- global reasoning- hard to compute
- applies more often
3, 2 2, 3
2, 3 3, 2
4, 0 0, 1
.5
.5
0
.5 .5
Is there something “in between” that combines good aspects of both? Yes! [C. & Sandholm AAAI05]
3, 2 2, 3
2, 3 3, 2
2, 0 2, 1
.5
.5
Definition as game between attacker and defender
• Stage 1: Defender specifies probabilities on E strategies (er
*
must get > 0) 3, 00, 30, 20, 2sr
4
0, 33, 00, 20, 2sr3
2, 02, 02, 22, 2sr2
2, 02, 02, 22, 2sr1
sc4sc
3sc2sc
1
0.4
0.3
0.5 0.4
• Stage 2: Attacker chooses one of the E strategies with positive probability to attack and chooses (possibly mixed) attacking strategy
0.5 0.4
attacked
attacking
• Stage 3: Defender chooses on which (non-E) strategy to place the remainder of the probability– If attacking outperforms attacked,
attacker wins attacked
attacking
0.5 0.40.1
er* = sr
3, Er = {sr3, sr
4}, Ec = {sc3, sc
4}
3, 00, 30, 20, 2sr4
0, 33, 00, 20, 2sr3
2, 02, 02, 22, 2sr2
2, 02, 02, 22, 2sr1
sc4sc
3sc2sc
1
3, 00, 30, 20, 2sr4
0, 33, 00, 20, 2sr3
2, 02, 02, 22, 2sr2
2, 02, 02, 22, 2sr1
sc4sc
3sc2sc
1
A spectrum of elimination power
• The larger the Ei sets, the more strategies are eliminable
• If the Ei sets include all strategies, then a strategy is eliminable if and only if no Nash equilibrium places positive probability on it
• If the Ei sets are empty (with the exception of er*) then er* is eliminable if and only if it is dominated
dominance Nash equilibrium
larger Ei sets
Alternative definition
• Stage 1: Defender specifies probabilities on E sets (er
* must get > 0)
0.40.3
0.5 0.4
• Stage 2: Attacker chooses one of the E strategies with positive probability to attack
• Stage 3: Defender distributes the remainder of the probability (not on E)
attacked
0.5 0.4
attacked
0.5 0.4
• Stage 4: Attacker chooses attacking strategy– If attacking outperforms attacked,
attacker wins
0.05 0.05
attacked
0.5 0.40.05 0.05
attacking
er* = sr
3, Er = {sr3, sr
4}, Ec = {sc3, sc
4}
3, 00, 30, 20, 2sr4
0, 33, 00, 20, 2sr3
2, 02, 02, 22, 2sr2
2, 02, 02, 22, 2sr1
sc4sc
3sc2sc
1
3, 00, 30, 20, 2sr4
0, 33, 00, 20, 2sr3
2, 02, 02, 22, 2sr2
2, 02, 02, 22, 2sr1
sc4sc
3sc2sc
1
3, 00, 30, 20, 2sr4
0, 33, 00, 20, 2sr3
2, 02, 02, 22, 2sr2
2, 02, 02, 22, 2sr1
sc4sc
3sc2sc
1
3, 00, 30, 20, 2sr4
0, 33, 00, 20, 2sr3
2, 02, 02, 22, 2sr2
2, 02, 02, 22, 2sr1
sc4sc
3sc2sc
1
Equivalence
• Theorem. The alternative definition is equivalent to the original one.
• Proof based on duality (more specifically, Minimax Theorem [von Neumann 1927])
Mixed integer programming approach (using alternative definition)
• Continuous variables: pi(ei), pie-i(si), binary: bi(ei)
• maximize pr(er*)
• subject to– for both i, for any eiEi, Σp-i(e-i) + Σp-i
ei(s-i) = 1
– for both i, for any eiEi, pi(ei) ≤ bi(ei)
– for both i, for any eiEi and any diSi,
Σp-i(e-i)(ui(ei, e-i)-ui(di, e-i)) + Σp-iei(s-i)(ui(ei, s-i)-ui(di, s-i)) ≥ (bi(ei)-1)Ui
Ui is the maximum difference between two of player i’s utilities
• Number of binary variables = |Er| + |Ec|– Exponential only in this!
Eliminating strategies in the hard game
0, 2 0, 3 3, 0 0, 0 0, 2 0, 0 0, 2
0, 2 0, 0 0, 0 0, 3 0, 2 3, 0 0, 2
2, 4 2, 0 2, 0 2, 0 4, 2 2, 0 3, 3
3, 3 2, 0 2, 0 2, 0 2, 4 2, 0 4, 2
0, 2 3, 0 0, 3 0, 0 0, 2 0, 0 0, 2
0, 2 0, 0 0, 0 3, 0 0, 2 0, 3 0, 2
4, 2 2, 0 2, 0 2, 0 3, 3 2, 0 2, 4
1/3 1/3 1/3
1/3
1/31/3
0 00000
00
Er
Ec
Another preprocessing technique for computing a Nash equilibrium [C. & Sandholm AAMAS06]
al, dml…a2, dm2a1, dm1
………
al, d2l…a2, d22a1, d21
al, d1l…a2, d12a1, d11
ckn, bk…ck2, bkck1, bk
………
c2n, b2…c22, b2c21, b2
c1n, b1…c12, b1c11, b1
G
πr, πcal, ΣipG(si)d1l…a2, ΣipG(si)di2a1,ΣipG(si)di1
ΣjpG(tj)ckj, bk
…
ΣjpG(tj)c2j, b2
ΣjpG(tj)c1j, b1
G
H
H
Required structure on original game O
al, dml…a2, dm2a1, dm1sm
…………
al, d2l…a2, d22a1, d21s2
al, d1l…a2, d12a1, d11s1
ckn, bk…ck2, bkck1, bkuk
…………
c2n, b2…c22, b2c21, b2u2
c1n, b1…c12, b1c11, b1u1
tn…t2t1vl…v2v1
That is: against any fixed vj, all the si give the row player the same utility aj
against any fixed ui, all the tj give the column player the same utility bi
H
G
Solve for equilibrium of G (recursively)
sm
…
s2
s1
tn…t2t1
• Obtain– Equilibrium distributions pG(si), pG(tj)
– Player’s expected payoffs in equilibrium πr, πc
G
Reduced game R
πr, πcal, ΣipG(si)d1l…a2, ΣipG(si)di2a1,ΣipG(si)di1s
ΣjpG(tj)ckj, bkuk
……
ΣjpG(tj)c2j, b2u2
ΣjpG(tj)c1j, b1u1
tvl…v2v1
Expected payoffs when row player plays the equilibrium of G, column player plays vi
Expected payoffs when both players play the equilibrium of G
• Theorem. pR(ui), pR(s)pG(si); pR(vj), pR(t)pG(tj) constitutes a Nash equilibrium of original game.
H
Example
v1 t1 t2
u1 2, 2 0, 3 2, 3
s1 1, 2 4, 0 0, 4
s2 1, 4 0, 4 4, 0
t1 t2
s1 4, 0 0, 4
s2 0, 4 4, 0
0.5 0.5
0.5
0.5
v1 t
u1 2, 2 1, 3
s 1, 3 2, 2
0.50.5
0.5 0.5
0.5
0.5 0.25 0.25
0.25
0.25
A more difficult example
= the game that we solved before!
v1 = b2 t1 = b1 t2 = b3
u1 = a2 2, 2 0, 3 2, 3
s1 = a1 1, 2 4, 0 0, 4
s2 = a3 1, 4 0, 4 4, 0
b1 b2 b3
a1 4, 0 1, 2 0, 4
a2 0, 3 2, 2 2, 3
a3 0, 4 1, 4 4, 0
• But how (in general) do we find the correct labeling of the strategies as ui, si , vj , tj? Can it be done in polynomial time?
Let’s try to use satisfiability
b1 b2 b3
a1 4, 0 1, 2 0, 4
a2 0, 3 2, 2 2, 3
a3 0, 4 1, 4 4, 0• Say that v(σ) = true if we label σ as one of the si or tj (that is, we put it “in” G)
• If a1, a2 are both in G, then b1 must also be in G because a1, a2 get different payoffs against b1
• Equivalently, v(a1) and v(a2) v(b1)
– or (-v(a1) or -v(a2) or v(b1))
• Theorem: satisfaction of all such clauses the condition is satisfied
Clauses for the example
b1 b2 b3
a1 4, 0 1, 2 0, 4
a2 0, 3 2, 2 2, 3
a3 0, 4 1, 4 4, 0• v(a1) and v(a2) v(b1) and v(b2) and v(b3)
• v(a1) and v(a3) v(b1) and v(b3)
• v(a2) and v(a3) v(b2) and v(b3)
• v(b1) and v(b2) v(a1) and v(a2)
• v(b1) and v(b3) v(a1) and v(a3)
• v(b2) and v(b3) v(a1) and v(a2) and v(a3)
• Complete characterization of solutions:– Set at most one variable to true for each player (does not reduce game)
– Set all variables to true (G = whole game!)
– Only nontrivial solution: set v(a1), v(a3), v(b1), v(b3) to true
Simple algorithm
• Algorithm to find nontrivial solution:– Start with any two variables for the same agent set to true– Follow the implications– If all variables set to true, start with next pair of variables
Solving the example with the algorithm (pass 1)
b1 b2 b3
a1 4, 0 1, 2 0, 4
a2 0, 3 2, 2 2, 3
a3 0, 4 1, 4 4, 0
• v(a1) and v(a2) v(b1) and v(b2) and v(b3)
• v(a1) and v(a3) v(b1) and v(b3)
• v(a2) and v(a3) v(b2) and v(b3)
• v(b1) and v(b2) v(a1) and v(a2)
• v(b1) and v(b3) v(a1) and v(a3)
• v(b2) and v(b3) v(a1) and v(a2) and v(a3)
• Variables set to true: v(a1) v(a2) v(a3)v(b1) v(b2) v(b3)
Solving the example with the algorithm (pass 2)
b1 b2 b3
a1 4, 0 1, 2 0, 4
a2 0, 3 2, 2 2, 3
a3 0, 4 1, 4 4, 0
• v(a1) and v(a2) v(b1) and v(b2) and v(b3)
• v(a1) and v(a3) v(b1) and v(b3)
• v(a2) and v(a3) v(b2) and v(b3)
• v(b1) and v(b2) v(a1) and v(a2)
• v(b1) and v(b3) v(a1) and v(a3)
• v(b2) and v(b3) v(a1) and v(a2) and v(a3)
• Variables set to true: v(a1) v(a3) v(b1) v(b3)
Algorithm complexity
• Theorem. Requires at most O((#rows+#columns)4) clause applications– That is, quadratic if the game is square
• Can improve in practice by caching previous results
Preprocessing the hard game
2, 4 4, 2 3, 3
3, 3 2, 4 4, 2
4, 2 3, 3 2, 4
0, 2 1.5, 1.5 0, 2 0, 2
2, 4 2, 0 4, 2 3, 3
3, 3 2, 0 2, 4 4, 2
4, 2 2, 0 3, 3 2, 4
0, 2 0, 3 3, 0 0, 0 0, 2 0, 0 0, 2
0, 2 0, 0 0, 0 0, 3 0, 2 3, 0 0, 2
2, 4 2, 0 2, 0 2, 0 4, 2 2, 0 3, 3
3, 3 2, 0 2, 0 2, 0 2, 4 2, 0 4, 2
0, 2 3, 0 0, 3 0, 0 0, 2 0, 0 0, 2
0, 2 0, 0 0, 0 3, 0 0, 2 0, 3 0, 2
4, 2 2, 0 2, 0 2, 0 3, 3 2, 0 2, 40, 3 3, 0
3, 0 0, 3
0, 3 3, 0 0, 0
0, 0 0, 0 1.5, 1.5
3, 0 0, 3 0, 0
1.5, 1.5 0, 0
0, 0 1.5, 1.5
0, 3 3, 0
3, 0 0, 3
0, 3 3, 0 0, 0 0, 0
0, 0 0, 0 0, 3 3, 0
3, 0 0, 3 0, 0 0, 0
0, 0 0, 0 3, 0 0, 3
1/2 1/21/2
1/2
1/2 1/21/2
1/2
1/3
11
0
0
1/3 1/3
1/3
1/3
1/3
Another game
2, 1 4, 0
1, 0 3, 1dominates
dominates
What if player 1 commits first?
2, 1 4, 0
1, 0 3, 1
0
1
10
What if player 1 commits first?
2, 1 4, 0
1, 0 3, 1
1/2 (- ε)
1/2 (+ ε)
10
Computing optimal mixed strategies to commit to [C. & Sandholm EC06]
For every t, solve:
maximize Σspsul(s, t)
subject to
for all t’, Σspsuf(s, t) ≥ Σspsuf(s, t’)
Σsps = 1
Choose solution with highest objective
Example solve
maximize 2pUp+ 1pDown
subject to
1pUp ≥ 1pDown
pUp+ pDown = 1
solution: pUp= 1, pDown = 0, objective = 2
maximize 4pUp+ 3pDown
subject to
1pDown ≥ 1pUp
pUp+ pDown = 1
solution: pUp= .5, pDown = .5, objective = 3.5
2, 1 4, 0
1, 0 3, 1
Optimal computer player for “Liar’s Dice” games
“accept”“9” “10” “bluff”
• (One variant of) Liar’s Dice:– Player rolls some number of dice under cup, peeks– Makes claim about total score (can lie)– Next player can accept or call bluff– If next player accepts, has to claim higher number in her turn
“5” “accept” “7” “bluff”
Red player wins
Red player wins
Conclusions• To act strategically (according to game theory), we need algorithms for computing game-theoretic solutions• In computer science, we often get to design the game, too
– Network protocols, e-commerce mechanisms, …
• Many important computational questions here– Even just running the mechanism can be hard– Finding the optimal mechanism (given strategic behavior) is even harder
Thank you for your attention!