best reply mechanisms

42
Best Reply Mechanisms Justin Thaler and Victor Shnayder

Upload: verdi

Post on 05-Feb-2016

47 views

Category:

Documents


0 download

DESCRIPTION

Best Reply Mechanisms. Justin Thaler and Victor Shnayder. What are best-reply dynamics?. Start with an arbitrary strategy profile In each step let some player switch his strategy to be a best reply to the current strategies of the others. What are best-reply dynamics?. Definition: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Best Reply Mechanisms

Best Reply Mechanisms

Justin Thaler and Victor Shnayder

Page 2: Best Reply Mechanisms

What are best-reply dynamics?

•Start with an arbitrary strategy profile

•In each step let some player switch his strategy to be a best reply to the current strategies of the others.

Page 3: Best Reply Mechanisms

What are best-reply dynamics?

Definition: A repeated-reply mechanism for a private info game G:• Extensive form game with perfect recall (same players)• At most M steps. In each step:• A single player announces an element of Ai

• Players play in round-robin order• Stop when all players “pass” in n consecutive steps. • Enforce action profile of the most recently announced actions• If M steps go by without stopping, penalize the players.

Page 4: Best Reply Mechanisms

What are best-reply dynamics?

•Need a penalty to ensure non-convergence is not in best interest of any player.

•Realistic modeling assumption for BGP, TCP, etc.

•Best-reply dynamics is the strategy profile of a repeated-reply mechanism in which each player i updates to i’s best-reply to the other players’ strategies each time it is i’s turn.

Page 5: Best Reply Mechanisms

Why best reply dynamics?

•If convergence occurs, we have a highly justifiable Nash Equilibrium

•Computationally simple

•Players only need private information

•Feasible in distributed, asynchronous settings

•Prescribed by existing protocols (Ex: BGP)

Page 6: Best Reply Mechanisms

Why best reply dynamics?

•In light of Theorems 1 and 2 (which we’ll see soon):

•Often gives a non-VCG way of creating incentive compatible mechanisms (?). And sometimes without $$$.

•Often get collusion-proofness, Pareto-efficiency

Page 7: Best Reply Mechanisms

Outline

•When do best reply dynamics work?

•Universal max-solvability (UMS)

•Thm: UMS implies convergence to unique NE, collusion-proofness

•Example applications (correlated markets, BGP, etc)

•Connections to strategy-proofness

•Discussion

Page 8: Best Reply Mechanisms

Universal max-dominance

•A subset T of S is universally max-dominated if:

•Very strong condition!

•Existence of max-dominated set is strictly stronger than existence of dominated strategy.

•Exists si, si’ s.t. ui(si, s-i) < ui(si’, s-i) for all s-i

Page 9: Best Reply Mechanisms

Universal max-solveability (UMS)

•A game G is universally max-solvable if we can iteratively remove universally max-dominated strategy sets and get to a single strategy for each player.

•Stronger condition than solvable by iterated removal of strictly dominated strategies (IRSDS)

Page 10: Best Reply Mechanisms

Example 1

5, 5 0, 0

10, 0 4, 4

Solvable by IRSDS, but not UMS. Neither player has a universally max-dominated set. Note unique NE is not PE, and best-

reply dynamics are not incentive compatible for the row player.

Page 11: Best Reply Mechanisms

Example 2

0, 1 0, 1

1, 1 1, 0

UMS

Page 12: Best Reply Mechanisms

Example 2

0, 1 0, 1

1, 1 1, 0

UMS

Page 13: Best Reply Mechanisms

Example 2

0, 1 0, 1

1, 1 1, 0

UMS

Page 14: Best Reply Mechanisms

Example 3 (UMS)

1, 9 2, 9 2,9

3, 1 3, 2 3, 2

3, 1 4, 3 5, 4

L M R

A

C

B

Page 15: Best Reply Mechanisms

Example 3 (UMS)

1, 9 2, 9 2,9

3, 1 3, 2 3, 2

3, 1 4, 3 5, 4

L M R

A

C

B

Page 16: Best Reply Mechanisms

Example 3 (UMS)

1, 9 2, 9 2,9

3, 1 3, 2 3, 2

3, 1 4, 3 5, 4

L M R

A

C

B

Page 17: Best Reply Mechanisms

Example 3 (UMS)

1, 9 2, 9 2,9

3, 1 3, 2 3, 2

3, 1 4, 3 5, 4

L M R

A

C

B

Page 18: Best Reply Mechanisms

Example 3 (UMS)

1, 9 2, 9 2,9

3, 1 3, 2 3, 2

3, 1 4, 3 5, 4

L M R

A

C

B

Page 19: Best Reply Mechanisms

TheoremsTheorem 1: G is UMS ⇒ G has unique, pure NE, and it is

collusion-proof.

Corollary: Collusion-proof NE ⇒ NE is Pareto optimal

Theorems

Note that solvable by IRSDS suffices for unique, pure NE. UMS is needed for collusion-proofness and PE.

Page 20: Best Reply Mechanisms

Proof of theorem 1:•By contradiction: G is UMS, so fix an elimination sequence of dominated strategy-sets. •Let s* be the final strategy profile.•If s* is not collusion proof NE, some set of players T can deviate and be better off.•Let s be new strategies where players in T change strategy from s*•Let si be first strategy eliminated. Then it was max-dominated, so si* is strictly better, so i can’t be better off.

Page 21: Best Reply Mechanisms

Example 1

5, 5 0, 0

10, 0 4, 4

Solvable by IRSDS, but not UMS. Neither player has a universally max-dominated set. Note unique NE is not PE, and best-

reply dynamics are not incentive compatible for the row player.

Page 22: Best Reply Mechanisms

TheoremsTheorem 2: If G is UMS with private

information, then best reply dynamics are incentive-compatible in ex-post NE, and

converge to the unique NE of the induced full-information game.

Theorems

Proof: Similar to Theorem 1. The main idea is that a strategy eliminated in the t‘th step of the UMS elimination process can never be used after the nt’th step of the best-reply mechanism.

Page 23: Best Reply Mechanisms

Correlated two-sided markets

•Agents: buyers and sellers

•Game: weighted bipartite graph -- buyers on one side, sellers on the other

•Buyers have preference order over sellers (higher edge weight = higher preference)

•Sellers prefer buyers connected by heavier edges

Page 24: Best Reply Mechanisms

Correlated two-sided markets are UMS

•Let e be maximum weight edge. Choosing it universally max-dominates all other strategies of both endpoints.

•Remove the two endpoints of e and all incident edges, repeat.

•Therefore, best reply dynamics converge to ex-post NE.

Page 25: Best Reply Mechanisms

Extended Example: BGP

Page 26: Best Reply Mechanisms

Internet routing: BGP• Receive update messages from neighbours announcing routes to d. • Choose a single neighbor, whose route you prefer most, to send traffic through. • Announce your new route to all your neighbors

d

1 2

12d1d

21d2d

Page 27: Best Reply Mechanisms

Internet routing: BGP•BGP is asynchronous, distributed

•Prescribes best-reply dynamics

•But does BGP converge?

•And is BGP “incentive compatible”? Do ASes have an incentive to deviate from the protocol?

Page 28: Best Reply Mechanisms

Does BGP Converge?

•We can break this into two questions:

•Does a stable solution even exist in the static game?

•If so, will BGP find such a solution?

•But we only need one answer.

Page 29: Best Reply Mechanisms

Does a Stable Solution Exist?

d

1 2

3

13d

1d

21d2d

32d3d

No stable solution exists!

It is actually NP-complete to

determine existence in

general networks

Page 30: Best Reply Mechanisms

Does BGP Converge When A Stable Solution Exists?

d

1 2

12d1d

21d2d•Notice that multiple NE exist.

•And asynchronous best-reply dynamics do not necessarily converge.

•So must not be UMS.

Page 31: Best Reply Mechanisms

So What Do We Do?• Approach #1: Use mechanism design to

achieve IC convergence, but solution must be distributed.

• Approach #2: Identify conditions (on network topology and/or AS preferences) under which BGP converges and is IC.

• Both approaches are canonical problems in Distributed Algorithmic Mechanism Design.

Page 32: Best Reply Mechanisms

Approach #2 for Convergence

• Griffin et al. (1999): If BGP fails to converge, then there exists a Dispute Wheel.

•Each ui would rather route clockwise through ui+1 than Qi

Image Source: Levin et al. “Internet Routing and Games,” 2008.

Page 33: Best Reply Mechanisms

Approach #2 for Convergence

• Gao and Rexford (2001): Identified reasonable conditions based on economic structure of the Internet that guarantee No Dispute Wheel and hence convergence. (No bounds on convergence rate given).

•But limited progress made until recently on conditions for guaranteeing that BGP is IC.

Page 34: Best Reply Mechanisms

Approach #2 for Incentive Compatibility • Theorem 3: Assuming non-convergence after n3 rounds is a penalty, and No Dispute Wheel holds, then routing games are UMS.

•Corollary: Under the above conditions, best-reply strategies are IC in collusion-proof ex-post NE.

•Corollary: Under the Gao-Rexford conditions, BGP converges in O(n3) time and is IC.

Page 35: Best Reply Mechanisms

Theorem 3

• Proof sketch: The case of finding the first universally max-dominated action set is general.

•Find a node a1 with at least 2 actions. Let R be a1’s most preferred existing route. One of two cases must occur:

Page 36: Best Reply Mechanisms

Theorem 31. Every node a2 on R prefers the suffix

of R leading from a2 to d. In this case, if u is the closest node to d on R with at least two actions, then (u, d) universally max-dominates all other actions of u, and we’re done.

2. Some node a2 on R prefers some other path over the suffix of R leading from a2 to d. In this case, we repeat the analysis at a2. Eventually we either form a dispute wheel or find ourselves in Case 1.

Page 37: Best Reply Mechanisms

What’s left in Routing?

•Complete characterization of BGP convergence (No Dispute Wheel sufficient, not necessary).

•Conditions for convergence to globally optimal solution. Can it even be efficiently found?

•Do mechanism design and/or $$$ have a role to play?

•Changes in network topology?

Page 38: Best Reply Mechanisms

Other applications•Congestion control

• Criticism: Best-reply dynamics are only somewhat descriptive of how TCP works in practice.

•Cost sharing games

•Matching games (stable-roommate, intern assignment)

•Auctions (unit demand bidders, GSP)

• Relies a lot on VCG results

• Main contribution is proof of convergence! (opposite of BGP)

Page 39: Best Reply Mechanisms

Relationship to DSIC

OutcomeθEx-postEx-post

NENE

Play s(θ)

Given UMS game, best-replying is a strategy that gives ex-post NE.

Get a direct-revelation, dominant strategy IC mechanism.Good: New way to create DSIC mechanisms.Bad: Impossibility results limit the class of problems amenable to this approach (at

least without money or limits on preferences).

Page 40: Best Reply Mechanisms

Discussion

•What is the main contribution?

1. Sufficient conditions for IC convergence of best-reply dynamics. General enough to encompass many applications, esp. BGP.

2. Bounds on time to convergence.

3. New framework for developing IC mechanisms?

Page 41: Best Reply Mechanisms

Next Steps

1.Necessary conditions for best-reply dynamics to converge? To be IC (under what definition?)?

2.Better-reply dynamics? Other types of dynamics aka algorithms? What types of dynamics are reasonable or “natural”?

Page 42: Best Reply Mechanisms

Economists and Complexity

See recent blog post by Noam Nisan: Does complexity of equilibria matter?

Kamal Jain: “If your laptop can’t find it then neither can the market“.

Jeff Ely:  “Solving the n-body problem is beyond the capabilities of the world’s smartest mathematicians.  How do those rocks-for-brains planets manage to do pull it off?“