adaptively learning tolls to induce target flows aaron roth joint work with jon ullman and steven wu

28
Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Upload: baby-mitchiner

Post on 02-Apr-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Adaptively Learning Tolls to Induce Target

FlowsAaron Roth

Joint work with Jon Ullman and Steven Wu

Page 2: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Non-Atomic Congestion Games• A graph representing a road network

Page 3: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Non-Atomic Congestion Games• A graph representing a road network• A latency function on each edge• Infinitely many (infinitesimally small)

players• A mass of players who want to route

flow between and .

• Actions for players are paths in the graph• Action profile a multi-commodity flow.

ℓ 1(𝑥 )=

𝑥2

ℓ2 (𝑥 )=𝑥 3

+3

ℓ3 (𝑥 )=log 𝑥⋅ log log𝑥

ℓ 4(𝑥)=2

𝑥

ℓ5 (𝑥

)=1

Page 4: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Equilibrium Flows

• “Nobody can improve their total latency by switching paths”• Let denote the set of paths in the graph. • Let

Definition: A feasible multi-commodity flow is a Wardrop equilibrium if for every with , and for all paths with :

Page 5: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Routing games are potential games

• Equilibrium flows minimize the following potential function, among all feasible multi-commodity flows:

• Convex so long as are non-decreasing• Strongly convex if are strictly increasing equilibrium is unique.

Page 6: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Manipulating equilibrium flow (classic problem)• Suppose you can set tolls on each edge:

• Potential function becomes • Changes the equilibrium flow.

• Goal: Set tolls to induce some target flow in equilibrium• E.g. the socially optimal flow. • Always possible• Computationally tractable

Page 7: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

A Natural Problem [Bhaskar, Ligett, Schulman, Swamy FOCS 14]

• You don’t know the latency functions…• But you have the power to set tolls and see what happens. • Agents play equilibrium flow given your tolls. • i.e. you have access to an oracle that takes toll vectors and returns , the

equilibrium flow given .

• Want to learn tolls that induce some target flow in polynomially many rounds.

Page 8: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

The [BLSS] solution in a nutshell

• Assume latency functions are convex polynomials of fixed degree (the only thing unknown is the coefficients)• E.g.

• Write down a convex program to try and solve for coefficients and tolls that induce the target flow using Ellipsoid• i.e.

• Every day use tolls at the centroid of the ellipsoid• If , can find a separating hyperplane

• So: number of rounds to convergence running time of ellipsoid.

Page 9: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

The [BLSS] solution in a nutshell

• Very neat! (Read the paper!)• A couple of limitations:• Latency functions must have a simple, known form: only unknowns are a

small number of coefficients. • Latency functions must be convex• Heavy machinery

• Have to run Ellipsoid. Computationally intensive. Centralized.

Page 10: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Some desiderata

• Remove assumptions on latency functions• No known parametric form• Not necessarily convex• Not necessarily Lipschitz

• Make update elementary• Ideally decentralized.

Page 11: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Proposed algorithm: Tatonnemont.

1. Initialize for all edges. Let .2. Observe equilibrium flow 3. While

1. For each edge set: 2. Set and

• Natural, simple, distributed.• Why should it work?

Page 12: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Reverse engineering why it should work• View the interaction as repeated play of a game between a toll player

and a flow player. • What game are they playing? • Flow player’s strategies are feasible flows, toll player’s strategies are tolls in

some bounded range• What are their cost functions?

• How are they playing it?

Page 13: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Behavior of the flow player is clear.

• Every day, the flow player plays:

• If we define the flow player’s cost to be:

then the flow player is playing a best response every day.

Page 14: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Behavior of the toll player?

• Consistent with playing online gradient descent with loss function:

• Which is the gradient of cost function:

Page 15: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

So algorithm is consistent with:

• Repeated play of the following game:

• Where, in rounds:• The flow player updates his strategy with online gradient descent, and• The toll player best responds.

Page 16: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Questions

• Does the equilibrium of this game correspond to the target flow?• Does this repeated play converge (quickly) to equilibrium?

Page 17: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Questions

• Does the equilibrium of this game correspond to the target flow?• Yes

• Suppose tolls induce flow .

• Nash equilibrium.

• Suppose is a Nash equilibrium. • , else an edge with and toll player would set

• Thinking a little bit harder, if for , then setting sufficient

Page 18: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Questions

• Does this repeated play converge (quickly) to equilibrium? • It does in zero sum games! [FreundSchapire96?]

• Actual strategy of GD player, empirical average of BR player.

Page 19: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

So it converges in a zero sum game.

Page 20: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Strategic Equivalence

• Adding a strategy-independent term to a player’s cost function does not change that player’s best response function• And so doesn’t change the equilibria of the game…

Page 21: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

So it converges in a zero sum game.

−∑𝑒∈𝐸

𝑔𝑒 𝑡𝑒

Page 22: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

So it converges in a zero sum game.

−∑𝑒∈𝐸

𝑔𝑒 𝑡𝑒

−∑𝑒∈𝐸

∫0

𝑓 𝑒

ℓ𝑒 (𝑥 )𝑑𝑥

Page 23: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

So it converges in a zero sum game.

−∑𝑒∈𝐸

𝑔𝑒 𝑡𝑒

−∑𝑒∈𝐸

∫0

𝑓 𝑒

ℓ𝑒 (𝑥 )𝑑𝑥

Page 24: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

So the dynamics converge!

So by the regret bound of OGD:Toll player reaches an -approximate min-max strategy in T rounds for:

Page 25: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Do approximate min-max tolls guarantee the approximate target flow? • Yes! Recall that if latency functions are strictly increasing, then is

strongly convex in for all .

Page 26: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Upshot

So: For any fixed class of latency functions such that:1. Each is bounded in the range 2. Each is strictly increasing

This simple process results in a flow such that in rounds. • Can get better bounds with further assumptions• E.g. are Lipschitz-continuous

Page 27: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Questions

• Exact convergence without assumptions on latency functions? • Extensions to other games?• Seem to crucially use the fact that equilibrium is the solution to a convex

optimization problem…

Page 28: Adaptively Learning Tolls to Induce Target Flows Aaron Roth Joint work with Jon Ullman and Steven Wu

Questions

• Exact convergence without assumptions on latency functions? • Extensions to other games?• Seem to crucially use the fact that equilibrium is the solution to a convex

optimization problem… Thank You!