mechanism design for computationally limited agents tuomas sandholm computer science department...
TRANSCRIPT
Mechanism design for computationally limited agents
Tuomas SandholmComputer Science Department
Carnegie Mellon University
Outline
• Part I: Limited deliberation to determine valuations: A study of common auction mechanisms
• Part II: Limited deliberation to determine valuations: Designing new mechanisms
• Part III: Other ideas for mechanism design for computationally limited agents
Part I: Limited deliberation to determine valuations: A study of
common auction mechanisms
Bidders may need to compute their valuations for (bundles of) goods
• In many applications, e.g.
– Vehicle routing problem in transportation exchanges
– Manufacturing scheduling problem in procurement
• Value of a bundle of items (tasks, resources, etc) =
value of solution with those items - value of solution without them
• Information gathering fits the model as well
Software agents for auctions
• Software agents exist that bid on behalf of user• We want to enable agents to not only bid in auctions,
but also determine the valuations of the items• Agents use computational resources to compute
valuations• Valuation determination can involve computing on NP-
complete problems (scheduling, vehicle routing, etc.)
• Optimal solutions may not be possible to determine due to limitations in agents’ computational abilities (i.e. agents have bounded rationality)
Recall
• A bidder in an auction can pay cost c to find out his own valuation => Vickrey auction ceases to have a dominant strategy [Sandholm ICMAS-96, International J. of Electronic Commerce 2000]– Same model studied in “information acquisition
in auctions” [Compte and Jehiel 01, Rezende 02, Rasmussen 03]
Bounded rationality• Work in economics has largely focused on descriptive models• Some models based on limited memory in repeated games
[Papadimitriou, Rubinstein, …]
• Some AI work has focused on models that prescribe how computationally limited agents should behave [Horvitz; Russell & Wefald; Zilberstein & Russell; Sandholm & Lesser; Hansen & Zilberstein, …]– Simplifying assumptions
• Myopic deliberation control• Asymptotic notions of bounded optimality• Conditioning on performance but not path of an algorithm
• Simplifications can work well in single agent settings, but any deviation from full normativity can be catastrophic in multiagent settings
Incorporate deliberation (computing) actions into agents’ strategies => deliberation equilibrium
Normative control of deliberation
• In our setting agents have – Limited computing, or– Costly computing
• Agents must decide how to use their limited resources in an efficient manner
• Agents have anytime algorithms and use performance profiles to control their deliberation
Anytime algorithms can be used to approximate valuations
• Solution improves over time
• Can usually “solve” much larger problem instances than complete algorithms can
• Allow trading off computing time against quality– Decision is not just which bundles to evaluate, but how carefully
• Examples
– Iterative refinement algorithms: Local search, simulated annealing
– Search algorithms: Depth first search, branch and bound
Performance profiles of anytime algorithms
• Statistical performance profiles characterize the quality of an algorithm’s output as a function of computing time
• There are different ways of representing performance profiles– Earlier methods were not normative: they do not capture all the
possible ways an agent can control its deliberation• Can be satisfactory in single agent settings, but catastrophic in multiagent
systems
Performance profiles
Computing time
Solution quality
Deterministic performance profile
Solution quality
Variance introduced by different problem instances
Computing time
[Horvitz 87, 89, Dean & Boddy 89]
Optimum
Ignores conditioning on the path
Table-based representation of uncertainty in performance profiles
.08 .19 .24
.15 .30 .17 .39
.16 .10 .16 .25 .30 .22
.08 .04 .17 .20 .22 .30 .24 .19 .15
.09 .10 .20 .22 .23 .37 .31 .13 .15
.11 .14 .33 .18 .21 .18 .08
.22 .17 .25 .24 .15 .13
.40 .31 .15 .19 .05
.15 .20 .03
.03
Computing time
Solutionquality
[Zilberstein & Russell IJCAI-91, AIJ-96]
Conditioning on solution quality so far [Hansen & Zilberstein AAAI-96]
Performance profile tree [Larson & Sandholm TARK-01]
• Normative– Allows conditioning on path of solution quality
– Allows conditioning on path of other solution features
– Allows conditioning on problem instance features (different trees to be used for different classes)
• Constructed from statistics on earlier runs
0
4
2
6
4
5
10
3
15
20
A
P(B|A) B
5
CP(C|A)Solution quality
Performance profile tree…• Can be augmented to model
– Randomized algorithms
– Agent not knowing which algorithms others are using
– Agent having uncertainty about others’ problem instances
• Agent can emulate different scenarios of others
0
4
2
6
45
10
3
15
20
p(0)
p(1)
Random node
Value node
Our results hold in this augmented setting
Roles of computing
• Computing by an agent– Improves the solution to the agent’s own problem(s)– Reduces uncertainty as to what future computing steps will
yield– Improves the agent’s knowledge about others’ valuations– Improves the agent’s knowledge about what problems
others may have computed on and what solutions others may have obtained
• Our results apply to different settings– Computing increases the valuation (reduces cost)– Computing refines the valuation estimate
Strategic computing [Larson & Sandholm]
• Good estimates of the other bidders’ valuations can allow an agent to tailor its bids to achieve higher utility
• Definition. Strong strategic computing: Agent uses some of its deliberation resources to compute on others’ problems
• Definition. Weak strategic computing: Agent uses information from others’ performance profiles
• How an agent should allocate its computation (based on results it has obtained so far) can depend on how others allocate their computation– Deliberation equilibrium
Theorems on strategic computing
yesyesno
Generalized Vickrey
On which <bidder, bundle> pair to allocate next computation step ?
Multiple items for
sale
noEnglish (1st price ascending) yes
yes
no
nonoVickrey (2nd price sealed bid)
yesyesyesDutch (1st price descending)
yesyesyesFirst price sealed-bidSingle item for
sale
Costly computing
Limited computing
Strategic computing ?Counter-speculation by rational
agents ?
Auction
mechanism
If performance profiles are deterministic, only weak strategic computing can occur
New normative deliberation control method uncovered a new phenomenon
Costly computing in English auctions• Dominant strategy mechanism for rational bidders
• Thrm: If at most one performance profile is stochastic, no strong strategic computing occurs in equilibrium
• Thrm: If at least two performance profiles are stochastic, strong strategic computing can occur in equilibrium
– Despite the fact that agents learn about others’ valuations by waiting and observing others’ bids
– Passing & restarting computation during the auction is allowed
– Proof. Consider an auction with two bidders:
• Agent 1 can compute for free
• Agent 2 incurs cost 1 for each computing step
Performance profiles of the proof
Agent 1’s problem Agent 2’s problem
p(high1)
1-p(high1)
p(high2)
1-p(high2)
high1
low1
high2
low2
low2 < low1 < high2 < high1
000
Since computing one step on 2’s problem does not yield any information, we can treat computing for two steps on 2’s problem atomically
Proof continued…• Agent 1 has a dominant strategy:
– Compute only on own problem & increment bid whenever • Agent 1 does not have the highest bid and• Highest bid is lower than agent 1’s valuation
• Agent 2’s strategy:– CASE 1: bid1 > low1
• Agent 2 knows that agent 1 has valuation high1
• Agent 2 cannot win, and thus has no incentive to compute or bid
– CASE 2: bid1< low2 • Agent 2 continues to increment its own bid• No need to compute since it knows that its valuation is at least low2
– CASE 3: low1 bid1 low2 • Agent 2’s strategy depends on the performance profiles
Decision problem of agent 2 in CASE 3
Withdraw
Bid
Compute on 2’s problem
Compute on 1’s problem
high1
low1
high1
low1
high2
high2
Compute on 2’s
low2
low2
Decision node for agent 2
Chance node for agent 1’sperformance profile
Chance node for agent 2’sperformance profile
Bid
0
-1
0
high1
low1
high2-low1-2
low2-low1-2
high2-low1-3-3
Withdraw
high2
low2
Withdraw
-2
-2high2-low1-2
high2
low2
Compute on 2’s
high2
low2
Bidhigh2-high1-3
low2-high1-3
high2-low1-3
-3
Withdraw
Bid high2
low2
Withdraw
Compute on 1’shigh1
low1
-2
-1
high2-low1-3
low2-low1-3
Bid
Compute on 1’s
high1
high1
low1
low1
-3-3
-2-2
-3high2-low1-3
Agent 2’s utilitylow2 < low1 < high2 < high1
Under what conditions does strong strategic computing occur?
Probability that agent 1 will have its high valuation
Probability that agent 2 will have its high valuation
0 0.2 0.4 0.6 0.8 1
1
0.8
0.6
0.4
0.2
0
Other variants we have solved
• Agents cannot pass on computing during the auction & continue computing later during the auction– Can make a difference in English auctions with costly computing, but
strong strategic computing is still possible in equilibrium• Agents can/cannot compute after the auction• 2-agent bargaining (again with performance profile trees)
– Larson, K. and Sandholm, T. 2001. Bargaining with Limited Computation: Deliberation Equilibrium. Artificial Intelligence, 132(2), 183-217.
– Larson, K. and Sandholm, T. 2002. An Alternating Offers Bargaining Model for Computationally Limited Agents. In Proceedings of the First International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), Bologna, Italy, July.
Conclusions• Software agents participating in auctions may need to compute
valuations under computational limitations– This adds other possibilities to the agents’ strategies
• Modeled computing normatively as part of each agent’s strategy– Deliberation equilibrium– Showed under which auction mechanisms and which models of
bounded rationality strategic computing can/cannot occur
• Deliberation resources may be used strategically– Strong strategic vs. weak strategic computing– Deep interaction between incentives and computing
• Dominant strategy mechanisms can become strategy-prone• Even English auction with costly computing
The future ?• In many B2B settings, automated bidders can
compute valuations dynamically faster than humans
• Some future research directions– Using our deliberation control method in systems
• Manufacturing planning, networks, …
– New (market) mechanisms• Game-theoretically engineered to work well under (different)
models of bounded rationality• Our results show that even the most common mechanism
design principles (e.g., revelation principle) cease to hold• Our normative deliberation control method = basis for new
design principles ?
Part II: Limited deliberation to determine valuations:
Designing new mechanisms
[Larson & Sandholm AAMAS-05]
Mechanism desiderata• Preference formation-independent
– Mechanism should not be involved in agents’ preference formation process (otherwise revelation principle applies trivially)
• Deliberation-proof– In equilibrium, no agent should have incentive to strategically
deliberate
• Non-misleading– In equilibrium, no agent should follow a strategy that causes others
to believe that its true preferences are impossible
Proof sketch. Given any outcome function it is always possible to construct an example where agents are best off knowing the valuation of another agent
Indirect/multi-step mechanisms provide information to agents
– Example: Ascending auction
Bidders
InformationAt price p there are k bidders remaining in
the auction
Is it possible to satisfy the three desiderata via a multi-stage
mechanism? • Thm: There does not exist any strategy-dependent,
preference-formation independent mechanism that is both– deliberation proof, and– non-misleading
• Proof sketch. Look at information sets in the game induced by the indirect mechanism– Case 1: Game does not provide enough information to stop
strategic-deliberation (ascending auction)– Case 2: Game does provide enough information BUT agents’ play
a signaling game• Pooling equilibria (misleading)
Future work
• Overcoming the impossibility result by relaxing the properties– Encourage strategic deliberation
• Incentives for agents to share information?
– Relax preference-formation independent property
• Mechanism guides deliberation
Part III: Other ideas for mechanism design for computationally limited
agents
Recall from last lecture
• With computationally limited agents, a non-truthful mechanism can be better than a truth-promoting one – [Conitzer & Sandholm: “Computational
Criticisms of the Revelation Principle”, 2003]
2nd-chance mechanism [in paper “Computationally Feasible VCG Mechanisms” by Nisan & Ronen, EC-00]
• (Interesting unrelated fact: Any VCG mechanism that is maximal in range is incentive compatible)
• Observation: only way an agent can improve its utility in a VCG mechanism where an approximation algorithm is used is by helping the algorithm find a higher-welfare allocation
• Second-chance mechanism: let each agent i submit a valuation fn vi and an appeal fn li: V->V. Mechanism (using alg k) computes k(v), k(li(v)), k(l2(v)), … and picks the among those the allocation that maximizes welfare. Pricing based on unappealed v.
Work based on the assumption that agents can only solve problems that
are worst-case polynomial time
• Bartholdi, Tovey, and Trick. 1989. The computational difficulty of manipulating an election, Social Choice and Welfare, 1989.
• Bartholdi and Orlin. Single Transferable Vote Resists Strategic Voting, Social Choice and Welfare, 1991.
• Nisan and Ronen. 2000. Computationally Feasible VCG Mechanisms, EC-00.• O’Connell and Stearns. 2000. Polynomial Time Mechanisms for Collective
Decision Making, SUNYA-CS-00-1• Conitzer, V. and Sandholm, T. 2002.
Complexity of Manipulating Elections with Few Candidates. National Conference on Artificial Intelligence (AAAI).
• Conitzer, V. and Sandholm, T. 2003. Universal Voting Protocol Tweaks to Make Manipulation Hard. International Joint Conference on Artificial Intelligence (IJCAI).
• Conitzer, V. and Sandholm, T. 2003. How Many Candidates Are Needed to Make Elections Hard to Manipulate? Conference on Theoretical Aspects of Rationality and Knowledge (TARK).
• …