exploring bidding strategies for market-based scheduling
DESCRIPTION
Exploring Bidding Strategies for Market-Based Scheduling. Daniel M.Reeves, Micheal P.Wellman, Jeffrey K.MacKie-Mason, Anna Osepayshvili (University of Michigan Ann Arbor). Strategies for Complex market games. Allocation problem: deciding how to assign the available resources to agents. - PowerPoint PPT PresentationTRANSCRIPT
04/19/23 aRman aRtuc for CSC84200 1
Exploring Bidding Strategies for Market-Based Scheduling
Daniel M.Reeves, Micheal P.Wellman, Jeffrey K.MacKie-Mason, Anna Osepayshvili (University of Michigan Ann Arbor)
04/19/23 aRman aRtuc for CSC84200 2
Strategies for Complex market games
Allocation problem: deciding how to assign the available resources to agents.
Centralized vs. distributed information; centralized has superior results, not applicable : private interests of the agents.
Resource allocation mechanism: the communication process that determines which
agents get which resources based on the messages exchanged.
Market games are difficult to solve, even more complicated when resources are complements for some agents
No known optimal bidding strategy in multiple item simultaneous ascending auctions.
04/19/23 aRman aRtuc for CSC84200 3
Market based scheduling
Scheduling problem = resource allocation problem + resources are distinguished by the time periods in which they are available so that a schedule is an allocation of these resources over time.
Market based scheduling: a configuration of markets that allocates resources over time.
Focus on the strategic problem faced by an agent participating in a market based scheduling mechanism.
04/19/23 aRman aRtuc for CSC84200 4
Simultaneous Auctions
Simultaneous ascending auctions work well when there exist one competitive price equilibrium. (When the goods are substitutes for the multiple goods case)
But such simultaneous ascending auctions can fail badly when there are complements.
How should agents behave when faced with separate markets for complements?
04/19/23 aRman aRtuc for CSC84200 5
Scheduling problem definition
M units of a schedulable resources (time slots), 1, …, M
Each N agents have a single job that can be accomplished using the resource.
Agent j’s job requires λj timeslots to complete and if j acquires λj timeslots by deadline t, it accrues value vj(t)
Single unit: λj = 1, else multiple unit. Fixed deadline: every agent has the same
deadline, else variable deadline.
04/19/23 aRman aRtuc for CSC84200 6
Factory Scheduling
04/19/23 aRman aRtuc for CSC84200 7
Auction mechanism
A separate auction is run for each slot Bid price on slot m, βm: highest bid bj
m made Ask price on slot m: m = βm+ε. Bids must satisfy bj
m > m. An auction is quiescent when a round passes with
no admissible bids. Auctions proceed concurrently. When all of them
are simultaneously quiescent, all close and allocate their respective slots per the last admitted bids.
Because no slot is committed until they all are, an agent’s bidding strategy on one slot cannot be contingent on the outcome for another slot.
04/19/23 aRman aRtuc for CSC84200 8
Straightforward bidding – SB
SB takes a vector of perceived prices for the slots SB bids those prices for the bundle slots that
would maximize the agent’s surplus if it were to win all of its bids at those prices.
If agent j is assigned to a set of slots X, it accrues vj(X) and if it obtains X at prices p, surplus is
σ(X, p) = vj(X) - Σm∈X pm. If agent j is winning X-1 in the previous round,
current perceived prices are βm, m∈ X-1 or m otherwise.
SB agent j bids bjm = pm for m∈X*, such that X*=
arg maxXσ(X, p)
04/19/23 aRman aRtuc for CSC84200 9
Baseline Strategy Performance
No anticipation of other agents’ strategies. No-regret: from the agent’s perspective no
bidding policy other than the current one would have been a better response to the other agents’ bids.
SB is a Bayesian equilibrium for single-unit fixed-deadline problem.
But for multiple-unit problems, allocations differ from the optimal by large amounts.
04/19/23 aRman aRtuc for CSC84200 10
One SB path leads to a state where agent 2 is winning both slots at prices (4,3) Then in the next round agent 1 would bid 4 for the second slot and the auction would end agent 1 receiving slot 2 and agent 2 receiving slot 1 at a price of 4. Though SB leads a result with value 5, the optimal would have produced 8.
Agent 2 may have offered 5 for slot 2 and would be better of with a surplus of -1 rather than -4!! SB is not reasonable for a general strategy
Baseline Strategy Performance-2
04/19/23 aRman aRtuc for CSC84200 11
How to find an equilibrium strategy for the simultaneous ascending auction for simple scheduling?
Space of joint preferences is very large (Preference of an agent depends on job length plus payoff for each of M deadlines) (M+1) x N dimensional. Number of bidding rounds can be quite large (small bid increments)
Strategy space is all functions mapping the Cartesian product of the space of preferences and the space of all price-quote histories into a vector of next round bids.
Finding optimal by enumeration is computationally infeasible. So how to derive an optimal strategy?
With complement slots, there is a problem of exposure. Exposure problem: In order to obtain the combination it prefers
it must expose itself to the risk of paying for a far less desirable (or worthless) subset.
Alternative bidding Strategies
04/19/23 aRman aRtuc for CSC84200 12
OPPORTUNITY COST OF NOT BIDDING is ignored by SB. SB agents are bidding as if the incremental cost for slots
they are currently winning is the full price, but actually incremental cost is zero. (The cost is sunk!)
Sunk-aware strategy: permit agents to account for the true incremental cost of slots they are currently winning.
A sunk aware agent bids as if the incremental cost for slots currently winning is on the interval of zero and the current bid price.
Agent j’s perceived price for slot m = k.βm if j is winning m, and βm+ε otherwise where k is the sunk awareness parameter k ∈ [0, 1].
Sunk Awareness
04/19/23 aRman aRtuc for CSC84200 13
Select a set of candidate strategies and then evaluate their performance against each other through statistical simulation based on an evolutionary game.
Strategies are assigned population frequencies and samples of agents compete against each other.
Strategies that perform well are rewarded with higher population frequencies. Poor strategies weeded out.
Proposed method
04/19/23 aRman aRtuc for CSC84200 14
Estimate the payoff matrix for a restricted game. Strategy function that maps agent preferences +
auction information to bids Construct agents implementing selected
strategies and calculate the expected payoffs with respect to specified distributions from which agents are drawn.
Consider only reflex agents: only information from the current auction round, nothing previous.
Consider only a specific parameterized family of strategy functions (only sunk-awareness for this study)
Generating Payoff Matrices
04/19/23 aRman aRtuc for CSC84200 15
An element in the matrix is an N-vector of expected payoffs associated with a particular strategy profile.
A profile: {0.5,0.5,1,1,1} : 5 agents A distinct element in the payoff matrix for each
possible strategy combination. Estimate an entry using MC simulator drawing
preferences and assigning them to agents, simulating the auction for the given strategy profile to quiescence and averaging surplus.
Generating Payoff Matrices - 2
04/19/23 aRman aRtuc for CSC84200 16
An agent population that has reached a fixed point with respect to the replicator dynamics will be a candidate NE (for mixed strategy)
Every pure strategy >0 representatives in the fixed-state population does as well in expectation against others.
Iterative algorithm for finding equilibrium. Increase the proportion of good-performing
strategies at the expense of the others. Pg(s) ~ pg-1(s) (EPs-W)
For EP: average payoffs for s in profiles it appears.
Evolutionary Search for Equilibria
04/19/23 aRman aRtuc for CSC84200 17
Use monte carlo simulation to generate an expected payoff matrix for every combination of strategies playing against each other.
Then find Nash Equilibria with Replicator dynamics
(evolutionary tournament) GAMBIT (a computational game solver) Amoeba
(a function minimization algorithm)
Game settings
04/19/23 aRman aRtuc for CSC84200 18
Solving Payoff Matrices with Gambit
Gambit takes the full matrix representation of a strategic form game
iteratively eliminates strongly dominated strategies
applying a subdivision algorithm, enumerates all Nash equilibriums.
it cannot take advantage of symmetry in a payoff matrix, which reduces the number of distinct profiles drastically.
04/19/23 aRman aRtuc for CSC84200 19
Searching for Equilibriums with Amoeba
Nash equilibrium is a global minimum of the following function
Where u(x,p) is the payoff from playing strategy x against everyone else playing strategy p.
For any p∈NE => f(p) is zero.
04/19/23 aRman aRtuc for CSC84200 20
Replicator Dynamics and Biased Sampling
Payoff matrix calculation and equilibria search together. Initial set of population proportions for each pure
strategy. Sample from the preference distribution, iterating to
quiescence. (strategies are randomly drawn to participate according to their population proportions)
(successful strategies are sampled more often!) After a couple samples, apply the replicator dynamics
using the “realized” average payoffs and iterate calculating a sequence of new generations until population proportions are stationary.
Accumulates a statically precise estimate of the expected payoff matrix.
04/19/23 aRman aRtuc for CSC84200 21
Experiments with Sunk-Awareness
k=multiples of 1/20 from 0 to 20, designated by 0,1,…,20 for simplicity
Fix M, N and ε Vary the distributions for preferences
Uniform job length ~U[1,M] Constant job length (fixed for all j) Exponential job lengths (drawn from a
exponential distribution) Deadline values for each slot are
initialized as integers ~U[1,50].
04/19/23 aRman aRtuc for CSC84200 22
Uniform Job Length – Figure 1
Representation of payoff matrix for the restricted game with strategies {18,18,18,18,18}, {18,18,18,18,19}, … , {20,20,20,20,20}
04/19/23 aRman aRtuc for CSC84200 23
Uniform Job Length – Figure 1
04/19/23 aRman aRtuc for CSC84200 24
Uniform Job Length – Figure 2
Running payoff matrix through the replicator dynamics.
Population evolves to all playing 20, which is NE
This can be deduced from the payoff matrix where all-20 profile scores the most.
20 is a dominant strategy for this game and the only NE.
04/19/23 aRman aRtuc for CSC84200 25
Uniform Job Length – Figure 2
04/19/23 aRman aRtuc for CSC84200 26
Uniform Job Length – Figure 3
Replicator dynamics converge to the same NE independent of the initial populations.
04/19/23 aRman aRtuc for CSC84200 27
Constant Job Length – Figure 4
Fix λj = 2 for all j and consider the set of strategies {16,17,18,19,20} Evolutionary dynamics converge to {0.745,0.255,0,0,0}: a mixed strategy NE
04/19/23 aRman aRtuc for CSC84200 28
Constant Job Length – Figure 5
Fix λj = 2 for all j and consider the set of strategies {0,8,12,16,20} Evolutionary dynamics converge to {0,0,0,1,0} Confirmed by GAMBIT
04/19/23 aRman aRtuc for CSC84200 29
Exponential Job Length – Figure 6
Consider the set of strategies {16,17,18,19,20} Evolutionary dynamics converge to {0,1,0,0,0} No unique equilibrium by GAMBIT
04/19/23 aRman aRtuc for CSC84200 30
Vary no of Players: Two Agents
Exponential preferences. Investigate strategies {0,3,6,8,10,11,…,17,18,20}: 105 profiles.
NE : everybody plays 15. GAMBIT: this is one of three symmetric equilibriums.
04/19/23 aRman aRtuc for CSC84200 31
Discussion
For 8 and 10 players => k=1 is dominant. Using exponential preferences,
equilibrium k value is monotone in the number of agents N.
04/19/23 aRman aRtuc for CSC84200 32
Sensitivity Analysis
Is the equilibriums found are robust or would they change with further sampling?
There is a probability distribution for each of the expected payoffs in the payoff matrix. Sampling from these distributions independently, generate new payoff matrices and check for the equilibrium,.
Several of the results presented are impervious to sampling noise!
04/19/23 aRman aRtuc for CSC84200 33
Best Response to SB
Cannot derive an unrestricted characterization of equilibrium behavior in the full strategy space, but restricted equilibriums in selected environments by simulation.
Relax the restriction from one agent, still constraining others.
What if all except one are SB? What is the best response? Can this be characterized as a variant of SB?
04/19/23 aRman aRtuc for CSC84200 34
Best Response to SB – 2
It is illustrated that even relatively simple scenarios with one or two SBs can call for rather sophisticated bidding strategies.
Be skeptical that any simple strategy form will capture general situations where information revelation is pivotal.
04/19/23 aRman aRtuc for CSC84200 35
Conclusion
It is difficult to draw conclusions about strategy choices in even a relatively simple simultaneous ascending auction game. No analytic methods!
Coordinating the allocation of all significantly related resources in the world through a single mechanism is infeasible. (Simultaneous auctions!)
For particular environments, it is possible to derive constrained equilibrium through search.
04/19/23 aRman aRtuc for CSC84200 36
Some pointers!
Trading Agent Competition (TAC) http://tac.eecs.umich.edu/
Wellman, Walsh, Eurman, MacKie-Mason “Auction Protocols for Decentralized Scheduling” Games and Economic Behaviour 35,271-303 (2001)
Reeves, Wellman, MacKie-Mason, Osepayshvili, “Exploring Bidding Strategies for Market-Based Scheduling”, ACM Conference on Electronic Commerce, 2003