©intelligent agent technology and application, 2008, ai lab nju agent technology negotiation

©Intelligent Agent Technology and Application, 2008, Ai Lab NJU

Agent Technology

Negotiation

Sept. 2008©Gao Yang, Ai Lab NJU2

Outline

1. Introduction

2. Vote

3. Bid

4. Bargain

5. Summary


从田忌赛马谈起


田忌赛马

齐王以“上马、中马、下马”出赛；齐王的马优于田忌的马；每一场胜者赢黄金 100 两，负者输黄金 100 两；田忌以何种次序出马呢？

f1 f2 f3 f4 f5 f6

< 上，中，下 >

< 上，下，中 >

< 中，上，下 >

< 中，下，上 >

< 下，中，上 >

< 下，上，中 >

f1

< 上，中，下 >

-300 -100 -100 -100 -100 +100


囚徒困境

如果齐王的策略也是变化的？如果从零和扩展到非零和？

囚徒困境问题

– 每个囚徒如果只考虑自身的利益，则会选择‘坦白’行为；

– 而囚徒困境的最优策略是双方都选择‘抗拒’行为。

Prisoner 2

Cooperate Defect

Prisoner 1

Cooperate (-9,-9) (0,-10)

Defect (-10,0) (-1,-1)


囚徒困境问题的本质

在多 Agent 系统中，如果每个 Agent 都是自利的（使自身获利最大），那么每个 Agent 的最优策略的组合未必是多 Agent 系统的最优策略。

多 Agent 协商– 在自利多 Agent 系统中，通过协商获得最优策略。– 不同于分布式系统，设计者无须为每个 agent 指定策略。

个体利益与集体利益冲突的矛盾本质

究竟什么是最优策略？


三个术语

– Coordination Is a property of a system of agents performing some act

ivity in a shared environment.

– Cooperation (Collaborative) Is coordination among nonantagonistic agents.

– Negotiation Is coordination among competitive or simply self-intere

sted agents.

Distributed rational decision making


协商研究目标

Classical DAI

– System designer fixes an Interaction-Protocol

which is uniform for all agents. The designer also

fixes a strategy for each agent.

MAS

– Interaction-Protocol is given up. Each agent

determines its own strategy (maximizing its own

good, via a utility function, without looking at the

global task)


四条不同准则

We need to compare negotiation protocols. Each such protocol leads to a solution. So we determine how good these solutions are.

社会福利（ Social Welfare ） : 最大化所有 Agent 的效益– Sum of all utilities.– It requires inter-agent utility comparisons.


帕利脱最优（ Pareto Efficiency）

帕利脱最优（ Pareto Efficiency ） : 一个方案 x是帕利脱最优，当且仅当不存在另一个方案 x’满足 A solution x is Pareto-optimal (also called efficient), if there is no solution x’ with

– Pareto efficiency measures global good, and it does not require questionable inter-agent utility comparisons.

– Social welfare maximizing solutions are a subset of Pareto efficient ones. (Why?)

– Answer: Once the sum of the payoffs is maximized, an agent’s payoff can increase only if another agent’s payoff decreases.

1 agent :

2 agent :

ag ag

ag ag

ag ut x ut x

ag ut x ut x

Stability???


四条不同准则（续）

纳什均衡（ Nash-equilibrium ）Agent 的策略依赖于其他 agent. 如果 S*

A=< S*1, S*

2,……, S*| A |

> 为纳什均衡策略 , 当且仅当对 Agent i ： S*i 对于 agent i 是

最优策略当其他 Agent 选择以下策略时 < S*1, S*

2,…, S*i-1, S*

i+1 …,

S*| A |>

– Two main problems in applying Nash equilibrium; No Nash equilibrium exists in some games. Multiple Nash equilibria in some games, agents should play whi

ch? (See next page)

优超（ dominant ） Agent 的策略不依赖于其他 agent. 这样的策略称之为优超。


无纯 Nash均衡解


多个 Nash均衡解

When b>a>c>db>a>c>d

Agent 2

Action 1 Action 2

Agent 1

Action 1

Action 2

(a,a) (c,b)

(b,c) (d,d)


不同准则下囚犯困境的最优策略

社会福利 : Both defect.

帕利脱最优 : All are Pareto optimal, except when bot

h cooperate.

优超策略 : Both cooperate.

纳什均衡 : Both cooperate.

Prisoner 2

cooperate defect

Prisoner 1cooperate (-9,-9) (0,-10)

defect (-10,0) (-1,-1)

How to escape from PD?


多 Agent系统的类型

合作多 Agent 系统

– 典型例：合作推箱子，有共同的目标

半竞争多 Agent 系统

– 典型例：机器人行走，有不同的目标，但目标不冲突

竞争多 Agent 系统

– 典型例：下棋，有冲突的目标


三种协商机制

投票机制

拍卖机制

谈判机制


投票机制

Agents give input to a mechanism and the outcome of it is taken as a solution for the agents.

Motivation: 3 candidates, 3 voters

Comparing A and B: majority for A. Comparing A and C: majority for C. Comparing B and C: majority for B. Desired Preference ordering:

A>B>C>A????

1 2 3

V1 A B C

V2 B C A

V3 C A B

Nonexistence of desired preference ordering

How to design this software?


投票机制

Let A the set of agents, O the set of possible

outcomes (O could be equal to A, or a set of laws).

The voting of agent i is described by a binary

relation

which we assume to be asymmetric, strict and

transitive. We denote by Ordering the set of all such

binary relations.

i O O


投票机制的六准则

Six properties of a social choice rule– A social preference ordering should exist for

all possible inputs (individual preferences) .– should be defined for every pair .– should be asymmetric and transitive over O.– The outcome should be Pareto efficient:

– The scheme should be independent of irrelevant alternatives.

– No agent should be a dictator in the sense that implies for all preferences of the other agents.

*

* ,o o O*

* , , .iif i A o o then o o

io o*o o


阿罗定理

Arrow’s impossibility theorem

– No social choice rule satisfies all of these six

conditions.

没有任何一种投票机制是完全公平，民主！

Nobel Prize in Economics, 1972


2010年世界杯申办

侯选国家– 埃及– 利比亚– 摩洛哥– 尼日利亚– 南非– 突尼斯

投票方法– ？？？


多数投票

Plurality protocol– A majority voting protocol where all alternatives

are compared simultaneously, and the one with the highest number of votes wins.

– Don’t satisfies the irrelevant rule.– For example:

60% , 40% ,

.

30% , 30% , 40% ,

.

,

6

0

agents a b and agents a b

The first social choice result is a

a

Introduce

gents a c b agents c a b and agents c a b

The second social choice result is b

Howeve

alternativ

r

es c

% .agents think a is favor of b


二叉投票

Binary protocol

– The alternatives are voted on pairwise, and the wi

nner stays to challenge further alternatives while

the loser is eliminated.

– As in plurality protocol, could’t satisfy irrelevant r

ule.

– Further, the agenda can change the socially chos

en outcome.


二叉投票（续）

Binary protocol example– 35% of agents have preferences– 33% of agents have preferences– 32% of agents have preferences

c d b a

a c d b

b a c d b d

a b

c a b c

d a

c dc a

c a

c b

c d b d

d a

b da b

a d

c a

c d a d

b c

b dd c

c a

c b

c d b d

a b

a db d


记分投票

Borda protocol– The Borda count assigns an alternative |O| points

whenever it is highest in some agent’s preference list, |O|-1 whenever it is second and so on.

– The alternative with the highest count becomes the social choice.

– Can also lead to paradoxical result, for example via irrelevant alternatives.


记分投票（续）

Borda protocol exampleAgent Preferences a b c d Points

1 a > b > c > d 4 3 2 1

2 b > c > d > a 1 4 3 2

3 c > d > a > b 2 1 4 3

4 a > b > c > d 4 3 2 1

5 b > c > d > d 1 4 3 2

6 c > d > a > b 2 1 4 3

7 a > b > c > d 4 3 2 1

Borda count C wins with 20, b has 19, a has 18, d loses with 13

Borda count with d removed

A wins with 15, b has 14, c loses with 13


不诚实投票

How to design a social choice mechanisms in

insincere voting?

But if an agent can benefit from insincerely declaring his preferences, he will do so.


投票的对策论分析

效用– 首先构造每个 Agent 关于选民的序结构；– 通过函数给出序的值（在 0 ， 1 之间）；– 定义损失函数 dj(oi)= （在方案中 oi与 agent j 理想分配之间的欧氏距离）。

方案比较– 重心模型– Pareto 模型

基于 Game theory 的理论分析


拍卖机制

In voting, the protocol designer is assumed to want

to enhance the social good.

While in auctions, the auctioneer wants to maximize

his own profit.

– The auctioneer wants to sell an item and get the

highest possible payment for it.

– The bidders wants to acquire the item at the

lowest possible price.


拍卖机制的环境设置

– Private valueThe value of the good depends only on the agent’s own

preferences.

The key is that the winning bidder will not resell the item

in order to get utility.

For example: auctioning off a cake.

In other words, value depends only on the bidder.


拍卖机制的环境设置（续）

– Common valueAn agent’s value of an item depends entirely on other

agents’ values of it.

For example: auctioning treasury bills.

In other words: value depends only on other bidders.


拍卖机制的环境设置（续）

– Correlated valueAn agent’s value depends partly on its own preferences

and partly on others’ values.

For example: a negotiation within a contracting setting.

– An agent decreases the cost of task.

– An agent can recontract out the task.

In other words, partly on own’s value, partly on others.


英格兰拍卖

English (first-price open-cry)– Each bidder is free to raise his bid. When no bidder is

willing to raise anymore, the auction ends, and the highest

bidder wins the item.

– An agent’s strategy Is a series of bids as a function of his private value, his

prior estimates of other bidder’s valuations, the past

bids of others.


英格兰拍卖（续）

– In private value English auctionsAn agent’s dominant strategy is to always bid a small

more than the current highest bid, and stop when his

private value price is reached.

– In correlated value auctionsThe rules are often varied to make the auctioneer

increase the price at a constant rate or at a rate he

thinks appropriate.


密封拍卖

First-price sealed-bid auction– Each bidder submits one bid without knowing the others’

bids. The highest bidder wins the item.

– An agent’s strategyas a function of his private value, his prior estimates of

other bidder’s valuations.

– No dominant strategy for bidding in this auction.

– Best strategy Is to bid less than his true valuation, but how much less

depends on what the others bid.


密封拍卖

– In private value auction (Common knowledge assumptions) The probability

distributions such as uniform distribution of the agents’

value.

Nash equilibrium for every agent i is

1i

Av

A


荷兰式拍卖

Dutch (descending) auction– The seller continuously lowers the price until one of the

bidders takes the item at the current price.

– An agent’s strategyThe Dutch auction is strategically equivalent to the first-

price sealed-bid auction.


Vickery拍卖

Vickrey (second-price sealed-bid) auction– Each bidder submits one bid without knowing the others’ bi

ds. The highest bidder wins, but at the price of the second

highest bid.

– An agent’s strategyas a function of his private value, his prior estimates of

other bidder’s valuations.

– Theorem

A bidder’s dominant strategy in a private value Vi

ckrey auction is to bid his true valuation.


Vickery拍卖（续）

– Vickrey auctions are used to

Allocate computation resources in operation systems,

Allocate bandwith in computer networks,

Control building heating.


Vickery拍卖（续）

Are first-price auctions better for the auctioneer than second-

price auctions?

Theorem: All 4 types of protocol produce the same expected

revenue to the auctioneer (assuming (1) private value auctions,

(2) values are independently distributed and (3) bidders are risk-

neutral).

Why are second prices not so popular among humans?

– Lying auctioneer.

– When the result are published, subcontractors know the true

valuations and what they saved. So they might want to share

the profit.


拍卖的对策论分析

拍卖的对策假定、要素和过程分析– 拍卖的对策理论假定

拍卖是具有不完全信息的非合作对策Agent I 对其他 Agent 私有价值的主观概率

Agent 的决策是独立作出的，不存在协议。

11 2 1 1

,..., ,...,, ,..., , ,..., i n

i i n ii

P v v vP v v v v v v

P v


拍卖的对策论分析（续）

– 拍卖的要素– 效用函数：用 vi表示第 i 个 Agent 对拍卖品的私人价值，而 bi为报价。则效用值为未获得标的 ui=0

获得标的

– 预期所获支付

1 1 1max ,..., , ,...,i i

ii i i n

v bu

v b b b b

第一，荷兰，英式拍卖

第二价格拍卖

i i i iv b P b


拍卖的对策论分析（续）

– 拍卖的过程分析

*

0 1

[0 1]

1, ,

20

max 2.

i i

i i i j

i i j i i i i j

i j

i j

ib

Agent v v

Agent

v b if b b

u b b v v b if b b

if b b

u v b P b b

u b v

两个，只知道自己的私有价值（），其他的私有价值是，上的均匀分布。

期望支付

最优策略，解微分方程

如果是 n 个 Agent ，则为（ n-1 ） v/n 。因此招标中对拍卖者来讲，投标人数越多越有利。


相关资源的拍卖

Inefficient Allocation and Lying in Interrelated Aucti

ons

– Extended Auctioning

auction of multiple items of a homogeneous good,

auction of heterogeneous irrelated goods,

auction of heterogeneous interrelated goods.


相关资源的拍卖（续）

Example (Task Allocation)– Two delivery tasks t1, t2, two agents 1, 2,

1.0

0. 5 0. 5

t2

t1

A1 A2

c1(t1)=2

c1(t2)=1

c1(t1,t2)=2

c2(t1)=1.5

c2(t2)=1.5

c2(t1,t2)=2.5



The global optimal solution is not reached by

auctioning independently and truthful bidding.

– t1 goes to agent 2 (for a price of 1.5) and t2 goes

to agent 1 (for a price of 1).

– Even if agent 2 considers (when bidding for t2)

that he already got t1 (so he bids cost(t1,t2)-

cost(t1)=2.5-1.5=1) he will get it only with a

probability of 0.5.



What about full lookahead?

Therefore:

– It pays off for agent 1 to bid more for t1 (up to 1.5 m

ore than truthful bidding).

– It does not pay off for agent 2, because agent 2 doe

s not make a profit at t2 anyway.

– Agent 1 bids 0.5 for t1 (instead of 2), agent 2 bids 1.

5. Therefore agent 1 gets it for 1.5. Agent 1 also get

s t2 for 1.5.



If a1 have t1, c1(t1,t2)-c1(t1)=2-2=0. else c1(t2)=1

If a2 have t1, c2(t1,t2)-c2(t1)=2.5-1.5=1 else c2(t2)=1.5

So when a1 have t1, it bids t2 will get extra profit 1.5-0=1.5

when a2 have t1, it bids t2 will get extra profit 1-1=0

So when a1 bids t1, it will bid c1(t1)-extra profit=2-1.5=0.5

when a2 bids t1, it will bid c2(t1)-extra profit=1.5-0=1.5

A1 wins!


拍卖中的调查

Does it make sense to counterspeculate at private value Vickery auctions?

Vickery auctions were invented to avoid counterspeculation. But what if the private value for a bidder is uncertain? The bidder might be able to determine it, but he needs to invest c.

Example– Suppose bidder 1 does not know the (private-) val

ue v1 of the item to be auctioned. To determine it, he need to invest cost. We also assume that v1 is uniformly distributed: satisfies 0 <= v1 <= 1.


拍卖中的调查（续）

– For bidder 2, the private value v2 of the item is fixed: 0 <=v2 <= ½. So his dominant strategy is to bid v2.

Should bidder 1 try to invest cost to determine his private value? How does this depend on knowing v2?

Answer: Bidder 1 should invest cost if and only if

1

22 2v cost


拍卖中的调查（续）

Proof

2

2

1

1 2 1 20

1 21 1 2 1 2 20

22 2 2

22 2

1

21 1

2 21 1 1

2 2 21

22

noinfo

v

info v

info noinfo

E v v dv v

E cdv v v cdv v v c

E E v v c v

v c v c


谈判机制

Axiomatic Bargaining Theory

We assume two agent 1,2 , each with a utility function ui: O->R. if the agents do not agree on the result o the fallback ofallback is tacken.

Example (Sharing 1 apple)– How to share 1 apple?– Agent 1 offers p (0 <p <1), agent 2 agrees!

– Such deals are individually rational and each one is in Nas

h-equilibrium!

Therefore we need axioms!


公理谈判理论

Axioms on the global solutions u*=<u1(o*), u2(o*) >.

Invariance: Absolute values of the utility functions

do not matter, only relative values.

Symmetry: Changing the agents does not influence

the solution o*.

Irrelevant Alternatives: If O is made smaller but o*

still remains, then o* remains the solution.

Pareto: The players can not get the higher utility

than u*=<u1(o*), u2(o*) >.


可行解的说明

u2

u1

c


公理谈判理论的解

Theorem (Unique solution)– The four axioms above uniquely determine a

solution. This solution is given by

*1 1 2 2arg maxo fallback fallbacko u o u o u o u o


策略谈判理论

No axioms: view it as a game!

Example revisited: Sharing 1 apple. Protocol with finitely many steps: The last offerer ju

st offers e, This should be accepted, so the last offerer gets 1-e.

This is unsatisfiable. Ways out:– 1. Add a discountfactor δ: in round n, only the δn-1

th part of the original value is available.– 2. Bargaining costs: bargaining is not for free – fe

es have to be paid.


策略谈判理论（续）

Strategic Bargaining Theory

Finite Games: Suppose δ =0.9. Then the outcome depends on # rounds.

Round I’s share2’s share

Total value

Offerer

…

n-3

…

0.819

…

0.181

…

0.9n-4

…

2

n-2 0.91 0.09 0.9n-3 1

n-1 0.9 0.1 0.9n-2 2

n 1 0 0.9n-1 1



Infinite games: δ1 factor for agent 1, δ2 factor for agent 2.

Theorem: (Unique solution for infinite games)

In a discounted infinite round setting, there exists a unique Nash equilibrium :Agent 1 gets (1- δ2)/(1- δ1

δ2). Agent 2 gets the rest. Agreement is reached in the first round.



Proof:

Round 1‘s share 2‘s share Offerer

… … … …

t-2 1- δ2(1- δ1 π1) 1

t-1 1- δ1 π1 2

t π1 1

… … … …

21 2 1 1 1

1 2

11 1

1



Bargaining Costs

Agent 1 pays c1, agent 2 pays c2.

Time t: 1 get p, 2 get 1-p; Time t-1: 2 thinks: 1 get p+c2, 2 get 1-p-c2; Time t-2: 1 thinks: 1get p+c2-c1, 2 get 1-p-c2+c1; Time t-2k: 1 thinks: 2 get 1-p-k(c2-c1).



c1=c2: Any split is in Nash-equilibrium.

c1<c2: Agent 1 gets all.

c1>c2: Agent 1 gets c2, agent 2 gets 1-c2.


Choiced paper

Cooperative vs. Competitive Multi-Agent Negotiatios in retail Electronic Commerce by Guttman and Maes (PDF)

M. Beer, M. d'Inverno, M. Luck, N. R. Jennings, C. Preist and M. Schroeder (1999) "Negotiation in Multi-Agent Systems" Knowledge Engineering Review 14 (3) 285-289. (PS)

N. R. Jennings, S. Parsons, C. Sierra and P. Faratin (2000) "Automated Negotiation" Proc. 5th Int. Conf. on the Practical Application of Intelligent Agents and Multi- Agent Systems (PAAM-2000), Manchester, UK, 23-30. (PS)

Automated Negotiation: Prospects, Methods and Challenges by Jennings et al. (PS)

Color : read optional. Color : translate optional.

©intelligent agent technology and application, 2008, ai lab nju agent technology negotiation

Documents