evolution and repeated games d. fudenberg (harvard) e. maskin (ias, princeton)

Evolution and Repeated Games

D. Fudenberg (Harvard)

E. Maskin (IAS, Princeton)

2

Theory of repeated games important• central model for explaining how self-interested

agents can cooperate• used in economics, biology, political science and

other fields

2,2 1,3

3, 1 0,0

C D

Cooperate

Defect

3

But theory has a serious flaw:

• although cooperative behavior possible, so is uncooperative behavior (and everything in between)

• theory doesn’t favor one behavior over another

• theory doesn’t make sharp predictions

4

Evolution (biological or cultural) can promote efficiency

• might hope that uncooperative behavior will be “weeded out”

• this view expressed in Axelrod (1984)

5

Basic idea:

• Start with population of repeated game strategy Always D

• Consider small group of mutants using Conditional C (Play C until someone plays D, thereafter play D)

– does essentially same against Always D as Always D does

– does much better against Conditional C than Always D does

• Thus Conditional C will invade Always D

• uncooperative behavior driven out

2,2 1,3

3, 1 0,0

C D

Cooperate

Defect

6

But consider ALTAlternate between C and D until pattern broken, thereafter play D

• can’t be invaded by some other strategy– other strategy would have to alternate or else would do much worse

against ALT than ALT does

• Thus ALT is “evolutionarily stable”• But ALT is quite inefficient (average payoff 1)

2,2 1,3

3, 1 0,0

C D

C

D

7

• Still, ALT highly inflexible– relies on perfect alternation– if pattern broken, get D forever

• What if there is a (small) probability of mistake in execution?

8

• Consider mutant strategy identical to ALT except if (by mistake) alternating pattern broken– “intention” to cooperate by playing C in following period– if other strategy plays C too, – if other strategy plays D,

• • • •

2,2 1,3

3, 1 0,0

C D

C

D

s

signalss plays forevers C

and ALT each get about 0 against ALT, after pattern brokens gets 2 against ; ALT gets about 0 against , after pattern brokens s s

so invades ALTs

plays forevers D identical to ALT before pattern brokens

9

Main results in paper (for 2-player symmetric repeated games)

(1) If s evolutionarily stable and– discount rate r small (future important)

– mistake probability p small (but p > 0)

then s (almost) “efficient”

(2) If payoffs (v, v) “efficient”,then exists ES strategy s (almost) attaining (v, v) provided

– r small

– p small relative to r

• generalizes Fudenberg-Maskin (1990), in which r = p = 0

10

Finite symmetric 2–player game

• if

• normalize payoffs so that

•

2:g A A

1 2 1 2 convex hull , ,V g a a a a A A

1 2 1 2 2 1 2 1, , , then , ,g a a v v g a a v v

2 1

1 1 2min max , 0a a

g a a

11

• strongly efficient if

0,0 1,2

2,1 0,0

1 2,v v V

1 2

1 2 1 2,

maxv v V

w v v v v

1 2 1 2 1, 3, 1 2 strongly efficientv v v v v

2,2 unique strongly efficient pair

2,2 1,3

3, 1 0,0

12

Repeated game: g repeated infinitely many times

• period t history

•

• H = set of all histories

• repeated game strategy

– assume finitely complex (playable by finite computer)

• in each period, probability p that i makes mistake– chooses (equal probabilities for all actions)

– mistakes independent across players

1 2 1 21 , 1 , 1 , 1h a a a t a t

i ia s h

2 1 2 11 , 1 , , 1 , 1h a a a t a t

:s H A

13

1

,1 1 2 1 1 2 1 2

1

1, , , ,

1 1

tr p

t

rU s s E g a t a t s s p

r r

1

,1 1 2 1 1 2 1 2

1

1, , , , ,

1 1

tr p

t

rU s s h E g a t a t s s p h

r r

14

• informally, s evolutionarily stable (ES), if no mutant can invade population with big proportion s and small proportion

• formally, s is ES w.r.t. if for all and all

• evolutionary stability

– expressed statically here– but can be given precise dynamic meaning

, ,1 11 , ,r p r pq U s s qU s s

,1 11 , ,r r pq U s s qU s s

, ,q r pq q

s s

ss

15

• population of • suppose time measure in “epochs” T = 1, 2, . . . • strategy state in epoch T

− most players in population use • group of mutants (of size a) plays s'

a drawn randomly from s' drawn randomly from finitely complex

strategies• M random drawings of pairs of players

− each pair plays repeated game• = strategy with highest average score

Ts

1,2, , , where a a b q

1Ts

Ts

playersb

16

Theorem 1: For any

exists such that, for all there exists

such that, for all

(i) if s not ES,

(ii) if

, , and 0, thereq p r ,T T

M T

Pr , for all T t Ts s t T s s

T

,M M T

is ES, Pr for all 1T t Ts s s t T s s

17

Let

Theorem 2: Given such that, for all

if s is ES w.r.t.then

, ,q r p

,1 , for all .r pU s s h v h

0 and 0, there exist and q r p 0, and 0, ,r r p p

min 0 there exists such that , strongly efficientv v v v v

18

0,0 1,2

2,1 0,0 ,

11, So , 1r pv U s s

,12, So , 2r pv U s s

2,2 1,3

3, 1 0,0

19

Proof:

Suppose• will construct mutant s' that can invade • let • if s = ALT, = any history for which alternating

pattern broken

,1 , for some r pU s s h v h

,1arg min ,r ph U s s h

h

20

Construct s' so that• if h not a continuation of

• after , strategy s' – “signals” willingness to cooperate by playing differently

from s for 1 period (assume s is pure strategy)

– if other player responds positively, plays strongly efficiently thereafter

– if not, plays according to s thereafter

• after

– responds positively if other strategy has signaled, and thereafter plays strongly efficiently

– plays according to s otherwise

or h h ,s h s h h

(assume ), h h h s

21

• because is already worst history,

s' loses for only 1 period by signaling(small loss if r small)

• if p small, probability that s' “misreads” other player’s intention is small

• hence, s' does nearly as well against s as s does against itself

(even after )• s' does very well against itself (strong efficiency),

after

h

or h h

or h h

,1 1, ,r pU s s h U s s h w

22

• remains to check how well s does against s' • by definition of

• Ignoring effect of p,

Also, after deviation by s', punishment started again, and so

Hence

• so s does appreciably worse against s' than s' does against s'

, ,1 1, , ,r p r pU s s h U s s h v

, ,1 1, , for some >0r p r pU s s h U s s h w

, ,1 1, , .r p r pU s s h U s s h

, ,1 1, ,r p r pU s s h U s s h w

,h

23

• Summing up, we have:

• s is not ES

, ,1 11 , ,r p r pq U s s q U s s

, ,1 11 , ,r p r pq U s s q U s s

24

• Theorem 2 implies for Prisoner’s Dilemma that, for any

• doesn’t rule out punishments of arbitrary (finite) length

0,

,1 , 2 for and smallr pU s s h r p

25

• Consider strategy s with “cooperative” and “punishment” phases – in cooperative phase, play C – stay in cooperative phase until one player plays D, in

which case go to punishment phase– in punishment phase, play D– stay in punishment phase for m periods (and then go back

to cooperative phase) unless at some point some player chooses C, in which case restart punishment

• For any m,

,

1 , 2 (efficiency), as 0 0r pU s s h r p

26

Can sharpen Theorem 2 for Prisoner’s Dilemma:

Given , there exist such that, for all

if s is ES w.r.t.

then it cannot entail a punishment lasting more than periods

Proof: very similar to that of Theorem 2

and r p, , ,q r p 0, and 0, ,r r p p

q

3 2

2

q

q

27

For r and p too big, ES strategy s may not be “efficient”

• if

• if fully cooperative

strategies in Prisoner’s Dilemma generate payoffs

12 , then evenp

1.

, back in one-shot caser

28

Theorem 3: Let

For all for all

for all

there exists 0 s.t.r r p

,1 ,r pU s s v

0, there exist 0 and 0 s.t.q r

, with v v V v v

there exists ES w.r.t. , , for whichp p s q r p

29

Proof: Construct s so that

• along equilibrium path of (s, s), payoffs are (approximately) (v, v)

• punishments are nearly strongly efficient – deviating player (say 1) minimaxed long enough wipe

out gain– thereafter go to strongly efficient point– overall payoffs after deviation:

• if r and p small (s, s) is a subgame perfect equilibrium

,v w v

30

• In Prisoner’s Dilemma, consider s that– plays C the first period– thereafter, plays C if and only if either both players played

C previous period or neither did

• strategy s– is efficient– entails punishments that are as short as possible– is modification of Tit-for-Tat (C the first period; thereafter,

do what other player did previous period)

• Tit-for-Tat not ES– if mistake (D, C) occurs then get wave of alternating

punishments:(C, D), (D, C), (C, D), ...

until another mistake made

31

• Let s = play d as long as in all past periods– both players played d– neither played d

if single player deviates from d– henceforth, that player plays b– other player plays a

• s is ES even though inefficient– any attempt to improve on efficiency, punished forever– can’t invade during punishment, because punishment efficient

2,2

0,0

0,0 0,0

a b

a

b

c

c

d

d

4,1 0,0

1,4 0,0 0,0

0,0

0,0

0,0

0,00,0

0,00,0

modified battle of sexes

32

Consider potential invader s' For any h, s' cannot do better against s than s does against itself, since (s, s)

equilibriumhence, for all h,

and so

For s' to invade, need

Claim: implies h' involves deviation from equil path of (s, s) only other possibility:

– s' different from s on equil path – then s' punished by – violates

we thus have Hence, from rhs of

, ,1 1, ,r p r pU s s h U s s h

, ,1 1, ,r p r pU s s h U s s h

, ,1 1, , (otherwise can't invade)r p r pU s s h U s s h s

, ,1 1, , for some r p r pU s s h U s s h h

, so inequality not feasiblew

( )

( )

( )

, ,1 1, ,r p r pU s s h U s s h w

( )

( ), ( )

( )

33

For Theorem 3 to hold, p must be small relative to r• consider modified Tit-for-Tat against itself

(play C if and only if both players took same action last period)

• with every mistake, there is an expected loss of 2 – (½ · 3 + ½ (−1)) = 1 the first period2 – 0 = 2 the second period

• so over-all the expected loss from mistakes is

approximately

• By contrast, a mutant strategy that signals, etc. and doesn’t

punish at all against itself loses only about

• so if r is small enough relative to p, mutant can invade

13

rp

r

1 rp

r

evolution and repeated games d. fudenberg (harvard) e. maskin (ias, princeton)

Documents