punishment and forgiveness in repeated games

31
Punishment and Forgiveness in Repeated Games

Upload: kieu

Post on 23-Feb-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Punishment and Forgiveness in Repeated Games. A review of present values. Calculating sums. In a repeated game, with probability d of continuation after each round, the probability that the game is still going at round k is d k-1 Calculate expected winnings if you receive - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Punishment and Forgiveness in Repeated Games

Punishment and Forgivenessin Repeated Games

Page 2: Punishment and Forgiveness in Repeated Games

A review of present values

Page 3: Punishment and Forgiveness in Repeated Games

Calculating sums

• In a repeated game, with probability d of continuation after each round, the probability that the game is still going at round k is dk-1

• Calculate expected winnings if you receive a fixed amount, R so long as the game continues.S = R+dR+d2R+ d3R+ d4R + ….+ S = R(1+d +d2 + d3 + d4 + ….+ )

• What is this infinite sum?

Page 4: Punishment and Forgiveness in Repeated Games

Adding forever

• The series 1+d +d2 + d3 + d4 + ….+ Is known as a geometric series. When |d|<1, this series converges. That is, to say, as n approaches infinity, the sum Tn =1+d +d2 + d3 + d4 + ….+ dn is bounded and has a limiting value S. Let S= 1+d +d2 + d3 + d4 + ….+ Then d × S= d +d2 + d3 + d4 + ….+ And S-(d × S)= 1.So S×(1-d)=1 S=1/(1-d).

Page 5: Punishment and Forgiveness in Repeated Games

What is the expected amount of money that you will get from your generous friend’s offer

A) $60B) $360C) $10D) $244E) $40

Page 6: Punishment and Forgiveness in Repeated Games

Why so?

• On the first roll, she will give you $10. With probability 5/6, she will do a second roll and give you another $10. With probability (5/6)2 she will not get a 6 on her first two rolls and give you a third $10. With probability (5/6)3 she will not get a 6 on her first 3 rolls and you will win another $10, and so on….

• So your expected winnings are $10(1+5/6+(5/6) 2 +…+(5/6) n +… )=$10×(1/(1-5/6) =$10×6=$60.

Page 7: Punishment and Forgiveness in Repeated Games

Punishment and reward in repeated play.

• Suppose we play Prisoners’s dilemma repeatedly with no known last round, but where there may be a next round with probability p<1.

• We consider possibilities for rewards for “good behavior” and punishment for “bad behavior”.

Page 8: Punishment and Forgiveness in Repeated Games

Prisoners’ Dilemma(the stage game)

R, R S, T

T, S P, P

Cooperate Defect

Cooperate

Defect

PLAyER 1

Player 2

T > R > P > S “Temptation” “Reward” “Punishment” “Sucker”

Page 9: Punishment and Forgiveness in Repeated Games

Extreme Punishment:(the Grim Trigger Strategy)

• In prisoner’s dilemma Grim Trigger is the following strategy.

• Play cooperate on first play. So long as opponent plays cooperate, you cooperate.

• But if cooperator ever defects, you defect on all remaining plays.

• Grim trigger never forgives.

Page 10: Punishment and Forgiveness in Repeated Games

When is there a symmetric SPNE where all play Grim Trigger?

• Suppose that the other player is playing Grim Trigger.

• If you play Grim Trigger as well, then you will cooperate as long as the game continues and and you will receive a payoff of R.

If the probability that the game will continue after any round Is d, the expected payoff from playing Grim Trigger if the other guy is playing Grim Trigger is R×(1+d +d2 + d3 + d4 + ….+ )=R/(1-d)

Page 11: Punishment and Forgiveness in Repeated Games

What if you defect against Grim Trigger

• If you defect and the other guy is playing Grim Trigger, you will get a payoff of T>R the first time that you defect. But after this, the other guy will always play defect. The best you can do, then is to always defect as well.

• Your expected payoff from defecting is therefore T+ P(d +d2 + d3 + d4 + ….+ )

=T+Pd/1-d

Page 12: Punishment and Forgiveness in Repeated Games

Cooperate vs Defect• If other guy is playing Grim trigger and nobody has yet

defected, your expected payoff from playing cooperate is R/(1-d)

• If other guy is playing Grim trigger and nobody has yet defected, your expected payoff from playing defect is T+Pd/(1-d)

• Cooperate is R/(1-d) better for you if R/(1-d)>T+Pd/(1-d) which implies d>(T-R)/(T-P)• Example If T=10, R=5, P=2, then condition is d>5/8.• If d is too small, it pays to “take the money and run”

Page 13: Punishment and Forgiveness in Repeated Games

Other equilbria?

• Grim trigger is a SPNE if d is large enough.• Are there other SPNEs?• Yes, for example both play Always Defect is an

equilibrium.• If other guy is playing Always Defect, what is

your best response in any subgame?• Another is Play Defect the first 10 rounds,

then play Grim Trigger.

Page 14: Punishment and Forgiveness in Repeated Games

The “Folk Theorem”:A general result

• The “good news”: In a repeated game with complete information, where the probability d that it will be continued to the next round is sufficiently close to 1, an efficient outcome can always be sustained as a subgame perfect Nash equilibrium.

Page 15: Punishment and Forgiveness in Repeated Games

More about the Folk Theorem

• “Not-so-good-news” In a repeated game of incomplete information with d close to one, not only can efficient outcome can be sustained as a Nash equilibrium, so can almost anything else.

• Possible explanation for why men wear neckties or women wear absurdly painful high heels.

Page 16: Punishment and Forgiveness in Repeated Games

Details of a Folk Theorem

• Consider a repeated game with an inefficient Nash equilibrium.

• Consider a strategy called Strategy A: “Do some quite arbitrary sequence of plays” so long as everybody else does their specified drill. If anyone fails to do so, revert to your inefficient Nash equilibrium action.

• If everybody prefers the result when all follow the arbitrary sequence to the inefficient Nash equilibrium, then for d close to 1, the strategy profile. Everybody uses Strategy A is a subgame perfect Nash equilibrium.

Page 17: Punishment and Forgiveness in Repeated Games

Forgiveness

• Does the grim trigger strategy have to be so unrelenting?

• In the real world, why might it not be a good idea to have an unforgiving punishment?

• What if you get a noisy signal about other player’s action?

• What if other player made a one-time mistake?• This question is much wrestled with in religion

and in politics.

Page 18: Punishment and Forgiveness in Repeated Games

Tit for Tat: a more forgiving strategy

• What is both players play the following strategy in infinitely repeated P.D?

• Cooperate on the first round. Then on any round do what the other guy did on the previous round.

• Suppose other guy plays tit for tat.• If I play tit for tat too, what will happen?

Page 19: Punishment and Forgiveness in Repeated Games

Payoffs

• If you play tit for tat when other guy is playing tit for tat, you get expected payoff of

R(1+d +d2 + d3 + d4 + ….+ )=R/(1-d)• Suppose instead that you choose to play “Always defect”

when other guy is tit for tat.• You will get T+ P(d +d2 + d3 + d4 + ….+ ) =T+Pd/1-dSame comparison as with Grim Trigger. Tit for tat is a better response to tit for tat than always defect if d>(T-R)/(T-P)

Page 20: Punishment and Forgiveness in Repeated Games

Another try

• Sucker punch him and then get him to forgive you.

• If other guy is playing tit for tat and you play D on first round, then C ever after, you will get payoff of T on first round, S on second round, and then R for ever. Expected payoff is T+ Sd+d2R(1+d +d2 + d3 + d4 + ….+ )=T+ Sd+d2R/(1-d).

Page 21: Punishment and Forgiveness in Repeated Games

Which is better?• Tit for tat and Cheat and ask forgiveness give same payoff

from round 3 on.• Cheat and ask for forgiveness gives T in round 1 and S in

round 2. • Tit for tat give R in all rounds.• So tit for tat is better if R+dR>T+dS, which means d(R-S)>T-R or d>(T-R)(R-S) If T=10, R=6, and S=1, this would mean if d>4/5.But if T=10, R=5, and S=1, this would be the case only if d>5/4, which can’t happen. In this case, tit for tat could not be a Nash equilibrium.

Page 22: Punishment and Forgiveness in Repeated Games

Some Problems

Page 23: Punishment and Forgiveness in Repeated Games

Problem 1: Chapter 13A FISHERMAN’S CATCH AND PAYOFF

Number of Number of OwnOwn Boats Other People’s PAYOFF

Boats

1 2 251 3 201 4 152 2 452 3 352 4 20

Page 24: Punishment and Forgiveness in Repeated Games

What is the Nash equilibrium for the stage game for the three fishermen?

A) All send one boat.B) All send two boats.C) There is more than one Nash equilibrium for the

stage game.D) There are no pure strategy Nash equilibria, but

there is a mixed strategy Nash equilibrium for the stage game.

E) There are no pure or mixed strategy Nash equilibria for the stage game.

Page 25: Punishment and Forgiveness in Repeated Games

Can efficiency be sustained by the Grim Trigger?

• Suppose that the other two fishermen are playing the grim trigger strategy of sending one boat until somebody sends two boats and if anybody ever sends two boats, you send two boats ever after.

• If you and the others play the grim trigger strategy, you will always send 1 boat and so will they.

Page 26: Punishment and Forgiveness in Repeated Games

If others are playing grim trigger strategy, would you want to?

• If you play grim trigger, you will always send 1 boat. Your payoff will be 25 in every period.

Assume that a fisherman discounts later profits at rate d.Value of this stream is then25(1+d+d2+d3 +…)=25(1/1-d)• If instead you send 2 boats, you will get payoff of 45 the first

time, but only 20 thereafter.• Value of this stream is 45+ 20(d+d2+d3 +…)• Grim trigger is bigger if • 20<5 (d+d2+d3 +…)• This means 20<5d/(1-d) which implies d>4/5

Page 27: Punishment and Forgiveness in Repeated Games

Problem 7

The stage game:• Payoff to player 1 is V1(x1,x2)=5+x1-2x2

• Payoff to player 2 is V2(x1,x2)=5+x2-2x1

• Strategy set for each player is the interval [1,4]What is a Nash equilibrium for the stage game?

Page 28: Punishment and Forgiveness in Repeated Games

What is a Nash equilibrium for the stage game?

A) Both players choose 4B) Both players choose 3C) Both players choose 2D) Both players choose 1E) There is no pure strategy Nash equilibrium.

Page 29: Punishment and Forgiveness in Repeated Games

Part b (i)

• If the strategy set is X={2,3}, when is there a subgame perfect Nash equilibrium in which both players always play 2 so long as nobody has ever played anything else.

• Compare payoff v(2,2) forever with payoff v(3,2) in first period, then v(3,3) ever after.

• That is, compare 3 forever with 4 in the first period and then 2 forever.

Page 30: Punishment and Forgiveness in Repeated Games

Part b(ii) X=[1,4]

• When is there a subgame perfect equilibrium where everybody does y so long as nobody has ever done anything differently and everybody does z>y if anyone ever does anything other than y?

• First of all, it must be that z=4. Because actions after a violation must be Nash for stage game.

• When is it true that getting V(y,y) forever is better than getting V(4,y) in the first period and then V(4,4) forever.

Page 31: Punishment and Forgiveness in Repeated Games

Comparison

V(y,y) forever is worth V(y,y)/(1-d)=(5-y)/(1-d)V(4,y) and then V(4,4) forever is worth9-y+1d+1d2+…=9-y+d/1-d)Works out that V(y,y)>V(4,y) if d(8-y)>4