the value of information in newcomb's problem and the prisoners' dilemma

5
PAUL SNOW THE VALUE OF INFORMATION IN NEWCOMB'S PROBLEM AND THE PRISONERS' DILEMMA ABSTRACT. The acts in Newcomb's Problem and the Prisoner's Dilemma are viewed as experiments. The cost of the information yielded by each act-experiment is compared to the value of information in the two problems, which is zero. The non-dominant act-experiments cost more than their information is worth, and are therefore rejected. Nozick (1969) poses the following puzzle, which he attributes to William Newcomb. The decision maker ("DM") chooses between two acts in ignorance of which of two exhaustive and mutually exclusive states of nature obtains. The states concern which of two actions has already been performed secretly by another person who claims to be a prophet. The pay-off matrix for DM is S1 $2 A1 1 000 1 001 000 A2 0 1 000 000 Act A1 dominates act A2. The wrinkle in the problem is that DM knows that state $2 obtains if and only if the prophet thought that DM would choose A2. DM is asumed to believe in the prophet's foresight, ~ enough so that the expected utility of A1 is less than that of A2 when the probabilities used reflect DM's belief about the prophet. Should DM choose A1 or A2? Much has been written about the problem. Eels (1982) provides a recent bibliography and review. The present discussion assumes that DM believes that the prophet's move is now unalterable, and that the prophet is not certain to be correct} Newcomb's problem is often described as a conflict between two prin- ciples of choice: the dominance principle (which leads to A1) and the Theory and Decision 18 (1985) 129-133. 0040-5833/85.10 1985 by D. Reidel Publishing Company.

Upload: paul-snow

Post on 06-Jul-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The value of information in Newcomb's Problem and the Prisoners' Dilemma

P A U L SNOW

T H E V A L U E O F I N F O R M A T I O N IN N E W C O M B ' S

P R O B L E M A N D T H E P R I S O N E R S ' D I L E M M A

ABSTRACT. The acts in Newcomb's Problem and the Prisoner's Dilemma are viewed as experiments. The cost of the information yielded by each act-experiment is compared to the value of information in the two problems, which is zero. The non-dominant act-experiments cost more than their information is worth, and are therefore rejected.

Nozick (1969) poses the following puzzle, which he attributes to William

Newcomb. The decision maker ( "DM") chooses between two acts in

ignorance of which of two exhaustive and mutually exclusive states of nature obtains. The states concern which of two actions has already been performed secretly by another person who claims to be a prophet. The pay-off matrix for DM is

S1 $2 A1 1 000 1 001 000 A2 0 1 000 000

Act A1 dominates act A2. The wrinkle in the problem is that DM knows

that state $2 obtains if and only if the prophet thought that DM would

choose A2. DM is asumed to believe in the prophet's foresight, ~ enough so that the expected utility of A1 is less than that of A2 when the probabilities used reflect DM's belief about the prophet. Should DM choose A1 or A2?

Much has been written about the problem. Eels (1982) provides a recent bibliography and review. The present discussion assumes that DM believes that the prophet's move is now unalterable, and that the prophet is not certain to be correct}

Newcomb's problem is often described as a conflict between two prin- ciples of choice: the dominance principle (which leads to A1) and the

Theory and Decision 18 (1985) 129-133. 0040-5833/85.10 �9 1985 by D. Reidel Publishing Company.

Page 2: The value of information in Newcomb's Problem and the Prisoners' Dilemma

130 P A U L SNOW

expected utility principle (which, when applied a certain way, leads to A2). As Levi (1975) points out, one can just as well view the problem as a conflict between two ways of applying the expected utility principle. On the one hand, DM could view the state probabilities as fixed (which leads to A1), or DM could alter the probabilities according to the act chosen (which leads to A2).

There seems to be no controversy that after choosing an act, DM may rationally alter one's assessment of the state probabilities. These altered assessments might be expressed in "side bets", if such were available. Thus, for example, a DM who chose A1 would give good odds that S1 obtains. Side bets are unavailable, however.

Since the acts provide probabilistic information about the states but do not influence which state obtains, the acts may properly be viewed as experiments. Newcomb's problem becomes to pick which of the two experiments DM ought to perform.

It seems correct to argue that the information yielded by A2 costs $1 000, while that given by A 1 costs nothing. The value of the information in either case is zero, since it cannot change one's choice. A1 emerges as the only experiment whose cost does not exceed its value.

Alternatively, we note that DM would presumably be indifferent between cost-free perfect information and A1. DM would also be indif- ferent between perfect information at a cost of $1 000 (inducing DM to select A13) and A2. Since perfect information for free is preferred to perfect information at the cost of a thousand dollars, A1 is preferred to A2 by transitivity, a

In ordinary experience with games against nature, an experiment can increase the expected utility value of a choice only if the experiment opens the possibility of choosing differently than one would without the exper- iment. The heuristic "Don ' t pay for information unless it could change your act" is thus usually consistent with the expected utility principle, and is itself a special kind of dominance principle.

Newcomb's problem, through its element of prophecy, presents a case where the heuristic and the expected utility principle conflict. Paying for

good news does increase the expected utility of one's proscpects, but it

does not lead to a different decision. On the contrary, it leads to an intransitive ordering among the acts and perfect information at selected

Page 3: The value of information in Newcomb's Problem and the Prisoners' Dilemma

INFORMATION IN NEWCOMB'S PROBLEM 131

costs. Axiomatic expected utility depends on transitivity; it cannot be applied in a way that yields an intransitive ordering.

Although Newcomb's Problem relies on a prophet, an unusual element for a decision problem, a related puzzle arises in real life. Lewis (1979), among others, notes that the classic Prisoners' Dilemma two-person co- operative game presents to each player a pay-off matrix structurally similar

to Newcomb's. For each player, S1 can be "The other player plays AI" , and $2 can be

"The other player plays A2". Each player moves in ignorance of the other's act.

A1 still dominates A2. If both players pursue their dominant strategy, then both get $1 000. If neither plays the dominant AI, however, they both do better, and get $1 000 000 apiece. 5

It is sometimes argued that one ought to play A2 in such a circumstance. One rationale is that DM's play of A2 leads to a high probability assess- ment that the other player will make the same play. 6 One's post-exper- iment expected utility analysis, then, may lead to a choice of A2 instead of the dominant A1.

Once again, the experiment is ganged to the act, so whatever informa- tion one Obtains cannot change one's action. The value of the information is therefore zero, and only A1 offers its information for free. Preference for A2 over A 1 will entail an intransitive ordering among the acts and perfect information at selected prices.

Both the Prisoners' Dilemma and Newcomb's Problem can be analyzed on grounds that go beyond the decision theorist's exclusive concern with the explicit pay-offs and consistent probability assessments. For example, DM might be concerned with magnanimity or the fear of losing a big prize while displaying an excess of cleverness (even if that isn't why one lost the prize).

Nevertheless, both problems do present puzzles within the conventional theory. Nozick's original point in posing Newcomb's Problem seems justified. The expected utility principle cannot be applied uncritically. On some occasions, it will conflict with robust principles of decision making.

Fortunately, the difficulty that has been revealed seems to require

neither extensive revision to the serviceable expected utility principle nor ingenious ad hoe analysis 7 of the particular puzzle problem. One need only

Page 4: The value of information in Newcomb's Problem and the Prisoners' Dilemma

132 P A U L S N O W

be careful that costly information pay for itself in possibilities for more profitable actions. The required care does not exceed the applicaton of a heuristic routinely invoked in ordinary purchase of information problems.

N O T E S

l In Nozick's version, this is because the prophet has faultlessly made predictions in the past. A rational DM is not obliged to infer that the prophet has any powers whatsoever on such evidence. Perhaps it isn't all that difficult to guess which way most people will behave in decision problems. Take Newcomb's Problem itself. As a thought experiment, how bad would your record of "prophecies" be if you had the opportunity to chat socially with people before they were presented with the problem? What if you had the benefit of a year's apprenticeship with a carnival fortune-teller, so that your attention to subtle cues was sharp and shrewd? 2 "Certain" is used in the sense that the combinations (A1, $2) and (A2, S1) would be impossible, and therefore there would be no problem. 3 If S1 is known to hold, then the best act is AI; so, too, if $2 is known to hold. 4 Using similar arguments, one can derive Nozick's conjecture that the usual dominance principle, when there is a dominant act, should prevail over expected utility when the states are probabilistically but not causally depend on DM's acts. That is, when an experiment is involved. 5 Some people who endorse the dominant strategy find that disturbing. A pair of"irrational" players will each do better than either member of a "rational" pair. The observation is correct, but it fails to account for the substantial risk taken by an irrational player. The superior result is obtained only if one is fortunate enough to avoid the inferior result that occurs if the other player chooses the dominant play while one is caught with the non-domi- nant .

It is neither unusual nor disturbing that great wealth can accrue to those who successfully run risks that daunt others. 6 The idea is that the players are in symmetrical circumstances. If one is willing to assume that the players are likely to use similar approaches to the problem, then what one player does provides information about what the other does. v See, for example, Sorensen (1983).

R E F E R E N C E S

Eels, E.: 1982, Rational Decisions and Causality, Cambridge University Press, Cambridge, U.K.

Levi, I.: 1975, 'Newcomb's Many Problems', Theory and Decision 6, 161-175. Lewis, D.: 1979, 'Prisoners' Dilemma Is a Newcomb's Problem', Philosophy & Public Affairs

8, 235-240. Nozick, R.: 1969, 'Newcomb's Problem and Two Principles of Choice', in Essays in Honor

of Carl G. Hempel, Reidel, Dordrecht, pp. 114-146.

Page 5: The value of information in Newcomb's Problem and the Prisoners' Dilemma

I N F O R M A T I O N IN N E W C O M B ' S P R O B L E M 133

Sorensen, R. A.: 1983, 'Newcomb's Problem: Recalculations for the One-boxer', Theory and Decision 15, 399-404.

University of New Hampshire, Department of Mechanical Engineering, Durham, NH 03824, U.S.A.