meeting 8: decision theory, utility732a66/info/meeting6_20.pdf · decision theory, utility....

24
Meeting 8: Decision theory, Utility

Upload: others

Post on 13-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Meeting 8:

Decision theory, Utility

Page 2: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Exercise 5.6

Components of the decision problem:

Actions (equal procedures here): Run the campaign (1 ) or not (2 ).

States of the world: Gain 10 percent (1 ), 20 percent (2 ) or 30 percent (3 ).

Consequences: Payoffs: with 1 : 150000( / 10) – 200000 , with 2: 0

Note! One could consider adding a fourth state of the world: “gain 0%”. This will be the only

state possible with action 2 but impossible with action 1. However, since with this state the

maximum payoff (and hence loss) will be 0 it will have no effect on the decision problem.

Page 3: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Payoff table

𝜃 = 10% 𝜃 = 20% 𝜃 = 30%

Run campaign (1) –50000 100000 250000

Do not run campaign (2) 0 0 0

𝐿𝑖𝑗 = max𝑘

𝑅𝑘𝑗 − 𝑅𝑖𝑗

⇒ 𝐿𝑖1 = 0 − 𝑅𝑖1 ; 𝐿𝑖2 = 100000 − 𝑅𝑖2 ; 𝐿𝑖3 = 250000 − 𝑅𝑖2

Loss table

𝜃 = 10% 𝜃 = 20% 𝜃 = 30%

Run campaign (1) 50000 0 0

Do not run campaign (2) 0 100000 250000

Page 4: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Exercise 5.7

Firm B is assumed to run a campaign only if Firm A does so.

Firm A cannot lose any market shares, but the gain of new market may be

zero One new state of the world: “gain is 0%”

Modified payoff table

𝜃 = 0% 𝜃 = 10% 𝜃 = 20% 𝜃 = 30%

Run campaign –200000 –50000 100000 250000

Do not run campaign 0 0 0 0

Page 5: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

ቋ𝑃 ෨𝜃 = 20% = 𝑃 ෨𝜃 = 30% = 3 ⋅ 𝑃 ෨𝜃 = 10%

𝑃 ෨𝜃 = 20% + 𝑃 ෨𝜃 = 30% + 𝑃 ෨𝜃 = 10% = 1⇒ 7 ⋅ 𝑃 ෨𝜃 = 10% = 1

⇒ 𝑃 ෨𝜃 = 10% = Τ1 7 and 𝑃 ෨𝜃 = 20% = 𝑃 ෨𝜃 = 30% = Τ3 7

Exercise 5.17

Hence, 𝐸𝑅 𝛿1 = −50000 ⋅1

7+ 100000 ⋅

3

7+ 250000 ⋅

3

7=1000000

7≈ 143000

𝐸𝑅 𝛿2 = 0

𝜃 = 10% 𝜃 = 20% 𝜃 = 30%

Run campaign (1) –50000 100000 250000

Do not run campaign (2) 0 0 0

The payoff table is

To formally show this, we could have introduced the

state “ = 0%” and put a point mass of probability 1

on this state when the decision is 2 Run campaign!

Page 6: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Exercise 5.18

𝑃 ෨𝜃 = 10%ȁ𝐵 not advertising = Τ1 7 and

𝑃 ෨𝜃 = 20%ȁ𝐵 not advertising = 𝑃 ෨𝜃 = 30%ȁ𝐵 not advertising = Τ3 7

𝑃 ෨𝜃 = 0%ȁ𝐵 advertising = 1

𝑃 𝐵 advertisingȁ𝐴 advertising = Τ2 3

𝑃 ෨𝜃 = 10% = 𝑃 ෨𝜃 = 10%ȁ𝐵 not advertising ⋅ 𝑃 𝐵 not advertising

+𝑃 ෨𝜃 = 10%ȁ𝐵 advertising ⋅ 𝑃 𝐵 advertising = Τ1 7 ⋅ Τ1 3 + 0 ⋅ Τ2 3 = Τ1 21

𝑃 ෨𝜃 = 20% = 𝑃 ෨𝜃 = 20%ȁ𝐵 not advertising ⋅ 𝑃 𝐵 not advertising

+𝑃 ෨𝜃 = 20%ȁ𝐵 advertising ⋅ 𝑃 𝐵 advertising = Τ3 7 ⋅ Τ1 3 + 0 ⋅ Τ2 3 = Τ3 21

𝑃 ෨𝜃 = 30% = 𝑃 ෨𝜃 = 20% = Τ3 21

𝑃 ෨𝜃 = 0% = 𝑃 ෨𝜃 = 0%ȁ𝐵 not advertising ⋅ 𝑃 𝐵 not advertising

+𝑃 ෨𝜃 = 0%ȁ𝐵 advertising ⋅ 𝑃 𝐵 advertising = 0 ⋅ Τ1 3 + 1 ⋅ Τ2 3 = Τ2 3

Page 7: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

The payoff table applicable is

𝜃 = 0% 𝜃 = 10% 𝜃 = 20% 𝜃 = 30%

Run campaign –200000 –50000 100000 250000

Do not run campaign 0 0 0 0

Hence,

𝐸𝑅 𝛿1 = −200000 ⋅2

3+ −50000 ⋅

1

21+ 100000 ⋅

3

21+ 250000 ⋅

3

21

= −1800000

7≈ −86000

𝐸𝑅 𝛿2 = 0

Do not run campaign!

Page 8: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Utility

What do we mean by utility ?

Example

Hence, the expected payoff for buying a ticket is SEK (–50)0.95 + 99500.05 = 450.

The expected payoff for not buying a ticket is SEK 0.

According to the ER – criterion (maximize the expected payoff) you should buy a

ticket.

Consider the following lottery:

The price for buying a lottery ticket is SEK 50. If you win on this lottery you win

SEK 10000. The probability of winning is 0.05.

Now, the following three persons are all considering buying a ticket:

• Martin, who has a total asset of SEK 150 000 once he has paid his monthly bills

• Zoran, who has a total asset of SEK 2100 once he has paid his monthly bills

• Sarah, who has a total asset of SEK 1 450 000 once she has paid her monthly bills

Do you think all these three would make their decisions according to the ER-

criterion?

Page 9: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Temporarily we can work with consequences of making a decision (taking an

action). Such consequences depend also on the state of the world and a

consequence can thus be written

𝑐 = 𝑐 𝛿 𝒙 , 𝜽

In the rest of the course we will tune down the necessity of defining a decision

procedure, and separating between procedure and action. When working with non-

probabilistic criteria this separation is necessary, but with probabilistic criteria like

max(ER) and min(EL) it suffices to use actions.

To be consistent with the notation used in Winkler we will from now on use a to

denote a specific action (and not (x) ).

Hence, we will write 𝑐 = 𝑐 𝑎, 𝜽

Consequences can often be expressed in monetary terms, i.e. using payoff functions

we may have c(a, ) = R(a, ).

Cash equivalents may be used to transform non-monetary consequences into

monetary ones.

Page 10: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

means that ci is not preferred to cj

𝑐𝑖 ≺ 𝑐𝑗

𝑐𝑖~𝑐𝑗

𝑐𝑖 ≺~𝑐𝑗

The following notation is used for comparison of consequences, when the

consequences are such that it is not possible to use a simple numerical ordering

means that consequence cj is preferred to consequence ci

means that ci and cj are equally preferred

Back to the example with the lottery

• Martin, has a total asset of SEK 150 000 once he has paid his monthly bills

• Zoran, has a total asset of SEK 2100 once he has paid his monthly bills

• Sarah, has a total asset of SEK 1 450 000 once she has paid her monthly bills

Could the preferences be like the following?

Martin: not buying a ticket buying a ticket

Zoran: buying a ticket not buying a ticket

Sarah: buying a ticket not buying a ticket

≺≺

~

Page 11: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Moreover, assume that when the temperature is below 15 C and you have decided

to wear shorts and a t-shirt you will as a consequence feel unusually cold

c2 = c (a =“shorts”, < 15 C )

Clearly, the ER-criterion (maximising the expected payoff) is not always the

obvious probabilistic criterion.

A consequence of an action can be preferred to another consequence by one

decision-maker, while the opposite can hold for another decision-maker.

Another example

Your preference order would be one of , and𝑐1 ≺ 𝑐2 𝑐2 ≺ 𝑐1 𝑐1~𝑐2

Assume that when the temperature is above 25 C and you have decided to wear

long trousers and a long sleeves shirt, you will as a consequence feel unusually hot

c1 = c (a =“longs”, > 25 C )

Page 12: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

If you think it is always better to feel warm than cold your preference order will be

𝑐2 ≺ 𝑐1

c1 = c (a =“longs”, > 25 C )

c2 = c (a =“shorts”, < 15 C )

Another person feeling the same as you may really dislike feeling too warm and

hence has the preference order

𝑐1 ≺ 𝑐2

A third person also feeling the same as you may be someone who would always

complain as soon as weather condition and choice of garments do not “fit” well

probably has the preference order

𝑐1~𝑐2

Page 13: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

To be able to allow for a relative desirability that deviates from the linear

comparability of monetary consequences we introduce a so-called utility function:

𝑈 𝑐 = 𝑈 𝑐 𝑎, 𝜽 = 𝑈 𝑎, 𝜽

If the difference in payoff between two pairs of action and state of world is dR, i.e.

𝑑𝑅 = 𝑅 𝑎1, 𝜽1 − 𝑅 𝑎2, 𝜽2

the following three differences in utility may hold

𝑈 𝑎1, 𝜽1 − 𝑈 𝑎2, 𝜽2 < 𝑘 ⋅ 𝑑𝑅𝑈 𝑎1, 𝜽1 − 𝑈 𝑎2, 𝜽2 = 𝑘 ⋅ 𝑑𝑅𝑈 𝑎1, 𝜽1 − 𝑈 𝑎2, 𝜽2 > 𝑘 ⋅ 𝑑𝑅

where k is any constant > 0 that can take care of that utility and payoff may be

given on different scales.

Page 14: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Two axioms of utility:

1. If then U(c1) < U(c2) and if then U(c1) = U(c2)

2. If

• O1 = Obtaining consequence c1 for certain

• O2 = Obtaining consequence c2 with probability p and obtaining

consequence c3 with probability 1–p

• O1 ~ O2

then U(c1) = pU(c2) + (1–p) U(c3)

𝑐1 ≺ 𝑐2 𝑐1~𝑐2

Hence, it is not necessary to work with preferences and their notations ( , ~ , ).≺ ≺~

All preferences can be expressed in terms of the utility function:

𝑐1 ≺ 𝑐2 ⇔ 𝑈 𝑐1 < 𝑈 𝑐2𝑐1 ~ 𝑐2 ⇔ 𝑈 𝑐1 = 𝑈 𝑐2𝑐1 ≺

~𝑐2 ⇔ 𝑈 𝑐1 ≤ 𝑈 𝑐2

“p-mixture”

Page 15: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Now, assume U(a, ) is a utility function and let W (a, ) = c + d U(a, ) where c

and d are constants with d > 0.

If 𝑈 𝑎𝑖 , 𝜽𝑘 < 𝑈 𝑎𝑗 , 𝜽𝑙 [where 𝑖 ≠ 𝑗 or 𝑘 ≠ 𝑙 or both ; 𝑐 𝑎𝑖 , 𝜽𝑘 ≺ 𝑐 𝑎𝑗 , 𝜽𝑙 ]

⇒𝑊 𝑎𝑖 , 𝜽𝑘 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑖 , 𝜽𝑘𝑊 𝑎𝑗 , 𝜽𝑙 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗 , 𝜽𝑙⇒ 𝑊 𝑎𝑗 , 𝜽𝑙 −𝑊 𝑎𝑖 , 𝜽𝑘 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗 , 𝜽𝑙 − 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑖 , 𝜽𝑘

= ณ𝑑>0

⋅ 𝑈 𝑎𝑗 , 𝜽𝑙 − 𝑈 𝑎𝑖 , 𝜽𝑘>0

> 0

If 𝑈 𝑎𝑖 , 𝜽𝑘 = 𝑈 𝑎𝑗 , 𝜽𝑙 [where 𝑖 ≠ 𝑗 or 𝑘 ≠ 𝑙 or both]

⇒ 𝑊 𝑎𝑗 , 𝜽𝑙 −𝑊 𝑎𝑖 , 𝜽𝑘 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗 , 𝜽𝑙 − 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑖 , 𝜽𝑘

= ณ𝑑>0

⋅ 𝑈 𝑎𝑗 , 𝜽𝑙 − 𝑈 𝑎𝑖 , 𝜽𝑘=0

= 0

If 𝑈 𝑎𝑖 , 𝜽𝑘= 𝑝 ⋅ 𝑈 𝑎𝑗1 , 𝜽𝑙1 + 1 − 𝑝 ⋅ 𝑈 𝑎𝑗2 , 𝜽𝑙2 [utilities for 3 different consequences]

⇒ 𝑊 𝑎𝑖 , 𝜽𝑘 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑖 , 𝜽𝑘 = 𝑐 + 𝑑 ⋅ 𝑝 ⋅ 𝑈 𝑎𝑗1 , 𝜽𝑙1 + 1 − 𝑝 ⋅ 𝑈 𝑎𝑗2 , 𝜽𝑙2 =

= 𝑐 ⋅ 𝑝 + 𝑐 ⋅ 1 − 𝑝 + 𝑑 ⋅ 𝑝 ⋅ 𝑈 𝑎𝑗1 , 𝜽𝑙1 + 𝑑 ⋅ 1 − 𝑝 ⋅ 𝑈 𝑎𝑗2 , 𝜽𝑙2 =

= 𝑝 ⋅ 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗1 , 𝜽𝑙1 + 1 − 𝑝 ⋅ 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗2 , 𝜽𝑙2 =

=𝑝 ⋅ 𝑊 𝑎𝑗1 , 𝜽𝑙1 + 1 − 𝑝 ⋅ 𝑊 𝑎𝑗2 , 𝜽𝑙21

Page 16: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Now, assume U(a, ) is a utility function and let W (a, ) = c + d U(a, ) where c

and d are constants with d > 0.

If 𝑈 𝑎𝑖 , θ𝑘 < 𝑈 𝑎𝑗 , θ𝑙 [where 𝑖 ≠ 𝑗 or 𝑘 ≠ 𝑙 or both ; 𝑐 𝑎𝑖 , θ𝑘 ≺ 𝑐 𝑎𝑗 , θ𝑙 ]

⇒𝑊 𝑎𝑖 , θ𝑘 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑖 , θ𝑘𝑊 𝑎𝑗 , θ𝑙 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗 , θ𝑙⇒ 𝑊 𝑎𝑗 , θ𝑙 −𝑊 𝑎𝑖 , 𝜽𝑘 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗 , 𝜽𝑙 − 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑖 , 𝜽𝑘

= ณ𝑑>0

⋅ 𝑈 𝑎𝑗 , 𝜽𝑙 − 𝑈 𝑎𝑖 , 𝜽𝑘>0

> 0

If 𝑈 𝑎𝑖 , 𝜽𝑘 = 𝑈 𝑎𝑗 , 𝜽𝑙 where 𝑖 ≠ 𝑗 or 𝑘 ≠ 𝑙 or both ⇒ 𝑊 𝑎𝑗 , 𝜽𝑙 −𝑊 𝑎𝑖 , 𝜽𝑘

= 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗 , 𝜽𝑙 − 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑖 , 𝜽𝑘 = ณ𝑑>0

⋅ 𝑈 𝑎𝑗 , 𝜽𝑙 − 𝑈 𝑎𝑖 , 𝜽𝑘=0

= 0

If 𝑈 𝑎𝑖 , 𝜽𝑘= 𝑝 ⋅ 𝑈 𝑎𝑗1 , 𝜽𝑙1 + 1 − 𝑝 ⋅ 𝑈 𝑎𝑗2 , 𝜽𝑙2 [utilities for 3 different consequences]

⇒ 𝑊 𝑎𝑖 , 𝜽𝑘 = 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑖 , 𝜽𝑘 = 𝑐 + 𝑑 ⋅ 𝑝 ⋅ 𝑈 𝑎𝑗1 , 𝜽𝑙1 + 1 − 𝑝 ⋅ 𝑈 𝑎𝑗2 , 𝜽𝑙2 =

= 𝑐 ⋅ 𝑝 + 𝑐 ⋅ 1 − 𝑝 + 𝑑 ⋅ 𝑝 ⋅ 𝑈 𝑎𝑗1 , 𝜽𝑙1 + 𝑑 ⋅ 1 − 𝑝 ⋅ 𝑈 𝑎𝑗2 , 𝜽𝑙2 =

= 𝑝 ⋅ 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗1 , 𝜽𝑙1 + 1 − 𝑝 ⋅ 𝑐 + 𝑑 ⋅ 𝑈 𝑎𝑗2 , 𝜽𝑙2 =

=𝑝 ⋅ 𝑊 𝑎𝑗1 , 𝜽𝑙1 + 1 − 𝑝 ⋅ 𝑊 𝑎𝑗2 , 𝜽𝑙21

A utility function is only unique up to a

linear transformation

Page 17: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

The expected utility of an action a with respect to a probability distribution of the

states of the world is obtained – analogously to how expected payoff and expected

loss are obtained – by integrating the utility function with the probability distribution

of using its probability density (or mass) function g( ) :

When data (x) is taken into account, g( ) is the posterior pdf/pmf f’’( | x ) =

q ( | x ) :

When data is not taken into account g( ) is the prior pdf/pmf f’( ) = p( ) :

𝐸𝑈 = 𝐸𝑔 𝑈 𝑎, ෩𝜽 = න𝜽

𝑈 𝑎, 𝜽 ⋅ 𝑔 𝜽 𝑑𝜽

𝐸𝑈 = න𝜽

𝑈 𝑎, 𝜽 ⋅ 𝑓′′ 𝜽ȁ𝒙 𝑑𝜽 = න𝜽

𝑈 𝑎, 𝜽 ⋅ 𝑞 𝜽ȁ𝒙 𝑑𝜽

𝐸𝑈 = න𝜽

𝑈 𝑎, 𝜽 ⋅ 𝑓′ 𝜽 𝑑𝜽 = න𝜽

𝑈 𝑎, 𝜽 ⋅ 𝑝 𝜽 𝑑𝜽

Page 18: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

For a particular state of nature let

c1 be the worst consequence and c2 be the best consequence

For a particular action with consequence c it must always hold that

0 U(c(a, )) 1

Normalise – without loss of generality – the utility function U(c(a, )) such that

U(c1) = 0 and U(c2) = 1

Assessing/finding a utility function

Now, assume a gamble in which you should choose between

1. Obtaining consequence c for certain

2. Obtain consequence c1 with probability 1–p and consequence c2 with

probability p

With the first choice the expected utility is U(c(a, ) = U(a, ).

With the second choice the expected utility is

𝑈 𝑐1 ⋅ 1 − 𝑝 + 𝑈 𝑐2 ⋅ 𝑝 = 0 ⋅ 1 − 𝑝 + 1 ⋅ 𝑝 = 𝑝

Page 19: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Hence 𝑈 𝑎, 𝜽 = 1 − 𝑝0 ⋅ 𝑈 𝑐1=0

+ 𝑝0 ⋅ 𝑈 𝑐2=1

= 𝑝0

This means that U(a, ) can be seen as proportional to the probability of obtaining

the best consequence.

For a certain value of p, p0 say, you will be indifferent between 1 and 2

1. Obtaining consequence c for certain

2. Obtain consequence c1 with probability 1–p and

consequence c2 with probability p

Page 20: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

The Bayes action (decision) is

Pr Best consequenceȁ𝑎, 𝜽 ∝ 𝑈 𝑎, 𝜽

Pr Best consequenceȁ𝑎 ∝ න𝜃

𝑈 𝑎, 𝜽 ⋅ 𝑔 𝜽 𝑑𝜽 = 𝑈 𝑎, 𝑔

𝑎 𝐵 = ൞argmax

𝑎∈𝒜𝑈 𝑎, 𝑝 when no data are used

argmax𝑎∈𝒜

𝑈 𝑎, 𝑞, x when data, x are used

Hence, the optimal action is the action that maximises the expected utility under

the probability distribution that rules the state of nature

𝒜 is the set of possible

actions𝑎𝑔

optimal= argmax

𝑎∈𝒜𝑈 𝑎, 𝑔

Page 21: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Example

Assume you are choosing between fixing the interest rate of your mortgage loan for

one year or keeping the floating interest rate for this period.

Let us say that the floating rate for the moment is 4 % and the fixed rate is 5 %.

The floating rate may however increase during the period and we may approximately

assume that with probability g1 = 0.10 the average floating rate will be 7 %, with

probability g2 = 0.20 the average floating rate will be 6 % and with probability g3 =

0.70 the floating rate will stay at 4 %.

Let a1 = Fix the interest rate and a2 = Keep the floating interest rate

Let = average floating rate for the coming period

𝑈 𝑎1, 𝜃 = ቐ4 − 5 = −1 𝜃 = 46 − 5 = 1 𝜃 = 67 − 5 = 2 𝜃 = 7

𝑈 𝑎2, 𝜃 = ቐ5 − 4 = 1 𝜃 = 45 − 6 = −1 𝜃 = 65 − 7 = −2 𝜃 = 7

𝑈 𝑎1, 𝑔 = (−1) ⋅ 0.7 + 1 ⋅ 0.2 + 2 ⋅ 0.1 = −0.3

𝑈 𝑎2, 𝑔 = 1 ⋅ 0.7 + −1 ⋅ 0.2 + −2 ⋅ 0.1 = 0.3 ⇒ 𝑎 𝐵 = 𝑎2

Page 22: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Loss function

Then, the Bayes action with the use of data can be written

i.e. the action that minimises the expected posterior loss.

𝐿𝑆 𝑎, 𝜽 = max𝑎′∈𝒜

𝑈 𝑎′, 𝜽 − 𝑈 𝑎, 𝜽

𝑎 𝐵 = argmax𝑎∈𝒜

න𝜽

𝑈 𝑎, 𝜽 𝑞 𝜽ȁx 𝑑𝜽 =

= argmax𝑎∈𝒜

න𝜽

max𝑎′∈𝒜

𝑈 𝑎′, 𝜽 − 𝐿𝑆 𝑎, 𝜽 𝑞 𝜽ȁx 𝑑𝜽 =

= argmin𝑎∈𝒜

න𝜽

𝐿𝑆 𝑎, 𝜽 𝑞 𝜽ȁx 𝑑𝜽 = argmin𝑎∈𝒜

𝐿𝑆 𝑎, 𝑞, x

When utilities are all non-desirable it is common to describe the decision

problem in terms of losses than utilities. The loss function in Bayesian decision

theory is defined as

Page 23: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Example

A person asking for medical care has some symptoms that may be connected with

two different diseases A and B. But the symptoms could also be temporary and

disappear within reasonable time.

For A there is a therapy treatment that cures the disease if it is present and hence

removes the symptoms. If however the disease is not present the treatment will

lead to that the symptoms remain with the same intense.

For B there is a therapy treatment that generally “reduces” the intensity of the

symptoms by 10 % regardless of whether B is present or not. If B is present the

reduction is 40 %.

Assume that A is present with probability 0.3, that B is present with probability

0.4. Assume further that A and B cannot be present at the same time and therefore

that the probability of the symptoms being just temporary is 0.3.

What is the Bayes action in this case: Treatment for A, treatment for B or no

treatment?

Page 24: Meeting 8: Decision theory, Utility732A66/info/Meeting6_20.pdf · Decision theory, Utility. Exercise 5.6 Components of the decision problem: Actions (equal procedures here): Run the

Use normalised utilities:

U(Decision, State of nature) = 0 Symptoms remain with same intense

U(Decision, State of nature) = 1 Symptoms disappear

Treatment for A (TA): 𝑈 TA, A = 1𝑈 TA, B = 0

𝑈 TA, A ∪ B = 0

Treatment for B (TB): 𝑈 TB, A = 0.1𝑈 TB, B = 0.4

𝑈 TB, A ∪ B = 1

No treatment (NT): 𝑈 NT, A = 0𝑈 NT, B = 0

𝑈 NT, A ∪ B = 1

Pr A = 0.3Pr B = 0.4

Pr A ∩ B = 0 ⇒ Pr A ∪ B = 0.3

⇒ 𝑈 TA = 1 ⋅ 0.3 + 0 ⋅ 0.4 + 0 ⋅ 0.3 = 0.30

𝑈 TB = 0.1 ⋅ 0.3 + 0.4 ⋅ 0.4 + 1 ⋅ 0.3 = 0.49

𝑈 NT = 0 ⋅ 0.3 + 0 ⋅ 0.4 + 1 ⋅ 0.3 = 0.30

The Bayes’ action is therapy treatment for B