lecture 8, 11/08/07information, control & games, fall 07, copyright p. b. luh, s.-c. chang 1...

53
Lecture 8, 11/08/07 Information, Control & Games, Fa ll 07, Copyright P. B. Luh, S.- C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games Last Two Times: Finite Nash Games Feedback Games and Behavior Strategies Infinite Nash Games Open-Loop, Feedback, Closed-Loop Nash Equilibria Cooperative games Coalitional games Redistribution of payoffs Unanimity game and the core Majority, vote trading, Landowner&workers game Shapley Value under Differential Marginal Contributions Cooperative Game and Risk Next Time 11/15 No Class 11/22 Midterm exam

Upload: barnard-sims

Post on 21-Jan-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Lecture 8, 11/08/07 Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

1

Information, Control & Games: Lecture #8Hierarchical Games

Last Two Times: • Finite Nash Games

– Feedback Games and Behavior Strategies• Infinite Nash Games

– Open-Loop, Feedback, Closed-Loop Nash Equilibria• Cooperative games

– Coalitional games– Redistribution of payoffs– Unanimity game and the core– Majority, vote trading, Landowner&workers game– Shapley Value under Differential Marginal Contributions– Cooperative Game and Risk

Next Time• 11/15 No Class • 11/22 Midterm exam

Page 2: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Today • Finite Hierarchical Games

– Motivating Examples

– Solution Concept

– Examples and Results on Finite Games

– An Example of Single-Act Infinite Games

• The Inducible Region Approach – Approach for Single-Stage Problems

– Principle of Optimality

– Multi-Stage Games

• Team Decision Theory– A Motivating Example

– A Formal Model and Solution Methodology

– A Canonical Example

Page 3: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

3

• Reading Assignments for Today

1. Text 6.2

2. T. S. Chang, P. B. Luh, “Derivation of Necessary and Sufficient Conditions for Single-Stage Stackelberg Games vis the Inducible Region Concept,” IEEE Transactions on Automatic Control, Vol. AC-29, No.1, Jan. 1984, pp. 63-66

3. P.B. Luh, S.C. Chang, T.S. Chang, ”Solutions and Properties of Multi-Stage Stackelberg Games,” Automatica, Vol.20,No.2, March 1984, pp.251-256.

4. Relevant sections of T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory

5. Y. C. Ho, “Team Decision Theory and Information Structures,” Proceedings of IEEE, Vol. 68, No. 6, June 1980, pp. 644-654.

Information, Control and GamesLecture 8

Lecture 8, 11/08/07

Page 4: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Today • Finite Hierarchical Games

– Motivating Examples

– Solution Concept

– Examples and Results on Finite Games

– An Example of Single-Act Infinite Games

• The Inducible Region Approach – Approach for Single-Stage Problems

– Principle of Optimality

– Multi-Stage Games

• Team Decision Theory– A Motivating Example

– A Formal Model and Solution Methodology

– A Canonical Example

Page 5: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

5

Hierarchical Games: Motivating Examples

• Consider the Dating Game discussed previously

Gentleman\Lady Opera Football

Opera (-1, -2) (0, 0)

Football (0, 0) (-2, -1)

– There are two Nash solutions which are not equivalent (Do

not have the same pair of costs) and not interchangeable (Mixing

various Nash choices may not end up with a Nash solution)

Q. If the lady is in a dominating position and can announce and then impose her strategy, knowing that the gentleman will react “rationally.” What should she announce? Why?

The lady would announce and impose Opera

Lecture 8, 11/08/07

Page 6: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

6

– The DM holding the powerful position to announce and then impose his/her strategies is the Leader

– Followers then reacting rationally to Leader’s strategies

– Hierarchical, Leader-Follower, or Stackelberg Game

Q. Who is the leader? For example, after marriage, would the lady remain to be the leader?– Earn to be a leader or by the authority of position (still earned)

Q. Other examples of hierarchical games? – Grading for this course

• Allocation of 100 points: What are important

• Homework: Grading 2 randomly selected problems

• Term Project: Extra credit for cross-discipline teaming

– What would happen if a student said “I am not going to hand in any homework assignment for this course”?

Other examples:

Seat belt law

Speed limits

Lecture 8, 11/08/07

Page 7: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

7

Solution Concept • Consider a two-person problem with 1 leader (DM1) and 1

follower (DM2) – DM1: Strategy 1 1, cost function J1(1, 2)

– DM2: Strategy 2 2, cost function J2(1, 2)

SGD. How to describe the hierarchical game concept?

• For a given 1, DM2 reacts rationally by minimizing his/her cost, i.e., min22

J2(1, 2)

– DM2’s rational reaction set:

R2(1) { 2, | J2(1, ) J2(1, 2) 2 2}

• The leader needs to find the best strategy 1S to minimize

J1(1, 2), taking into account F’s reactions

min11 J1(1, 2), s.t. 2 R2(1)

Lecture 8, 11/08/07

Page 8: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

8

Q. If you were DM1, what would you do? – Need to have some behavioral assumptions

– The leader is assumed to be conservative to safeguard the worst case

Q. What if there are multiple elements in R2(1)?

Example: DM1\DM2 L M R

L (0, 0) (1, 0) (3, 1)

R (2, -1) (2, 0) (-1, -1)

Selecting L for the costs (1, 0). Mathematically?

min11 max2R2(1) J1(1, 2)

1S ~ The Stackelberg strategy for the leader

J1S(1

S, 2S) ~ The Stackelberg cost

Q. Is the problem easier or more difficult as compared to Nash? Lecture 8, 11/08/07

Page 9: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

9

• The problem is difficult even if R2(1) is singleton for all 1

– Difficult to characterize the reaction set R2(1)

– Difficult to optimize J1(1, 2) s.t. 2 R2(1)

Graphical Interpretation

Q. What is R2(u1)? What is the solution with DM1 as the leader (S1)? What is R1(u2)? Solution with DM2 as the leader (S2)? Nash equilibrium (N)?

1

2

Level curves for DM2

R2Level curvesfor DM1

S1

S2

N

R1

Q. How do we compare S1 with N for DM1? Why?

S1 is better than N for DM1 if R2(1) is a singleton for each 1

N is the intersection of R1 and R2, and S1 has the best J1 on R2

Same is true for DM 2 Lecture 8, 11/08/07

Page 10: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

10

Examples

Example 1. A Matrix Game

Q. What is the solution when DM1 is the leader? With DM2 as the leader? The Nash solutions?

DM1\DM2 21 22 23 24

L (-4, -1) (2, 0) (0, 1) (2, -1)

M (-3, -2) (0, 3) (0, -3) (-3, -2)

R (4, -1) (1, 0) (1, 0) (-2, -1)

1S = M with costs (0, -3)

2S = 24 with costs (-3, -2)

– The Nash equilibrium points are: (L, 21) with costs (-4, -1) and (M, 23) with costs (0, -3) (Intersection of R1 and R2)

Lecture 8, 11/08/07

Page 11: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

11

Example 2. A Game in Extensive Form

L R

DM1

6,-6 3,-7

L R

DM2

L R

4,-10 10,-5

Q. Who is the leader?

What is the solution when DM1 is the leader? When DM2 is the leader? Nash?

– DM1 as the leader 1S = L with costs (3, -7) ~ Easy

– DM2 can be the leader ~ The leader doesn’t have to move first. He has to announce strategy first and then impose it

– DM2: 22 = 4 strategies 21 22 23 24

DM1 L L L R R

DM1 R L R L R

Outcome

Nash: 2

N: DM1 = L ~ L DM1 = R ~ L

1N: L with costs (3, -7)

Backward induction(3, -7) (3, -7) (4, -10) (6, -6)

Lecture 8, 11/08/07

Page 12: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

12

Example 2. (Continued)

L R

DM1

6,-6 3,-7

L R

DM2

L R

4,-10 10,-5

SGD. The problem with DM2 as the leader was solved by converting it to normal form. Can it be directly solved in extensive form? How?

• Will come back to this at the next hour

Lecture 8, 11/08/07

Page 13: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

13

Relevant Results on Finite Games

Theorem 3.3. Every two-person finite game admits a pure Stackelberg strategy for the leader

Proof. Intuitively clear from the finiteness of 1 and 2

Proposition 3.16. For a given two-person finite game, if– a pure Nash solution (1, 2) exists, and

– R2(1) is a singleton for every 1 1

then J1S J1

N

Proof. By contradiction

Lecture 8, 11/08/07

Page 14: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

14

An Example of Single-Act Infinite Games

Problem Formulation

• Two manufacturers, M1 and M2, produce a single product type with the same manufacturing technology – Mi produces quantity ui at the cost Ci = cui + d

– d > 0 is the setup cost, and c > 0 is the unit production cost

• Price is determined by the demand-supply relationship: p = a – b(u1 + u2), with a > 0 and b > 0

• Profit for M1: pu1 – C1 = -J(u1, u2), or

J1(u1, u2) = [b(u1 + u2) – a]u1 + cu1 + d

– Similarly, J2(u1, u2) = [b(u1 + u2) – a]u2 + cu2 + d Lecture 8, 11/08/07

Page 15: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

15

• Several cases will be examined:– Single manufacturer (monopoly case with u2 0)

– Nash equilibrium

– Stackelberg solution with M1 as the leader

Case 1. Single Manufacturer

Q. How to solve the problem?

• Necessary condition: dJ1/du1 = a - 2bu1 - c = 0 u1 = (a – c)/2b

• Suppose a = 50, b = 1, c = 2, and d = 10, thenu1

M = (a – c)/2b = 24, pM = a – bu1M = 50 – 24 = 26

C1M = cu1

M + d = 224 + 10 = 58

J1M = – pu1

M + C1M = –2624 + 58 = –566

Lecture 8, 11/08/07

Page 16: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

16

Case 2. Nash equilibrium

Q. How to solve the problem? • Necessary condition:

(J1(u1, u2) = [b(u1 + u2) – a]u1 + cu1 + d)

J1/u1 = a - 2bu1 - bu2 - c = 0

J2/u2 = a - bu1 - 2bu2 - c = 0

u1N = u2

N = (a – c)/3b ~ Symmetric

• With the same a = 50, b = 1, c = 2, and d = 10, then what?u1

N = u2N = (a – c)/3b = 16, u1

N + u2N = 32 > u1

M = 24

pN = a – b(u1N + u2

N) = 50 – 32 = 18 < pM = 26

C1N = C2

N = cu1N + d = 216 + 10 = 42, C1

N + C2N = 84 > 58

J1N = J2

N = -1816 + 42 = –246,

-(J1N + J2

N) = 492 < -J1M = 566

Lecture 8, 11/08/07

Page 17: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

17

Case 3. Stackelberg solution with M1 as the leader

SGD. How to solve the problem? • Reaction of M2:

J2/u2 = a - bu1 - 2bu2 - c = 0 u2 = (a – c – bu1)/2b

• M1’s problem: Min J1(u1, u2), s.t. u2 = (a – c – bu1)/2b. How to solve it?

J1(u1, u2) = [b(u1 + u2) – a]u1 + cu1 + d

= (bu1 – a – c)u1/2 + cu1 + d

dJ1/du1 = bu1 - 0.5a + 0.5c = 0 u1S = (a – c)/2b

• From here, one can get u2

S = (a – c – bu1S)/2b = (a – c)/4b = u1

S/2

pS = a – b(u1S + u2

S) = (a +3c)/4

Lecture 8, 11/08/07

Page 18: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

18

• With the same a = 50, b = 1, c = 2, and d = 10, then what?u1

S = (a – c)/2b = 24 = u1M > u1

N = 16

u2S = 0.5u1

S = 12 < u2N

u1S + u2

S = 36 > u1N + u2

N = 32 > u1M = 24

pS = (a +3c)/4 = 56/4 = 14 < pN = 18 < pM = 26

C1S = cu1

S + d = 224 + 10 = 58

C2S = cu2

S + d = 212 + 10 = 34

J1S = -pu1

S + C1S = -1424 + 58 = –278

-J1S = 278 > -J1

N = 246; -J1S = 278 < -JM = 566

J2S = – pu2

S + C2S = –1412 + 34 = –134

-J2S = 134 < -J2

N = 246

-(J1S + J2

S) = 412 < -(J1N + J2

N) = 492 < -J1M = 566

Lecture 8, 11/08/07

Page 19: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Today • Finite Hierarchical Games

– Motivating Examples

– Solution Concept

– Examples and Results on Finite Games

– An Example of Single-Act Infinite Games

• The Inducible Region Approach – Approach for Single-Stage Problems

– Principle of Optimality

– Multi-Stage Games

• Team Decision Theory– A Motivating Example

– A Formal Model and Solution Methodology

– A Canonical Example

Page 20: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

20

The Inducible Region Approach

• Change of DM2 to DM0 with cost stated as (J0, J1)

• If DM1 is the leader, the problem is quite simple

• The best DM0 can do depends on his/her ability to influence DM1’s cost

Q. What is the worst that DM0 can penalize DM1?

L R

DM1

3,8 -3,7

L R

DM0

L R

-9,10 10,3

SGD. Now with DM0 as the leader, we want to solve the problem directly in extensive form. How?

Lecture 8, 11/08/07

Page 21: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

21

minu1 maxu0

J1(u0, u1) M, maximum penalizing strategy, (u0M, u1

M)

– M is the worst value that DM1 could ever get. Implications?

– Any outcome with J1 > M cannot be achieved

– Conversely, any outcome with J1(u0, u1) < M is “inducible,”

i.e., exists a strategy 0 so that (u0, u1) is the resulting outcome

L R

DM1

3,8 -3,7

L R

DM0

L R

-9,10 10,3

– On the boundary J1(u0, u1) = M:

• (u0M, u1

M) is inducible

L R

DM1

3,8 -3,7

L R

DM0

L R

-9,10 -9,8

• J0(u0, u1) < J0(u0M, u1

M) is not inducible (behavior assumption)

(6,8)

• J0(u0, u1) > J0(u0M, u1

M) is okay but not worthwhile to consider

Lecture 8, 11/08/07

Page 22: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang

22

DM0’s Optimization Problem

• Select the minimal cost within the “inducible region”: IR = {(u0, u1) | J1(u0, u1) () M}

L R

DM1

3,8 -3,7

L R

DM0

L R

-9,10 10,3

Min(u0, u1)IR J0(u0, u1)

– If u1 = u1S, then u0 = u0

S

– If u1 u1S, the resulting J1 > 7 to induce DM1 to select u1

S

(u0S, u1

S) = (L, L), with (J0S, J1

S) = (-3, 7)

• To construct DM0’s strategy 0S:

Lecture 8, 11/08/07

Page 23: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Q. How to interpret the approach graphically?

u1

u0

J1(u0,u1) = M

maxu0 J1(u0,u1)

0M(u1)

J1(u0,u1) = J1S

Q. How to construct 0S?

– As long as the curve is outside the level curve of J(u0, u1) =

J1S but tangent to it at (u0

S, u1S)

– Could be linear, nonlinear, or even discontinuous

minIR J0(u0,u1) (u0

S,u1S)

M

Page 24: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Example

Problem Formulation J0 = ½ u0

2 – ½ u0u1 + u12 + a u1

J1 = (u0 – ½)2 + 4(u1 – ½)2

– with u0, u1 [0, 1], and “a” is a parameter to be varied

u1

u0

0 0

1

1

IR

Delineate IR, find (u0S, u1

S), and construct a 0S

Delineation of IR M minu1

maxu0 J1(u0, u1)

As an intermediate step: maxu0 J1(u0, u1) 0

M(u1). What is it?

– It is easy to see that 0M(u1) = 0 or 1, and u1 = ½

– Consequently, M (u0 – ½)2 + 4(u1 – ½)2, = (½)2 = ¼, and

IR = {(u0, u1) | (u0 – ½)2 + 4(u1 – ½)2 () ¼}

Page 25: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Finding (u0S, u1

S)

– Min(u0, u1)IR J0(u0, u1) ~ May not be an easy task

Q. Any short cut?

– Can find the “team solution,” i.e., selecting both u0 and u1 to

minimize J0 (DM0 and DM1 act as a team, the best possible)

– If the team solution is in IR, then we are almost done

– Otherwise, have to perform the “hard” optimization

• First, find the “team solution” J0 = ½ u0

2 – ½ u0u1 + u12 + a u1

J0/u0 = u0 – ½ u1 = 0 u0 = ½ u1

J0/u1 = – ½ u0 + 2u1 + a = 0 7/4 u1 + a = 0

– Consequently, u0 = -2/7 a, and u1 = -4/7 a

– If (-2/7 a, -4/7 a) IR, then it is (u0S, u1

S)

Page 26: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Q. Is (-2/7 a, -4/7 a) in IR? J1 = (u0 – ½)2 + 4(u1 – ½)2

= (-2/7 a – ½)2 + 4(-4/7 a – ½)2

= (4/49 a2 + 2/7 a + ¼) + 4(16/49 a2 + 4/7 a + ¼)

= 68/49 a2 + 18/7 a + 5/4 () ¼

– Therefore, (-2/7 a, -4/7 a) IR if

68/49 a2 + 18/7 a + 1 () 0, or

a2 + 63/34 a + 49/68 () 0, or

68

63763,

6863763

a

a [-1.298. -0.555] A

• If a A, then u0S = -2/7 a, and u1

S = -4/7 a

• A possible 0S:

– If u1 = -4/7 a, then u0 = -2/7 a

– Otherwise, u0 = 0

Page 27: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

u1

u0

0 0

1

1

IR

• Suppose a = -1, and we want to find a linear 0S

u0S = -2/7 a = 2/7; u1

S = -4/7 a = 4/7

Q. How to find a linear 0S?

– Find the level curve of J1(u0S, u1

S)

– Find the tangent line of the level curve at (u0

S, u1S)J1 = (u0 – ½)2 + 4(u1 – ½)2 = 13/196

J1/u1 = 2(u0 – ½) (du0/du1) + 8(u1 – ½) = 0

du0/du1|(u0S, u1

S) = -8(u1 – ½)/2(u0 – ½) = 4/3

– Then the tangent line at (u0S, u1

S) is described by

(u0 – 2/7) = 4/3 (u1 – 4/7), or

u0 = 4/3 u1 – 10/21 for u1 [5/14, 1], and u0 = 0 otherwise

– Or, u0 = 2/7 for u1 = 4/7, and u0 = 0 otherwise. Credible?

Page 28: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Q. What to do if a A ?

• Have to solve the problem:Min J0 = ½ u0

2 – ½ u0u1 + u12 + a u1

s.t. (u0 – ½)2 + 4(u1 – ½)2 () ¼

– With nonlinear constraint, not an easy problem

– Left as an exercise

Page 29: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Principle of Optimality

The Issue

• Nash games in extensive form are solved by backward induction. This is based on the Principle of Optimality– An optimal strategy has the property that whatever the initial

state and time are, all remaining decisions must also constitute an optimal strategy

• For hierarchical games, we have not been able to use backward induction

Q. Why? Does Principle of Optimality still hold here?

Time

01 = u1

u0 = 0(u1)

– Once announced 0S and observed u1, DM0 should set u0 as u0 =

0S(u1). Is there incentive for DM0 to deviate from this?

Page 30: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Example (Slightly modified)

L R

DM1

-10,9 -3,5

L R

DM0

L R

-2,-2 4,7

Q. Now suppose we are the leader DM0. We announced the strategy, and DM1 selected L. What should we do? – There is strong incentive for DM0 to select R instead of

carrying out what was announced

– Along the optimal path, there is incentive for DM0 to deviate

Q. Suppose for some unknown reasons DM1 select R. What should we do? – There is also strong incentive for DM0 to select L instead of

carrying out what was announced

– Off the optimal path, there is incentive for the DM0 to deviate

Page 31: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

• Principle of optimality does not hold for hierarchical games

• Multi-stage games cannot be solved one stage at a time back from the terminal stage as in backward induction

• There exist inherent incentive for the leader to deviate from what was announced, even in the absence of uncertainties or any unforeseeable interruptions

• Government/CEO credibility is at risk, a very familiar phenomena

Page 32: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Multi-Stage Hierarchical Games

• Since the Principle of Optimality does not hold, a multi-stage problem cannot be solved by backward induction

Q. What to do? – The inducible region approach can be extended. How?

– The basic ideas of IR presented earlier still hold here

The General Approach – Delineate IR

• Worst case analysis for the follower

– Min(u0, u1)IR J0(u0, u1)

• A parameter optimization problem may not easy to solve

– Construct 0S.

• Theoretically not hard, but may have practical difficulties

Page 33: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

U

DM1

DM0

D

DM0

DM1

(2,4.5)

(3,9)

(4,10)

(5,7)

(-100,7)

(3,3)

(5,1)

(-2,9) (0,5)

(-100,10)

(3,5)

(7,6)

(4,2)

(5,3)

(9,5)

(-100,4)

(J0,J1)

SGD. What is the worst that DM0 can penalize DM1?

9

10

7

9

10

6

3

5

9

7

6

3

9

6

6

minu11 maxu01

[minu12 maxu02

J1(u0, u1)]

= minu11 maxu01

[M2]

M = 6

0tM(u1t) ~ “The maximum penalizing

strategy”

• Obtained by backward induction

Q. Which outcomes are inducible, which are not? – For any (u0, u1), if J1(u0, u1) > Mt

for any t (any stage) along the path, u1 will never be selected

Page 34: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

– Any (u0, u1) such that J1(u0, u1) < Mt for all t (for all the stages)

along the path, (u0, u1) is “inducible”

– The boundary J1(u0, u1) = Mt: Analyzed carefully as before

IR = {(u0, u1) | J1(u0, u1) () Mt for all t along the path}

Finding (u0S, u1

S)

– Min(u0, u1)IR J0(u0, u1)

• A parameter optimization problem with outcomes (0, 5)

Constructing 0S

– If u1t = u1tS, then u0t = u0t

S

– Otherwise, try to make the resulting J1 greater than J1S

• One way is to use the maximum penalizing strategy 0tM(u1t)

• Others might be better in case of deviation by DM1

• E.g., (U, D, U, U) would result in (J0, J1) = (-100, 7)!

Page 35: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Today • Finite Hierarchical Games

– Motivating Examples

– Solution Concept

– Examples and Results on Finite Games

– An Example of Single-Act Infinite Games

• The Inducible Region Approach – Approach for Single-Stage Problems

– Principle of Optimality

– Multi-Stage Games

• Team Decision Theory– A Motivating Example

– A Formal Model and Solution Methodology

– A Canonical Example

Page 36: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Team Decision Theory

– Decentralized decision-making where DMs have access to different information and are responsible for different decisions, but share the same objective

A Motivating Example

• Problem Context – Back to the old days with no radio or telephone

– Mr. B of Boston and Mr. N of NYC, knowing their local weather, have to decide whether to go to Hartford today

Boston Hartford

New York

– The meeting requires a good weather at Hartford. If it rains, then they waste the trip

Q. Should they go or not?

Page 37: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

• Mr. B and Mr. N share the same objective function – Shine at Hartford: B \ N Go Don’t

Go -10 3

Don’t 3 0B \ N Go Don’t

Go 4 2

Don’t 2 -5

– Rain at Hartford:

• The only information available is the local weather: B at

Boston and N at NYC. B, N, and H are correlated:

B R R R R S S S SN R R S S R R S SH R S R S R S R SPr. 0.25 0.05 0.1 0.1 0.1 0.1 0.05 0.25

Page 38: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Q. What are possible strategies for Mr. B? Mr. N? B: 4 strategies   B1

B2 B3 B4

S in B G G D DR in B G D G D

  N1 N2 N3 N4

S in N G G D D

R in N G D G D

N: 4 strategies

• Off-line coordination ~ Want to find the best pair of strategies. How?

• Game in normal form: Compute the expected payoff for each pair of strategies (16 of them), and select the best pair

J(B, N) = E[J(B(B), N(N), H)]

= B, N, H J(B(B), N(N), H)Pr(B, N, H)

Page 39: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

– For example,

J(B1, N1) = 0.254 + 0.05(-10) + 0.14 + 0.1(-10) + 0.14

+ 0.1(-10) + 0.054 + 0.25(-10) = -3

J(B1, N2) = 0.252 + 0.053 + 0.14 + 0.1(-10) + 0.12

+ 0.13 + 0.054 + 0.25(-10) = -1.75

• The one with the minimum cost is the solution (B

*, N*) = (B1, N1), with J* = -3

– With pre-game coordination, there is no difficulty associated with the non-uniqueness of solution

Q. Is there a systematic way to find the optimal solution? – Not an easy task. Shall present a formal model and then go

over several examples

Page 40: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

A Formal Model and Solution Methodology

A Formal Model

• A set of DMs: {1, 2, .., N} – Without loss of generality, assume that each DM observes one

measurement, makes one decision, and then leaves

• A set of uncertainties = {1, 2, .., m}, with j j

– Nature’s decisions

• A set of observations: z = {z1, z2, .., zN}

– zn = n(, u) Zn ~ DM n’s observation, subject to causality

– If zn = n() for all n, then static information

• A set of decision variables: u = {u1, u2, .., uN}

– un Un ~ DM n’s decision

Page 41: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

n: Zn Un ~ DM n’s strategy

n n ~ The set of admissible strategies, e.g., linear

strategies

– un = n(zn)

• The cost function J: U R – For a given realization of :

J(u, ) = J(u1, u2, .., uN, ) ~ Extensive form

= J(1(z1), 2(z2), .., N(zN), )

= J(, ) ~ Normal form

J(1, 2, .., N) = E[J(, )] ~ Expected cost function

• The problem: min1, 2, .., NJ(1, 2, .., N)

Q. How to solve the problem?

Page 42: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Solution Methodology • Functional (as opposed to parameter) optimization

• No systematic way to solve it, except under special conditions

• Possible methods: – Brute force exhaustive search

– Impose more structures, e.g., best linear/quadratic strategies

– Relax the conditions, e.g., person-by-person optimal

  21 2223

11 0 0 -1

12 0 -3 0

13 -1 0 -2

J(12, 22) = -3 ~ Optimal team solution

J(13, 23) = -2 ~ Person-by-person optimal ~ Nash solution

• Team optimality Person-by-person optimality, but not vice versa

Page 43: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

• (1*, 2

*, .., N*) is a person-by-person optimal solution iff

J(1*, 2

*, .., N*) J(1

*, 2*, .., n, n+1

*,., N*) n and n

– Equivalently, minnJ(1

*, 2*, .., n, n+1

*,., N*) n

*

– An optimal control problem, and may be solvable

– DMs could coordinate off-line, and no difficulty on the non-uniqueness of solutions Start

Guess (1g, .., N

g)

MinnJ(1

g, .., n, .., Ng)

n* n

N

* = ng n?

Yes, PBPOS?

No, revise the guess

• Consequently, one way to find (1

*, 2*, .., N

*) by:

– There is no systematic way to revise the guess

– To prove the convergence is quite difficult

Page 44: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Today – Approach for Single-Stage Problems

– Principle of Optimality

– Multi-Stage Games

• Team Decision Theory– A Motivating Example

– A Formal Model and Solution Methodology

– A Canonical Example

• Paper Review by Priscillia Hunt, Selini Katsaiti, Dinesh Padmanabhan, Ivailo Kotzev

Page 45: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

A Canonical Example

Problem Formulation

• Two DMs. DM0 with control u0, and DM1 with u1

y0 = x0 + bv0, b > 0

y1 = x0 + cu0 + dv1, c 0, d > 0

– x0, v1, v2 are independent random variables with x0 ~ N(0, 2),

v1 ~ N(0, 1), and v2 ~ N(0, 1)

– Information structure yet to be specified

– The cost function:

J = E{½ (x0 + au0 + u1)2 + hu0

2 + gu12}, with a, h, g 0

• By appropriately assigning values to parameters, the above can represent different problems. We shall consider a few

Page 46: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

The Static Case – Let a = b = d = 1, c = 0, and h = g = ½

J = E{½ (x0 + u0 + u1)2 + ½u0

2 + ½u12}

0 = {y0}, with y0 = x0 + v0

1 = {y1}, with y1 = x0 + v1

– With a random initial state x0, two noisy measurements are

made: y0 by DM0, and y1 by DM1

t

y0

y1

u0(y0)

u1(y1)

x0

– Static information structure since both y0 and y1 are

independent of decisions

Q. How to solve it?

Page 47: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

• Shall find linear person-by-person optimal solution first, and then show that it is also team optimal

Linear Person-by-Person Optimal Solution

– Assume u0* = k0y0 and u1

* = k1y1, with k0 and k1 yet to be

determined

DM0’s perspective

• If u1* = k1y1 = k1(x0 + v1), then min0

J(0, 1*) 0

*

J = E{½ (x0 + u0 + u1)2 + ½u0

2 + ½u12}

= E{E{½[x0 + u0 + k1(x0 + v1)]2 + ½u0

2 + ½[k1(x0 + v1)]2|y0}}

J/u0 = 0 = 2u0 + E[(k1+ 1)x0 + k1v1)| y0]

= 2u0 + (k1+ 1)E[x0| y0]

u0* = - ½ (k1+ 1)E[x0| y0]

~ What is this?

Page 48: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

E[x0|y0] =x0 + x0y0 y0y0

-1 (y0 -y0)

= 0 + 2(2 + 1)-1y0 = 2y0/(2 + 1)

– Consequently,

u0* = - ½ (k1+ 1)E[x0|y0] = -(k1+ 1)2y0/2(2 + 1) = k0y0

2(2 + 1)k0 + 2k1 = -2

DM1’s perspective

– Similarly, if u0* = k0y0, then min1

J(0*, 1) 1

*

u1* = - ½ (k0+ 1)E[x0|y1] = -(k0+ 1)2y1/2(2 + 1) = k1y1

2k0 + 2(2 + 1)k1 = -2

1

,0

0N~

vx

x

y

x22

22

00

0

0

0

Page 49: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Determining k0 and k1

– Two unknowns k0 and k1, and two linear conditions:

12

12k

k

112

121

2

2

2

2

1

0

2

2

2

2

• Determinant is not zero Always exists a unique solution

• Once k0 and k1 are solved, the solution is obtained

Q. Is the solution team optimal? Why or why not?

Page 50: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Team Optimal Solution

• The above solution is also team optimal in view of the strict convexity of J

– For a given realization of (including x0, v0 and v1), y0 and y1

are determined

u0* = k0 y0 and u1

* = k1 y1

J(u0, u1, ) = ½ (x0 + u0 + u1)2 + ½u0

2 + ½u12

u

J

u*

> J(u0*, u1

*, ) + J/u0(u0 - u0*) + J/u1(u1 - u1

*)

for all (0, 1) (0*, 1

*)

E[J(u0=0(y0), u1=1(y1), )] > E[J(u0=0*(y0), u1=1

*(y1), )]

(0*, 1

*) is team optimal

Page 51: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

u0* = -2/7 y0, and u1

* = -2/7 y1

JStatic = E{½ (x0 + u0 + u1)2 + ½u0

2 + ½u12}

= ½ E{[x0 – 2/7 (x0 + v0) – 2/7 (x0 + v1)]2

+ [2/7 (x0 + v0)]2 + [2/7 (x0 + v1)]

2}

= ½ E{[3/7 x0 – 2/7 v0 – 2/7 v1]2 + 4/49 (x0 + v0)

2

+ 4/49 (x0 + v1)2}

= ½ {[36/49 + 4/49 + 4/49] + 104/49}

= 42/49 = 6/7 = 0.857

Special Case When = 2

• In this case, the following equations for k0 and k1 become

;4.0

4.0

k

k

14.0

4.01

1

0

7272

4.0

4.0

14.0

4.01

84.01

k

k

2

1

Page 52: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

General Results

Proposition 1

• For an LQG static team decision problem with J = ½ uTQu + uTS, Q > 0, ~ N(0, ), and

y = H,

The team optimal solution exists, is unique, is linear in measurements, and can be obtained by solving the person-by-person optimal problem

Page 53: Lecture 8, 11/08/07Information, Control & Games, Fall 07, Copyright P. B. Luh, S.-C. Chang 1 Information, Control & Games: Lecture #8 Hierarchical Games

Today • Finite Hierarchical Games

– Motivating Examples– Solution Concept– Examples and Results on Finite Games – An Example of Single-Act Infinite Games

• The Inducible Region Approach – Approach for Single-Stage Problems– Principle of Optimality – Multi-Stage Games

• Team Decision Theory– A Motivating Example– A Formal Model and Solution Methodology – A Canonical Example

Next Time• 11/14 No Class • 11/21 Midterm exam