july, 2013 tutorial: introduction to game theory

© 2013 IBM Corporation

July, 2013

Tutorial: Introduction to Game Theory Jesus Rios IBM T.J. Watson Research Center, USA [email protected]

© 2013 IBM Corporation 2

Approaches to decision analysis

 Descriptive – Understanding of how decisions are made

 Normative – Models of how decision should be made

 Prescriptive – Helping DM make smart decisions – Use of normative theory to support DM – Elicit inputs of normative models

•  DM preferences and beliefs (psycho-analysis) •  use of experts

– Role of descriptive theories of DM behavior


Game theory arena

 Non-cooperative games – More than one intelligent player –  Individual action spaces –  Interdependent consequences

•  Players’ consequences depend on their own and other player actions

 Cooperative game theory – Normative bargaining models

•  Joint decision making -  Binding agreements on what to play

•  Given players preferences and solution space Find a fair, jointly satisfying and Pareto optimal agreement/solution

– Group decision making on a common action space (Social choice) •  Preference aggregation •  Voting rules

-  Arrow’s theorem – Coalition games


Cooperative game theory: Bargaining solution concepts

•  Disagreement point: BATNA, status quo •  Feasible solutions: ZOPA •  Pareto-efficiency •  Aspiration levels •  Fairness:

K-S, Nash, maxmin solutions

Working alone Juan $ 10

Maria $ 20

Working together $ 100

Juan

Maria

How to distribute

the profits of the cooperation? Juan x

Maria y

10

20

y

x

x + y = 100

80

90

Fair K

x = 45

y = 55

Bliss point


Normative models of decision making under uncertainty

 Models for a unitary DM – vN-M expected utility

•  Objective probability distributions – Subjective expected utility (SEU)

•  Subjective probability distributions

 Example: investment decision problem – One decision variable with two alternatives

•  In what to investment? -  Treasury bonds -  IBM shares

– One uncertainty with two possible states •  IBM share price at the end of the year

-  High -  Low

– One evaluation criteria for consequences •  Profit from investment

 The simplest decision problem under uncertainty


Decision Table

  DM chooses a row without knowing which column will occur

  Choice depends on the relative likelihood of High and Low? –  If DM is sure that IBM share price will be High,

best choice is to buy Shares –  If DM is sure that IBM share price will be Low,

best choice is to buy Bonds Elicit the DM’s beliefs about which column will occur

  Choice depends on the value of money –  Expected return not a good measure of decision preferences

•  The two alternatives give the same expected return but most of DMs would not fell indifferent between them Elicit risk attitude of the DM


Decision tree representation

  What does the choice depends upon? –  relative likelihood of H vs L –  strength of preferences for money

IBM Shares

Bonds

High

Low

What to buy

price

$2,000

- $1,000

$500

uncertainty

certainty


Subjective expected utility solution

  If DM’s decision behavior consistent with some set of “rational” desiderata (axioms) DM decides as if he has

–  probabilities to represent his beliefs about the future price of IBM share –  “utilities” to represent his preferences and risk attitude towards money

and choose the alternative of maximum expected utility

  The subjective expected utility model balance in a “rational” manner –  the DM’s beliefs and risk attitudes

  Application requires to –  know the DM’s beliefs and “utilities”

•  Different elicitation methods –  compute of expected utilities of each decision strategy

•  It may require approximation in non-simple problems


  The Basic Canonical Reference Lottery ticket: p-BCRL

p

1 - p BCLR

$2,000

- $1,000

Preferences over BCRL p-BCRL > q-BCRL iff p > q

where p and q are canonical probabilities

A constructive definition of “utility”


Elicit prob. of the price of IBM shares

  Event H –  IBM price High

  Event L –  IBM price Low

  Pr( H ) + Pr( L ) = 1

1 - p

IBM shares

p-BCRL

H

L

p $2,000

- $1,000

$2,000

- $1,000

BCRL

price

  Move p from 1 to 0   Which alternative is preferred by the DM?

–  IBM shares –  p-BCRL

  There exists a breakeven canonical prob. such that the DM is indifferent –  pH-BCRL ~ IBM shares

–  The judgmental probability of H is pH


Elicit the utility of $500

  U( $500 )? 1 - p

p - BCLR

p $2,000

- $1,000

BCLR

$500 Bonds

  Move p from 1 to 0   Which alternative is preferred by the DM?

p-BCRL vs. Bonds   There exists a breakeven canonical prob. such that the DM is indifferent

–  u-BCRL ~ Bonds

–  This scales the value of $500 between the value of $2,000 and - $1,000 U($500) = u

  What is then U($500)? –  The probability of a BCRL between $2,000 and - $1,000 that is indifferent (for the DM) to getting

$500 with certainty


Comparison of alternatives IBM shares

H

L

pH $2,000

- $1,000

$2,000

- $1,000

BCRL

price

U($500) $2,000

- $1,000

BCLR

$500 Bonds

~

~ The DM prefers to

invest on “IBM Shares” iff

pH > U($500)


Solving the tree: backward induction

 Utility scaling 0 = U( - $1,000 ) < U( $500 ) = u < U( $2,000 ) = 1

IBM Shares

Bonds

High

Low

What to buy

price

$2,000

- $1,000

$500

1

0

Utilities

u

pH

1 - pH


Preferences: value vs. utility

 Value function – measure the desirability (intensity of preferences) of money gained, – but do not measure risk attitude

 Utility function – Measure risk attitude – but no intensity of preferences over sure consequences

 Many methods to elicit a utility function – Qualitative analysis of risk attitude leads to parametric utility functions – Ask quantitative indifference questions between deals (one of which must be an uncertain

lottery) to assess parameters of utility function – Consistency checks and sensitivity analysis


The Bayesian process of inference and evaluation with several stakeholders and decision makers (Group decision making)


Disagreements in group decision making

 Group decision making assumes – Group value/utility function – Group probabilities on the uncertainties

  If our experts disagree on the science (Expert problem) – How to draw together and learn from conflicting probabilistic judgements – Mathematical aggregation

•  Bayesian approach •  Opinion pools

-  There is no opinion pool satisfying a consensus minimum set of “good” probabilistic properties •  Issues

-  How do we model knowledge overlap/correlation -  Expertise evaluation

– Behavioural aggregation – The textbook problem

•  If we do not have access to experts we need to develop meta-analytical methodologies for drawing together expert judgment studies


Disagreements in group decision making

  If group members disagree on the values –  How to combine different individuals’ rankings of options into a group ranking? –  Arbitration/voting

•  Ordinal rankings -  Arrow impossibility results.

•  Cardinal ranking (values and not utilities -- Decisions without uncertainty) -  Interpersonal comparison of preferences’ strengths -  Supra decision maker approach (MAUT)

•  Issues: manipulation and true reporting of rankings

  Disagreement on the values and the science –  Combining

•  individual probabilities and utilities •  into group probabilities and utilities, respectively, •  to form the corresponding group expected utilities and choosing accordingly

–  Impossibility of being Bayesian and Paretian at the same time •  No aggregation method exist (of probabilities and utilities) compatible with the Pareto order

–  Behavioral approaches •  Consensus on group probabilities and utilities via sensitivity analysis. •  Agreement on what to do via negotiation


Decision analysis in the presence of intelligent others

 Matrix games against nature – One player: R (Row)

•  Two choices: U (Up) and D (Down) – Payoff matrix

0 5

10 3

U

D

R

Nature

L R

If you were R, what would you do? D > U against L U > D against R


Games against nature

 Do we know which Colum nature will choose? – We know our best responses to Nature moves, but not what move Nature will choose

 Do we know the (objective) probabilities of Nature’s possible moves? – YES

0 5

10 3

U

D

R

Nature

L R

p 1-p

0 p + 5 (1-p)

10 p + 3 (1-p)

Expected payoff

U > D iff p < 1/6 Payoffs = vNM utils


Games against nature and the SEU criteria

 Do we know the (objective) probabilities of Nature’s possible moves? – No

•  Variety of decision criteria -  Maximin (pessimistic), maxmax (optimistic), Hurwicz, minimax regret,…

0 5

10 3

U

D

R

Nature

L R

0

3

Min

Maxmin D

5

10

Max

10

2

Max Regret

Maxmax D

Minmax Regret

D SEU criteria

Elicit DM’s subjective probabilistic beliefs about Nature move (p) Compute SEU of each alternative: D > U iff p > 1/6


Games against others intelligent players

 Bimatrix (simultaneous) games – Second intelligent player: C (Column)

•  Two choices: L (Left) and R (Right) – Payoff bimatrix

•  we know C payoffs and that he will try to maximize them – As R, what would you do?

0 5

10 3

U

D

R

C

L R

– Knowledge C’s payoffs and rationality allows us to predict with certitude C’s move (R)

2 4

3 8

*


One shot simultaneous bi-matrix games

 Two players – Trying to maximize their payoffs

 Players must choose one out of two fixed alternatives – Row player chooses a row – Column player chooses a column

 Payoffs depends of both players’ moves  Simultaneous move game

– Players must act without knowing what the other player does – Play once

 No other uncertainties involved  Players have full and common knowledge of

– choice spaces – bi-matrix payoffs

 No cooperation allowed

uR(U,L) uR(U,R)

uR(D,L) uR(D,L)

U

D

R

C

L R

uC(U,L) uC(U,R)

uC(D,L) uC(D,L)


Dominant alternatives and social dilemmas

 Prisoner dilemma –  (NC,NC) is mutually dominant

•  Players’ choices are independent of information regarding the other player’s move

–  (NC,NC) is socially dominated by (C,C)

 Airport network security

5 -5

10 -2

C

NC

R

C

C NC

5 10

-5 -2 *

*


Iterative dominance

 No dominant strategy for either player, however – There are iterative dominated strategies

•  L > R •  Now M is dominant in the restricted game

-  M > U and M > D •  Now L > C in the restricted game

-  20 > - 10 –  (M,L) solution by iteratively elimination of (strict) dominated strategies

•  Common knowledge and rationality assumptions

 Exercise – Find if there is a solution by iteratively eliminating dominated strategies

Solution: (D,C)


Nash equilibrium

 Games without – Dominant solution – Solution by iterative elimination of dominated alternatives

0 2

1 0

Ballet

Concert

Concert Ballet

0 1

2 0 *

1 -1

-1 1

Head

Tails

Head Tails

-1 1

1 -1

*

Battle of the sexes Matching pennies


Existence of Nash equilibrium (Nash)

  Every finite game has a NE in mixed strategies –  Requires extending the original set of alternatives of each player

  Consider the matching pennies game – Mixed strategies

•  Choosing a lottery of certain probabilities over Head and Tails –  Players’ choice sets defined by the lottery’s probability

•  Row: p in [0,1] •  Column: q in [0,1]

–  Payoff associated with a pair of strategies (p,q) is •  (p,1-p) P (q,1-q)T

where P is the payoff matrix for the original game in pure strategies •  Payoffs need to be vNM utilities

–  Nash equilibrium •  Intersection of players best response correspondences

uR(p*,q*) > uR(p,q*) uC(p*,q*) > uC(p*,q) (p*,q*)


Nash equilibria concept as predictive tool

 Supporting the row player against the column player  Games with multiple NEs

4 10

12 5

U

D

L R

-100 6

8 4 *

*  Two NEs   (D,L) > (U,R), since 12>10 and 8>6  C may prefer to play R

– To protect himself against -100  Knowing this, R would prefer to play U

– ending up at the inferior NE (U,R)  How can we model C behavior?

– Bayesian K-level thinking


K-level thinking

 Row is not sure about Column’s move – p: Row’s beliefs about C moving L – Row’s SEU

•  U: 4 p + 10 (1-p) •  D: 12 p + 5 (1-p)

– U > D iff p < 5/13 = 0.38  How to elicit p?

– Row’s analysis of Column’s decision •  Assuming C behave as a SEU maximizer •  q: C’s beliefs about whether Row is smart enough to choose D (best NE) •  L SEU: -100 (1-q) + 8 q

R SEU: 6 (1-q) + 4 q •  L > R iff q > 53/55 = 0.96 •  Since Row does not know q, his beliefs about q are represented by a CPD F •  p = Pr (q > 0.96) = F(0.96)

p

q


Simultaneous vs sequential games

  First mover advantage –  Both players want to move first

•  Credible commitment/threat

  Second mover advantage –  Players want to observe their opponent’s move

before acting –  Both players try not to disclose their moves

Game of Chicken Matching pennies game


Dynamic games: backward induction

  Sequential Defend-Attack games – Two intelligent players

•  Defender and Attacker – Sequential moves

•  First Defender, afterwards Attacker knowing Defender’s decision


Standard Game Theoretic Analysis

Solution:

Expected utilities at node S

Best Attacker’s decision at node A

Assuming Defender knows Attacker’s analysis Defender’s best decision at node D


Supporting a SEU maximizer Defender

Defender’s problem Defender’s solution of maximum SEU

Modeling input: ??


Example: Banks-Anderson (2006)

 Exploring how to defend US against a possible smallpox attack – Random costs (payoffs)

– Conditional probabilities of each kind of smallpox attack given terrorists know what defence has been adopted

– Compute expected cost of each defence strategy

 Solution: defence of minimum expected cost

This is the problematic step

of the analysis


Predicting Attacker’s decision: .

Defender problem Defender’s view of Attacker problem


Solving the assessment problem

Defender’s view of Attacker problem

Elicitation of

A is an EU maximizer

D’s beliefs about

MC simulation


Bayesian decision solution for the sequential Defend- Attack model


Standard Game Theory vs. Bayesian Decision Analysis

 Decision Analysis (unitary DM) – Use of decision trees – Opponent’ actions treated as a random variables

•  How to elicit probs on opponents’ decisions?? •  Sensitivity analysis on (problematic) probabilities

 Game theory (multiple DMs) – Use of game trees – Opponent’ actions treated as a decision variables – All players are EU maximizers

•  Do we really know the utilities our opponents try to maximizes?


Bayesian decision analysis approach to games

 One-sided prescriptive support – Use a prescriptive model (SEU) for supporting one of the DMs – Treat opponent's decisions as uncertainties – Assess probs over opponent's possible actions – Compute action of maximum expected utility

 The ‘real’ bayesian approach to games (Kadane & Larkey 1982) – Weaken common (prior) knowledge assumption

 How to assess a prob distribution over actions of intelligent others?? –  “Adversarial Risk Analysis” (DRI, DB and JR) – Development of new methods for the elicitation of probs on adversary’s actions

•  by modeling the adversary’s decision reasoning -  Descriptive decision models


Relevance to counterbioterrorism

 Biological Threat Risk Assessment for DHS (Battelle, 2006) – Based on Probability Event Trees (PET)

•  Government & Terrorists’ decisions treated as random events

 Methodological improvements study (NRC committee) – PET appropriate for risk assessment of

•  Random failure in engineering systems but not for adversarial risk assessment

•  Terrorists are intelligent adversaries trying to achieve their own objectives

•  Their decisions (if rational) can be somehow anticipated

– PET cannot be used for a full risk management analysis •  Government is a decision maker not a random variable


Methodological improvement recommendations

 Distinction between risks from – Nature/Accidents vs. – Actions of intelligent adversaries

 Need of models to predict Terrorists’ behavior – Red team role playing (simulations of adversaries thinking)

– Attack-preference models •  Examine decision from Attacker viewpoint (T as DM)

– Decision analytic approaches •  Transform the PET in a decision tree (G as DM)

-  How to elicit probs on terrorist decisions?? -  Sensitivity analysis on (problematic) probabilities -  Von Winterfeldt and O’Sullivan (2006)

– Game theoretic approaches •  Transform the PET in a game tree (G & T as DM)


Models to predict opponents’ behavior

 Role playing (simulations of adversaries thinking)

 Opponent-preference models – Examine decision from the opponent viewpoint

•  Elicit opponent’s probs and utilities from our viewpoint (point estimates) – Treat the opponent as a EU maximizer ( = rationality?)

•  Solve opponent’s decision problem by finding his action of max. EU

– Assuming we know the opponent’s true probs and utilities •  We can anticipate with certitude what the opponent will do

 Probabilistic prediction models – Acknowledge our uncertainty on opponent’s thinking


Opponent-preference models

 Von Winterfeldt and O’Sullivan (2006) – Should We Protect Commercial Airplanes Against

Surface-to-Air Missile Attacks by Terrorists?

Decision tree + sensitivity analysis on probs


Parnell (2007)

  Elicit Terrorist’s probs and utilities from our viewpoint –  Point estimates

  Solve Terrorist’s decision problem –  Finding Terrorist’s action that gives him max. expected utility

  Assuming we know the Terrorist’s true probs and utilities –  We can anticipate with certitude what the terrorist will do


Parnell (2007)

 Terrorist decision tree


Paté-Cornell & Guikema (2002)

Attacker Defender


Paté-Cornell & Guikema (2002)

 Assessing probabilities of terrorist’s actions – From the Defender viewpoint

•  Model the Attacker’s decision problem •  Estimate Attacker’s probs and utilities (point estimates) •  Calculate expected utilities of attacker’s actions

– Prob of attacker’s actions proportional to their perceived EU

 Feed these probs into the Defender’s decision problem – Uncertainty of Attacker’s decisions has been quantified – Choose defense of maximum expected utility

 Shortcoming –  If the (idealized) adversary is an EU maximizer he would certainly choose the attack of max expected utility


How to assess probabilities over the actions of an intelligent adversary??

 Raiffa (2002): Asymmetric prescriptive/descriptive approach – Prescriptive advice to one party conditional on

a (probabilistic) description of how others will behave – Assess probability distribution from experimental data

•  Lab role simulation experiments

 Rios Insua, Rios & Banks (2009) – Assessment based on an analysis of the adversary rational behavior

•  Assuming the opponent is a SEU maximizer -  Model his decision problem -  Assess his probabilities and utilities -  Find his action of maximum expected utility

– Uncertainty in the Attacker’s decision stems from •  our uncertainty about his probabilities and utilities

– Sources of information •  Available past statistical data of Attacker’s decision behavior •  Expert knowledge / Intelligence


The Defend–Attack–Defend model

  Two intelligent players –  Defender and Attacker

  Sequential moves –  First, Defender moves –  Afterwards, Attacker knowing Defender’s move –  Afterwards, Defender again responding to attack

  Infinite regress


  Under common knowledge of utilities and probs   At node

  Expected utilities at node S

  Best Attacker’s decision at node A

  Best Defender’s decision at node

  Nash Solution:

Standard Game Theory Analysis


  At node

  Expected utilities at node S

  At node A

  Best Defender’s decision at node

  ??

Supporting the Defender against the Attacker


  Attacker’s problem as seen by the Defender

Predicting


Given Assessing


 Drawn

 Generate by

 Approximate

Monte-Carlo approximation of


 The Defender may want to exploit information about how the Attacker analyzes her problem

 Hierarchy of recursive analysis –  Infinity regress – Stop when there is no more information to elicit

The assessment of


Games with private information

 Example: – Consider the following two-person simultaneous game with asymmetric information

•  Player 1 (row) knows whether he is stronger than player 2 (Colum) but player 2 does not know this

•  Player's type use to represent information privately known by that player


Bayes Nash Equilibrium

 Assumption – common prior over the row player's type:

•  Column's beliefs about the row player's type are common knowledge •  Why column is going to disclose this information? •  Why row is going to believe that column is disclosing her true beliefs about his type?

 Row’s strategy function


Bayes Nash Equilibrium


Is the common knowledge assumption realistic?

  – Column is better off reporting that

– 

– 


Modeling opponents' learning of private information

 Simultaneous decisions – Bayes Nash Equilibrium – No opportunity to learn about this information

 Sequential decisions •  Perfect Bayesian equilibrium/Sequential rationality •  Opportunity to learn from the observed decision behavior

-  Signaling games

 Models of adversaries' thinking to anticipate their decision behavior – need to model opponents' learning of private information we want to keep secret – how would this lead to a predictive probability distribution?


Sequential Defend-Attack model with Defender’s private information

 Two intelligent players – Defender and Attacker

 Sequential moves – First Defender, afterwards Attacker knowing Defender’s decision

 Defender’s decision takes into account her private information – The vulnerabilities and importance of sites she wants to protect – The position of ground soldiers in the data ferry control problem (ITA)

 Attacker observes Defender’s decision – Attacker can infer/learn about information she wants to keep secret

 How to model the Attacker’s learning


Influence diagram vs. game tree representation


A game theoretic analysis


A game theoretic solution


Supporting the Defender

 We weaken the common knowledge assumption  The Defender’s decision problem

D

V

S A ??


Defender’s solution


Predicting the Attacker’s move:


Attacker action of MEU


Assessing


How to stop this hierarchy of recursive analysis?

 Potentially infinite analysis of nested decision models – where to stop?

•  Accommodate as much information as we can •  Stop when the Defender has no more information •  Non-informative or reference model •  Sensitivity analysis test

july, 2013 tutorial: introduction to game theory

Documents