crt information game theory in the analysis...

1

Cognitive Radio Technologies, 2007

1Cognitive RadioTechnologiesCCognitiveognitive RRadioadioTTechnologiesechnologies

CRTCRTCRTCRTCRTCRTCRTCRTCRTCRTCRTCRT

James Neel

[email protected]

(540) 230-6012www.crtwireless.com

DySPAN 2007

April 17, 2007

Game Theory in the Analysis and Design of Cognitive Radio Networks


2

Small business officially incorporated in Feb 2007 to commercialize cognitive radio research

Email: [email protected]

[email protected]

Website: crtwireless.com

Tel: 540-230-6012

Mailing Address:

Cognitive Radio Technologies, LLC

147 Mill Ridge Rd, Suite 119

Lynchburg, VA 24502

CRT Information


3

0 50 100 150 200 250 300

40

60

80

100

120

140

Channel

0 50 100 150 200 250 300-90

-80

-70

-60

I i(f)

(dB

m)

0 50 100 150 200 250 300-80

-75

-70

-65

-60

-55

iteration

Φ(f

) (d

Bm

)

CRT’s Strengths• Analysis of networked

cognitive radio algorithms (game theory)

• Design of low complexity, low overhead (scalable), convergent and stable cognitive radio algorithms

– Infrastructure, mesh, and ad-

hoc networks

– DFS, TPC, AIA, beamforming, routing, topology formation

Net Interference

Average LinkInterference

FrequencyAdjustments

Statistics from yet to be published DFS algorithm for ad-hoc networks P2P 802.11a links in 0.5kmx0.5km area

(discussed in tomorrow’s slides) 0 10 20 30 40 50 60 70 80 90 100-90

-85

-80

-75

-70

-65

-60

-55

-50

-45

Ste

ady-s

tate

In

terf

ere

nce

le

vels

(d

Bm

)

Number Links

Typical Worst Case Without Algorithm

Average Without AlgorithmTypical Worst Case With Algorithm

Average With Algorithm

Colission Threshold


4

Tutorial Background

• Most material from my three week defense– Very understanding committee– Dissertation online @

http://scholar.lib.vt.edu/theses/available/etd-12082006-141855/

– Original defense slides @ http://www.mprg.org/people/gametheory/Meetings.shtml

• Other material from training short course I gave in summer 2003– http://www.mprg.org/people/gametheory/Class.shtml

• Eventually will be formalized into a book


5

Approximate Schedule

BreakBreak~20min1000-1020

Break3:15-3:30

Performance Metrics (11)3:00-3:15

Steady-state Solution Concepts (38)2:15-3:00

Cognitive Radio and Game Theory (51)1:30-2:15

Notion of Time and Imperfections in Games (34)3:30-4:10

Using Game Theory to Design Cognitive Radio Networks (28)4:10-4:45

Summary (14)4:45-5:00

MaterialTime


6

General Comments on Tutorial

• More leisurely sources of information:– D. Fudenberg, J. Tirole, Game Theory,

MIT Press 1991.– R. Myerson, Game Theory: Analysis of

Conflict, Harvard University Press, 1991.

– M. Osborne, A. Rubinstein, A Course in Game Theory, MIT Press, 1994.

– J. Neel. J. Reed, A. MacKenzie, Cognitive Radio Network Performance Analysis in Cognitive Radio Technology, B. Fette, ed., Elsevier August 2006.

• “This talk is intended to provide attendees with knowledge of themost important game theoretic concepts employed in state-of-the-art dynamic spectrum access networks.”

• Lots of concepts, no proofs – cramming 2-3 semesters of game theory into 3.5 hours

• Tutorial can provide quick reference for concepts discussed at conference

Image modified from http://hacks.mit.edu/Hacks/by_year/1991/fire_hydrant/

2


7

Cognitive Radio and Game TheoryCognitive Radio,

Game Theory,

Relationship Between the

Two


8

Basic Game Concepts and Cognitive Radio Networks

• Assumptions about Cognitive Radios and Cognitive Radio Networks– Definition and concept of cognitive radio as used in this

presentation

– Design Challenges Posed by Cognitive Radio Networks– A Model of a Cognitive Radio Network

• High Level View of Game Theory– Common Components– Common Models

• Relationship between Game Theory and Cognitive Radio Networks– Modeling a Generic Cognitive Radio Network as a Game– Differences in Typical Assumptions

– Limitations of Application


9

Cognitive Radio: Basic Idea

• Cognitive radios enhance the control process by adding– Intelligent, autonomous control of the

radio– An ability to sense the environment

– Goal driven operation

– Processes for learning about environmental parameters

– Awareness of its environment• Signals

• Channels

– Awareness of capabilities of the radio

– An ability to negotiate waveforms with other radios

Board package (RF, processors)

Board APIs

OS

Software ArchServices

Waveform Software

Co

ntr

ol P

lan

e

Software radios permit network or user to control the operation of a software radio


10

OODA Loop: (continuously)• Observe outside world• Orient to infer meaning of

observations• Adjust waveform as

needed to achieve goal• Implement processes

needed to change waveform

Other processes: (as needed)

• Adjust goals (Plan)• Learn about the outside

world, needs of user,…

Urgent

Allocate Resources

Initiate Processes

Negotiate Protocols

OrientInfer from Context

Select Alternate

Goals

Plan

Normal

Immediate

LearnNew

States

Observe

Outside

World

Decide

Act

User Driven

(Buttons)Autonomous

Infer from Radio Model

StatesGenerate “Best”

Waveform

Establish Priority

Parse Stimuli

Pre-process

Cognition cycle

Conceptual Operation

Figure adapted From Mitola, “Cognitive Radio for Flexible Mobile

Multimedia Communications ”, IEEE Mobile Multimedia Conference, 1999,

pp 3-10.


11

Implementation Classes

• Weak cognitive radio

– Radio’s adaptations

determined by hard coded

algorithms and informed by observations

– Many may not consider this

to be cognitive (see discussion related to Fig 6

in 1900.1 draft)

• Strong cognitive radio

– Radio’s adaptations determined by conscious

reasoning

– Closest approximation is the ontology reasoning

cognitive radios

In general, strong cognitive radios have potential to achieve both much better and much worse behavior in a network, but may not be realizable. Cognitive Radio Technologies, 2007

12

Brilliant Algorithms and Cognitive Engines

• Most research focuses on development of algorithms for:– Observation– Decision processes

– Learning– Policy

– Context Awareness

• Some complete OODA loop algorithms

• In general different algorithms will perform better in different situations

• Cognitive engine can be viewed as a software architecture

• Provides structure for incorporating and interfacing different algorithms

• Mechanism for sharing information across algorithms

• No current implementation standard

3


13

Security

User Model

Policy Model

Evolver

|(Simulated Meters) – (Actual Meters)| Simulated

Meters

Actual Meters

Cognitive System Module

Cognitive System Controller

Chob

Uob

Reg

Knowledge BaseShort Term Memory

Long Term Memory

WSGA Parameter Set

Regulatory Information

Initial Chromosomes

WSGA Parameters

Objectives and weights

System Chromosome

max

max

UUU

CHCHCH

USD

USD

•=

•=

Decision Maker

User Domain

User preference

Local service facility

Policy Domain

User preference

Local service facility

Security

User data security

System/Network security

X86/Unix

Terminal

Radio-domain cognitionRadio

Resource

Monitor

Performance API Hardware/platform API

Radio

Radio

Performance

Monitor

WMS

CE-Radio Interface

Search

Space

ConfigChannel

Identifier

Waveform

Recognizer

Observation

Orientation

Action

Example Architecture from CWT

Decision

Learning

Models Cognitive Radio Technologies, 2007

14

DFS in 802.16h• Drafts of 802.16h

defined a generic DFS algorithm which implements

observation, decision, action, and learning

processes

• Very simple

implementation

Modified from Figure h1 IEEE 802.16h-06/010 Draft IEEE Standard for Local and metropolitan area networks Part 16: Air Interface for Fixed Broadband Wireless Access Systems Amendment for Improved Coexistence Mechanisms for License-Exempt Operation, 2006-03-29

Channel AvailabilityCheck on next channel

Available?

Choose Different Channel

Log of Channel Availability

Stop Transmission

Detection?

Select and change to new available channel in a defined time with a max. transmission time

In service monitoring of operating channel

Channel unavailable for Channel Exclusion time

Available?

Background In service monitoring (on non-

operational channels)

Service in function

No

No

No

Yes

Yes

Start Channel Exclusion timer

Yes

Learning

Observation

Decision,

Action

Decision, Action

Observation


15

Used cognitive radio definition

• A cognitive radio is a radio whose control processes permit the radio to leverage situational knowledge and intelligent processing to autonomously adapt towards some goal.

• Intelligence as defined by [American Heritage_00] as “The capacity to acquire and apply knowledge, especially toward a purposeful goal.”

– To eliminate some of the mess, I would love to just call cognitive radio, “intelligent” radio, i.e.,

– a radio with the capacity to acquire and apply knowledge

especially toward a purposeful goal


16

The Interaction Problem

• Outside world is determined by the interaction of numerous cognitive radios

• Adaptations spawn adaptations

OutsideWorld


17

Dynamic Spectrum Access Pitfall• Suppose

– g31>g21; g12>g32 ;g23>g13

• Without loss of generality– g31, g12, g23 = 1

– g21, g32, g13 = 0.5

• Infinite Loop!– 4,5,1,3,2,6,4,…

Interf.

Chan.

(1.5,1.5,1.5)(0.5,1,0)(1,0,0.5)(0,0.5,1)(0,0.5,1)(1,0,0.5)(0.5,1,0)(1.5,1.5,1.5)

(1,1,1)(1,1,0)(1,0,1)(1,0,0)(0,1,1)(0,1,0)(0,0,1)(0,0,0)

Interference Characterization

0 1 2 3 4 5 6 7

1

2

3


18

Implications

• In one out every four deployments, the example system will enter into an infinite loop

• As network scales, probability of entering an infinite loop goes to 1:

– 2 channels

– k channels

• Even for apparently simple algorithms,

ensuring convergence and stability will be nontrivial

( ) ( ) 31 3 / 4 n Cp loop ≥ −

( ) ( )111 1 2

n kCk

p loop+− +≥ − −

4


19

Locally optimal decisions that lead

to globally undesirable networks

• Scenario: Distributed SINR maximizing power control in a single cluster

• For each link, it is desirable to increase transmit power in response to increased interference

• Steady state of network is all nodes transmitting at maximum power

Power

SINR

Insufficient to consider only a

single link, must consider

interaction Cognitive Radio Technologies, 2007

20

Potential Problems with Networked Cognitive Radios

Distributed

• Infinite recursions

• Instability (chaos)

• Vicious cycles

• Adaptation collisions

• Equitable distribution of resources

• Byzantine failure

• Information distribution

Centralized

• Signaling Overhead

• Complexity

• Responsiveness

• Single point of failure


21

Cognitive Networks• Rather than having

intelligence reside in a single device, intelligence can reside in the network

• Effectively the same as a centralized approach

• Gives greater scope to the available adaptations– Topology, routing

– Conceptually permits adaptation of core and edge devices

• Can be combined with cognitive radio for mix of capabilities

• Focus of E2R program

R. Thomas et al., “Cognitive networks: adaptation and learning to achieve end-to-end performance objectives,” IEEE Communications Magazine, Dec. 2006


22

1. Steady state characterization

2. Steady state optimality

3. Convergence

4. Stability/Noise

5. Scalabilitya1

a2

NE1

NE2

NE3

a1

a2

NE1

NE2

NE3

a1

a2

NE1

NE2

NE3

a1

a2

NE1

NE2

NE3

a3

Steady State CharacterizationIs it possible to predict behavior in the system?How many different outcomes are possible?

OptimalityAre these outcomes desirable?Do these outcomes maximize the system target parameters?

ConvergenceHow do initial conditions impact the system steady state?What processes will lead to steady state conditions?How long does it take to reach the steady state?

Stability/NoiseHow do system variations/noise impact the system?Do the steady states change with small variations/noise?Is convergence affected by system variations/noise?

ScalabilityAs the number of devices increases,

How is the system impacted?Do previously optimal steady states remain optimal?

Network Analysis Objectives

(Radio 1’s available actions)

(Ra

dio

2’s

ava

ilab

le a

ctio

ns)

foc

us


23

General Model (Focus on OODA Loop Interactions)

• Cognitive Radios • Set N

• Particular radios, i, j

OutsideWorld


24


Actions• Different radios may

have different capabilities

• May be constrained by policy

• Should specify each radio’s available actions to account for variations

• Actions for radio i – Ai Act

5


25


Decision Rules

• Maps observations to actions– di:O→Ai

• Intelligence implies that these actions further the radio’s goal– ui:O→R

• Interesting problem: simultaneously modeling behavior of ontological and procedural radios

Decide

Implies very simple,

deterministic function, e.g., standard

interference function


26

Cognitive Radio Network Modeling Summary

• Decision making radios

• Actions for each radio

• Observed Outcome Space

• Goals

• Decision Rules

• Timing

• i,j ∈N, |N| = n

• A=A1×A2×⋅⋅⋅×An

• O

• uj:O→R (uj:A→R)

• dj:O→Ai (dj:A→ Ai)

• T=T1∪T2∪⋅⋅⋅∪Tn


27

Comments on Timing• When decisions are

made also matters and different radios will likely make decisions at different time

• Tj – when radio j makes its adaptations– Generally assumed to be

an infinite set

– Assumed to occur at discrete time

• Consistent with DSP implementation

• T=T1∪T2∪⋅⋅⋅∪Tn

• t ∈ T

Decision timing classes

• Synchronous– All at once

• Round-robin– One at a time in order

– Used in a lot of analysis

• Random– One at a time in no order

• Asynchronous– Random subset at a time

– Least overhead for a network


28

Basic Game Components

1. A (well-defined) set of 2 or more players

2. A set of actions for each player.3. A set of preference relationships for each

player for each possible action tuple.

• More elaborate games exist with more components but these

three must always be there.

• Some also introduce an outcome function which maps action

tuples to outcomes which are then valued by the preference

relations.

• Games with just these three components (or a variation on

the preference relationships) are said to be in Normal form

or Strategic Form


29

Set of Players (decision makers)

• N – set of n players consisting of players “named” 1, 2, 3,…,i, j,…,n

• Note the n does not mean that there are 14 players in every game.

• Other components of the game that “belong”to a particular player are normally indicated by a subscript.

• Generic players are most commonly written as i or j.

• Usage: N is the SET of players, n is the number of players.

• N \ i = 1,2,…,i-1, i+1 ,…, n All players in Nexcept for i


30

Actions

Ai – Set of available actions for player i

ai – A particular action chosen by i, ai ∈ Ai

A – Action Space, Cartesian product of all Ai

A=A1× A2×· · · × An

a – Action tuple – a point in the Action Space

A-i – Another action space A formed from

A-i =A1× A2×· · · ×Ai-1 × Ai+1 × · · · × An

a-i – A point from the space A-i

A = Ai × A-i

A1= A-2

A2 = A-1

a

a1 = a-2

a2 = a-1

Example Two Player

Action Space

A1 = A2 = [0 ∞)

A=A1× A2

b

b1 = b-2

b2 = b-1

6


31

Preference Relation expresses an individual player’s desirability of

one outcome over another (A binary relationship)

*

io o

o is preferred at least as much as o* by player i

i

Preference Relationship (prefers at least as much as)

i Strict Preference Relationship (prefers strictly more than)

~i “Indifference” Relationship (prefers equally)

*

io o *

io o

iff *

io o

but not

*~io o *

io o

iff *

io o

and

Preference Relations (1/2)


32

Preference Relations (2/2)

• Games generally assume the relationship

between actions and outcomes is invertible so preferences can be

expressed over action vectors.

• Preferences are really an ordinal

relationship

– Know that player prefers one outcome to

another, but quantifying by how much introduces difficulties


33

Preference Relation then defined as

*

ia a

Maps action space to set of real numbers.

iff ( ) ( )*

i iu a u a≥

*

ia a iff ( ) ( )*

i iu a u a>

*~ia a iff ( ) ( )*

i iu a u a=

:i

u A →R

A mathematical description of preference relationships.

Utility Functions (1/2)(Objective Fcns, Payoff Fcns)


34

Utility Functions (2/2)

By quantifying preference relationships all sorts of valuable

mathematical operations can be introduced.

Also note that the quantification operation is not unique as

long as relationships are preserved. Many map preference

relationships to [0,1].

Example

Jack prefers Apples to Oranges

JackApples Oranges ( ) ( )Jack Jacku Apples u Oranges>

a) uJack(Apples) = 1, uJack(Oranges) = 0

b) uJack(Apples) = -1, uJack(Oranges) = -7.5


35

Normal Form Games(Strategic Form Games)

, , iG N A u=

In normal form, a game consists of three primary

components

N – Set of Players

Ai – Set of Actions Available to Player i

A – Action Space

ui – Set of Individual Objective Functions

:i

u A →R

1 2 nA A A A= × × ×


36

Normal Formal Games in Matrix Representation

a1

b1

a2 b2

u1(a1, a2), u2(a1, a2) u1(a1, b2), u2(a1, b2)

u1(b1, b2), u2(b1, b2)u1(b1, a2), u2(b1, a2)

A2 = a2,b2A1 = a1,b1N = 1,2

Useful for representing 2 player games with finite action sets.

Player 1’s actions are indexed by rows.

Player 2’s actions are indexed by columns.

Each entry is the payoff vector, (u1, u2), corresponding to the action tuple

7


37

NormalUrgent

Allocate Resources

Initiate Processes


Establish Priority

PlanNormal

Negotiate

Immediate

LearnNew

States

Goal

Adapted From Mitola, “Cognitive Radio for Flexible Mobile Multimedia Communications ”, IEEE Mobile Multimedia Conference, 1999, pp 3-10.

Observe

Outside

World

Decide

Act

Autonomous


States

\

Utility function

Arguments

Utility Function

Outcome Space

Action Sets

Decision

Rules

Cognitive radios are naturally modeled as players in a game


38

Radio 2

Actions

Radio 1

Actions

Action Space

u2u1

Decision

Rules

Decision

Rules

Outcome Space

:f A O→Informed by

Communications

Theory

( )1 2ˆ ˆ,γ γ

( )1 1u γ ( )2 2ˆu γ

Interaction is naturally modeled as a game


39

NormalUrgent

Allocate Resources

Initiate Processes


Select Alternate

Goals

Establish Priority

PlanNormal

Negotiate

Immediate

LearnNew

States

Negotiate Protocols

Generate Alternate

Goals

Observe

Outside

World

Decide

Act

User Driven

(Buttons)Autonomous Determine “Best”

Plan


States

Determine “Best”

Known WaveformGenerate “Best”

Waveform

Level0 SDR

1 Goal Driven

2 Context Aware

3 Radio Aware

4 Planning

5 Negotiating

6 Learns Environment

7 Adapts Plans

8 Adapts Protocols

If distributed adaptation doesn’t occur,

it’s not a game

Parse Stimuli

Pre-process

Suitable for game theory analysis

Unconstrained action sets (radios can

make up new adaptations) or

undefined goals (utility functions)

make analysis impractical

When Game Theory can be Applied

Game Theory applies to: 1. Adaptive aware radios

2. Cognitive radios that learn about

their environment Cognitive Radio Technologies, 2007

40

Conditions for Applying Game Theory to CRNs

• Conditions for rationality

– Well defined decision making processes

– Expectation of how changes impacts performance

• Conditions for a nontrivial game

– Multiple interactive decision makers

– Nonsingleton action sets

• Conditions generally satisfied by distributed dynamic CRN schemes


41

• Inappropriate applications– Cellular Downlink power control (single cell)– Site Planning– A single cognitive network

• Appropriate applications– Multiple interactive cognitive networks– Distributed power control on non-orthogonal

waveforms• Ad-hoc power control

• Cell breathing

– Adaptive MAC– Distributed Dynamic Frequency Selection– Network formation (localized objectives)

Example Application Appropriateness


42

Some differences between game models and cognitive radio network model

Can learn O (may know or learn A)Knows AKnowledge

Not invertible (noise)

May change over time (though relatively

fixed for short periods)

Has to learn

Invertible

Constant

Knownf : A →O

Cardinal (goals)OrdinalPreferences

Cognitive RadioPlayer

• Assuming numerous iterations, normal form game only has a single stage.– Useful for compactly capturing modeling components at a single

stage

– Normal form game properties will be exploited in the analysis ofother games

– Other game models discussed throughout this presentation

8


43

Summary• Adaptations of cognitive radios interact

– Adaptations can have unexpected negative results

• Infinite recursions, vicious cycles

– Insufficient to consider behavior of only a single link in the design

• Behavior of collection of radios can be modeled as a game

• Some differences in models and assumptions but high level mapping is fairly close

• As we look at convergence, performance, collaboration, and stability, we’ll extend the model

1


1

Equilibrium Concepts

Nash Equilibria, Mixed Strategy Equilbria, Coalitional Games, the Core, Shapley Value, Nash Bargaining,

DySPAN 2007 April 17, 2007 Cognitive Radio Technologies, 2007

2

Steady-states• Recall model of <N,A,di,T> which we characterize with

the evolution function d

• Steady-state is a point where a*= d(a*) for all t ≥t *

• Obvious solution: solve for fixed points of d.

• For non-cooperative radios, if a* is a fixed point under synchronous timing, then it is under the other three timings.

• Works well for convex action spaces– Not always guaranteed to exist

– Value of fixed point theorems

• Not so well for finite spaces– Generally requires exhaustive search


3

An action vector from which no player can profitably unilaterally deviate.

( ) ( ), ,i i i i i iu a a u b a− −≥An action tuple a is a NE if for every i ∈ N

for all bi ∈Ai.

Definition

Note showing that a point is a NE says nothing about the process by which the steady state is reached. Nor anything about its uniqueness.Also note that we are implicitly assuming that only pure strategies are possible in this case.

“A steady-state where each player holds a correct expectation of the other players’ behavior and acts rationally.” - Osborne

Nash Equilibrium


4

Examples• Cognitive Radios’

Dilemma

– Two radios have two signals

to choose between n,w and N,W

– n and N do not overlap

– Higher throughput from

operating as a high power wideband signal when other

is narrowband

• Jamming Avoidance

– Two channels

– No NE

(-1,1)(1,-1)1

(1,-1)(-1,1)0

10

Jammer

Transmitter


5

How do the players find the Nash Equilibrium?• Preplay Communication

– Before the game, discuss their options. Note only NE are suitable candidates for coordination as one player could profitably violate any agreement.

• Rational Introspection– Based on what each player knows about the other players,

reason what the other players would do in its own best interest.(Best Response - tomorrow) Points where everyone would be playing “correctly” are the NE.

• Focal Point– Some distinguishing characteristic of the tuple causes it to stand

out. The NE stands out because it’s every player’s best response.

• Trial and Error– Starting on some tuple which is not a NE a player “discovers”

that deviating improves its payoff. This continues until no player can improve by deviating. Only guaranteed to work for PotentialGames (couple weeks) Cognitive Radio Technologies, 2007

6

Nash Equilibrium as a Fixed Point

• Individual Best Response

• Synchronous Best Response

• Nash Equilibrium as a fixed point

• Fixed point theorems can be used to establish existence of NE (see dissertation)

• NE can be solved by implied system of equations

( ) ( ) ( ) ˆ : , ,i i i i i i i i i i iB a b A u b a u a a a A− −= ∈ ≥ ∀ ∈

( ) ( )ˆ ˆi

i NB a B a

∈= ×

( )* *ˆa B a=

2


7

Example solution for Fixed Point by Solving for Best Response Fixed Point

• Bandwidth Allocation Game

– Five cognitive radios with each radio, i, free to determine the number of simultaneous frequency hopping channels the radio implements, ci ∈[0,∞).

– Goal

– P(c) fraction of symbols that are not interfered with (making P(c)ci the goodput for radio i)

– Ci(ci) is radio i’s cost for supporting ci simultaneous channels.

( )i k i i

k N

u c B c c Kc∈

= − −

∑

( ) ( ) ( )i i i iu c P c c C c= −


8

Best Response Analysis

( )\

ˆ / 2i i k

k N i

c B c B K c∈

= = − −

∑

( )ˆ / 6ic B K i N= − ∀ ∈

( ) ( )ˆ / 1ic B K N i N= − + ∀ ∈

( )i k i i

k N

u c B c c Kc∈

= − −

∑Goal

Best Response

Simultaneous System of Equations

Solution

Generalization


9

Significance of NE for CRNs

• Why not “if and only if”?– Consider a self-motivated game with a local maximum and a

hill-climbing algorithm.

– For many decision rules, NE do capture all fixed points (see dissertation)

• Identifies steady-states for all “intelligent” decision rules with the same goal.

• Implies a mechanism for policy design while accommodating differing implementations– Verify goals result in desired performance– Verify radios act intelligently Cognitive Radio Technologies, 2007

10

Key Theorem for NE Existence

• Kakutani’s Fixed Point Theorem

– Let f :X→ X be an upper semi-continuous convex valued correspondence from a non-empty compact convex set X ⊂ Rn,

then there is some

x*∈X such that x* ∈f(x*)

x

f(x)

x

f(x)


11

Nash Equilibrium Existence

( ) ( ) * *:U a a A f a a= ∈ ≥

a2a1

U(a1)

a

f (a)

a0 a2a1

U(a1)

a

f (a)

a0

Visualizable Definition of Quasi-Concavity

All upper-level sets are convex


12

My Favorite Mixed Strategy StoryPure Strategies in an Extended GameConsider an extensive form game where each stage is a strategic

form game and the action space remains the same at each stage.

Before play begins, each player chooses a probabilistic strategy

that assigns a probability to each action in his action set. At each

stage, the player chooses an action from his action set according

to the probabilities he assigned before play began.

Example

Consider a video football game which will be simulated. Before the

game begins two players assign probabilities of calling running

plays or passing plays for both offense or defense. In the simulation,

for each down the kind of play chosen by each team is based on the

initial probabilities assigned to kinds of plays. (Play NCAA2003)

3


13

Example Mixed Strategy Game

Jamming game

a1

b1

a2 b2

1,-1 -1, 1

1, -1-1, 1

p

(1-p)

q (1-q)Action Tuples Probabilities

(a1,a2)

(a1,b2)

(b1,a2)

(b1,b2)

pq

p(1-q)

(1-p)(1-q)

(1-p)q

Expected Utilities

( ) ( ) ( )( )

( ) ( ) ( )( )( )1 , 1 1 1

1 1 1 1 1

U p q pq p q

p q p q

= + − − +

+ − − + − −

( ) ( ) ( )( )

( ) ( ) ( )( )( )2 , 1 1 1

1 1 1 1 1

U p q pq p q

p q p q

= − + − +

+ − + − − −

∆(A1)=p,(1-p): ∀p∈[0,1]

∆(A2)=q,(1-q): ∀q∈[0,1]

Sets of probability distributions


14

Nash Equilibrium in a Mixed Strategy Game

Definition Mixed Strategy Nash Equilibrium

A mixed strategy profile α* is a NE iff ∀i∈N

( ) ( ) ( )* * *, ,i i i i i i i i

U U Aα α β α β− −≥ ∀ ∈ ∆

Best Response Correspondence

( )( )

( )arg max ,i i

i i i i iA

BR Uα

α α α− −∈∆

=

Alternate NE Definition

Consider ( ) ( )i N iB BRα α∈= ×

A mixed strategy profile α* is a NE iff

( )* *Bα α∈


15

0 1

1

Best Response

( )1 4 3u

q qp

∂= −

∂

( )2 4 3u

p pq

∂= −

∂

( )1

1 3/ 4

[0,1] 3/ 4

0 3 / 4

q

BR q q

q

>

= = <

( ) ( ) ( )( )( )2 , 1 1 1 3U p q pq p q= + − −

( ) ( ) ( )( )( )1 , 1 1 1 3U p q pq p q= + − −

( )2

1 3/ 4

[0,1] 3/ 4

0 3/ 4

p

BR p p

p

>

= = <

0.75

0.75

BR1(q)

BR2(p)

Best Response Correspondences

Note: both NE of the strategic game

are NE in its mixed extension

p(a1)

p(a2)

Nash Equilibrium


16

Interesting Properties of Mixed Strategy Games

1. Every Mixed Extension of a Strategic Game has an NE.

2. A mixed strategy αi is a best response to α-i iff every action in the support of αi is itself a best response to α-i.

3. Every action in the support of any player’s equilibrium mixed strategy yields the same payoff to that player.


17

Coalitional Game (with transferable payoff)• Concept: groups of players (called coalitions) conspire together to

implement actions which yields a result for the coalition. The value received by the coalition is then distributed among the coalition members.

• Where do radios collaborate and distribute value?– 802.16h interference groups – allocation of bandwidth– Distribution of frequencies/spreading codes among cells

– File sharing in P2P network

• Transferable utility refers to existence of some commodity for which a player’s utility increases by one unit for every unit of the commodity it receives

• Game Components, ⟨N,v⟩– N set of players

– Characteristic function– Coalition, S⊆N

• How is this value distributed?– Payoff vector, (xi)i∈S

• Payoff vector is said to be S-feasible if x(S) ≤ v(S)

: 2 \Nv ∅ →

( ) i

i S

x S x∈

=∑


18

The Core (Transferrable)

• The Core– For ⟨N,v⟩, the set of feasible payoff profiles,

(xi)i∈S for which there is no coalition S and S-feasible payoff vector (yi)i∈S for which yi > xi

for all i∈S.

• General principles of the NE also apply to the Core:– Number of solutions for a game may be

anywhere from 0 to ∞

– May be stable or unstable.

4


19

Example

• Suppose three radios, N = 1,2,3, can choose to participate in a peer-to-peer network.

• Characteristic Function– v(N) = 1

– v(1,2)= v(1,3)= v(2,3)=α∈[0,1]

– v(1)= v(2)= v(3) = 0

• Loosely, α indicates # of duplicated files

• If α>2/3, Core is empty

x = (2/5, 2/5,0)Example adaptations for

α=4/5x = (0, 3/5, 1/5) x = (2/5, 0, 2/5)

x = (1/3, 1/3, 1/3) x = (2/5, 2/5,0)


20

Comments on the Core

• Possibility of empty core implies that even when radios can freely negotiate and form arbitrary coalitions, no steady-state may exist

• Frequently very large (infinite) number of steady-states, e.g., α<2/3 – Makes it impossible to predict exact behavior

• Existence conditions for the Core, but would need to cover some linear programming concepts

• Related (but not addressed today) concepts:– Bargaining Sets, Kernel, Nucleolus


21

Strong NE

• Concept: Assume radios are able to collaborate, but utilities aren’t necessarily transferrable

• An action tuple a* such that

( ) ( )* *, ,i i S S S ii S

u a u a a S N a A−∈

≥ ∀ ⊆ ∈ ×

No Strong NE

(22, 22)(21, 9.6)w

(9.6, 21)(9.6,9.6)n

WN

Unique Strong NE


22

Motivation for Shapley value

• Core was generally either empty or very large. Want a “good” single solution.

• Terminology

– Marginal Contribution of i

– Interchangeability of i, j

– Dummy player (no synergy)

( ) ( ) ( )i S v S i v S∆ = ∪ −

( ) ( ) \i

S v i S N i∆ = ∀ ⊆

( ) ( ) \ , i j

S S S N i j∆ = ∆ ∀ ⊆


23

Axioms for Shapley Value

• Let ψ be some distribution of value for a TU coalition game

• Symmetry:– If i and j are interchangeable, then ψi(v)=ψj(v)

• Dummy:– If i is a dummy, then ψi(v) = v(i)

• Additivity:– Given ⟨N,v⟩ and ⟨N,w⟩, ψi(v + w)= ψi(v)+ ψi(w) for all

i∈N, where v+w = v(S) + w(S)

• Balanced Contributions– Given ⟨N,v⟩, ( ) ( ) ( ) ( )\ \

, \ . , \ .N j N i

i i j jN v N j v N v N i vψ ψ ψ ψ− = −


24

Shapley Value

Only assignment (value) that satisfies balanced contributions; only assignment that simultaneously satisfies symmetry, dummy, and additivity axioms

( )( )

( ) ( )( )\

! 1 !

!i

S N i

S N SS v S i v S

Nψ

⊆

− −= −∑ ∪

Marginal Value Contributed by i

Probability that i will be next one invited to the grand coalition given that coalition S is already part of the coalition assuming random ordering.

5


25

Implications of Shapley Value

• One form of a fair allocation– What you receive is based on the value you add

– Independent of order of arrival

– I liken it to setting salaries according to the Value Over Replacement Player concept in Baseball Statistics

• “Better” solution concept than the core as it’s a single payoff as opposed to a potentially infinite number

• Allows for analysis of relative “power” of different players in the system


26

Bargaining Problem• Components: ⟨F, v⟩

– Feasible payoffs F, closed convex subset of Rn

– Disagreement Point v = (v1, v2)• What 1 or 2 could achieve without bargaining

• Example:– Even if system is jammed, still gets some throughput

– Member of 802.16h interference group and try its luck

• F is said to be essential if there is some y∈F such that y1>v1 and y2>v2

• If contracts are “binding” then F could be the payoffs corresponding to entire original action space

• Otherwise, F may need to be drawn from the set of NE or from enforceable set (see punishment in repeated games)

• A particular solution is referred to by φ(F, v) ∈Rn


27

Desirable Bargaining Axioms about a Solution

• Strong Efficiency

– φ(F, v) is Pareto Efficient

• Individually Rational

– φ(F, v) ≥v

• Scale Covariance

– For any λ1, λ2, γ1, γ2∈R, λ1, λ2 >0, if

then

• Independence of Irrelevant Alternatives

– If G ⊆F and G is closed and convex

and φ(F, v)∈G, then φ(G, v)=φ(F, v)

• Symmetry

– If v1=v2 and

(x1,x2)|(x2,x1)∈F=F, then φ1(F, v)= φ2(F, v)

( ) ( ) 1 1 1 2 2 2 1 2, | ,G x x x x Fλ γ λ γ= + + ∈

( ) ( ) ( )( )1 1 1 2 2 2, , , ,G w F v F vφ λ φ γ λ φ γ= + +

( )1 1 1 2 2 2,w v vλ γ λ γ= + + Cognitive Radio Technologies, 2007

28

Nash Bargaining Solution

• NBS

• Interestingly, this is the only bargaining solution which simultaneously satisfies the

preceding 5 axioms

( ) ( ),

, arg maxi i

i Nx F x vF v x vφ

∈∈ ≥∈ Π −


29

GT framework for BW allocation

[Yaiche]: System Model

• N users

• L links

• Users compete for the total link capacity

• Each user has a minimum rate MRi and peak rate PRi

• Admissible rate vector is given by,

C : vector of link capacities

A L*N: alp = 1 if link belongs to path p, else 0.

0 | , , andNX x x MR x PR Ax C= ∈ ≥ ≤ ≤

Scenario given in H. Yaiche, R. Mazumdar, C. Rosenberg, “A game theoretic framework for bandwidth allocation and pricing in broadband networks”, IEEE/ACM Transactions on Networking, Volume: 8 , Issue: 5 , Oct. 2000, pp. 667-678. Cognitive Radio Technologies, 2007

30

Centralized Optimization Problem

( )

( ) ( )

1

: 1

1

1

N

i ix

i

i i

i i

l l

Max x MR

st x MR i N

x PR i N

Ax C l L

=

−

≥ ∈

≤ ∈

≥ ∈

∏

…

…

…

•

• Unique NBS exists

v

F

6


31

Steady-State Summary• Not every game has a steady-state• NE are analogous to fixed points of self-interested

decision processes• NE can be applied to procedural and ontological radios

– Don’t need to know decision rule, only goals, actions, and assumption that radios act in their own interest

• A game (network) may have 0, 1, or many steady-states• All finite normal form games have an NE in its mixed

extension– Over multiple iterations, implies constant adaptation

• More complex game models yield more complex steady-state concepts

• Can define steady-states concepts for coalitional games– Frequently so broad that specific solutions are used

1


1

Evaluating Equilibria

Objective Function

Maximization, Pareto Efficiency, Notions

of Fairness


2

Optimality

• In general we assume the existence of some design objective function J:A→R

• The desirableness of a network state, a, is the value of J(a).

• In general maximizers of J are unrelated to fixed points of d.

Figure from Fig 2.6 in I. Akbar, “Statistical Analysis of Wireless Systems Using Markov Models,” PhD Dissertation, Virginia Tech, January 2007


3

Example Functions• Utilitarian

– Sum of all players’ utilities

– Product of all players’utilities

• Practical– Total system throughput

– Average SINR

– Maximum End-to-End Latency

– Minimal sum system interference

• Objective can be unrelated to utilities

Utilitarian Maximizers

System Throughput Maximizers

Interference Minimization


4

Price of Anarchy (Factor)

• Centralized solution always at least as good as distributed solution– Like ASIC is always at least as good as

DSP

• Ignores costs of implementing algorithms– Sometimes centralized is infeasible (e.g.,

routing the Internet)

– Distributed can sometimes (but not generally) be more costly than centralized

Performance of Centralized Algorithm Solution

Performance of Distributed Algorithm Solution

≥≥≥≥ 1

9.6

7


5

Implications• Best of All Possible Worlds

– Low complexity distributed algorithms with low anarchy factors

• Reality implies mix of methods– Hodgepodge of mixed solutions

• Policy – bounds the price of anarchy

• Utility adjustments – align distributed solution with centralized solution

• Market methods – sometimes distributed, sometimes centralized• Punishment – sometimes centralized, sometimes distributed,

sometimes both• Radio environment maps –”centralized” information for distributed

decision processes

– Fully distributed• Potential game design – really, the panglossian solution, but only

applies to particular problems


6

Pareto efficiency (optimality)• Formal definition: An action vector a* is

Pareto efficient if there exists no other action vector a, such that every radio’s valuation of the network is at least as good and at least one radio assigns a higher valuation

• Informal definition: An action tuple is Pareto

efficient if some radios must be hurt in order to improve the payoff of other radios.

• Important note– Like design objective function, unrelated to fixed

points (NE)

– Generally less specific than evaluating design objective function

2


7

Example Games

a1

b1

a2 b2

1,1 -5,5

-1,-15,-5

a1

b1

a2 b2

1,1 -5,5

3, 35,-5

Legend Pareto Efficient

NE NE + PE


8

Notions of Fairness

• What is “Fair”?

– Abstractly “fair” means different things to different analysts

– In every day life, really just short hand for “I deserve more than I got”

• Nonetheless is used to evaluate how equitably radio resources are distributed


9

Gini Coefficient• Basic concept:

– Order players by utility. – Form CDF for sorted utility

distribution (Lorenz curve)

– Integrate (sum) the difference between perfect equality (of outcome) and CDF

– Divide result by sum of all players’utilities

• Formula

• Used in a lot of macro-economic comparisons of income distributions

• Relatively simple, independent of scale, independent of size of N, anonymity

• Radically different outcomes can give the same result

00.37w

0.370n

WNG

Player #

Ag

gre

ga

te U

tilit

y

Lorenz curvePerfe

ct E

quality

( )( ) ( )

( )

11

1 2

i

i N

i

i N

n i u a

G a nn u a

∈

∈

+ −

= + −

∑

∑


10

Other Metrics of Fairness

• Theill Index

• Atkinson Index, ε is income inequality aversion

( ) ( )( )

[ )1 1

11 11 , 0,1

i

i N

T a u au n

ε

εε

−

−

∈

= − ∈

∑

( )( ) ( )1

lni i

i N

u a u aT a

n u u∈

=

∑ ( ) ( )

1i

i N

u a u an ∈

= ∑

( ) ( )1/

1 11 , 1

n

i

i N

T a u au n

ε∈

= − =

∑


11

Summary of EquilibriaEvaluation

• Lots of different ways which a point can be evaluated

• Many are contradictory

• Loosely, any point could be said to be optimal given the right objective function

• Insufficient to say that a point is optimal

• Must describe the metric in use

• Suggestion: use whatever metric makes sense to you as a network designer

1


1

The Notion of Time and Imperfections in Games and NetworksExtensive Form Games, Repeated Games, Convergence Concepts in Normal Form Games, Trembling Hand Games, Noisy Observations


2/67

Model Timing Review• When decisions are

made also matters and different radios will likely make decisions at different time

• Tj – when radio j makes its adaptations– Generally assumed to be

an infinite set

– Assumed to occur at discrete time

• Consistent with DSP implementation

• T=T1∪T2∪⋅⋅⋅∪Tn

• t ∈ T

Decision timing classes

• Synchronous– All at once

• Round-robin– One at a time in order

– Used in a lot of analysis

• Random– One at a time in no order

• Asynchronous– Random subset at a time

– Least overhead for a network


3/67

Motivating Extensive Form Games (1/2)

Models attempt to capture as much information as possible in as

simple a format as possible. Often this requires abstracting away

certain system details that are deemed unimportant to the analysis.

To this point, we have only considered systems where all

decision-making occurs simultaneously and where action sets are

defined independently of the decision making process.

Although many systems can be abstracted to fit these conventions,

it is sometimes clumsy and if the analyst is not careful, may

eliminate information that would alter the expected outcome.


4/67

Extensive Form Game Components

1. A set of players.

2. The actions available to each player at each decision moment (state).

3. A way of deciding who is the current decision maker.

4. Outcomes on the sequence of actions.

5. Preferences over all outcomes.

Components

A1

2

1

2

1’

2’

B

B

1,-1

-1,1

-1,1

1,-1

1

2

1,1’ 1,2’ 2,1’ 2,2’

1,-1

1,-11,-1

1,-1 -1,1 -1,1

-1,1 -1,1

Strategic Form Equivalence

Strategies for A

1,2

Strategies for B

(1,1’),(1,2’),

(2,1’),(2,2’)

Game Tree Representation

A Silly Jammer Avoidance Game


5/67

Backwards Induction• Concept

– Reason backwards based on what each player would rationally play

– Predicated on Sequential Rationality– Sequential Rationality – if starting at any decision point for a

player in the game, his strategy from that point on represents abest response to the strategies of the other players

– Subgame Perfect Nash Equilibrium is a key concept (not formally discussed today).

C

S S

1 1

1,0

C2

0,2

C

S

3,1

S

C2

2,4

S

1

5,3

C C

S

2

4,6

7,54,65,32,43,10,2

Alternating Packet Forwarding Game


6/67

Comments on Extensive Form Games

• Actions will generally not be directly observable

• However, likely that cognitive radios will build up histories

• Ability to apply backwards induction is predicated on knowing other radio’s objectives, actions, observations and what they know they know…– Likely not practical

• Really the best choice for modeling notion of time when actions available to radios change with history

2


7/67

Repeated GamesStage 1

Stage 2

Stage k

Stage 1

Stage 2

Stage k

• Same game is repeated

– Indefinitely

– Finitely

• Players consider discounted payoffs

across multiple stages

– Stage k

– Expected value over all future stages

( ) ( )k k k

i iu a u aδ=

( )( ) ( )0

k k k

i i

k

u a u aδ∞

=

=∑


8/67

Lesser Rationality: Myopic Processes

• Players have no knowledge about utility functions, or expectations about future play,

typically can observe or infer current actions

• Best response dynamic – maximize individual

performance presuming other players’ actions are fixed

• Better response dynamic – improve individual performance presuming other players’ actions are fixed

• Interesting convergence results can be established


9/67

Paths and Convergence• Path [Monderer_96]

– A path in Γ is a sequence γ = (a0, a1,…) such that for every k ≥ 1 there exists a unique player such that the strategy combinations (ak-1, ak) differs in exactly one coordinate.

– Equivalently, a path is a sequence of unilateral deviations. When discussing paths, we make use of the following conventions.

– Each element of γ is called a step.

– a0 is referred to as the initial or starting point of γ.

– Assuming γ is finite with m steps, am is called the terminal point or ending point of γ and say that γ has length m.

• Cycle [Voorneveld_96]– A finite path γ = (a0, a1,…,ak) where ak = a0


10/67

Improvement Paths

• Improvement Path– A path γ = (a0, a1,…) where for all k≥1,

ui(ak)>ui(a

k-1) where i is the unique deviator at k

• Improvement Cycle– An improvement path that is also a cycle

– See the DFS example

γ2

γ1

γ3

γ4γ5

γ6γ2

γ1

γ3

γ4γ5

γ6


11/67

Convergence Properties

• Finite Improvement Property (FIP)

– All improvement paths in a game are finite

• Weak Finite Improvement Property (weak

FIP)

– From every action tuple, there exists an

improvement path that terminates in an NE.

• FIP implies weak FIP

• FIP implies lack of improvement cycles

• Weak FIP implies existence of an NE Cognitive Radio Technologies, 2007

12/67

Examples

a

b

A B

1,-1

-1,1

0,2

2,2

Game with FIP

a

b

A B

1,-1 -1,1

1,-1-1,1

C

0,2

1,2

c 2,12,0 2,2

Weak FIP but not FIP

3


13/67

Implications of FIP and weak FIP• Assumes radios are incapable of reasoning ahead and

must react to internal states and current observations

• Unless the game model of a CRN has weak FIP, then no autonomously rational decision rule can be guaranteed to converge from all initial states under random and round-robin timing (Theorem 4.10 in dissertation).

• If the game model of a CRN has FIP, then ALL autonomously rational decision rules are guaranteed to converge from all initial states under random and round-robin timing.– And asynchronous timings, but not immediate from definition

• More insights possible by considering more refined classes of decision rules and timings


14/67

Decision Rules


15/67

Markov Chains• Describes adaptations

as probabilistic transitions between network states.– d is nondeterministic

• Sources of randomness:– Nondeterministic timing

– Noise

• Frequently depicted as a weighted digraph or as a transition matrix


16/67

General Insights ([Stewart_94])• Probability of occupying a state

after two iterations.– Form PP.

– Now entry pmn in the mth row and nth

column of PP represents the probability that system is in state an

two iterations after being in state am.

• Consider Pk.– Then entry pmn in the mth row and nth

column of represents the probability that system is in state an two iterations after being in state am.


17/67

Steady-states of Markov chains

• May be inaccurate to consider a Markov chain to have a fixed point– Actually ok for absorbing Markov chains

• Stationary Distribution– A probability distribution such that ππππ* such that

ππππ*T P =ππππ*T is said to be a stationary distribution for the Markov chain defined by P.

• Limiting distribution– Given initial distribution ππππ0 and transition

matrix P, the limiting distribution is the distribution that results from evaluating

0lim T k

kπ

→∞P


18/67

Ergodic Markov Chain

• [Stewart_94] states that a Markov chain is ergodic if it is a Markov chain if it is a) irreducible, b) positive recurrent, and c)

aperiodic.

• Easier to identify rule:

– For some k Pk has only nonzero entries

• (Convergence, steady-state) If ergodic, then chain has a unique limiting stationary distribution.

4


19/67

Absorbing Markov Chains

• Absorbing state

– Given a Markov chain with transition matrix P, a state am is said to be an absorbing state if pmm=1.

• Absorbing Markov Chain

– A Markov chain is said to be an absorbing Markov chain if

• it has at least one absorbing state and

• from every state in the Markov chain there exists a sequence

of state transitions with nonzero probability that leads to an absorbing state. These nonabsorbing states are called

transient states.

a0 a1 a2 a3 a4 a5


20/67

Absorbing Markov Chain Insights

• Canonical Form

• Fundamental Matrix

• Expected number of times that the system will pass through state am given that the system starts in state ak.– nkm

• (Convergence Rate) Expected number of iterations before the system ends in an absorbing state starting in state am is given by tm where 1 is a ones vector– t=N1

• (Final distribution) Probability of ending up in absorbing state am given that the system started in ak is bkm where

' ab

=

Q RP

0 I

( )1−

= −N I Q

=B NR


21/67

Absorbing Markov Chains and Improvement Paths

• Sources of randomness– Timing (Random, Asynchronous)

– Decision rule (random decision rule)– Corrupted observations

• An NE is an absorbing state for autonomously rational decision rules.

• Weak FIP implies that the game is an absorbing Markov chain as long as the NE terminating improvement path always has a nonzero probability of being implemented.

• This then allows us to characterize – convergence rate, – probability of ending up in a particular NE,

– expected number of times a particular transient state will be visited


22/67

Connecting Markov models, improvement paths, and decision rules

• Suppose we need the path γ = (a0, a1,…am) for convergence by weak FIP.

• Must get right sequence of players and right sequence of adaptations.

• Friedman Random Better Response– Random or Asynchronous

• Every sequence of players have a chance to occur• Random decision rule means that all improvements have a chance to

be chosen

– Synchronous not guaranteed

• Alternate random better response (chance of choosing same action)– Because of chance to choose same action, every sequence of

players can result from every decision timing.

– Because of random choice, every improvement path has a chance of occurring


23/67

Convergence Results (Finite Games)

• If a decision rule converges under round-robin, random, or synchronous timing, then it also converges under asynchronous timing.

• Random better responses converge for the most decision timings and the most surveyed game conditions.– Implies that non-deterministic procedural cognitive radio

implementations are a good approach if you don’t know much about the network.


24/67

Impact of Noise

• Noise impacts the mapping from actions to outcomes, f :A→O

• Same action tuple can lead to different outcomes

• Most noise encountered in wireless systems is theoretically unbounded.

• Implies that every outcome has a nonzero chance of being observed for a particular action tuple.

• Some outcomes are more likely to be observed than others (and some outcomes may have a very small chance of occurring)

5


25/67

DFS Example• Consider a radio observing the spectral

energy across the bands defined by the set C where each radio k is choosing its band of operation fk.

• Noiseless observation of channel ck

• Noisy observation

• If radio is attempting to minimize inbandinterference, then noise can lead a radio to believe that a band has lower or higher interference than it does

( ) ( ),i k ki k k k

k N

o c g p c fθ∈

=∑( ) ( ) ( ), ,i k ki k k k i k

k N

o c g p c f n c tθ∈

= +∑


26/67

Trembling Hand (“Noise” in Games)

• Assumes players have a nonzero chance of making an error implementing their action.

– Who has not accidentally handed over the wrong amount of cash at a restaurant?

– Who has not accidentally written a “tpyo”?

• Related to errors in observation as erroneous observations cause errors in implementation

(from an outside observer’s perspective).


27/67

Noisy decision rules

• Noisy utility ( ) ( ) ( ), ,i i iu a t u a n a t= +

Trembling Hand

ObservationErrors


28/67

Implications of noise• For random timing, [Friedman] shows game with noisy

random better response is an ergodic Markov chain.

• Likewise other observation based noisy decision rules are ergodic Markov chains– Unbounded noise implies chance of adapting (or not adapting) to

any action

– If coupled with random, synchronous, or asynchronous timings, then CRNs with corrupted observation can be modeled as ergodic Makov chains.

– Not so for round-robin (violates aperiodicity)

• Somewhat disappointing– No real steady-state (though unique limiting stationary

distribution)


29/67

DFS Example with three access points

• 3 access nodes, 3 channels, attempting to operate in band with least spectral energy.

• Constant power• Link gain matrix

• Noiseless observations

• Random timing

12

3


30/67

Trembling Hand

• Transition Matrix, p=0.1

• Limiting distribution

6


31/67

Noisy Best Response

• Transition Matrix, N(0,1) Gaussian Noise

• Limiting stationary distributions


32/67

Comment on Noise and Observations

• Cardinality of goals makes a difference for cognitive radios– Probability of making an error is a function of the difference

in utilities– With ordinal preferences, utility functions are just useful

fictions• Might as well assume a trembling hand

• Unboundedness of noise implies that no state can be absorbing for most decision rules

• NE retains significant predictive power– While CRN is an ergodic Markov chain, NE (and the

adjacent states) remain most likely states to visit– Stronger prediction with less noise

– Also stronger when network has a Lyapunov function

– Exception - elusive equilibria ([Hicks_04])


33/67

Stability

x

y

x

y

Stable, but not attractive

x

y

Attractive, but not stable Cognitive Radio Technologies, 2007

34/67

Lyapunov’s Direct Method

Left unanswered: where does L come from?

Can it be inferred from radio goals?


35/67

Summary• Given a set of goals, an NE is a fixed point for all radios with those

goals for all autonomously rational decision processes• Traditional engineering analysis techniques can be applied in a

game theoretic setting– Markov chains to improvement paths

• Network must have weak FIP for autonomously rational radios to converge– Weak FIP implies existence of absorbing Markov chain for many

decision rules/timings

• In practical system, network has a theoretically nonzero chance of visiting every possible state (ergodicity), but does have uniquelimiting stationary distribution– Specific distribution function of decision rules, goals

– Will be important to show Lyapunov stability

• Shortly, we’ll cover potential games and supermodular games which can be shown to have FIP or weak FIP. Further potential games have a Lyapunov function!

1


1

Designing Cognitive Radio Networks to Yield Desired Behavior

Policy, Cost Functions,

Global Altruism, SupermodularGames, Potential

Games


2

Policy• Concept: Constrain the

available actions so the worst cases of distributed decision making can be avoided

• Not a new concept –– Policy has been used since

there’s been an FCC

• What’s new is assuming decision makers are the radios instead of the people controlling the radios


3

Policy applied to radios instead of humans

• Need a language to convey policy– Learn what it is– Expand upon policy later

• How do radios interpret policy– Policy engine?

• Need an enforcement mechanism– Might need to tie in to humans

• Need a source for policy– Who sets it?– Who resolves disputes?

• Logical extreme can be quite complex, but logical extreme may not be necessary.

Policies

frequency

mask


4

• Detection

– Digital TV: -116 dBm over a 6 MHz channel

– Analog TV: -94 dBm at the peak of the NTSC (National Television System Committee) picture carrier

– Wireless microphone: -107 dBm in a 200 kHz bandwidth.

• Transmitted Signal

– 4 W Effective Isotropic Radiated Power (EIRP)

– Specific spectral masks

– Channel vacation timesC. Cordeiro, L. Challapali, D. Birru, S. Shankar, “IEEE 802.22: The First Worldwide Wireless Standard based on Cognitive Radios,”

IEEE DySPAN2005, Nov 8-11, 2005 Baltimore, MD.

802.22 Example Policies


5

Cost Adjustments

• Concept: Centralized unit dynamically adjusts costs in radios’ objective functions to ensure

radios operate on desired point

• Example: Add -12 to use of wideband waveform

( ) ( ) ( )i i iu a u a c a= +


6

Comments on Cost Adjustments

• Permits more flexibility than policy

– If a radio really needs to deviate, then it can

• Easy to turn off and on as a policy tool

– Example: protected user shows up in a channel, cost to use that channel goes up

– Example: prioritized user requests channel, other users’ cost to use prioritized user’s channel goes up (down if when done)

2


7

Global Altruism: distributed, but more costly• Concept: All radios distributed all relevant information to all other

radios and then each independently computes jointly optimal solution– Proposed for spreading code allocation in Popescu04, Sung03– Used in xG Program (Comments of G. Denker, SDR Forum Panel

Session on “A Policy Engine Framework”) Overhead ranges from 5%-27%

• C = cost of computation

• I = cost of information transfer from node to node

• n = number of nodes• Distributed

– nC + n(n-1)I/2

• Centralized (election)– C + 2(n-1)I

• Price of anarchy = 1

• May differ if I is asymmetric


8

Improving Global Altruism• Global altruism is clearly inferior to a centralized solution

for a single problem. • However, suppose radios reported information to and

used information from a common database– n(n-1)I/2 => 2nI

• And suppose different radios are concerned with different problems with costs C1,…,Cn

• Centralized– Resources = 2(n-1)I + sum(C1,…,Cn)– Time = 2(n-1)I + sum(C1,…,Cn)

• Distributed– Resources = 2nI + sum(C1,…,Cn)– Time = 2I + max (C1,…,Cn)


9

Example Application: • Overlay network of secondary

users (SU) free to adapt power, transmit time, and channel

• Without REM:– Decisions solely based on link

SINR

• With REM– Radios effectively know everything

Upshot: A little gain for the secondary users; big gain for primary users

From: Y. Zhao, J. Gaeddert, K. Bae, J. Reed, “Radio Environment Map Enabled Situation-Aware Cognitive Radio Learning Algorithms,” SDR Forum Technical Conference 2006.


10

Comments on Radio Environment Map

• Local altruism also possible

– Less information transfer

• Like policy, effectively needs a common

language

• Nominally could be centralized or distributed

database

• Read more:

– Y. Zhao, B. Le, J. Reed, “Infrastructure Support – The Radio Environment MAP,” in Cognitive Radio Technology, B. Fette, ed., Elsevier 2006.


11

Supermodular Games

2

0, , ,i

i j

ui j N a A

a a

∂≥ ∀ ∈ ∈

∂ ∂

• A game such that– Action space is a lattice– Utility functions are supermodular

• Identification• NE Properties

– NE Existence: All supermodular games have a NE– NE Identification: NE form a lattice

• Convergence– Has weak FIP– Best response algorithms converge

• Stability– Unique NE is an attractive fixed point for best

response Cognitive Radio Technologies, 2007

12

Ad-hoc power control

• Network description

• Each radio attempts to achieve a target SINR at the receiving

end of its link.

• System objective is

ensuring every radio achieves its target SINR

Gateway

Cluster

Head

Cluster

Head

Gateway

Cluster

Head

Cluster

Head

( ) ( )2

ˆk k

k N

J γ γ∈

= − −∑p

3


13

Generalized repeated gamestage game

• Players – N

• Actions –

• Utility function

• Action space formulation

( ) ( )2

ˆj j ju o γ γ= − −

max0,j jP p =

( ) ( )2

10 10

\

ˆ 10log 10 logj j jj j kj k j

k N j

u p g p g p Nγ∈

= − − + +

∑

gjk fraction of power transmitted by j that can’t be removed by receiving end of

radio j’s link

Nj noise power at receiving end of radio j’s link


14

Model identification & analysis

• Supermodular game– Action space is a lattice

– Implications• NE exists

• Best response converges

• Stable if discrete action space

• Best response is also standard– Unique NE

– Solvable (see prelim report)

– Stable (pseudo-contraction) for infinite action spaces

( )

( )

2

\

2000

ln 20

j kj

j k

j kj k j

k N j

u p g

p pp g p N

∈

∂= >

∂ ∂ +

∑

( )ˆ

ˆ jk

j j

j

B pγ

γ=p


15

Validation

Noiseless Best Response Noisy Best Response

Implies all radios achieved target SINR


16

Comments on Designing Networks

with Supermodular Games

• Scales well

– Sum of supermodular functions is a supermodularfunction

– Add additional action types, e.g., power, frequency, routing,..., as long as action space remains a lattice and utilities are supermodular

• Says nothing about desirability or stability of

equilibria

• Convergence is sensitive to the specific decision

rule and the ability of the radios to implement it


17

Potential Games

time

Φ(ω

)

• Existence of a function (called the potential function, V), that reflects the change in utility seen by a unilaterally deviating player.

• Cognitive radio interpretation:– Every time a cognitive radio

unilaterally adapts in a way that furthers its own goal, some real-valued function increases.


18

Exact Potential Game Forms

• Many exact potential games can be recognized by the form of the utility function

4


19

Implications of Monotonicity• Monotonicity implies

– Existence of steady-states (maximizers of V)

– Convergence to maximizers of V for numerous combinations of decision timings decision rules – all self-interested adaptations

• Does not mean that that we get good performance– Only if V is a function we want to maximize


20

Other Potential Game Properties

• All finite potential games have FIP• All finite games with FIP are potential games

– Very important for ensuring convergence of distributed cognitive radio networks

• -V is a is a Lyapunov function for isolated maximizers

• Stable NE solvable by maximizers of V• Linear combination of exact potential games is

an exact potential game• Maximizer of potential game need not maximize

your objective function– Cognitive Radios’ Dilemma is a potential game


21

Interference Reducing Networks

• Concept– Cognitive radio network is a potential game with a potential

function that is negation of observed network interference

• Definition– A network of cognitive radios where each adaptation

decreases the sum of each radio’s observed interference is an IRN

• Implementation:– Design DFS algorithms such that network is a potential game

with Φ ∝ -V

( ) ( )i

i N

Iω ω∈

Φ =∑

time

Φ(ω

)


22

Bilateral Symmetric Interference

• Two cognitive radios, j,k∈N, exhibit bilateral symmetric interference if

Source: http://radio.weblogs.com/0120124/Graphics/geese2.jpg

What’s good for the goose, isgood for the gander…

( ) ( ), ,jk j j k kj k k jg p g pρ ω ω ρ ω ω= ,j j k kω ω∀ ∈ Ω ∀ ∈ Ω

• ωk – waveform of radio k

• pk - the transmission power of radio k’s waveform

• gkj - link gain from the transmission source of radio k’ssignal to the point where radio jmeasures its interference,

• - the fraction of radio k’s signal that radio j cannot exclude via processing (perhaps via filtering, despreading, or MUD techniques).

( ),k jρ ω ω


23

Bilateral Symmetric Interference Implies an Interference Reducing Network

• Cognitive Radio Goal:

• By bilateral symmetric interference

• Rewrite goal

• Therefore a BSI game (Si =0)

• Interference Function

• Therefore profitable unilateral deviations increase V and decrease Φ(ω) – an IRN

( ) ( ) ( )∑∈

−=−=iNj

jijjiii pgIu\

,ωωρωω

( ) ( )\

,i ik i k

k N i

u bω ω ω∈

= −∑

( ) ( )1

1

,i

ki k k i

i N k

V g pω ρ ω ω−

∈ =

= −∑∑

( ) ( )2Vω ωΦ = −

( ) ( ) ( ) ( )kiikikkikiiikikkki bbpgpg ωωωωωωρωωρ ,,,, ===


24

An IRN 802.11 DFS Algorithm• Suppose each access node

measures the received signal power and frequency of the RTS/CTS (or BSSID) messages sent by observable access nodes in the network.

• Assumed out-of-channel interference is negligible and RTS/CTS transmitted at same power

( ) ( )jkkkjkjjjk ffpgffpg ,, σσ =

( ) ( ) ( )\

,i i ki k i k

k N i

u f I f g p f fσ∈

= − = −∑

( )1

,0

i k

i k

i k

f ff f

f fσ

==

≠

Listen on

Channel LC

RTS/CTS

energy detected? Measure power

of access node

in message, p

Note address of access

node, a

Update

interference table

Time for decision?Apply decision criteria for new

operating

channel, OCUse 802.11h

to signal change

in OC to clients

yn

Pick channel tolisten on, LC

y

n

Start

5


25

Statistics• 30 cognitive access nodes in European UNII

bands

• Choose channel with lowest interference• Random timing

• n=3• Random initial channels

• Randomly distributed positions over 1 km2

0 10 20 30 40 50 60 70 80 90 1000

10

20

30

40

50

60

70

Number of Access Nodes

Reduction in N

et

Inte

rfere

nce (

dB

)

Round-robin Asynchronous Legacy Devices

Reduction in Net Interference

Reduction in Net Interference


26

Ad-hoc Network

• Possible to adjust previous algorithm to not favor access nodes over clients

• Suitable for ad-hoc networks• CRT has IRN based

distributed low-complexity algorithms for – Spreading codes

– Power variations

– Subcarrier allocation– Bandwidth variations

– Activity levels weighted by interference

– Noninteractive terms –modulation, FEC, interleaving

– Beamforming

– And combinations of the above


27

Comments on Potential Games• All networks for which there is not a better response interaction loop

is a potential game• Before implementing fully distributed GA, SA, or most CBR decision

rules, important to show that goals and action satisfy potential game model

• Sum of exact potential games is itself an exact potential game– Permits (with a little work) scaling up of algorithms that adjust single

parameters to multiple parameters

• Possible to combine with other techniques– Policy restricts action space, but subset of action space remains a

potential game (see J. Neel, J. Reed, “Performance of Distributed Dynamic Frequency Selection Schemes for Interference Reducing Networks,” Milcom 2006)

– As a self-interested additive cost function is also a potential game, easyto combine with additive cost approaches (see J. Neel, J. Reed, R. Gilles, “The Role of Game Theory in the Analysis of Software Radio Networks,” SDR Forum02)

• Read more on potential games:– Chapter 5 in Dissertation of J. Neel, Available at

http://scholar.lib.vt.edu/theses/available/etd-12082006-141855/ Cognitive Radio Technologies, 2007

28

Token Economies

• Pairs of cognitive radios exchange tokens for services rendered or bandwidth rented

• Example: – Primary users leasing spectrum to secondary users

• D. Grandblaise, K. Moessner, G. Vivier and R. Tafazolli,“Credit Token based Rental Protocol for Dynamic Channel Allocation,” CrownCom06.

– Node participation in peer-to-peer networks• T. Moreton, “Trading in Trust, Tokens, and Stamps,”

Workshop on the Economics of Peer-to-Peer Systems, Berkeley, CA June 2003.

• Why it works – it’s a potential game when there’s no externality to the trade– Ordinal potential function given by sum of utilities


29

Comments on Network Options

• Approaches can be combined– Policy + potential

– Punishment + cost adjustment

– Cost adjustment + token economies

• Mix of centralized and distributed is likely best approach

• Potential game approach has lowest complexity, but cannot be extended to every problem

• Token economies requires strong property rights to ensure proper behavior

• Punishment can also be implemented at a choke point in the network

1


1

Summary and Conclusions

• Summary of Critical Concepts

• The Future Role of Game Theory in the Design and Regulation of Dynamic Spectrum Access Networks

• Topics for Further Study and Research


2

What does game theory bring to the design of cognitive radio networks? (1/2)

• A natural “language” for modeling cognitive radio networks

• Permits analysis of ontological radios– Only know goals and that radios will adapt towards its goal

• Simplifies analysis of random procedural radios

• Permits simultaneous analysis of multiple decision rules – only need goal

• Provides condition to be assured of possibility of convergence for all autonomously myopic cognitive radios (weak FIP)


3

What does game theory bring to the design of cognitive radio networks? (2/2)

• Provides condition to be assured of convergence for all autonomously myopic cognitive radios (FIP, not synchronous timing)

• Rapid analysis– Verify goals and actions satisfy a single model, and

steady-states, convergence, and stability

• An intuition as to what conditions will be needed to field successful cognitive radio decision rules.

• A natural understanding of distributed interactive behavior which simplifies the design of low complexity distributed algorithms


4

Game Models of Cognitive Radio Networks

• Almost as many models as there are algorithms

• Normal form game excellent for capturing single iteration of a complex system

• Most other models add features to this model – Time, decision rules, noisy

observations, Natural states

• Some can be recast as a normal form game– Extensive form game

• Normal Form Game– ⟨N, A, ui⟩

• Supermodular Game–

• Potential Game–

• Repeated Game– ⟨N, A, ui, di⟩

• Asynchronous Game– ⟨N, A, ui, di, T⟩

• Extensive Form Game– ⟨N, A, ui, di, T⟩

• TU Game– ⟨N, v⟩

• Bargaining Game– ⟨N, v⟩

( )2

0i

i j

u a

a a

∂≥

∂ ∂

( ) ( )22

ji

i j j i

u au a

a a a a

∂∂=

∂ ∂ ∂ ∂


5

Steady-states• Different game models have

different steady-state concepts

• Games can have many, one, or no steady-states

• Nash equilibrium (and its variants) is most commonly applied concept– Excellent for distributed

noncollaborative algorithms

• Games with punishment and Coalitional games tend to have a very large number of equilibria

• Game theory permits analysis of steady-states without knowing specific decision rules

• Nash Equilibrium• Strong Nash

Equilibrium

• Core• Shapley Value• Nash Bargaining

Solution


6

Optimality

• Numerous different notions of optimality

• Many are contradictory

• Use whatever metric makes sense

• Pareto Efficiency

• Objective Maximization

• Gini Index

• Shapley Value

• Nash Bargaining Solution

2


7

Convergence• Showing existence

of steady-states is insufficient; need to know if radios can reach those states

• FIP (potential games) gives the broadest convergence conditions

• Random timing actually helps convergence


8

Noise

• Unbounded noise causes all networks to theoretically behave as ergodic Markov chains

• Important to show Lyapunov stabiltiy

• Noisy observations cause noisy implementation to an outside observer

– Trembling hand


9

Game Theory and Design• Numerous techniques for

improving the behavior of cognitive radio networks

• Techniques can be combined• Potential games yield lowest

complexity implementations– Judicious design of goals,

actions,

• Practical limitations limit effectiveness of punishment– Observing actions

– Likely best when a referee exists

• Policy can limit the worst effects, doesn’t really address optimality or convergence issues

• Supermodular games– Steady state exists– Best response convergence

• Potential games– Identifiable steady-states– All self-interested decision rules

converge– Lyapunov function exists for isolated

equilibria

• Punishment– Can enforce any action tuple– Can be brittle when distributed

• Policy– Limits worst case performance

• Cost function– Reshapes preferences– Could damage underlying structure if

not a self-interested cost

• Centralized– Can theoretically realize any result

– Consumes overhead– Slower reactions


10

Future Directions in Game Theory and Design

• Integrate policy and potential games

• Integration of coalitional and distributed

forms

• Increasing dimensionality of action sets

– Cross-layer

• Integration of dynamic and hierarchical policies and games


11

Future Direction in Regulation

• Can incorporate optimization into policy by specifying goals

• In theory, correctly implementing goals, correctly implementing actions, and exhaustive self-interested adaptation is enough to predict behavior (at least for potential games)– Simpler policy certification

• Provable network behavior


12

Avenues for Future Research on Game Theory and CRNs• Integration of bargaining,

centralized, and distributed algorithms into a common framework

• Cross-layer algorithms

• Better incorporating performance of classification techniques into behavior

• Asymmetric potential games

• Bargaining algorithms for cognitive radio

• Improving the brittleness of punishment in distributed implementations with imperfect observations

• Imperfection in observations in general

• Time varying game models while inferring convergence, stability…

• Combination of policy, potential games, coalition formation, and token economies

• Can be modeled as a game with to types of players

– Distributed cognitive radios

– Dynamic policy provider

3


13

Questions?

Cognitive RadioTechnologiesCCognitiveognitive RRadioadioTTechnologiesechnologies

CRTCRTCRTCRTCRTCRTCRTCRTCRTCRTCRTCRT

www.crtwireless.com

crt information game theory in the analysis...

Documents