crt information game theory in the analysis...
TRANSCRIPT
1
Cognitive Radio Technologies, 2007
1Cognitive RadioTechnologiesCCognitiveognitive RRadioadioTTechnologiesechnologies
CRTCRTCRTCRTCRTCRTCRTCRTCRTCRTCRTCRT
James Neel
(540) 230-6012www.crtwireless.com
DySPAN 2007
April 17, 2007
Game Theory in the Analysis and Design of Cognitive Radio Networks
Cognitive Radio Technologies, 2007
2
Small business officially incorporated in Feb 2007 to commercialize cognitive radio research
Email: [email protected]
Website: crtwireless.com
Tel: 540-230-6012
Mailing Address:
Cognitive Radio Technologies, LLC
147 Mill Ridge Rd, Suite 119
Lynchburg, VA 24502
CRT Information
Cognitive Radio Technologies, 2007
3
0 50 100 150 200 250 300
40
60
80
100
120
140
Channel
0 50 100 150 200 250 300-90
-80
-70
-60
I i(f)
(dB
m)
0 50 100 150 200 250 300-80
-75
-70
-65
-60
-55
iteration
Φ(f
) (d
Bm
)
CRT’s Strengths• Analysis of networked
cognitive radio algorithms (game theory)
• Design of low complexity, low overhead (scalable), convergent and stable cognitive radio algorithms
– Infrastructure, mesh, and ad-
hoc networks
– DFS, TPC, AIA, beamforming, routing, topology formation
Net Interference
Average LinkInterference
FrequencyAdjustments
Statistics from yet to be published DFS algorithm for ad-hoc networks P2P 802.11a links in 0.5kmx0.5km area
(discussed in tomorrow’s slides) 0 10 20 30 40 50 60 70 80 90 100-90
-85
-80
-75
-70
-65
-60
-55
-50
-45
Ste
ady-s
tate
In
terf
ere
nce
le
vels
(d
Bm
)
Number Links
Typical Worst Case Without Algorithm
Average Without AlgorithmTypical Worst Case With Algorithm
Average With Algorithm
Colission Threshold
Cognitive Radio Technologies, 2007
4
Tutorial Background
• Most material from my three week defense– Very understanding committee– Dissertation online @
http://scholar.lib.vt.edu/theses/available/etd-12082006-141855/
– Original defense slides @ http://www.mprg.org/people/gametheory/Meetings.shtml
• Other material from training short course I gave in summer 2003– http://www.mprg.org/people/gametheory/Class.shtml
• Eventually will be formalized into a book
Cognitive Radio Technologies, 2007
5
Approximate Schedule
BreakBreak~20min1000-1020
Break3:15-3:30
Performance Metrics (11)3:00-3:15
Steady-state Solution Concepts (38)2:15-3:00
Cognitive Radio and Game Theory (51)1:30-2:15
Notion of Time and Imperfections in Games (34)3:30-4:10
Using Game Theory to Design Cognitive Radio Networks (28)4:10-4:45
Summary (14)4:45-5:00
MaterialTime
Cognitive Radio Technologies, 2007
6
General Comments on Tutorial
• More leisurely sources of information:– D. Fudenberg, J. Tirole, Game Theory,
MIT Press 1991.– R. Myerson, Game Theory: Analysis of
Conflict, Harvard University Press, 1991.
– M. Osborne, A. Rubinstein, A Course in Game Theory, MIT Press, 1994.
– J. Neel. J. Reed, A. MacKenzie, Cognitive Radio Network Performance Analysis in Cognitive Radio Technology, B. Fette, ed., Elsevier August 2006.
• “This talk is intended to provide attendees with knowledge of themost important game theoretic concepts employed in state-of-the-art dynamic spectrum access networks.”
• Lots of concepts, no proofs – cramming 2-3 semesters of game theory into 3.5 hours
• Tutorial can provide quick reference for concepts discussed at conference
Image modified from http://hacks.mit.edu/Hacks/by_year/1991/fire_hydrant/
2
Cognitive Radio Technologies, 2007
7
Cognitive Radio and Game TheoryCognitive Radio,
Game Theory,
Relationship Between the
Two
Cognitive Radio Technologies, 2007
8
Basic Game Concepts and Cognitive Radio Networks
• Assumptions about Cognitive Radios and Cognitive Radio Networks– Definition and concept of cognitive radio as used in this
presentation
– Design Challenges Posed by Cognitive Radio Networks– A Model of a Cognitive Radio Network
• High Level View of Game Theory– Common Components– Common Models
• Relationship between Game Theory and Cognitive Radio Networks– Modeling a Generic Cognitive Radio Network as a Game– Differences in Typical Assumptions
– Limitations of Application
Cognitive Radio Technologies, 2007
9
Cognitive Radio: Basic Idea
• Cognitive radios enhance the control process by adding– Intelligent, autonomous control of the
radio– An ability to sense the environment
– Goal driven operation
– Processes for learning about environmental parameters
– Awareness of its environment• Signals
• Channels
– Awareness of capabilities of the radio
– An ability to negotiate waveforms with other radios
Board package (RF, processors)
Board APIs
OS
Software ArchServices
Waveform Software
Co
ntr
ol P
lan
e
Software radios permit network or user to control the operation of a software radio
Cognitive Radio Technologies, 2007
10
OODA Loop: (continuously)• Observe outside world• Orient to infer meaning of
observations• Adjust waveform as
needed to achieve goal• Implement processes
needed to change waveform
Other processes: (as needed)
• Adjust goals (Plan)• Learn about the outside
world, needs of user,…
Urgent
Allocate Resources
Initiate Processes
Negotiate Protocols
OrientInfer from Context
Select Alternate
Goals
Plan
Normal
Immediate
LearnNew
States
Observe
Outside
World
Decide
Act
User Driven
(Buttons)Autonomous
Infer from Radio Model
StatesGenerate “Best”
Waveform
Establish Priority
Parse Stimuli
Pre-process
Cognition cycle
Conceptual Operation
Figure adapted From Mitola, “Cognitive Radio for Flexible Mobile
Multimedia Communications ”, IEEE Mobile Multimedia Conference, 1999,
pp 3-10.
Cognitive Radio Technologies, 2007
11
Implementation Classes
• Weak cognitive radio
– Radio’s adaptations
determined by hard coded
algorithms and informed by observations
– Many may not consider this
to be cognitive (see discussion related to Fig 6
in 1900.1 draft)
• Strong cognitive radio
– Radio’s adaptations determined by conscious
reasoning
– Closest approximation is the ontology reasoning
cognitive radios
In general, strong cognitive radios have potential to achieve both much better and much worse behavior in a network, but may not be realizable. Cognitive Radio Technologies, 2007
12
Brilliant Algorithms and Cognitive Engines
• Most research focuses on development of algorithms for:– Observation– Decision processes
– Learning– Policy
– Context Awareness
• Some complete OODA loop algorithms
• In general different algorithms will perform better in different situations
• Cognitive engine can be viewed as a software architecture
• Provides structure for incorporating and interfacing different algorithms
• Mechanism for sharing information across algorithms
• No current implementation standard
3
Cognitive Radio Technologies, 2007
13
Security
User Model
Policy Model
Evolver
|(Simulated Meters) – (Actual Meters)| Simulated
Meters
Actual Meters
Cognitive System Module
Cognitive System Controller
Chob
Uob
Reg
Knowledge BaseShort Term Memory
Long Term Memory
WSGA Parameter Set
Regulatory Information
Initial Chromosomes
WSGA Parameters
Objectives and weights
System Chromosome
max
max
UUU
CHCHCH
USD
USD
•=
•=
Decision Maker
User Domain
User preference
Local service facility
Policy Domain
User preference
Local service facility
Security
User data security
System/Network security
X86/Unix
Terminal
Radio-domain cognitionRadio
Resource
Monitor
Performance API Hardware/platform API
Radio
Radio
Performance
Monitor
WMS
CE-Radio Interface
Search
Space
ConfigChannel
Identifier
Waveform
Recognizer
Observation
Orientation
Action
Example Architecture from CWT
Decision
Learning
Models Cognitive Radio Technologies, 2007
14
DFS in 802.16h• Drafts of 802.16h
defined a generic DFS algorithm which implements
observation, decision, action, and learning
processes
• Very simple
implementation
Modified from Figure h1 IEEE 802.16h-06/010 Draft IEEE Standard for Local and metropolitan area networks Part 16: Air Interface for Fixed Broadband Wireless Access Systems Amendment for Improved Coexistence Mechanisms for License-Exempt Operation, 2006-03-29
Channel AvailabilityCheck on next channel
Available?
Choose Different Channel
Log of Channel Availability
Stop Transmission
Detection?
Select and change to new available channel in a defined time with a max. transmission time
In service monitoring of operating channel
Channel unavailable for Channel Exclusion time
Available?
Background In service monitoring (on non-
operational channels)
Service in function
No
No
No
Yes
Yes
Start Channel Exclusion timer
Yes
Learning
Observation
Decision,
Action
Decision, Action
Observation
Cognitive Radio Technologies, 2007
15
Used cognitive radio definition
• A cognitive radio is a radio whose control processes permit the radio to leverage situational knowledge and intelligent processing to autonomously adapt towards some goal.
• Intelligence as defined by [American Heritage_00] as “The capacity to acquire and apply knowledge, especially toward a purposeful goal.”
– To eliminate some of the mess, I would love to just call cognitive radio, “intelligent” radio, i.e.,
– a radio with the capacity to acquire and apply knowledge
especially toward a purposeful goal
Cognitive Radio Technologies, 2007
16
The Interaction Problem
• Outside world is determined by the interaction of numerous cognitive radios
• Adaptations spawn adaptations
OutsideWorld
Cognitive Radio Technologies, 2007
17
Dynamic Spectrum Access Pitfall• Suppose
– g31>g21; g12>g32 ;g23>g13
• Without loss of generality– g31, g12, g23 = 1
– g21, g32, g13 = 0.5
• Infinite Loop!– 4,5,1,3,2,6,4,…
Interf.
Chan.
(1.5,1.5,1.5)(0.5,1,0)(1,0,0.5)(0,0.5,1)(0,0.5,1)(1,0,0.5)(0.5,1,0)(1.5,1.5,1.5)
(1,1,1)(1,1,0)(1,0,1)(1,0,0)(0,1,1)(0,1,0)(0,0,1)(0,0,0)
Interference Characterization
0 1 2 3 4 5 6 7
1
2
3
Cognitive Radio Technologies, 2007
18
Implications
• In one out every four deployments, the example system will enter into an infinite loop
• As network scales, probability of entering an infinite loop goes to 1:
– 2 channels
– k channels
• Even for apparently simple algorithms,
ensuring convergence and stability will be nontrivial
( ) ( ) 31 3 / 4 n Cp loop ≥ −
( ) ( )111 1 2
n kCk
p loop+− +≥ − −
4
Cognitive Radio Technologies, 2007
19
Locally optimal decisions that lead
to globally undesirable networks
• Scenario: Distributed SINR maximizing power control in a single cluster
• For each link, it is desirable to increase transmit power in response to increased interference
• Steady state of network is all nodes transmitting at maximum power
Power
SINR
Insufficient to consider only a
single link, must consider
interaction Cognitive Radio Technologies, 2007
20
Potential Problems with Networked Cognitive Radios
Distributed
• Infinite recursions
• Instability (chaos)
• Vicious cycles
• Adaptation collisions
• Equitable distribution of resources
• Byzantine failure
• Information distribution
Centralized
• Signaling Overhead
• Complexity
• Responsiveness
• Single point of failure
Cognitive Radio Technologies, 2007
21
Cognitive Networks• Rather than having
intelligence reside in a single device, intelligence can reside in the network
• Effectively the same as a centralized approach
• Gives greater scope to the available adaptations– Topology, routing
– Conceptually permits adaptation of core and edge devices
• Can be combined with cognitive radio for mix of capabilities
• Focus of E2R program
R. Thomas et al., “Cognitive networks: adaptation and learning to achieve end-to-end performance objectives,” IEEE Communications Magazine, Dec. 2006
Cognitive Radio Technologies, 2007
22
1. Steady state characterization
2. Steady state optimality
3. Convergence
4. Stability/Noise
5. Scalabilitya1
a2
NE1
NE2
NE3
a1
a2
NE1
NE2
NE3
a1
a2
NE1
NE2
NE3
a1
a2
NE1
NE2
NE3
a3
Steady State CharacterizationIs it possible to predict behavior in the system?How many different outcomes are possible?
OptimalityAre these outcomes desirable?Do these outcomes maximize the system target parameters?
ConvergenceHow do initial conditions impact the system steady state?What processes will lead to steady state conditions?How long does it take to reach the steady state?
Stability/NoiseHow do system variations/noise impact the system?Do the steady states change with small variations/noise?Is convergence affected by system variations/noise?
ScalabilityAs the number of devices increases,
How is the system impacted?Do previously optimal steady states remain optimal?
Network Analysis Objectives
(Radio 1’s available actions)
(Ra
dio
2’s
ava
ilab
le a
ctio
ns)
foc
us
Cognitive Radio Technologies, 2007
23
General Model (Focus on OODA Loop Interactions)
• Cognitive Radios • Set N
• Particular radios, i, j
OutsideWorld
Cognitive Radio Technologies, 2007
24
General Model (Focus on OODA Loop Interactions)
Actions• Different radios may
have different capabilities
• May be constrained by policy
• Should specify each radio’s available actions to account for variations
• Actions for radio i – Ai Act
5
Cognitive Radio Technologies, 2007
25
General Model (Focus on OODA Loop Interactions)
Decision Rules
• Maps observations to actions– di:O→Ai
• Intelligence implies that these actions further the radio’s goal– ui:O→R
• Interesting problem: simultaneously modeling behavior of ontological and procedural radios
Decide
Implies very simple,
deterministic function, e.g., standard
interference function
Cognitive Radio Technologies, 2007
26
Cognitive Radio Network Modeling Summary
• Decision making radios
• Actions for each radio
• Observed Outcome Space
• Goals
• Decision Rules
• Timing
• i,j ∈N, |N| = n
• A=A1×A2×⋅⋅⋅×An
• O
• uj:O→R (uj:A→R)
• dj:O→Ai (dj:A→ Ai)
• T=T1∪T2∪⋅⋅⋅∪Tn
Cognitive Radio Technologies, 2007
27
Comments on Timing• When decisions are
made also matters and different radios will likely make decisions at different time
• Tj – when radio j makes its adaptations– Generally assumed to be
an infinite set
– Assumed to occur at discrete time
• Consistent with DSP implementation
• T=T1∪T2∪⋅⋅⋅∪Tn
• t ∈ T
Decision timing classes
• Synchronous– All at once
• Round-robin– One at a time in order
– Used in a lot of analysis
• Random– One at a time in no order
• Asynchronous– Random subset at a time
– Least overhead for a network
Cognitive Radio Technologies, 2007
28
Basic Game Components
1. A (well-defined) set of 2 or more players
2. A set of actions for each player.3. A set of preference relationships for each
player for each possible action tuple.
• More elaborate games exist with more components but these
three must always be there.
• Some also introduce an outcome function which maps action
tuples to outcomes which are then valued by the preference
relations.
• Games with just these three components (or a variation on
the preference relationships) are said to be in Normal form
or Strategic Form
Cognitive Radio Technologies, 2007
29
Set of Players (decision makers)
• N – set of n players consisting of players “named” 1, 2, 3,…,i, j,…,n
• Note the n does not mean that there are 14 players in every game.
• Other components of the game that “belong”to a particular player are normally indicated by a subscript.
• Generic players are most commonly written as i or j.
• Usage: N is the SET of players, n is the number of players.
• N \ i = 1,2,…,i-1, i+1 ,…, n All players in Nexcept for i
Cognitive Radio Technologies, 2007
30
Actions
Ai – Set of available actions for player i
ai – A particular action chosen by i, ai ∈ Ai
A – Action Space, Cartesian product of all Ai
A=A1× A2×· · · × An
a – Action tuple – a point in the Action Space
A-i – Another action space A formed from
A-i =A1× A2×· · · ×Ai-1 × Ai+1 × · · · × An
a-i – A point from the space A-i
A = Ai × A-i
A1= A-2
A2 = A-1
a
a1 = a-2
a2 = a-1
Example Two Player
Action Space
A1 = A2 = [0 ∞)
A=A1× A2
b
b1 = b-2
b2 = b-1
6
Cognitive Radio Technologies, 2007
31
Preference Relation expresses an individual player’s desirability of
one outcome over another (A binary relationship)
*
io o
o is preferred at least as much as o* by player i
i
Preference Relationship (prefers at least as much as)
i Strict Preference Relationship (prefers strictly more than)
~i “Indifference” Relationship (prefers equally)
*
io o *
io o
iff *
io o
but not
*~io o *
io o
iff *
io o
and
Preference Relations (1/2)
Cognitive Radio Technologies, 2007
32
Preference Relations (2/2)
• Games generally assume the relationship
between actions and outcomes is invertible so preferences can be
expressed over action vectors.
• Preferences are really an ordinal
relationship
– Know that player prefers one outcome to
another, but quantifying by how much introduces difficulties
Cognitive Radio Technologies, 2007
33
Preference Relation then defined as
*
ia a
Maps action space to set of real numbers.
iff ( ) ( )*
i iu a u a≥
*
ia a iff ( ) ( )*
i iu a u a>
*~ia a iff ( ) ( )*
i iu a u a=
:i
u A →R
A mathematical description of preference relationships.
Utility Functions (1/2)(Objective Fcns, Payoff Fcns)
Cognitive Radio Technologies, 2007
34
Utility Functions (2/2)
By quantifying preference relationships all sorts of valuable
mathematical operations can be introduced.
Also note that the quantification operation is not unique as
long as relationships are preserved. Many map preference
relationships to [0,1].
Example
Jack prefers Apples to Oranges
JackApples Oranges ( ) ( )Jack Jacku Apples u Oranges>
a) uJack(Apples) = 1, uJack(Oranges) = 0
b) uJack(Apples) = -1, uJack(Oranges) = -7.5
Cognitive Radio Technologies, 2007
35
Normal Form Games(Strategic Form Games)
, , iG N A u=
In normal form, a game consists of three primary
components
N – Set of Players
Ai – Set of Actions Available to Player i
A – Action Space
ui – Set of Individual Objective Functions
:i
u A →R
1 2 nA A A A= × × ×
Cognitive Radio Technologies, 2007
36
Normal Formal Games in Matrix Representation
a1
b1
a2 b2
u1(a1, a2), u2(a1, a2) u1(a1, b2), u2(a1, b2)
u1(b1, b2), u2(b1, b2)u1(b1, a2), u2(b1, a2)
A2 = a2,b2A1 = a1,b1N = 1,2
Useful for representing 2 player games with finite action sets.
Player 1’s actions are indexed by rows.
Player 2’s actions are indexed by columns.
Each entry is the payoff vector, (u1, u2), corresponding to the action tuple
7
Cognitive Radio Technologies, 2007
37
NormalUrgent
Allocate Resources
Initiate Processes
OrientInfer from Context
Establish Priority
PlanNormal
Negotiate
Immediate
LearnNew
States
Goal
Adapted From Mitola, “Cognitive Radio for Flexible Mobile Multimedia Communications ”, IEEE Mobile Multimedia Conference, 1999, pp 3-10.
Observe
Outside
World
Decide
Act
Autonomous
Infer from Radio Model
States
\
Utility function
Arguments
Utility Function
Outcome Space
Action Sets
Decision
Rules
Cognitive radios are naturally modeled as players in a game
Cognitive Radio Technologies, 2007
38
Radio 2
Actions
Radio 1
Actions
Action Space
u2u1
Decision
Rules
Decision
Rules
Outcome Space
:f A O→Informed by
Communications
Theory
( )1 2ˆ ˆ,γ γ
( )1 1u γ ( )2 2ˆu γ
Interaction is naturally modeled as a game
Cognitive Radio Technologies, 2007
39
NormalUrgent
Allocate Resources
Initiate Processes
OrientInfer from Context
Select Alternate
Goals
Establish Priority
PlanNormal
Negotiate
Immediate
LearnNew
States
Negotiate Protocols
Generate Alternate
Goals
Observe
Outside
World
Decide
Act
User Driven
(Buttons)Autonomous Determine “Best”
Plan
Infer from Radio Model
States
Determine “Best”
Known WaveformGenerate “Best”
Waveform
Level0 SDR
1 Goal Driven
2 Context Aware
3 Radio Aware
4 Planning
5 Negotiating
6 Learns Environment
7 Adapts Plans
8 Adapts Protocols
If distributed adaptation doesn’t occur,
it’s not a game
Parse Stimuli
Pre-process
Suitable for game theory analysis
Unconstrained action sets (radios can
make up new adaptations) or
undefined goals (utility functions)
make analysis impractical
When Game Theory can be Applied
Game Theory applies to: 1. Adaptive aware radios
2. Cognitive radios that learn about
their environment Cognitive Radio Technologies, 2007
40
Conditions for Applying Game Theory to CRNs
• Conditions for rationality
– Well defined decision making processes
– Expectation of how changes impacts performance
• Conditions for a nontrivial game
– Multiple interactive decision makers
– Nonsingleton action sets
• Conditions generally satisfied by distributed dynamic CRN schemes
Cognitive Radio Technologies, 2007
41
• Inappropriate applications– Cellular Downlink power control (single cell)– Site Planning– A single cognitive network
• Appropriate applications– Multiple interactive cognitive networks– Distributed power control on non-orthogonal
waveforms• Ad-hoc power control
• Cell breathing
– Adaptive MAC– Distributed Dynamic Frequency Selection– Network formation (localized objectives)
Example Application Appropriateness
Cognitive Radio Technologies, 2007
42
Some differences between game models and cognitive radio network model
Can learn O (may know or learn A)Knows AKnowledge
Not invertible (noise)
May change over time (though relatively
fixed for short periods)
Has to learn
Invertible
Constant
Knownf : A →O
Cardinal (goals)OrdinalPreferences
Cognitive RadioPlayer
• Assuming numerous iterations, normal form game only has a single stage.– Useful for compactly capturing modeling components at a single
stage
– Normal form game properties will be exploited in the analysis ofother games
– Other game models discussed throughout this presentation
8
Cognitive Radio Technologies, 2007
43
Summary• Adaptations of cognitive radios interact
– Adaptations can have unexpected negative results
• Infinite recursions, vicious cycles
– Insufficient to consider behavior of only a single link in the design
• Behavior of collection of radios can be modeled as a game
• Some differences in models and assumptions but high level mapping is fairly close
• As we look at convergence, performance, collaboration, and stability, we’ll extend the model
1
Cognitive Radio Technologies, 2007
1
Equilibrium Concepts
Nash Equilibria, Mixed Strategy Equilbria, Coalitional Games, the Core, Shapley Value, Nash Bargaining,
DySPAN 2007 April 17, 2007 Cognitive Radio Technologies, 2007
2
Steady-states• Recall model of <N,A,di,T> which we characterize with
the evolution function d
• Steady-state is a point where a*= d(a*) for all t ≥t *
• Obvious solution: solve for fixed points of d.
• For non-cooperative radios, if a* is a fixed point under synchronous timing, then it is under the other three timings.
• Works well for convex action spaces– Not always guaranteed to exist
– Value of fixed point theorems
• Not so well for finite spaces– Generally requires exhaustive search
Cognitive Radio Technologies, 2007
3
An action vector from which no player can profitably unilaterally deviate.
( ) ( ), ,i i i i i iu a a u b a− −≥An action tuple a is a NE if for every i ∈ N
for all bi ∈Ai.
Definition
Note showing that a point is a NE says nothing about the process by which the steady state is reached. Nor anything about its uniqueness.Also note that we are implicitly assuming that only pure strategies are possible in this case.
“A steady-state where each player holds a correct expectation of the other players’ behavior and acts rationally.” - Osborne
Nash Equilibrium
Cognitive Radio Technologies, 2007
4
Examples• Cognitive Radios’
Dilemma
– Two radios have two signals
to choose between n,w and N,W
– n and N do not overlap
– Higher throughput from
operating as a high power wideband signal when other
is narrowband
• Jamming Avoidance
– Two channels
– No NE
(-1,1)(1,-1)1
(1,-1)(-1,1)0
10
Jammer
Transmitter
Cognitive Radio Technologies, 2007
5
How do the players find the Nash Equilibrium?• Preplay Communication
– Before the game, discuss their options. Note only NE are suitable candidates for coordination as one player could profitably violate any agreement.
• Rational Introspection– Based on what each player knows about the other players,
reason what the other players would do in its own best interest.(Best Response - tomorrow) Points where everyone would be playing “correctly” are the NE.
• Focal Point– Some distinguishing characteristic of the tuple causes it to stand
out. The NE stands out because it’s every player’s best response.
• Trial and Error– Starting on some tuple which is not a NE a player “discovers”
that deviating improves its payoff. This continues until no player can improve by deviating. Only guaranteed to work for PotentialGames (couple weeks) Cognitive Radio Technologies, 2007
6
Nash Equilibrium as a Fixed Point
• Individual Best Response
• Synchronous Best Response
• Nash Equilibrium as a fixed point
• Fixed point theorems can be used to establish existence of NE (see dissertation)
• NE can be solved by implied system of equations
( ) ( ) ( ) ˆ : , ,i i i i i i i i i i iB a b A u b a u a a a A− −= ∈ ≥ ∀ ∈
( ) ( )ˆ ˆi
i NB a B a
∈= ×
( )* *ˆa B a=
2
Cognitive Radio Technologies, 2007
7
Example solution for Fixed Point by Solving for Best Response Fixed Point
• Bandwidth Allocation Game
– Five cognitive radios with each radio, i, free to determine the number of simultaneous frequency hopping channels the radio implements, ci ∈[0,∞).
– Goal
– P(c) fraction of symbols that are not interfered with (making P(c)ci the goodput for radio i)
– Ci(ci) is radio i’s cost for supporting ci simultaneous channels.
( )i k i i
k N
u c B c c Kc∈
= − −
∑
( ) ( ) ( )i i i iu c P c c C c= −
Cognitive Radio Technologies, 2007
8
Best Response Analysis
( )\
ˆ / 2i i k
k N i
c B c B K c∈
= = − −
∑
( )ˆ / 6ic B K i N= − ∀ ∈
( ) ( )ˆ / 1ic B K N i N= − + ∀ ∈
( )i k i i
k N
u c B c c Kc∈
= − −
∑Goal
Best Response
Simultaneous System of Equations
Solution
Generalization
Cognitive Radio Technologies, 2007
9
Significance of NE for CRNs
• Why not “if and only if”?– Consider a self-motivated game with a local maximum and a
hill-climbing algorithm.
– For many decision rules, NE do capture all fixed points (see dissertation)
• Identifies steady-states for all “intelligent” decision rules with the same goal.
• Implies a mechanism for policy design while accommodating differing implementations– Verify goals result in desired performance– Verify radios act intelligently Cognitive Radio Technologies, 2007
10
Key Theorem for NE Existence
• Kakutani’s Fixed Point Theorem
– Let f :X→ X be an upper semi-continuous convex valued correspondence from a non-empty compact convex set X ⊂ Rn,
then there is some
x*∈X such that x* ∈f(x*)
x
f(x)
x
f(x)
Cognitive Radio Technologies, 2007
11
Nash Equilibrium Existence
( ) ( ) * *:U a a A f a a= ∈ ≥
a2a1
U(a1)
a
f (a)
a0 a2a1
U(a1)
a
f (a)
a0
Visualizable Definition of Quasi-Concavity
All upper-level sets are convex
Cognitive Radio Technologies, 2007
12
My Favorite Mixed Strategy StoryPure Strategies in an Extended GameConsider an extensive form game where each stage is a strategic
form game and the action space remains the same at each stage.
Before play begins, each player chooses a probabilistic strategy
that assigns a probability to each action in his action set. At each
stage, the player chooses an action from his action set according
to the probabilities he assigned before play began.
Example
Consider a video football game which will be simulated. Before the
game begins two players assign probabilities of calling running
plays or passing plays for both offense or defense. In the simulation,
for each down the kind of play chosen by each team is based on the
initial probabilities assigned to kinds of plays. (Play NCAA2003)
3
Cognitive Radio Technologies, 2007
13
Example Mixed Strategy Game
Jamming game
a1
b1
a2 b2
1,-1 -1, 1
1, -1-1, 1
p
(1-p)
q (1-q)Action Tuples Probabilities
(a1,a2)
(a1,b2)
(b1,a2)
(b1,b2)
pq
p(1-q)
(1-p)(1-q)
(1-p)q
Expected Utilities
( ) ( ) ( )( )
( ) ( ) ( )( )( )1 , 1 1 1
1 1 1 1 1
U p q pq p q
p q p q
= + − − +
+ − − + − −
( ) ( ) ( )( )
( ) ( ) ( )( )( )2 , 1 1 1
1 1 1 1 1
U p q pq p q
p q p q
= − + − +
+ − + − − −
∆(A1)=p,(1-p): ∀p∈[0,1]
∆(A2)=q,(1-q): ∀q∈[0,1]
Sets of probability distributions
Cognitive Radio Technologies, 2007
14
Nash Equilibrium in a Mixed Strategy Game
Definition Mixed Strategy Nash Equilibrium
A mixed strategy profile α* is a NE iff ∀i∈N
( ) ( ) ( )* * *, ,i i i i i i i i
U U Aα α β α β− −≥ ∀ ∈ ∆
Best Response Correspondence
( )( )
( )arg max ,i i
i i i i iA
BR Uα
α α α− −∈∆
=
Alternate NE Definition
Consider ( ) ( )i N iB BRα α∈= ×
A mixed strategy profile α* is a NE iff
( )* *Bα α∈
Cognitive Radio Technologies, 2007
15
0 1
1
Best Response
( )1 4 3u
q qp
∂= −
∂
( )2 4 3u
p pq
∂= −
∂
( )1
1 3/ 4
[0,1] 3/ 4
0 3 / 4
q
BR q q
q
>
= = <
( ) ( ) ( )( )( )2 , 1 1 1 3U p q pq p q= + − −
( ) ( ) ( )( )( )1 , 1 1 1 3U p q pq p q= + − −
( )2
1 3/ 4
[0,1] 3/ 4
0 3/ 4
p
BR p p
p
>
= = <
0.75
0.75
BR1(q)
BR2(p)
Best Response Correspondences
Note: both NE of the strategic game
are NE in its mixed extension
p(a1)
p(a2)
Nash Equilibrium
Cognitive Radio Technologies, 2007
16
Interesting Properties of Mixed Strategy Games
1. Every Mixed Extension of a Strategic Game has an NE.
2. A mixed strategy αi is a best response to α-i iff every action in the support of αi is itself a best response to α-i.
3. Every action in the support of any player’s equilibrium mixed strategy yields the same payoff to that player.
Cognitive Radio Technologies, 2007
17
Coalitional Game (with transferable payoff)• Concept: groups of players (called coalitions) conspire together to
implement actions which yields a result for the coalition. The value received by the coalition is then distributed among the coalition members.
• Where do radios collaborate and distribute value?– 802.16h interference groups – allocation of bandwidth– Distribution of frequencies/spreading codes among cells
– File sharing in P2P network
• Transferable utility refers to existence of some commodity for which a player’s utility increases by one unit for every unit of the commodity it receives
• Game Components, ⟨N,v⟩– N set of players
– Characteristic function– Coalition, S⊆N
• How is this value distributed?– Payoff vector, (xi)i∈S
• Payoff vector is said to be S-feasible if x(S) ≤ v(S)
: 2 \Nv ∅ →
( ) i
i S
x S x∈
=∑
Cognitive Radio Technologies, 2007
18
The Core (Transferrable)
• The Core– For ⟨N,v⟩, the set of feasible payoff profiles,
(xi)i∈S for which there is no coalition S and S-feasible payoff vector (yi)i∈S for which yi > xi
for all i∈S.
• General principles of the NE also apply to the Core:– Number of solutions for a game may be
anywhere from 0 to ∞
– May be stable or unstable.
4
Cognitive Radio Technologies, 2007
19
Example
• Suppose three radios, N = 1,2,3, can choose to participate in a peer-to-peer network.
• Characteristic Function– v(N) = 1
– v(1,2)= v(1,3)= v(2,3)=α∈[0,1]
– v(1)= v(2)= v(3) = 0
• Loosely, α indicates # of duplicated files
• If α>2/3, Core is empty
x = (2/5, 2/5,0)Example adaptations for
α=4/5x = (0, 3/5, 1/5) x = (2/5, 0, 2/5)
x = (1/3, 1/3, 1/3) x = (2/5, 2/5,0)
Cognitive Radio Technologies, 2007
20
Comments on the Core
• Possibility of empty core implies that even when radios can freely negotiate and form arbitrary coalitions, no steady-state may exist
• Frequently very large (infinite) number of steady-states, e.g., α<2/3 – Makes it impossible to predict exact behavior
• Existence conditions for the Core, but would need to cover some linear programming concepts
• Related (but not addressed today) concepts:– Bargaining Sets, Kernel, Nucleolus
Cognitive Radio Technologies, 2007
21
Strong NE
• Concept: Assume radios are able to collaborate, but utilities aren’t necessarily transferrable
• An action tuple a* such that
( ) ( )* *, ,i i S S S ii S
u a u a a S N a A−∈
≥ ∀ ⊆ ∈ ×
No Strong NE
(22, 22)(21, 9.6)w
(9.6, 21)(9.6,9.6)n
WN
Unique Strong NE
Cognitive Radio Technologies, 2007
22
Motivation for Shapley value
• Core was generally either empty or very large. Want a “good” single solution.
• Terminology
– Marginal Contribution of i
– Interchangeability of i, j
– Dummy player (no synergy)
( ) ( ) ( )i S v S i v S∆ = ∪ −
( ) ( ) \i
S v i S N i∆ = ∀ ⊆
( ) ( ) \ , i j
S S S N i j∆ = ∆ ∀ ⊆
Cognitive Radio Technologies, 2007
23
Axioms for Shapley Value
• Let ψ be some distribution of value for a TU coalition game
• Symmetry:– If i and j are interchangeable, then ψi(v)=ψj(v)
• Dummy:– If i is a dummy, then ψi(v) = v(i)
• Additivity:– Given ⟨N,v⟩ and ⟨N,w⟩, ψi(v + w)= ψi(v)+ ψi(w) for all
i∈N, where v+w = v(S) + w(S)
• Balanced Contributions– Given ⟨N,v⟩, ( ) ( ) ( ) ( )\ \
, \ . , \ .N j N i
i i j jN v N j v N v N i vψ ψ ψ ψ− = −
Cognitive Radio Technologies, 2007
24
Shapley Value
Only assignment (value) that satisfies balanced contributions; only assignment that simultaneously satisfies symmetry, dummy, and additivity axioms
( )( )
( ) ( )( )\
! 1 !
!i
S N i
S N SS v S i v S
Nψ
⊆
− −= −∑ ∪
Marginal Value Contributed by i
Probability that i will be next one invited to the grand coalition given that coalition S is already part of the coalition assuming random ordering.
5
Cognitive Radio Technologies, 2007
25
Implications of Shapley Value
• One form of a fair allocation– What you receive is based on the value you add
– Independent of order of arrival
– I liken it to setting salaries according to the Value Over Replacement Player concept in Baseball Statistics
• “Better” solution concept than the core as it’s a single payoff as opposed to a potentially infinite number
• Allows for analysis of relative “power” of different players in the system
Cognitive Radio Technologies, 2007
26
Bargaining Problem• Components: ⟨F, v⟩
– Feasible payoffs F, closed convex subset of Rn
– Disagreement Point v = (v1, v2)• What 1 or 2 could achieve without bargaining
• Example:– Even if system is jammed, still gets some throughput
– Member of 802.16h interference group and try its luck
• F is said to be essential if there is some y∈F such that y1>v1 and y2>v2
• If contracts are “binding” then F could be the payoffs corresponding to entire original action space
• Otherwise, F may need to be drawn from the set of NE or from enforceable set (see punishment in repeated games)
• A particular solution is referred to by φ(F, v) ∈Rn
Cognitive Radio Technologies, 2007
27
Desirable Bargaining Axioms about a Solution
• Strong Efficiency
– φ(F, v) is Pareto Efficient
• Individually Rational
– φ(F, v) ≥v
• Scale Covariance
– For any λ1, λ2, γ1, γ2∈R, λ1, λ2 >0, if
then
• Independence of Irrelevant Alternatives
– If G ⊆F and G is closed and convex
and φ(F, v)∈G, then φ(G, v)=φ(F, v)
• Symmetry
– If v1=v2 and
(x1,x2)|(x2,x1)∈F=F, then φ1(F, v)= φ2(F, v)
( ) ( ) 1 1 1 2 2 2 1 2, | ,G x x x x Fλ γ λ γ= + + ∈
( ) ( ) ( )( )1 1 1 2 2 2, , , ,G w F v F vφ λ φ γ λ φ γ= + +
( )1 1 1 2 2 2,w v vλ γ λ γ= + + Cognitive Radio Technologies, 2007
28
Nash Bargaining Solution
• NBS
• Interestingly, this is the only bargaining solution which simultaneously satisfies the
preceding 5 axioms
( ) ( ),
, arg maxi i
i Nx F x vF v x vφ
∈∈ ≥∈ Π −
Cognitive Radio Technologies, 2007
29
GT framework for BW allocation
[Yaiche]: System Model
• N users
• L links
• Users compete for the total link capacity
• Each user has a minimum rate MRi and peak rate PRi
• Admissible rate vector is given by,
C : vector of link capacities
A L*N: alp = 1 if link belongs to path p, else 0.
0 | , , andNX x x MR x PR Ax C= ∈ ≥ ≤ ≤
Scenario given in H. Yaiche, R. Mazumdar, C. Rosenberg, “A game theoretic framework for bandwidth allocation and pricing in broadband networks”, IEEE/ACM Transactions on Networking, Volume: 8 , Issue: 5 , Oct. 2000, pp. 667-678. Cognitive Radio Technologies, 2007
30
Centralized Optimization Problem
( )
( ) ( )
1
: 1
1
1
N
i ix
i
i i
i i
l l
Max x MR
st x MR i N
x PR i N
Ax C l L
=
−
≥ ∈
≤ ∈
≥ ∈
∏
…
…
…
•
• Unique NBS exists
v
F
6
Cognitive Radio Technologies, 2007
31
Steady-State Summary• Not every game has a steady-state• NE are analogous to fixed points of self-interested
decision processes• NE can be applied to procedural and ontological radios
– Don’t need to know decision rule, only goals, actions, and assumption that radios act in their own interest
• A game (network) may have 0, 1, or many steady-states• All finite normal form games have an NE in its mixed
extension– Over multiple iterations, implies constant adaptation
• More complex game models yield more complex steady-state concepts
• Can define steady-states concepts for coalitional games– Frequently so broad that specific solutions are used
1
Cognitive Radio Technologies, 2007
1
Evaluating Equilibria
Objective Function
Maximization, Pareto Efficiency, Notions
of Fairness
DySPAN 2007 April 17, 2007 Cognitive Radio Technologies, 2007
2
Optimality
• In general we assume the existence of some design objective function J:A→R
• The desirableness of a network state, a, is the value of J(a).
• In general maximizers of J are unrelated to fixed points of d.
Figure from Fig 2.6 in I. Akbar, “Statistical Analysis of Wireless Systems Using Markov Models,” PhD Dissertation, Virginia Tech, January 2007
Cognitive Radio Technologies, 2007
3
Example Functions• Utilitarian
– Sum of all players’ utilities
– Product of all players’utilities
• Practical– Total system throughput
– Average SINR
– Maximum End-to-End Latency
– Minimal sum system interference
• Objective can be unrelated to utilities
Utilitarian Maximizers
System Throughput Maximizers
Interference Minimization
Cognitive Radio Technologies, 2007
4
Price of Anarchy (Factor)
• Centralized solution always at least as good as distributed solution– Like ASIC is always at least as good as
DSP
• Ignores costs of implementing algorithms– Sometimes centralized is infeasible (e.g.,
routing the Internet)
– Distributed can sometimes (but not generally) be more costly than centralized
Performance of Centralized Algorithm Solution
Performance of Distributed Algorithm Solution
≥≥≥≥ 1
9.6
7
Cognitive Radio Technologies, 2007
5
Implications• Best of All Possible Worlds
– Low complexity distributed algorithms with low anarchy factors
• Reality implies mix of methods– Hodgepodge of mixed solutions
• Policy – bounds the price of anarchy
• Utility adjustments – align distributed solution with centralized solution
• Market methods – sometimes distributed, sometimes centralized• Punishment – sometimes centralized, sometimes distributed,
sometimes both• Radio environment maps –”centralized” information for distributed
decision processes
– Fully distributed• Potential game design – really, the panglossian solution, but only
applies to particular problems
Cognitive Radio Technologies, 2007
6
Pareto efficiency (optimality)• Formal definition: An action vector a* is
Pareto efficient if there exists no other action vector a, such that every radio’s valuation of the network is at least as good and at least one radio assigns a higher valuation
• Informal definition: An action tuple is Pareto
efficient if some radios must be hurt in order to improve the payoff of other radios.
• Important note– Like design objective function, unrelated to fixed
points (NE)
– Generally less specific than evaluating design objective function
2
Cognitive Radio Technologies, 2007
7
Example Games
a1
b1
a2 b2
1,1 -5,5
-1,-15,-5
a1
b1
a2 b2
1,1 -5,5
3, 35,-5
Legend Pareto Efficient
NE NE + PE
Cognitive Radio Technologies, 2007
8
Notions of Fairness
• What is “Fair”?
– Abstractly “fair” means different things to different analysts
– In every day life, really just short hand for “I deserve more than I got”
• Nonetheless is used to evaluate how equitably radio resources are distributed
Cognitive Radio Technologies, 2007
9
Gini Coefficient• Basic concept:
– Order players by utility. – Form CDF for sorted utility
distribution (Lorenz curve)
– Integrate (sum) the difference between perfect equality (of outcome) and CDF
– Divide result by sum of all players’utilities
• Formula
• Used in a lot of macro-economic comparisons of income distributions
• Relatively simple, independent of scale, independent of size of N, anonymity
• Radically different outcomes can give the same result
00.37w
0.370n
WNG
Player #
Ag
gre
ga
te U
tilit
y
Lorenz curvePerfe
ct E
quality
( )( ) ( )
( )
11
1 2
i
i N
i
i N
n i u a
G a nn u a
∈
∈
+ −
= + −
∑
∑
Cognitive Radio Technologies, 2007
10
Other Metrics of Fairness
• Theill Index
• Atkinson Index, ε is income inequality aversion
( ) ( )( )
[ )1 1
11 11 , 0,1
i
i N
T a u au n
ε
εε
−
−
∈
= − ∈
∑
( )( ) ( )1
lni i
i N
u a u aT a
n u u∈
=
∑ ( ) ( )
1i
i N
u a u an ∈
= ∑
( ) ( )1/
1 11 , 1
n
i
i N
T a u au n
ε∈
= − =
∑
Cognitive Radio Technologies, 2007
11
Summary of EquilibriaEvaluation
• Lots of different ways which a point can be evaluated
• Many are contradictory
• Loosely, any point could be said to be optimal given the right objective function
• Insufficient to say that a point is optimal
• Must describe the metric in use
• Suggestion: use whatever metric makes sense to you as a network designer
1
Cognitive Radio Technologies, 2007
1
The Notion of Time and Imperfections in Games and NetworksExtensive Form Games, Repeated Games, Convergence Concepts in Normal Form Games, Trembling Hand Games, Noisy Observations
DySPAN 2007 April 17, 2007 Cognitive Radio Technologies, 2007
2/67
Model Timing Review• When decisions are
made also matters and different radios will likely make decisions at different time
• Tj – when radio j makes its adaptations– Generally assumed to be
an infinite set
– Assumed to occur at discrete time
• Consistent with DSP implementation
• T=T1∪T2∪⋅⋅⋅∪Tn
• t ∈ T
Decision timing classes
• Synchronous– All at once
• Round-robin– One at a time in order
– Used in a lot of analysis
• Random– One at a time in no order
• Asynchronous– Random subset at a time
– Least overhead for a network
Cognitive Radio Technologies, 2007
3/67
Motivating Extensive Form Games (1/2)
Models attempt to capture as much information as possible in as
simple a format as possible. Often this requires abstracting away
certain system details that are deemed unimportant to the analysis.
To this point, we have only considered systems where all
decision-making occurs simultaneously and where action sets are
defined independently of the decision making process.
Although many systems can be abstracted to fit these conventions,
it is sometimes clumsy and if the analyst is not careful, may
eliminate information that would alter the expected outcome.
Cognitive Radio Technologies, 2007
4/67
Extensive Form Game Components
1. A set of players.
2. The actions available to each player at each decision moment (state).
3. A way of deciding who is the current decision maker.
4. Outcomes on the sequence of actions.
5. Preferences over all outcomes.
Components
A1
2
1
2
1’
2’
B
B
1,-1
-1,1
-1,1
1,-1
1
2
1,1’ 1,2’ 2,1’ 2,2’
1,-1
1,-11,-1
1,-1 -1,1 -1,1
-1,1 -1,1
Strategic Form Equivalence
Strategies for A
1,2
Strategies for B
(1,1’),(1,2’),
(2,1’),(2,2’)
Game Tree Representation
A Silly Jammer Avoidance Game
Cognitive Radio Technologies, 2007
5/67
Backwards Induction• Concept
– Reason backwards based on what each player would rationally play
– Predicated on Sequential Rationality– Sequential Rationality – if starting at any decision point for a
player in the game, his strategy from that point on represents abest response to the strategies of the other players
– Subgame Perfect Nash Equilibrium is a key concept (not formally discussed today).
C
S S
1 1
1,0
C2
0,2
C
S
3,1
S
C2
2,4
S
1
5,3
C C
S
2
4,6
7,54,65,32,43,10,2
Alternating Packet Forwarding Game
Cognitive Radio Technologies, 2007
6/67
Comments on Extensive Form Games
• Actions will generally not be directly observable
• However, likely that cognitive radios will build up histories
• Ability to apply backwards induction is predicated on knowing other radio’s objectives, actions, observations and what they know they know…– Likely not practical
• Really the best choice for modeling notion of time when actions available to radios change with history
2
Cognitive Radio Technologies, 2007
7/67
Repeated GamesStage 1
Stage 2
Stage k
Stage 1
Stage 2
Stage k
• Same game is repeated
– Indefinitely
– Finitely
• Players consider discounted payoffs
across multiple stages
– Stage k
– Expected value over all future stages
( ) ( )k k k
i iu a u aδ=
( )( ) ( )0
k k k
i i
k
u a u aδ∞
=
=∑
Cognitive Radio Technologies, 2007
8/67
Lesser Rationality: Myopic Processes
• Players have no knowledge about utility functions, or expectations about future play,
typically can observe or infer current actions
• Best response dynamic – maximize individual
performance presuming other players’ actions are fixed
• Better response dynamic – improve individual performance presuming other players’ actions are fixed
• Interesting convergence results can be established
Cognitive Radio Technologies, 2007
9/67
Paths and Convergence• Path [Monderer_96]
– A path in Γ is a sequence γ = (a0, a1,…) such that for every k ≥ 1 there exists a unique player such that the strategy combinations (ak-1, ak) differs in exactly one coordinate.
– Equivalently, a path is a sequence of unilateral deviations. When discussing paths, we make use of the following conventions.
– Each element of γ is called a step.
– a0 is referred to as the initial or starting point of γ.
– Assuming γ is finite with m steps, am is called the terminal point or ending point of γ and say that γ has length m.
• Cycle [Voorneveld_96]– A finite path γ = (a0, a1,…,ak) where ak = a0
Cognitive Radio Technologies, 2007
10/67
Improvement Paths
• Improvement Path– A path γ = (a0, a1,…) where for all k≥1,
ui(ak)>ui(a
k-1) where i is the unique deviator at k
• Improvement Cycle– An improvement path that is also a cycle
– See the DFS example
γ2
γ1
γ3
γ4γ5
γ6γ2
γ1
γ3
γ4γ5
γ6
Cognitive Radio Technologies, 2007
11/67
Convergence Properties
• Finite Improvement Property (FIP)
– All improvement paths in a game are finite
• Weak Finite Improvement Property (weak
FIP)
– From every action tuple, there exists an
improvement path that terminates in an NE.
• FIP implies weak FIP
• FIP implies lack of improvement cycles
• Weak FIP implies existence of an NE Cognitive Radio Technologies, 2007
12/67
Examples
a
b
A B
1,-1
-1,1
0,2
2,2
Game with FIP
a
b
A B
1,-1 -1,1
1,-1-1,1
C
0,2
1,2
c 2,12,0 2,2
Weak FIP but not FIP
3
Cognitive Radio Technologies, 2007
13/67
Implications of FIP and weak FIP• Assumes radios are incapable of reasoning ahead and
must react to internal states and current observations
• Unless the game model of a CRN has weak FIP, then no autonomously rational decision rule can be guaranteed to converge from all initial states under random and round-robin timing (Theorem 4.10 in dissertation).
• If the game model of a CRN has FIP, then ALL autonomously rational decision rules are guaranteed to converge from all initial states under random and round-robin timing.– And asynchronous timings, but not immediate from definition
• More insights possible by considering more refined classes of decision rules and timings
Cognitive Radio Technologies, 2007
14/67
Decision Rules
Cognitive Radio Technologies, 2007
15/67
Markov Chains• Describes adaptations
as probabilistic transitions between network states.– d is nondeterministic
• Sources of randomness:– Nondeterministic timing
– Noise
• Frequently depicted as a weighted digraph or as a transition matrix
Cognitive Radio Technologies, 2007
16/67
General Insights ([Stewart_94])• Probability of occupying a state
after two iterations.– Form PP.
– Now entry pmn in the mth row and nth
column of PP represents the probability that system is in state an
two iterations after being in state am.
• Consider Pk.– Then entry pmn in the mth row and nth
column of represents the probability that system is in state an two iterations after being in state am.
Cognitive Radio Technologies, 2007
17/67
Steady-states of Markov chains
• May be inaccurate to consider a Markov chain to have a fixed point– Actually ok for absorbing Markov chains
• Stationary Distribution– A probability distribution such that ππππ* such that
ππππ*T P =ππππ*T is said to be a stationary distribution for the Markov chain defined by P.
• Limiting distribution– Given initial distribution ππππ0 and transition
matrix P, the limiting distribution is the distribution that results from evaluating
0lim T k
kπ
→∞P
Cognitive Radio Technologies, 2007
18/67
Ergodic Markov Chain
• [Stewart_94] states that a Markov chain is ergodic if it is a Markov chain if it is a) irreducible, b) positive recurrent, and c)
aperiodic.
• Easier to identify rule:
– For some k Pk has only nonzero entries
• (Convergence, steady-state) If ergodic, then chain has a unique limiting stationary distribution.
4
Cognitive Radio Technologies, 2007
19/67
Absorbing Markov Chains
• Absorbing state
– Given a Markov chain with transition matrix P, a state am is said to be an absorbing state if pmm=1.
• Absorbing Markov Chain
– A Markov chain is said to be an absorbing Markov chain if
• it has at least one absorbing state and
• from every state in the Markov chain there exists a sequence
of state transitions with nonzero probability that leads to an absorbing state. These nonabsorbing states are called
transient states.
a0 a1 a2 a3 a4 a5
Cognitive Radio Technologies, 2007
20/67
Absorbing Markov Chain Insights
• Canonical Form
• Fundamental Matrix
• Expected number of times that the system will pass through state am given that the system starts in state ak.– nkm
• (Convergence Rate) Expected number of iterations before the system ends in an absorbing state starting in state am is given by tm where 1 is a ones vector– t=N1
• (Final distribution) Probability of ending up in absorbing state am given that the system started in ak is bkm where
' ab
=
Q RP
0 I
( )1−
= −N I Q
=B NR
Cognitive Radio Technologies, 2007
21/67
Absorbing Markov Chains and Improvement Paths
• Sources of randomness– Timing (Random, Asynchronous)
– Decision rule (random decision rule)– Corrupted observations
• An NE is an absorbing state for autonomously rational decision rules.
• Weak FIP implies that the game is an absorbing Markov chain as long as the NE terminating improvement path always has a nonzero probability of being implemented.
• This then allows us to characterize – convergence rate, – probability of ending up in a particular NE,
– expected number of times a particular transient state will be visited
Cognitive Radio Technologies, 2007
22/67
Connecting Markov models, improvement paths, and decision rules
• Suppose we need the path γ = (a0, a1,…am) for convergence by weak FIP.
• Must get right sequence of players and right sequence of adaptations.
• Friedman Random Better Response– Random or Asynchronous
• Every sequence of players have a chance to occur• Random decision rule means that all improvements have a chance to
be chosen
– Synchronous not guaranteed
• Alternate random better response (chance of choosing same action)– Because of chance to choose same action, every sequence of
players can result from every decision timing.
– Because of random choice, every improvement path has a chance of occurring
Cognitive Radio Technologies, 2007
23/67
Convergence Results (Finite Games)
• If a decision rule converges under round-robin, random, or synchronous timing, then it also converges under asynchronous timing.
• Random better responses converge for the most decision timings and the most surveyed game conditions.– Implies that non-deterministic procedural cognitive radio
implementations are a good approach if you don’t know much about the network.
Cognitive Radio Technologies, 2007
24/67
Impact of Noise
• Noise impacts the mapping from actions to outcomes, f :A→O
• Same action tuple can lead to different outcomes
• Most noise encountered in wireless systems is theoretically unbounded.
• Implies that every outcome has a nonzero chance of being observed for a particular action tuple.
• Some outcomes are more likely to be observed than others (and some outcomes may have a very small chance of occurring)
5
Cognitive Radio Technologies, 2007
25/67
DFS Example• Consider a radio observing the spectral
energy across the bands defined by the set C where each radio k is choosing its band of operation fk.
• Noiseless observation of channel ck
• Noisy observation
• If radio is attempting to minimize inbandinterference, then noise can lead a radio to believe that a band has lower or higher interference than it does
( ) ( ),i k ki k k k
k N
o c g p c fθ∈
=∑( ) ( ) ( ), ,i k ki k k k i k
k N
o c g p c f n c tθ∈
= +∑
Cognitive Radio Technologies, 2007
26/67
Trembling Hand (“Noise” in Games)
• Assumes players have a nonzero chance of making an error implementing their action.
– Who has not accidentally handed over the wrong amount of cash at a restaurant?
– Who has not accidentally written a “tpyo”?
• Related to errors in observation as erroneous observations cause errors in implementation
(from an outside observer’s perspective).
Cognitive Radio Technologies, 2007
27/67
Noisy decision rules
• Noisy utility ( ) ( ) ( ), ,i i iu a t u a n a t= +
Trembling Hand
ObservationErrors
Cognitive Radio Technologies, 2007
28/67
Implications of noise• For random timing, [Friedman] shows game with noisy
random better response is an ergodic Markov chain.
• Likewise other observation based noisy decision rules are ergodic Markov chains– Unbounded noise implies chance of adapting (or not adapting) to
any action
– If coupled with random, synchronous, or asynchronous timings, then CRNs with corrupted observation can be modeled as ergodic Makov chains.
– Not so for round-robin (violates aperiodicity)
• Somewhat disappointing– No real steady-state (though unique limiting stationary
distribution)
Cognitive Radio Technologies, 2007
29/67
DFS Example with three access points
• 3 access nodes, 3 channels, attempting to operate in band with least spectral energy.
• Constant power• Link gain matrix
• Noiseless observations
• Random timing
12
3
Cognitive Radio Technologies, 2007
30/67
Trembling Hand
• Transition Matrix, p=0.1
• Limiting distribution
6
Cognitive Radio Technologies, 2007
31/67
Noisy Best Response
• Transition Matrix, N(0,1) Gaussian Noise
• Limiting stationary distributions
Cognitive Radio Technologies, 2007
32/67
Comment on Noise and Observations
• Cardinality of goals makes a difference for cognitive radios– Probability of making an error is a function of the difference
in utilities– With ordinal preferences, utility functions are just useful
fictions• Might as well assume a trembling hand
• Unboundedness of noise implies that no state can be absorbing for most decision rules
• NE retains significant predictive power– While CRN is an ergodic Markov chain, NE (and the
adjacent states) remain most likely states to visit– Stronger prediction with less noise
– Also stronger when network has a Lyapunov function
– Exception - elusive equilibria ([Hicks_04])
Cognitive Radio Technologies, 2007
33/67
Stability
x
y
x
y
Stable, but not attractive
x
y
Attractive, but not stable Cognitive Radio Technologies, 2007
34/67
Lyapunov’s Direct Method
Left unanswered: where does L come from?
Can it be inferred from radio goals?
Cognitive Radio Technologies, 2007
35/67
Summary• Given a set of goals, an NE is a fixed point for all radios with those
goals for all autonomously rational decision processes• Traditional engineering analysis techniques can be applied in a
game theoretic setting– Markov chains to improvement paths
• Network must have weak FIP for autonomously rational radios to converge– Weak FIP implies existence of absorbing Markov chain for many
decision rules/timings
• In practical system, network has a theoretically nonzero chance of visiting every possible state (ergodicity), but does have uniquelimiting stationary distribution– Specific distribution function of decision rules, goals
– Will be important to show Lyapunov stability
• Shortly, we’ll cover potential games and supermodular games which can be shown to have FIP or weak FIP. Further potential games have a Lyapunov function!
1
Cognitive Radio Technologies, 2007
1
Designing Cognitive Radio Networks to Yield Desired Behavior
Policy, Cost Functions,
Global Altruism, SupermodularGames, Potential
Games
DySPAN 2007 April 17, 2007 Cognitive Radio Technologies, 2007
2
Policy• Concept: Constrain the
available actions so the worst cases of distributed decision making can be avoided
• Not a new concept –– Policy has been used since
there’s been an FCC
• What’s new is assuming decision makers are the radios instead of the people controlling the radios
Cognitive Radio Technologies, 2007
3
Policy applied to radios instead of humans
• Need a language to convey policy– Learn what it is– Expand upon policy later
• How do radios interpret policy– Policy engine?
• Need an enforcement mechanism– Might need to tie in to humans
• Need a source for policy– Who sets it?– Who resolves disputes?
• Logical extreme can be quite complex, but logical extreme may not be necessary.
Policies
frequency
mask
Cognitive Radio Technologies, 2007
4
• Detection
– Digital TV: -116 dBm over a 6 MHz channel
– Analog TV: -94 dBm at the peak of the NTSC (National Television System Committee) picture carrier
– Wireless microphone: -107 dBm in a 200 kHz bandwidth.
• Transmitted Signal
– 4 W Effective Isotropic Radiated Power (EIRP)
– Specific spectral masks
– Channel vacation timesC. Cordeiro, L. Challapali, D. Birru, S. Shankar, “IEEE 802.22: The First Worldwide Wireless Standard based on Cognitive Radios,”
IEEE DySPAN2005, Nov 8-11, 2005 Baltimore, MD.
802.22 Example Policies
Cognitive Radio Technologies, 2007
5
Cost Adjustments
• Concept: Centralized unit dynamically adjusts costs in radios’ objective functions to ensure
radios operate on desired point
• Example: Add -12 to use of wideband waveform
( ) ( ) ( )i i iu a u a c a= +
Cognitive Radio Technologies, 2007
6
Comments on Cost Adjustments
• Permits more flexibility than policy
– If a radio really needs to deviate, then it can
• Easy to turn off and on as a policy tool
– Example: protected user shows up in a channel, cost to use that channel goes up
– Example: prioritized user requests channel, other users’ cost to use prioritized user’s channel goes up (down if when done)
2
Cognitive Radio Technologies, 2007
7
Global Altruism: distributed, but more costly• Concept: All radios distributed all relevant information to all other
radios and then each independently computes jointly optimal solution– Proposed for spreading code allocation in Popescu04, Sung03– Used in xG Program (Comments of G. Denker, SDR Forum Panel
Session on “A Policy Engine Framework”) Overhead ranges from 5%-27%
• C = cost of computation
• I = cost of information transfer from node to node
• n = number of nodes• Distributed
– nC + n(n-1)I/2
• Centralized (election)– C + 2(n-1)I
• Price of anarchy = 1
• May differ if I is asymmetric
Cognitive Radio Technologies, 2007
8
Improving Global Altruism• Global altruism is clearly inferior to a centralized solution
for a single problem. • However, suppose radios reported information to and
used information from a common database– n(n-1)I/2 => 2nI
• And suppose different radios are concerned with different problems with costs C1,…,Cn
• Centralized– Resources = 2(n-1)I + sum(C1,…,Cn)– Time = 2(n-1)I + sum(C1,…,Cn)
• Distributed– Resources = 2nI + sum(C1,…,Cn)– Time = 2I + max (C1,…,Cn)
Cognitive Radio Technologies, 2007
9
Example Application: • Overlay network of secondary
users (SU) free to adapt power, transmit time, and channel
• Without REM:– Decisions solely based on link
SINR
• With REM– Radios effectively know everything
Upshot: A little gain for the secondary users; big gain for primary users
From: Y. Zhao, J. Gaeddert, K. Bae, J. Reed, “Radio Environment Map Enabled Situation-Aware Cognitive Radio Learning Algorithms,” SDR Forum Technical Conference 2006.
Cognitive Radio Technologies, 2007
10
Comments on Radio Environment Map
• Local altruism also possible
– Less information transfer
• Like policy, effectively needs a common
language
• Nominally could be centralized or distributed
database
• Read more:
– Y. Zhao, B. Le, J. Reed, “Infrastructure Support – The Radio Environment MAP,” in Cognitive Radio Technology, B. Fette, ed., Elsevier 2006.
Cognitive Radio Technologies, 2007
11
Supermodular Games
2
0, , ,i
i j
ui j N a A
a a
∂≥ ∀ ∈ ∈
∂ ∂
• A game such that– Action space is a lattice– Utility functions are supermodular
• Identification• NE Properties
– NE Existence: All supermodular games have a NE– NE Identification: NE form a lattice
• Convergence– Has weak FIP– Best response algorithms converge
• Stability– Unique NE is an attractive fixed point for best
response Cognitive Radio Technologies, 2007
12
Ad-hoc power control
• Network description
• Each radio attempts to achieve a target SINR at the receiving
end of its link.
• System objective is
ensuring every radio achieves its target SINR
Gateway
Cluster
Head
Cluster
Head
Gateway
Cluster
Head
Cluster
Head
( ) ( )2
ˆk k
k N
J γ γ∈
= − −∑p
3
Cognitive Radio Technologies, 2007
13
Generalized repeated gamestage game
• Players – N
• Actions –
• Utility function
• Action space formulation
( ) ( )2
ˆj j ju o γ γ= − −
max0,j jP p =
( ) ( )2
10 10
\
ˆ 10log 10 logj j jj j kj k j
k N j
u p g p g p Nγ∈
= − − + +
∑
gjk fraction of power transmitted by j that can’t be removed by receiving end of
radio j’s link
Nj noise power at receiving end of radio j’s link
Cognitive Radio Technologies, 2007
14
Model identification & analysis
• Supermodular game– Action space is a lattice
– Implications• NE exists
• Best response converges
• Stable if discrete action space
• Best response is also standard– Unique NE
– Solvable (see prelim report)
– Stable (pseudo-contraction) for infinite action spaces
( )
( )
2
\
2000
ln 20
j kj
j k
j kj k j
k N j
u p g
p pp g p N
∈
∂= >
∂ ∂ +
∑
( )ˆ
ˆ jk
j j
j
B pγ
γ=p
Cognitive Radio Technologies, 2007
15
Validation
Noiseless Best Response Noisy Best Response
Implies all radios achieved target SINR
Cognitive Radio Technologies, 2007
16
Comments on Designing Networks
with Supermodular Games
• Scales well
– Sum of supermodular functions is a supermodularfunction
– Add additional action types, e.g., power, frequency, routing,..., as long as action space remains a lattice and utilities are supermodular
• Says nothing about desirability or stability of
equilibria
• Convergence is sensitive to the specific decision
rule and the ability of the radios to implement it
Cognitive Radio Technologies, 2007
17
Potential Games
time
Φ(ω
)
• Existence of a function (called the potential function, V), that reflects the change in utility seen by a unilaterally deviating player.
• Cognitive radio interpretation:– Every time a cognitive radio
unilaterally adapts in a way that furthers its own goal, some real-valued function increases.
Cognitive Radio Technologies, 2007
18
Exact Potential Game Forms
• Many exact potential games can be recognized by the form of the utility function
4
Cognitive Radio Technologies, 2007
19
Implications of Monotonicity• Monotonicity implies
– Existence of steady-states (maximizers of V)
– Convergence to maximizers of V for numerous combinations of decision timings decision rules – all self-interested adaptations
• Does not mean that that we get good performance– Only if V is a function we want to maximize
Cognitive Radio Technologies, 2007
20
Other Potential Game Properties
• All finite potential games have FIP• All finite games with FIP are potential games
– Very important for ensuring convergence of distributed cognitive radio networks
• -V is a is a Lyapunov function for isolated maximizers
• Stable NE solvable by maximizers of V• Linear combination of exact potential games is
an exact potential game• Maximizer of potential game need not maximize
your objective function– Cognitive Radios’ Dilemma is a potential game
Cognitive Radio Technologies, 2007
21
Interference Reducing Networks
• Concept– Cognitive radio network is a potential game with a potential
function that is negation of observed network interference
• Definition– A network of cognitive radios where each adaptation
decreases the sum of each radio’s observed interference is an IRN
• Implementation:– Design DFS algorithms such that network is a potential game
with Φ ∝ -V
( ) ( )i
i N
Iω ω∈
Φ =∑
time
Φ(ω
)
Cognitive Radio Technologies, 2007
22
Bilateral Symmetric Interference
• Two cognitive radios, j,k∈N, exhibit bilateral symmetric interference if
Source: http://radio.weblogs.com/0120124/Graphics/geese2.jpg
What’s good for the goose, isgood for the gander…
( ) ( ), ,jk j j k kj k k jg p g pρ ω ω ρ ω ω= ,j j k kω ω∀ ∈ Ω ∀ ∈ Ω
• ωk – waveform of radio k
• pk - the transmission power of radio k’s waveform
• gkj - link gain from the transmission source of radio k’ssignal to the point where radio jmeasures its interference,
• - the fraction of radio k’s signal that radio j cannot exclude via processing (perhaps via filtering, despreading, or MUD techniques).
( ),k jρ ω ω
Cognitive Radio Technologies, 2007
23
Bilateral Symmetric Interference Implies an Interference Reducing Network
• Cognitive Radio Goal:
• By bilateral symmetric interference
• Rewrite goal
• Therefore a BSI game (Si =0)
• Interference Function
• Therefore profitable unilateral deviations increase V and decrease Φ(ω) – an IRN
( ) ( ) ( )∑∈
−=−=iNj
jijjiii pgIu\
,ωωρωω
( ) ( )\
,i ik i k
k N i
u bω ω ω∈
= −∑
( ) ( )1
1
,i
ki k k i
i N k
V g pω ρ ω ω−
∈ =
= −∑∑
( ) ( )2Vω ωΦ = −
( ) ( ) ( ) ( )kiikikkikiiikikkki bbpgpg ωωωωωωρωωρ ,,,, ===
Cognitive Radio Technologies, 2007
24
An IRN 802.11 DFS Algorithm• Suppose each access node
measures the received signal power and frequency of the RTS/CTS (or BSSID) messages sent by observable access nodes in the network.
• Assumed out-of-channel interference is negligible and RTS/CTS transmitted at same power
( ) ( )jkkkjkjjjk ffpgffpg ,, σσ =
( ) ( ) ( )\
,i i ki k i k
k N i
u f I f g p f fσ∈
= − = −∑
( )1
,0
i k
i k
i k
f ff f
f fσ
==
≠
Listen on
Channel LC
RTS/CTS
energy detected? Measure power
of access node
in message, p
Note address of access
node, a
Update
interference table
Time for decision?Apply decision criteria for new
operating
channel, OCUse 802.11h
to signal change
in OC to clients
yn
Pick channel tolisten on, LC
y
n
Start
5
Cognitive Radio Technologies, 2007
25
Statistics• 30 cognitive access nodes in European UNII
bands
• Choose channel with lowest interference• Random timing
• n=3• Random initial channels
• Randomly distributed positions over 1 km2
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
Number of Access Nodes
Reduction in N
et
Inte
rfere
nce (
dB
)
Round-robin Asynchronous Legacy Devices
Reduction in Net Interference
Reduction in Net Interference
Cognitive Radio Technologies, 2007
26
Ad-hoc Network
• Possible to adjust previous algorithm to not favor access nodes over clients
• Suitable for ad-hoc networks• CRT has IRN based
distributed low-complexity algorithms for – Spreading codes
– Power variations
– Subcarrier allocation– Bandwidth variations
– Activity levels weighted by interference
– Noninteractive terms –modulation, FEC, interleaving
– Beamforming
– And combinations of the above
Cognitive Radio Technologies, 2007
27
Comments on Potential Games• All networks for which there is not a better response interaction loop
is a potential game• Before implementing fully distributed GA, SA, or most CBR decision
rules, important to show that goals and action satisfy potential game model
• Sum of exact potential games is itself an exact potential game– Permits (with a little work) scaling up of algorithms that adjust single
parameters to multiple parameters
• Possible to combine with other techniques– Policy restricts action space, but subset of action space remains a
potential game (see J. Neel, J. Reed, “Performance of Distributed Dynamic Frequency Selection Schemes for Interference Reducing Networks,” Milcom 2006)
– As a self-interested additive cost function is also a potential game, easyto combine with additive cost approaches (see J. Neel, J. Reed, R. Gilles, “The Role of Game Theory in the Analysis of Software Radio Networks,” SDR Forum02)
• Read more on potential games:– Chapter 5 in Dissertation of J. Neel, Available at
http://scholar.lib.vt.edu/theses/available/etd-12082006-141855/ Cognitive Radio Technologies, 2007
28
Token Economies
• Pairs of cognitive radios exchange tokens for services rendered or bandwidth rented
• Example: – Primary users leasing spectrum to secondary users
• D. Grandblaise, K. Moessner, G. Vivier and R. Tafazolli,“Credit Token based Rental Protocol for Dynamic Channel Allocation,” CrownCom06.
– Node participation in peer-to-peer networks• T. Moreton, “Trading in Trust, Tokens, and Stamps,”
Workshop on the Economics of Peer-to-Peer Systems, Berkeley, CA June 2003.
• Why it works – it’s a potential game when there’s no externality to the trade– Ordinal potential function given by sum of utilities
Cognitive Radio Technologies, 2007
29
Comments on Network Options
• Approaches can be combined– Policy + potential
– Punishment + cost adjustment
– Cost adjustment + token economies
• Mix of centralized and distributed is likely best approach
• Potential game approach has lowest complexity, but cannot be extended to every problem
• Token economies requires strong property rights to ensure proper behavior
• Punishment can also be implemented at a choke point in the network
1
Cognitive Radio Technologies, 2007
1
Summary and Conclusions
• Summary of Critical Concepts
• The Future Role of Game Theory in the Design and Regulation of Dynamic Spectrum Access Networks
• Topics for Further Study and Research
DySPAN 2007 April 17, 2007 Cognitive Radio Technologies, 2007
2
What does game theory bring to the design of cognitive radio networks? (1/2)
• A natural “language” for modeling cognitive radio networks
• Permits analysis of ontological radios– Only know goals and that radios will adapt towards its goal
• Simplifies analysis of random procedural radios
• Permits simultaneous analysis of multiple decision rules – only need goal
• Provides condition to be assured of possibility of convergence for all autonomously myopic cognitive radios (weak FIP)
Cognitive Radio Technologies, 2007
3
What does game theory bring to the design of cognitive radio networks? (2/2)
• Provides condition to be assured of convergence for all autonomously myopic cognitive radios (FIP, not synchronous timing)
• Rapid analysis– Verify goals and actions satisfy a single model, and
steady-states, convergence, and stability
• An intuition as to what conditions will be needed to field successful cognitive radio decision rules.
• A natural understanding of distributed interactive behavior which simplifies the design of low complexity distributed algorithms
Cognitive Radio Technologies, 2007
4
Game Models of Cognitive Radio Networks
• Almost as many models as there are algorithms
• Normal form game excellent for capturing single iteration of a complex system
• Most other models add features to this model – Time, decision rules, noisy
observations, Natural states
• Some can be recast as a normal form game– Extensive form game
• Normal Form Game– ⟨N, A, ui⟩
• Supermodular Game–
• Potential Game–
• Repeated Game– ⟨N, A, ui, di⟩
• Asynchronous Game– ⟨N, A, ui, di, T⟩
• Extensive Form Game– ⟨N, A, ui, di, T⟩
• TU Game– ⟨N, v⟩
• Bargaining Game– ⟨N, v⟩
( )2
0i
i j
u a
a a
∂≥
∂ ∂
( ) ( )22
ji
i j j i
u au a
a a a a
∂∂=
∂ ∂ ∂ ∂
Cognitive Radio Technologies, 2007
5
Steady-states• Different game models have
different steady-state concepts
• Games can have many, one, or no steady-states
• Nash equilibrium (and its variants) is most commonly applied concept– Excellent for distributed
noncollaborative algorithms
• Games with punishment and Coalitional games tend to have a very large number of equilibria
• Game theory permits analysis of steady-states without knowing specific decision rules
• Nash Equilibrium• Strong Nash
Equilibrium
• Core• Shapley Value• Nash Bargaining
Solution
Cognitive Radio Technologies, 2007
6
Optimality
• Numerous different notions of optimality
• Many are contradictory
• Use whatever metric makes sense
• Pareto Efficiency
• Objective Maximization
• Gini Index
• Shapley Value
• Nash Bargaining Solution
2
Cognitive Radio Technologies, 2007
7
Convergence• Showing existence
of steady-states is insufficient; need to know if radios can reach those states
• FIP (potential games) gives the broadest convergence conditions
• Random timing actually helps convergence
Cognitive Radio Technologies, 2007
8
Noise
• Unbounded noise causes all networks to theoretically behave as ergodic Markov chains
• Important to show Lyapunov stabiltiy
• Noisy observations cause noisy implementation to an outside observer
– Trembling hand
Cognitive Radio Technologies, 2007
9
Game Theory and Design• Numerous techniques for
improving the behavior of cognitive radio networks
• Techniques can be combined• Potential games yield lowest
complexity implementations– Judicious design of goals,
actions,
• Practical limitations limit effectiveness of punishment– Observing actions
– Likely best when a referee exists
• Policy can limit the worst effects, doesn’t really address optimality or convergence issues
• Supermodular games– Steady state exists– Best response convergence
• Potential games– Identifiable steady-states– All self-interested decision rules
converge– Lyapunov function exists for isolated
equilibria
• Punishment– Can enforce any action tuple– Can be brittle when distributed
• Policy– Limits worst case performance
• Cost function– Reshapes preferences– Could damage underlying structure if
not a self-interested cost
• Centralized– Can theoretically realize any result
– Consumes overhead– Slower reactions
Cognitive Radio Technologies, 2007
10
Future Directions in Game Theory and Design
• Integrate policy and potential games
• Integration of coalitional and distributed
forms
• Increasing dimensionality of action sets
– Cross-layer
• Integration of dynamic and hierarchical policies and games
Cognitive Radio Technologies, 2007
11
Future Direction in Regulation
• Can incorporate optimization into policy by specifying goals
• In theory, correctly implementing goals, correctly implementing actions, and exhaustive self-interested adaptation is enough to predict behavior (at least for potential games)– Simpler policy certification
• Provable network behavior
Cognitive Radio Technologies, 2007
12
Avenues for Future Research on Game Theory and CRNs• Integration of bargaining,
centralized, and distributed algorithms into a common framework
• Cross-layer algorithms
• Better incorporating performance of classification techniques into behavior
• Asymmetric potential games
• Bargaining algorithms for cognitive radio
• Improving the brittleness of punishment in distributed implementations with imperfect observations
• Imperfection in observations in general
• Time varying game models while inferring convergence, stability…
• Combination of policy, potential games, coalition formation, and token economies
• Can be modeled as a game with to types of players
– Distributed cognitive radios
– Dynamic policy provider
3
Cognitive Radio Technologies, 2007
13
Questions?
Cognitive RadioTechnologiesCCognitiveognitive RRadioadioTTechnologiesechnologies
CRTCRTCRTCRTCRTCRTCRTCRTCRTCRTCRTCRT
www.crtwireless.com