bayesian and non-bayesian learning in games ehud lehrer tel aviv university, school of mathematical...
TRANSCRIPT
Bayesian and non-Bayesian Learning in Games
Ehud Lehrer
Tel Aviv University, School of Mathematical Sciences
Including joint works with: Ehud Kalai, Rann Smorodinsky, Eioln Solan.
Learning in GamesInformal definition of learning: a decentralized process thatconverges (in some sense) to (some) equilibrium.
Non-Bayesian learning: Players• don’t have any initial belief about other players’ strategies • don’t maximize their payoffs • don’t take into account future payoffsConvergence (of the empirical frequency) to an equilibrium of the ONE-SHOT GAME
Bayesian (rational) learning: Players do not start in equilibrium,
but • they have some initial belief about other players’ strategies • they are rational: they maximize their payoffs • they take into account future payoffs
Convergence in REPEATED GAME
Bayesian vs. non-Bayesian
Non-Bayesian learning: Players have no idea about other players’ actions. They don’t care to maximize payoffs.
Nature of results: the statistics of past actions looks like an
equilibrium of the one-shot game.
Bayesian learning: Players do not start in equilibrium, but they
start with a “grain” of idea about what other players do.
Nature of results: players eventually play something close to an
equilibrium of the repeated game.
Important tools
Non-Bayesian learning: approachability
Bayesian learning: merging of two probability measures along a
a filtration (an increasing sequence of - fields)
Both were initiated by Blackwell (the first with Dubins)
Repeated Games with Vector Payoffs• I = finite set of actions of player 1.• J = finite set of actions of player 2.• M = (mi,j) = a payoff matrix. Entries are vectors in Rd.
A set F is approachable by player 1 if there is a strategy s.t.
There are sets which are neither approachable nor excludable.
,, , , , sup ( , )nn N
N P d x F
A set F is excludable by player 2 if there is a strategy s.t.
,, , , , inf ( , )nn NN P d x F
Approachability
Applications (a sample): • No-regret (Hannan)• Repeated games with incomplete information (Aumann-Maschler)• Learning (Foster-Vohra, Hart-Mas Colell)• Manipulation of calibration tests (Foster-Vohra, Lehrer, Smorodinsky-Sandroni-Vohra)• Generating generalized normal-number (Lehrer)
Characterization of Approachable Sets
F
x
y
A closed set F Rd is a B-set if for every x F there is y F that satisfies:
1. y is a closest point in F to x.2. The hyperplane perpendicular to the line xy that passes through y
separates between x and H(p), for some p (I).
the line xy
the hyperplane perpendicular to xy that passes through y
mp,q = i,j pi mi,j qj H(p) = { mp,q , q (I) }
H(p0)
Characterization of Approachable Sets
Theorem [Blackwell, 1956]: every B-set F is approachable.
Theorem [Blackwell, 1956]: every convex set is either approachable or excludable.
Theorem [Hou, 1971; Spinat, 2002]: every minimal (w.r.t. set inclusion) approachable set is a B-set.Or: A set is approachable if and only if it contains a B-set.
The approaching strategy plays at each stage n the mixed action p such that H(p) and x are separated by the hyperplane connecting x and a closest point to x in F. With this strategy: 2 | |
( , )n
ME d x F
n
Bounded Computational Capacity
A strategy is k-bounded-recall if it depends only on the last k pairs of actions (and it does not depend on previously played actions).
A (non-deterministic) automaton is given by:• A finite state space.• A probability distribution over states, according to which the initial state is chosen.• A set of inputs (say, the set I × J of action pairs).• A set of outputs (say, I , the set of player 1’s actions).• A rule that assigns to each state a probability distribution over outputs.• A transition rule that assigns to every state and every input a probability distribution over the next state.
Approachability and Bounded Capacity
Theorem (w/ Eilon Solan): The following statements are equivalent.1. The set F is approachable with bounded-recall strategies.2. The set F is approachable with automata.3. The set F contains a convex approachable set.4. The set F is not excludable against bounded-recall strategies.
A set F is approachable with bounded-recall strategies by player 1 if for every >0, the set B(F, ) := { y : d(y, F) } is approachable by some bounded-recall strategy.
4 points to note
A set F is excludable against bounded-recall strategies by player 2 if player 2 has a strategy such that
,, , bounded-recall , , inf ( , )nn N
N P d x F
Theorem: The following statements are equivalent for closed sets.1. The set F is approachable with bounded-recall strategies.2. The set F is approachable with automata.3. The set F contains a convex approachable set.4. The set F is not excludable against bounded-recall strategies.
Main Theorem
1. A set is approachable with automata if and only if it is approachable by bounded-recall strategies.
2. A complete characterization of sets that are approachable with bounded-recall strategies.
3. A set which is not approachable with bounded-recall strategies, is excludable against all bounded-recall strategies.
4. We do not know whether the same holds for automata.
Example
(-1,1)(1,-1)
)1,1((-1,-1)On board
Good news: in applications target sets are convex ( a point or a whole -- positive or negative -- orthant).
Advantage: allows for infinitely many constraints
Approachability in Hilbert space • I = finite set of actions of player 1.• J = finite set of actions of player 2.• M = (mi,j) = a payoff matrix. Entries are points in HS (random variables).All may change with the stage n.
A set F is approachable by player 1 if there is a strategy s.t.
,, , , , sup ( , )nn N
N P d x F
Theorem: Suppose that at stage n, the average payoff is and y is a closest point in F to . If the hyperplane perpendicular to theline that passes through y separates between and H(p), for some p (I), then F is approachable.
nxnx
nx ynx
Approachability and law of large numbers
F is
1 2, ,...X X are uncorrelated r.v.’s with . ( ) 0iE X ( )i jE X X is the dot product.
0
At any stage n, . 1( ) 0n nE X X
F
nX
1nX
The game: each players has only one action. The payoff at stage n is . Thus, F is approachable. This is the strong law of large numbers. (When the payoffs are not uniformly bounded, there is anadditional boundedness condition.)
nX
Problem: Approachability in norm spaces.
The average payoff at stage n is
Activeness function
At stage n the characteristic function indicates which coordinates are active and which are not.
H is (even over a finite probability space). 2L
nK
1
1
n
t tt
n n
tt
K XX
K
Applications: 1. repeated games with incomplete information – different games are active on different times 2. construction of normal numbers 3. manipulability of many calibration tests 4. general no-regret theorem (against many replacing schemes) 5. convergence to correlated eq. along many sequences
Theorem: suppose that F is convex. Let be the closest point in F tothe average payoff at time n, . If the hyperplane perpendicular to theline
that passes through separates between and H(p), for some p (I), then F is approachable.
Activeness function – cont.
ny
1
1
( )nn nn
tt
KX y
K
ny nX
nX