nesterov’s excessive gap technique and poker andrew gilpin cmu theory lunch feb 28, 2007 joint...

31
Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas Sandholm

Post on 21-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Nesterov’s excessive gap technique and poker

Andrew GilpinCMU Theory Lunch

Feb 28, 2007

Joint work with:Samid Hoda, Javier Peña, Troels Sørensen, Tuomas Sandholm

Page 2: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Outline

• Two-person zero-sum sequential games

• First-order methods for convex optimization

• Nesterov’s excessive gap technique (EGT)

• EGT for sequential games

• Heuristics for EGT

• Application to Texas Hold’em poker

Page 3: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

We want to solve:

If Q1 and Q2 are simplices, this is the Nash equilibrium problem for two-person zero-sum matrix games

If Q1 and Q2 are complexes, this is the Nash equilibrium problem for two-person zero-sum sequential games

Page 4: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

What’s a complex?

It’s just like a simplex, but more complex.

Each player’s complex encodes her set ofrealization plans in the game

In particular, player 1’s complex is

where E and e depend on the game…

Page 5: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

A B C D E F G H

Page 6: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Recall our problem:

where Q1 and Q2 are complexes

Since Q1 and Q2 have a linear description,this problem can be solved as an LP. However,current LP solution methods do not scale

Page 7: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

(Un)scalability of LP solvers

• Rhode Island poker [Shi & Littman 01]– LP has 91 million rows and columns– Applying GameShrink automated abstraction algorithm yields an

LP with only 1.2 million rows and columns, and 50 million non-zeros [G. & Sandholm, 06a]

– Solution requires 25 GB RAM and over a week of CPU time

• Texas Hold’em poker– ~1018 nodes in game tree– Lossy abstractions need to be performed– Limitations of current solver technology primary limitation

to achieving expert-level strategies [G. & Sandholm 06b, 07a]

• Instead of standard LP solvers, what about a first-order method?

Page 8: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Convex optimization

Suppose we want to solve

where f is convex.

For general f, convergence requires O(1/ε2) iterations(e.g., for subgradient methods)

For smooth, strongly convex f with Lipschitz-continuous gradient, can be done in O(1/ε½) iterations

Note that this formulation capturesALL convex optimization problems(can model feasible space using anindicator function)

Analysis based on black-box oracleaccess model. Can we do better bylooking inside the box?

Page 9: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Strong convexity

A function is strongly convex if there exists such that

for all and all

is the strong convexity parameter of d

Page 10: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Recall our problem:

where Q1 and Q2 are complexes

Equivalently:

where

and

Page 11: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

, ,

Unfortunately, Φ and f are non-smooth

Fortunately, they have a special structure

Let d1,d2 be smooth and strongly convex on Q1,Q2

These are called prox-functions

Now let μ > 0 and consider:

These are well-defined smooth functions

Page 12: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Excessive gap condition

From weak duality, we have that f(y) ≤ Φ(x)

The excessive gap condition requires that

fμ(y) ≤ Φμ(x) (EGC)

The algorithm maintains (EGC), and gradually decreases μ

As μ decreases, the smoothed functions approach thenon-smooth functions, and thus iterates satisfying (EGC)converge to optimal solutions

Page 13: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Nesterov’s main theorem

Theorem [Nesterov 05]There exists an algorithm such that after at most N iterations, the iterates have duality gap at most

Furthermore, each iteration only requires solving three problems of the form

and performing three matrix-vector product operations on A.

Page 14: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Nice prox functions

A prox function d for Q is nice if it is:1. Strongly convex continuous everywhere in Q,

and differentiable in the relative interior of Q

2. The min of d over Q is 0

3. The following maps are easily computable:

Page 15: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Nice simplex prox function 1: Entropy

Page 16: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Nice simplex prox function 2: Euclidean

sargmax can be computed in O(n log n) time

Page 17: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

From the simplex to the complex

Theorem [Hoda, G., Peña 06]

A nice prox function can be constructed for

the complex via a recursive application of

any nice prox function for the simplex

Page 18: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Prox function example

Let be any nice simplex prox function.The prox function for this matrix is:

Page 19: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Solving

Page 20: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

(similar to b(i-vii))

Page 21: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Heuristics [G., Hoda, Peña, Sandholm 07]

• Heuristic 1: Aggressive μ reduction– The μ given in the previous algorithm is a

conservative choice guaranteeing convergence– In practice, we can do much better by aggressively

pushing μ, while checking that the excessive gap condition is satisfied

• Heuristic 2: Balanced μ reduction– To prevent one μ from dominating the other, we also

perform periodic adjustments to keep them within a small factor of one another

Page 22: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Matrix-vector multiplication in poker[G., Hoda, Peña, Sandholm 07]

• The main time and space bottleneck of the algorithm is the matrix-vector product on A

• Instead of storing the entire matrix, we can represent it as a composition of Kronecker products

• We can also effectively take advantage of parallelization in the matrix-vector product to achieve near-linear speedup

Page 23: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Memory usage comparison

Instance CPLEX IPM CPLEX Simplex EGT

10k 0.082 GB >0.051 GB 0.012 GB

160k 2.25 GB >0.664 GB 0.035 GB

RI 25.2 GB >3.45 GB 0.15 GB

Texas >458 GB >458 GB 2.49 GB

Page 24: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Poker

• Poker is a recognized challenge problem in AI because (among other reasons)– the other players’ cards are hidden;– bluffing and other deceptive strategies are needed in

a good player;– there is uncertainty about future events.

• Texas Hold’em: most popular variant of poker• Two-player game tree has ~1018 nodes

Page 25: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Potential-aware automated abstraction[G., Sandholm, Sørensen 07]

• Most prior automated abstraction algorithms employ a myopic expected value computation as a similarity metric– This ignores hands like flush draws where although the

probability of winning is small, the payoff could be high

• Our newest algorithm considers higher-dimensional spaces consisting of histograms over abstracted classes of states from later stages of the game

• This enables our bottom-up abstraction algorithm to automatically take into account positive and negative potential

Page 26: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Solving the four-round model

• Computed abstraction with– 20 first-round buckets– 800 second-round buckets– 4800 third-round buckets– 28800 fourth-round buckets

• Algorithm using 30 GB RAM– Simply representing as an LP requires 32 TB– Outputs new, improved solution every 2.5 days

Page 27: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

[G., Sandholm, Sørensen 07]

Page 28: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

[G., Sandholm, Sørensen 07]

Page 29: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

[G., Sandholm, Sørensen 07]

Page 30: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Future research

• Customizing second-order (e.g. interior-point methods) for the equilibrium problem

• Additional heuristics for improving practical performance of EGT algorithm

• Techniques for finding an optimal solution from an ε-solution

Page 31: Nesterov’s excessive gap technique and poker Andrew Gilpin CMU Theory Lunch Feb 28, 2007 Joint work with: Samid Hoda, Javier Peña, Troels Sørensen, Tuomas

Thank you ☺