poker for fun and profit (and intellectual challenge) robert holte computing science dept....

30
Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Post on 19-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Poker for Fun and Profit(and intellectual challenge)

Robert Holte

Computing Science Dept.

University of Alberta

Page 2: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Poker

Page 3: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

World Series of Poker

Page 4: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Poker Research Group - core

• Darse Billings (Ph.D.) • Aaron Davidson M.Sc., Poki • Neil Burch P/A, PsOpti• Terence Schauenberg (M.Sc.), Adapti

• Advisors: J Schaeffer, D Szafron

Page 5: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Poker Research Group – new arrivals

• Bret Hoehn (M.Sc.)• Finnegan Southey (postdoc)

• Michael Bowling• Dale Schuurmans• Rich Sutton• Robert Holte

Page 6: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Our Goal

Page 7: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

PsOpti2 vs. “theCount”

                          

Page 8: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Play Us Online

http://games.cs.ualberta.ca/poker/

Page 9: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Poki’s Poker Academy

http://poki-poker.com

Page 10: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Poker Variants

• Many different variants of poker

• Texas Hold’em the most skill-testing

• No-Limit Texas Hold’em used to determine the world champion

• Our research: Limit Texas Hold’em

• Current focus: 2-player (heads up)

Page 11: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Bet Sequence

Initial

Flop

Bet Sequence

Turn

Bet Sequence

River

1,624,350

9 of 19

9 of 19

45

9 of 19

44

17,296

19 Bet Sequence

O(1018)

2-player, limit, Texas Hold’em

2 private cards to each player

3 community cards

1 community card

1 community card

Page 12: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Research Issues

1. Chance events2. Imperfect Information3. Sheer size of the game tree4. Opponent modelling is crucial5. How best to use domain knowledge ?6. Experimental method

Variants have even more challenges:– More than 2 players (up to 10) – “No limit” (bid any amount)

Page 13: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Issues: Chance Events

• Utility of outcomes– currently just reason about expected payoff– short-term vs. long-term

• High variance– was the outcome due to luck or skill ?– experiment design

Page 14: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Issues: Imperfect Information

• Probabilistic strategies are essential

• Cannot construct your strategy in a bottom-up manner, as is done with perfect information games

Page 15: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Issues: Size of the game

• 2-player, Limit, Texas Hold’em game tree has about 1018 states

• Linear Programming can solve games with 108 states

Page 16: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Issues: Opponent Modelling

• Nash equilibrium not good enough– Static– Defensive

• Even the best humans have weaknesses that should be exploited

• How to learn very quickly, with very noisy information ?– Expoitation vs. exploration

• How not to be exploited yourself ?

Page 17: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Issues: Using Expert Knowledge

• We are fortunate to have unlimited access to a poker-playing expert (Darse)

• How best to use his knowledge ?– Expert system (explicitly encoded

knowledge) was not effective– Used his knowledge to devise abstractions

that reduced the game size with minimal impact on strategic aspects of the game

– Use him to evaluate the system

Page 18: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Experimental Method

• High variance

• ‘bot play not the same as human play

• Very limited access to expert humans other than our own expert

Page 19: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Coping with very large games

Full game treeT

StrategyFor T

StrategyFor T*

Abstract game treeT*

abstraction

Solve (LP)

(reversemapping)

(lossy)

too big to solve

Page 20: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Abstraction

• Texas Hold'em 2-player game tree is too big for current LP –solvers (1,179,000,604,565,715,751)

• Many ways of doing the abstractions– We require coarse-grained abstractions– Avoiding a severe loss of accuracy

• Abstract to a set of smaller problems 108 states, 106 equations and unknowns

Page 21: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Alternate Game Structures

• Truncation of betting rounds• Bypassing betting rounds• Models with 3 rounds, 2 rounds, or 1 round

• Many-to-one mapping of game-tree nodes to single nodes in the abstract game tree– How you do the mapping determines the overall

accuracy (few good and many bad mappings)– This is the limiting factor of the method

Page 22: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Bet Sequence

Initial

Flop

Bet Sequence

Turn

Bet Sequence

River

1,624,350

9 of 19

9 of 19

45

9 of 19

44

17,296

19 Bet Sequence

TexasHold'emO(1018)

3-roundModel

(expected valueleaf nodes)

Page 23: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Bet Sequence

Initial

Flop

Bet Sequence

Turn

Bet Sequence

River

1,624,350

9 of 19

9 of 19

45

9 of 19

44

17,296

19 Bet Sequence

TexasHold'emO(1018)

3-roundPostflopModel

(single flop)

1-roundPreflopModel

Page 24: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Abstractions

• Board Q – 7 – 2 • Compare 1.A–3 2.A–4 3.A–K

– Suit isomorphism (24X) (exact)– Rank near-equivalence (small error)

• Bucketing Hands are mapped to a small set of buckets

depending on• Current hand strength• Potential for improvement in hand strength

Page 25: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Bucketing

• Reduce branching factor at chance nodes• Partition hands into six classes per player• Overlaying strategically similar sub-trees

1,1 1,2 1,3 6,6

1,1 1,2 1,3 .…

OriginalBucketing

Next RoundBucketing

Transition Probabilities

….

6,6

Page 26: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Bet Sequence

Initial

Flop

Bet Sequence

Turn

Bet Sequence

River

1,624,350

9 of 19

9 of 19

45

9 of 19

44

17,296

w2 (36)

7 of 15

7 of 15

7 of 15

19 Bet Sequence

15

x2 (36)

z2 (36)

y2 (36)

TexasHold'emO(1018) Abstract

PostflopModelO(107)

AbstractPreflopModelO(107)

Page 27: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Reverse Mapping

• Bucket splitting– LP solution gives a strategy (recipe)– Each partition class split strong / weak– Split the randomized mixed strategy– {0, 0.2, 0.8} => {0, 0, 1.0} & {0, 0.4, 0.6}

• Better hand selection (with some risk)

Page 28: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Putting It All Together – PsOpti1

Bets2 4 6 8

Preflop

Flop

Turn

River

Selby preflop model

Post Post Post Post

Page 29: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Putting It All Together – PsOpti2

Preflop

Flop

Turn

River

Bets +model

3-roundpreflop model

Post Post Post Post Post Post Post

2 4 4 6 6 8 8

Page 30: Poker for Fun and Profit (and intellectual challenge) Robert Holte Computing Science Dept. University of Alberta

Conclusions

• Game Theory can be applied to large problems and practical systems

• Nash Equilibrium (minimax) too defensive, does not exploit the opponent’s weaknesses

• Current work involves opponent modelling– Preliminary results are very promising

• We hope to beat the best poker players in the world in the near future