poker and ai how the most “stable” creature on earth got used to that good old game from the...

44
Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Upload: prosper-cummings

Post on 11-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Poker and AIHow the most “stable” creature on earth got used to that good old game from the west!

Page 2: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

A game of (p)luck!• Cards:

– 2 Blinds– Flop : 3 community cards– Turn : 1 more community card– River : 1 last community card

• Betting rounds after every card deal/flip• Fold OR Call (Check) OR Raise (Bet)• Showdown, if you get there

Page 3: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Poker as a non trivial act of intelligence

Phil HellmuthPhil Hellmuth

Phil used my knowledge of Phil against me

Mike MatusowMike Matusow

Page 4: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Ain’t this an AI seminar?

• Games have always been an allure to AI theoreticians.

• Game of incomplete information• Several successful implementations:

BluffBot(Teppo Salonen), Polaris(Univ of Alberta), Poki, Casper… will see some.

• AAAI Annual Poker Competition : http://www.cs.ualberta.ca/~pokert/

Page 5: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

The essence of Poker

• Hand Strength & Hand Potential : Assess the strength of the current hand.– Cards in game– Number of players in the game– Position of the player– History

– Draws– Risks

Page 6: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

The essence of Poker

• Pot Odds– Pot odds are the relative odds of the bet v/s the

total pot compared with the odds of winning

– Example: If the cards in hand are A(H)-A(D). And the cards on board are A(C)-2-3-7-?. Then the odds of getting a very strong hand after the river are 5:13.

– The pot odds for a $10 bet on $40 pot are 1:4 while on a $10 pot are 1:1.

– The first favorable, not the second.

Page 7: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

The essence of Poker

• Bluffing & Unpredictability– Different strategies in similar situations– Element of non determinacy

• Opponent Modeling– Used to guess the opponents’ cards based on

history

Page 8: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

LOKI & POKIA look at how

The Experts do it!

Page 9: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Encoding the Problem

• Probability triples – simplicity itself

Pr := ( f , c , r )

“Marvin thinks for an eon and comes up with the three magic numbers to make tea!”

The output of all analysis at any game point is the probability with which poki folds or calls or raises. The final decision is non deterministic adding natural noise.

Page 10: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Building the system• Pre-flop strategies : Almost zero information

guess!• How do humans start: Sklansky’s rankings

– Collected into groups of similar cards (as far as poker is concerned) and categorized into 8 groups, of decreasing strength

– Tuned for 10 player games, not considering opponent characteristics

• A Rule based system on this information

Page 11: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Man as a hand-wavy standard

• Moving away from External information:– Eliminate the use of human knowledge

whenever possible– calculated information may be quantitative

rather than qualitative – The algorithmic approach can be applied to many

different specific situations (such as having exactly six players in the game)

Page 12: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Rebuilding the system

• Roll Out simulations– Pre-flop blinds called by all players and then

checks till the showdown. Then probability of winning with a pair of cards gives the Income rate

– Coarse• Iterated Roll Out simulations

– Income rates in the first simulation decides whether a player calls or folds pre-flops.

– This value stabilizes

Page 13: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Hand Strength Hand Potential

EffectiveHand Strength

Think!

ProbabilityTriple

RandomNumber

Generator

Page 14: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Hand Strength

• Hand Strength is the probability that a given hand is better than that of an active opponent– How? Calculate all possible hands that can be

made with the current hand, and also those that are better / equal / worse than ours

• Extrapolate to n-opponents by raising the found probability to n

HSn = (HS1)n

Page 15: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Hand Potential

• Positive Potential: Of all possible games with the current hand, we calculate all scenarios where Poki is behind but ends up winning.

• Negative Potential: Of all possible games with the current hand, we calculate all scenarios where Poki is ahead but ends up losing.

Page 16: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Hand Strength Hand Potential

EffectiveHand Strength

ProbabilityTriple

RandomNumber

Generator

Page 17: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Effective Hand Strength

Pr(win)

= Pr(ahead)×Pr(opponent does not improve) + Pr(behind)×Pr(we improve)

= HS ×(1 − NPot) +(1 − HS)×PPot.

= HS + (1 − HS)×PPot.

= HSn + (1− HSn)×Ppot (multiple opponents)

Page 18: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Hand Strength Hand Potential

EffectiveHand Strength

ProbabilityTriple

RandomNumber

Generator

Page 19: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Adding Sophistication

• All card pairs at a given point of time not equally likely

• Maintain a weighting table that stores the probability for each card pair he/she may be holding at the given point in game depending on history.

• re-weighting : update to this table on every move.

EHSi = HSi + (1− HSi)×Ppot,i

Page 20: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

“No poker strategy is complete without a good opponent modeling system”

A Neural Net trained for an opponent fed 19 game characteristics and outputs a probability triple of for the opponents next action.

Neural NetNeural Net

FoldCall

Bet

Inputs

Page 21: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

There are other ways tomake money

Page 22: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

CASe based Poker playER

• Stores a large case base obtained through the simulation of other bots (Loki/Poki)

• For a particular situation calculates similarity value for each case and sort them (quick sort)

• Take cases up to a threshold of 97% or top 20 (which ever applicable)

• Find probability (f, c, r) ,i.e., the frequency of various decisions taken in there cases.

Page 23: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

CASe based Poker playER

• Performs well against other bots and against real opponents in play money games

• Testing in real money games was expensive!! Reasons given for this– Insufficient real money cases– Different strategy adopted by people

Page 24: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Evolving Adaptive Play

Loose Tight

Passive

Evolution startsAggressive

A particular human trait is represented by a matrx which stores informations like probability tuple in various game situations

Page 25: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Evolution

• Matrices corresponding to the new generation are formed by randomizing/swapping some values in the matrix.

• The most promising matrices are selected through multiple game plays.

• The final set of matrices correspond to the best solution in the current playing environment.

• Can adapt to any change in the strategy of other players

Page 26: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Evolution: Martians can’t exist on Earth

Wtight(Atight) > Wtight(A)Wloose(Aloose) > Wloose(A)

Wtight(Atight) > Wtight(Aloose)Wloose(Aloose) > Wloose(Atight)

Wx : Performance in ‘x’ environmentAy : Program developed in environment ‘y’

Human traits are generally not fixed and their domain is not so small

Page 27: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Stereotypes

• People play with certain “prejudiced” strategies. Extensive statistics collected to jot down possible stereotypes

• In an early game, lack of data hampers effective opponent modeling : use stereotypes

• Extend the idea to the whole game.

Stereotypes are various game-play styles adopted by various peoples recorded by watching a large number of games

Page 28: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

A Façade used to match the decisions taken by the player at each betting round. The stereotype with the least mean square deviation chosen as the match

The actual stereotype then used to guess the action of the player in future

Page 29: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Poker and Game Theory

How to find the “optimal” strategy in the game of imperfect information – poker?

Page 30: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Applications of Game Theory

• To mathematically capture behavior in strategic situations, in which an individual's success in making choices depends on the choices of others

• In an equilibrium, each player of the game has adopted a strategy that they are unlikely to change, e.g. Nash Equilibrium applied to Climate Change Models

Page 31: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

A One Card Poker

OPENEROPENER DEALERDEALER

ACE DEUCE TREY

How is the game played?

Page 32: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

A One Card Poker

OPENEROPENER DEALERDEALER

1. Dealer Deals2. Put $ 100

2. Put $ 100

3. Check or Bet depending on how the other player plays!!

Page 33: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

One card poker – decision tree

The tree goes to a maximum depth of 3

Page 34: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

A One Card Poker – typical situation

OPENEROPENER DEALERDEALER

DEUCE

I Bet!!

What to do???Is he bluffing?

Page 35: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Assumption: Obvious Plays and Stupid Mistakes

1. Folding the trey (3)2. Calling with the ace3. Checking with the trey “in position”4. Betting with the deuce

Page 36: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Strategic Plays and Expected Value

Consider the following variables:

p1 = probability the opener bluffs with the ace,

p2 = probability the opener calls with the deuce,

p3 = probability the opener bets with the trey,

q1 = probability the dealer bluffs with the ace,

q2 = probability the dealer calls with the deuce.

Page 37: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Opener’s post-ante expected value

• There are three possible non-zero post-ante results for the opener. Either he loses $100, wins $200, or wins $300. We will begin by computing the probabilities of each of these outcomes.

Case 1: The opener has the ace, the dealer has the deuce P(-100 $) = p1q2, P(200 $) = p1(1 - q2), P(300 $) =

0Case 2: The opener has the ace, the dealer has the trey (3)

P(-100 $) = p1, P(200 $) = P(300 $) = 0

Page 38: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Opener’s post-ante expected value

Case 3: The opener has the deuce (2), the dealer has the aceP(-100 $) = 0, P(200 $) = 1 – q1, P(300 $) = q1p2

Case 4: The opener has the deuce (2), the dealer has the treyP(-100 $) = p2, P(200 $) = P(300 $) = 0

Case 5: The opener has the trey (3), the dealer has the aceP(-100 $) = 0, P(200 $) = 1 - (1 - p3)q1 , P(300 $) = (1

- p3)q1Case 6: The opener has the trey (3), the dealer has the deuce

P(-100 $) = 0, P(200 $) = 1 - p3q2 , P(300 $) = p3q2

Page 39: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Game Theoretic Analysis

The opener’s total Expected Value for the entire hand is:

[q1(3p2 − p3 − 1) + q2(p3 − 3p1) + (p1 − p2)] / 6

If q1 = q2 = 1/3; EV = - 1/18 and this does not depend on the opener’s choices of the numbers p1, p2, and p3

Page 40: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Optimal strategy: Game Theoretic Analysis

• The opener has an advantage in the game. The only way for the dealer to prevent the opener from being able to seize back some of this advantage is to play the indifferent strategy,

q1 = q2 = 1/3 • It is for this reason that the indifferent

strategy is more commonly referred to as the “optimal” strategy.

Page 41: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Game Theory – How to win?

You cannot win with the optimal strategy, but you can make sure you don’t lose.

Page 42: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Game Theory – How to win?

• So the object of the game is not to play optimally. It is to spot the times when your opponent is not playing optimally, or even to induce him not to play optimally, to recognize the way in which he is deviating from optimality, and then to choose a non-optimal strategy for yourself which capitalizes on his mistakes. You must play non-optimally in order to win. To capitalize on your opponent’s mistakes, you must play in a way that leaves you vulnerable.

Page 43: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

Game Theory – to the other games

Perfect Information Imperfect Information

No chance ChessGo

Inspection GameBattleships

Chance BackgammonMonopoly

Poker

Interesting finds in Game Theoretical Poker Research:

•Gautam Rao, a poker expert said about PsOpti : You have a very strong program. Once you add opponent modeling to it, it will kill everyone

•In poker, knowing the basic approach of the opponent is essential, since it will dictate how to properly handle many situations that arise. Some players wrongly attributed intelligence where none was present

Page 44: Poker and AI How the most “stable” creature on earth got used to that good old game from the west!

References• Billings, Davidson, Schaeffer, Szafron; The challenge of

poker, 2002• Billings, Davidson, Schaeffer, Szafron; Opponent modeling

in poker, 1998• Luigi Baron, Lyndon While; Evolving Adaptive play for

simplified poker, 1998• Watson and Rubin, Case Based Poker Bot, 2008• Layton, Vamplew, Turville; Using stereotypes to improve

early match poker play, 2008• Jason Swanson, Game Theory and Poker, 2005• D. Billings, N. Burch, A. Davidson, R. Holte, J. Schaeffer, T.

Schauenberg, and D. Szafron Approximating Game-Theoretic Optimal Strategies for Full-scale Poker