cse 40171: artificial intelligence - wjscheirer.comalpha-beta pruning properties slide credit: dan...
TRANSCRIPT
![Page 1: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/1.jpg)
Adversarial Search: Alpha-Beta Pruning; Imperfect Decisions
CSE 40171: Artificial Intelligence
�32
![Page 2: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/2.jpg)
�33
Homework #3 is due tonight at 11:59PM
![Page 3: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/3.jpg)
What limitations does minimax have?
![Page 4: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/4.jpg)
�35
? ? ?
-1 -2 4 9
4
min
max
-2 4
Image credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 5: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/5.jpg)
Resource Limits
Problem: In realistic games, cannot search to leaves!
Solution: Depth-limited search
‣ Instead, search only to a limited depth in the tree ‣ Replace terminal utilities with an evaluation function for
non-terminal positions
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 6: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/6.jpg)
Resource Limits
Example:
‣ Suppose we have 100 seconds and can explore 10K nodes / sec
‣ This means we can check 1M nodes per move
‣ 𝝰-𝝱 pruning reaches about depth 8 – decent chess program
Guarantee of optimal play is gone
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 7: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/7.jpg)
Depth Matters
Evaluation functions are always imperfect
The deeper in the tree the evaluation function is buried, the less the quality of the evaluation function matters
An important example of the tradeoff between complexity of features and complexity of computation
![Page 8: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/8.jpg)
Game Tree Pruning
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 9: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/9.jpg)
Minimax Example
3 12 8 2 4 6 14 5 2
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 10: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/10.jpg)
Motivating Example
Image credit: Russell and Norvig
![Page 11: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/11.jpg)
Minimax Pruning
3 12 8 2 14 5 2
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 12: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/12.jpg)
Alpha-Beta Pruning
When applied to a standard minimax tree, it returns the same move as minimax would
But always prunes away branches that cannot possibly influence the final decision
![Page 13: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/13.jpg)
What are alpha and beta?
𝝰 = the value of the best (i.e., highest-value) choice we have found so far at any choice point along the path for MAX.
𝝱 = the value of the best (i.e., lowest-value) choice we have found so far at any choice point along the path for MIN.
![Page 14: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/14.jpg)
General Configuration (MIN Version)
MAX
MIN
MAX
MIN
a
n
• We’re computing the MIN-VALUE at some node n
• We’re looping over n’s children • n’s estimate of the childrens’ min is
dropping • Who cares about n’s value? MAX • Let a be the best value that MAX
can get at any choice point along the current path from the root
• If n becomes worse than a, MAX will avoid it, so we can stop considering n’s other children (it’s already bad enough that it won’t be played)
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 15: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/15.jpg)
General Configuration (MAX Version)
MAX
MIN
MAX
MIN
a
n
The MAX version is simply symmetric
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 16: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/16.jpg)
Alpha-Beta Implementation
α: MAX’s best option on path to root β: MIN’s best option on path to root
def max-value(state, α, β): initialize v = -∞ for each successor of state:
v = max(v, value(successor, α, β)) if v ≥ β return v α = max(α, v)
return v
def min-value(state, α, β): initialize v = +∞ for each successor of state:
v = min(v, value(successor, α, β)) if v ≤ α return v β = max(β, v)
return v
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 17: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/17.jpg)
Alpha-Beta Pruning Properties
10 10 0
max
min
This pruning has no effect on minimax value computed for the root!
Values of intermediate nodes might be wrong ‣ Important: children of the root may have
the wrong value ‣ So the most naive version won’t let you
do action selection
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 18: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/18.jpg)
Demo: Minimax + Alpha-Beta Pruning
https://www.youtube.com/watch?v=_bEQJKXZ1-U
![Page 19: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/19.jpg)
Alpha-Beta Pruning Properties
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
10 10 0
max
min
Good child ordering improves effectiveness of pruning
With “perfect ordering”:
‣ Time complexity drops to O(bm/2) ‣ Doubles solvable depth! ‣ Full search of, e.g., chess, is still hopeless…
This is a simple example of metareasoning (computing about what to compute)
![Page 20: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/20.jpg)
Evaluation Functions
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 21: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/21.jpg)
It turns out that alpha-beta pruning isn’t so good…
It must search all the way to terminal states for at least a portion of the search space
Claude Shannon BY-NC-SA 2.0 tericee
This is usually not practical, because we need to play the game in a reasonable amount of time
Shannon’s suggestion: cutoff earlier via a heuristic evaluation function
![Page 22: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/22.jpg)
Cutoff Test
H-MINIMAX(s, d) =
EVAL(S) if CUTOFF-TEST(s, d)
maxα∈Actions(s) = H-MINMAX(RESULT(s, α), d + 1) if PLAYER(s) = MAX
min∈Actions(s) = H-MINMAX(RESULT(s, α), d + 1) if PLAYER(s) = MIN{
![Page 23: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/23.jpg)
Evaluation Functions
Evaluation functions score non-terminals in depth-limited search
Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188
![Page 24: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/24.jpg)
Evaluation Functions
Ideal function: returns the actual minimax value of the position
In practice: typically weighted linear sum of features:
e.g. f1(s) = (num white queens – num black queens), etc.
![Page 25: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/25.jpg)
Be wary of simple approaches
Heuristic: Material Advantage
Image credit: Russell and Norvig
![Page 26: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/26.jpg)
Be wary of simple approaches
Probable win for black
Image credit: Russell and Norvig
![Page 27: CSE 40171: Artificial Intelligence - wjscheirer.comAlpha-Beta Pruning Properties Slide credit: Dan Klein and Pieter Abbeel, UC Berkeley CS 188 10 10 0 max min Good child ordering](https://reader036.vdocuments.site/reader036/viewer/2022081400/5f0ad3c07e708231d42d890a/html5/thumbnails/27.jpg)
Be wary of simple approaches
Image credit: Russell and Norvig