sluggo: a computer baduk program presenter: ling zhao april 4, 2006 by david g doshay, charlie...

22
SlugGo: A Computer Baduk Program Presenter: Ling Zhao April 4, 2006 by David G Doshay, Charlie McDowell

Post on 22-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

SlugGo: A Computer Baduk Program

Presenter: Ling Zhao

April 4, 2006

by David G Doshay, Charlie McDowell

Introduction SlugGo is a parallel version of Gnu

Go, and with many enhancements. The name comes from banana slug,

the mascot of UCSC. One of the strongest Go program in

the world.

SlugGo Vs. Gnu Go Parallel computing. Whole-board lookahead (Gnu Go uses

local lookahead, and only for capture races and ladders).

New algorithm to determine best move.

Basic Idea Exhaustive search in Go not feasible

on one CPU – How about 24 CPUs? The deeper the search is, the more

accurate the evaluation will be. Two-step approach:

1. Generate candidate moves.

2. Do a global search for each move.

Parallel Computing A cluster of Apple G4/G5 CPUs (11,

26 or 72 processors). MPI is used for parallel programming. Master-slave architecture

with a task queue.

Move Generation Gnu Go move generator provides a list of

moves with values (ggValue) using pattern matching and local search – possible moves.

SlugGo select a subset of the list for global search – candidate moves.

Parallel lookahead for each move using Gnu Go move generator.

Static evaluation in the end: influence and territory.

Final value of a move is a function of ggValue, influence, territory, and boostValue.

Boost Value Go proverb: Your opponent’s move is

your own. SlugGo: Your best move may be close

to your opponent’s best move. boostValue = ggValue of the

opponent’s move * proximity factor

Boost Value

The maximum of the applicable boost values is chosen as boostValue.

Candidate List One move list for

us with boostValue added to ggValue.

One move list for opponent without boostValue.

Combine two lists and select the best N moves.

The best move from opponent’s list is always chosen (if legal).

Influence and Territory Gnu Go Influence function: Bouzy’s

algorithm using mathematical morphology.

Use influence function to determine the probability of each point controlled by one side, and the sum is the territory score.

Temporal Factors for Influence and Territory

Final Move Selection Smart Go bases the decision on the

state of the final lookahead configuration.

SlugGo considers the initial evaluation as well:

All values have been normalized.

Branching Original design: linear lookahead Linear branching:

24 CPUs: 24 candidate moves for linear lookahead. 4 candidate moves for 6-way branching in one

of the lookahead steps. 8 candidate moves for 3-way branching in one

of the lookahead steps.

Cache Cache stores game positions and

suggested moves by Gnu Go. Only 10% increase of performance. Reason 1: Saving in parallel computing

is determined by the weakest link. Reason 2: Huge game tree.

Experimental Results Lookahead depth: up to 16. Candidate moves: up to 26. Branching during lookahead. Weights for ggValue, boostValue, influence

and territory to compute final score. Normalization of ggValue. Boost factors. Weights to compute influence and territory.

SlugGo Vs. Gnu Go 91% winning percentage with 2-stone

handicap. Evil-twin syndrome:

SlugGo knows almost exactly Gnu Go’s response to any move.

SlugGo looks beyond Gnu Go’s horizon.

SlugGo vs. Many Faces of Go

Normalize ggValue Normalize ggValue(us) and ggValue(them). Approach 1: Combine and normalize: can

be disastrous. Approach 2: Normalize each list individually. Hybrid approach: if MAX(ggValue(us))

>MAX(ggValue(them)) use approach 1, otherwise choose approach 2.

Hybrid approach is better than approach 2.

Branching

No branching is the best and no need for more than 20 candidate moves.

Lookahead Depth

Deep search may decreases the chance of correct prediction.

Future Work Machine learning to adjust parameters. Why does branching reduce strength? Thinking in opponent’s time. Parallelize Gnu Go move generation

steps.

Conclusions Whole-board lookahead based on a Go

proverb is successful. If lookahead explores the path that the

game is likely to follow, it can increase strength substantially.

Branching and deep search seems no use. No need for more than 24 CPUs. Some experiments are missing.