cp - 2001 1 formal models of heavy-tailed behavior in combinatorial search hubie chen, carla p....
Post on 19-Dec-2015
216 views
TRANSCRIPT
1
CP - 2001
Formal Models of Heavy-Tailed Behavior in Combinatorial Search
Formal Models of Heavy-Tailed Behavior in Combinatorial Search
Hubie Chen, Carla P. Gomes, and Bart Selman
{hubes,gomes,selman}@cs.cornell.edu
Department of Computer Science
Cornell University
2
CP - 2001
BackgroundBackground
Randomized backtrack search methods demonstrate high variability of run time
(relative to fixed instance):
Heavy-tailed behavior (Gomes et. al. CP ‘97, JAR ‘00)
New insights into the the design of search algorithms restart strategies
Randomization and restart strategies are now an integral part of state-of-the-art SAT Solvers
(Chaff, GRASP, RELSAT, SATZ-Rand)
3
CP - 2001
GoalsGoals
Our goals: Formal analysis of tree search models: show
under what conditions heavy-tailed distributions can and cannot arise.
Understand when restart strategies are/are not effective.
Research on heavy-tails in search thus far largely based on empirical studies.
4
CP - 2001
IntuitionIntuitionIntuitionIntuition
How does heavy-tailed behavior arise?
• The procedure is characterized by a large variability, which leads to highly different trees from run to run.
• Wrong branching decisions may lead the search procedure to explore exponentially large subtrees of the search space containing no solutions.
• A lucky sequence of good branching decisions may lead the search to find a solution after exploring only a small subtree.
5
CP - 2001
Intuition Pump: RestartsIntuition Pump: RestartsIntuition Pump: RestartsIntuition Pump: Restarts
When are restarts effective?
Suppose a search procedure requires (on inputs of size n):
• Time p(n) (for a polynomial p) with probability ½• Time 2^n with probability ½
No restarts: expected time exponential: equal to ½ * (p(n) + 2^n)
Restart with time interval p(n): expected time drops to polynomial: equal to 2*p(n)
6
CP - 2001
Outline of TalkOutline of Talk
• Empirical evidence of Heavy-Tailed behavior
• Tree Search Models
• Balanced Tree Search Model
• Imbalanced Tree Search Model
• Bounded Heavy-Tailed Behavior: finite distributions
7
CP - 2001
Empirical Evidence
of Heavy-Tailed Behavior
8
CP - 2001
Quasigroups or Latin Squares:An Abstraction for Real World Applications
Quasigroups or Latin Squares:An Abstraction for Real World Applications
Quasigroup or Latin Square
(Order 4)
32% preassignment
Gomes and Selman 96
A quasigroup is an n-by-n matrix such that each row and column is a
permutation of the same n colors
9
CP - 2001
Randomized Backtrack SearchRandomized Backtrack Search
(*) no solution found - reached cutoff: 2000
Time: (*)3011 (*)7
Easy instance – 15 % preassigned cells
Gomes et al. 97
10
CP - 2001
Median = 1!
samplemean
3500!
Erratic Behavior of Search CostQuasigroup Completion ProblemErratic Behavior of Search Cost
Quasigroup Completion Problem
500
2000
number of runs
11
CP - 2001
Heavy-Tailed Distributions
12
CP - 2001
Heavy-Tailed DistributionsHeavy-Tailed Distributions
• Infinite variance, infinite mean
• Introduced by Pareto in the 1920’s --- “probabilistic curiosity.”
• Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena.
• Examples: stock-market, earthquakes, weather, web traffic...
13
CP - 2001
Decay of DistributionsDecay of Distributions
Standard
Exponential Decay
e.g. Normal:
Heavy-Tailed
Power Law Decay
e.g. Pareto-Levy:
0,]Pr[ 2
CsomeforxCexX
Pr[ ] ,X x Cx x 0
Power Law Decay
Standard Distribution(finite mean & variance)
Exponential Decay
14
CP - 2001
Visualization of Heavy Tailed Behavior
Visualization of Heavy Tailed Behavior
Log-log plot of tail of distributionshould be approximately linear.
Slope gives value of
infinite mean and infinite infinite mean and infinite variancevariance
infinite varianceinfinite variance
1
21
466.0
319.0
153.0
Number backtracks (log)
(1-F
(x))
(log
)
Un
solv
ed f
ract
ion
1 => Infinite mean
18% unsolved
0.002% unsolved
15
CP - 2001
Exploiting Heavy-Tailed BehaviorExploiting Heavy-Tailed Behavior
Heavy Tailed behavior has been observed in several domains: QCP, Graph Coloring, Planning, Scheduling, Circuit synthesis, Decoding, etc.
Consequence for algorithm design:
Use restarts or parallel / interleaved runs to
exploit the extreme variance performance.
Restarts provably eliminate heavy-tailed behavior (Gomes et al. 2000)
70%unsolved
1-F
(x)
Un
solv
ed f
ract
ion
Number backtracks (log)
250 (62 restarts)
0.001%unsolved
16
CP - 2001
Tree Search Models:
Balanced Tree Model
17
CP - 2001
Balanced Tree Model, DescribedBalanced Tree Model, DescribedBalanced Tree Model, DescribedBalanced Tree Model, Described
Trees All leaves occur at the same depth Branching factor 2 Exactly one “satisfying” leaf
Search algorithm Chronological backtrack search model Random child selection with no propagation mechanisms
18
CP - 2001
Balanced Tree Model: AnalysisBalanced Tree Model: AnalysisBalanced Tree Model: AnalysisBalanced Tree Model: Analysis
Let denote the runtime: number of leaf nodes visited (including “satisfying” leaf), on tree of depth n.
Let denote choice at (unique) node above satisfying leaf at depth i :
1 = bad choice, 0 = good choiceThen,
There is exactly one choice of zero-one assignments to the variables for each possible value of T(n); any such assignment has probability
T(n) has an uniform distribution.ni
ninTP 2,,1,
21])([
1022121
)( nXiniXnXnT
n
21
T=4
T=64
iX
)(nT
19
CP - 2001
Balanced Tree Model: Balanced Tree Model: DistributionDistribution
Balanced Tree Model: Balanced Tree Model: DistributionDistribution
• The expected run time and variance scale exponentially, in the height of the search tree (number of variables);
• The run time distribution is uniform --
shape not heavy tailed.
221)]([
nnTE
12
122)]([n
nTV
(see paper for formal proofs)
20
CP - 2001
Balanced Tree Model: RestartsBalanced Tree Model: RestartsBalanced Tree Model: RestartsBalanced Tree Model: RestartsRestart strategies are not effective for this model:
no restart strategy with expected polynomial time.
Define a restart strategy to be a sequence of times
Applied to a search procedure by running procedure for time ; restarting and running for time , etc., until
solution found.
Luby et al. (IPL ‘93) show that optimal performance (minimum expectation) obtained by a purely uniform restart strategy:
),...(3),(2),(1 ntntnt
...)(3)(2)(1 ntntnt
)(2 nt)(1 nt
21
CP - 2001
What sort of improvements can be made to an algorithm so that behavior not like backtrack in balanced tree model?
Very clever search heuristics that lead quickly to the solution node - but that is hard in general
Combination of pruning, propagation, dynamic variable ordering: prune subtrees that do not contain the solution, allowing for runs that are short.
Resulting trees may vary dramatically from run to run.
Balanced Tree ModelBalanced Tree ModelBalanced Tree ModelBalanced Tree Model
22
CP - 2001
Tree Search Models:
Imbalanced Tree Model
23
CP - 2001
Imbalanced Tree ModelImbalanced Tree ModelImbalanced Tree ModelImbalanced Tree Model
Algorithm requires time b^iwith probability (1-p)p^i
Intuition: lower p corresponds to “smarter” search
Let T denote the runtime of the algorithm:
the number of leaf nodes visited up to and including the successful
node.
)0()1(][ iippibTP
b=2
24
CP - 2001
Imbalanced Tree ModelImbalanced Tree Model
25
CP - 2001
Imbalanced Tree Model:Three Regimes of Behavior
Imbalanced Tree Model:Three Regimes of Behavior
Regime 1:finite expected time, finite variance
Regime 2:finite expected time, infinite variance
Regime 3:infinite expected time, infinite variance
Tail:
when we have
bp 1
LCp
bLpLTP log
2][
(see paper for formal proofs)
21b
p
bp
b1
21
21b
p 2
26
CP - 2001
Bounded Imbalanced Tree Model
27
CP - 2001
Bounded Imbalanced Tree Model
Bounded Imbalanced Tree Model
0)1(][ iippibTPUnbounded model Single infinite distribution.
11
0
)1(
npn
i
ipp
Bounded model Infinite number of distributions, one for each n.Arises from truncating successively larger finite segments of unbounded distribution.
Given that:
niipnCn
p
ippibTP ,,1,01
1
)1(][
11
1
np
pnC
We define:
with
28
CP - 2001
Bounded Imbalanced Tree Model: Three Regimes of Behavior
Bounded Imbalanced Tree Model: Three Regimes of Behavior
Regime 1:
polynomial expected time, polynomial variance
Regime 2:
polynomial expected time, exponential variance
Regime 3:
exponential expected time, exponential variance
bp 1
(see paper for formal proofs)
21b
p
bp
b1
21
Restart strategy - Expected polynomial time
29
CP - 2001
Bounded Heavy-Tailed BehaviorBounded Heavy-Tailed Behavior
30
CP - 2001
Balanced, Unbounded, and Imbalanced Trees
Balanced, Unbounded, and Imbalanced Trees
31
CP - 2001
Conclusions
32
CP - 2001
ConclusionsConclusions
Heavy-tailed behavior yields insight into backtrack search methods, providing an explanation for the effectiveness of restart strategies.
Tree Search Models: can be analyzed rigorously.
• Balanced Tree Search Model Uniform distribution (not heavy-tailed); restarts are not effective
• Imbalanced Tree Search Model (Bounded/Unbounded) Heavy-tailed; restarts are effective
Consequence for algorithm design: aim for strategies which have highly asymmetric distributions.
33
CP - 2001
www.cs.cornell.edu/hubeswww.cs.cornell.edu/gomes
Check also:
www.cis.cornell.edu/iisi
www.cs.cornell.edu/hubeswww.cs.cornell.edu/gomes
Check also:
www.cis.cornell.edu/iisi
Demos, papers, etc.