empirical methods for the analysis of algorithmsiridia.ulb.ac.be/sls2007/slides/sls-tutorial.pdf ·...
TRANSCRIPT
Empirical Methodsfor the Analysis of Algorithms
Marco Chiarandini1 Luıs Paquete2
1Department of Mathematics and Computer ScienceUniversity of Southern Denmark, Odense, Denmark
2CISUC - Centre for Informatics and Systems of the University of Coimbra,Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal
Semptember 7, 2007Workshop on Engineering SLS Algorithms
Based on:M. Chiarandini, L. Paquete, M. Preuss, E. Ridge (2007).Experiments on metaheuristics: Methodological overview and open issues.Tech. Rep. DMF-2007-03-003, The Danish Mathematical Society.
Outlook
The tutorial is about:
Applied Statisticsfor Engineers and Scientists of SLS Algorithms
We aim at providing:
I basic notions of statistics
I a review of performance measures
I an overview of scenarios with proper tools for
I exploratory data analysisI statistical inference
2
Definitions and MotivationsExploratory Data Analysis
Inferential StatisticsOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
3
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
4
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
5
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresThe Objects of Analysis
We consider search algorithms for solving (mainly) discrete optimizationproblems
For each general problem Π (e.g., TSP, GCP) we denote by CΠ a set (orclass) of possible instances and by π ∈ CΠ a single instance.
The object of analysis are SLS algorithms, i.e., randomized searchheuristics (with no “known” guarantee of optimality).
6
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresDefinitions
More precisely:
I single-pass heuristics (denoted A⊥): have an embedded termination,for example, upon reaching a certain state
(generalized optimization Las Vegas algorithms [Hoos and Stutzle,2004])
I asymptotic heuristics (denoted A∞): do not have an embeddedtermination and they might improve their solution asymptotically
(both probabilistically approximately complete and essentiallyincomplete [Hoos and Stutzle, 2004])
7
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresDefinitions
The most typical scenario considered in research on SLS algorithms:
Asymptotic heuristics with time limit decided a prioriThe algorithm A∞ is halted when time expires.
Deterministic case: A∞ on π
returns a solution of cost x.
The performance of A∞ on π is ascalar y = x.
Randomized case: A∞ on π
returns a solution of cost X, whereX is a random variable.
The performance of A∞ on π is theunivariate Y = X.
[This is not the only relevant scenario: to be refined later]
8
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresGeneralization
On a specific instance, the random variable Y that defines theperformance measure of an algorithm is described by its probabilitydistribution/density function
p(y|π)
It is often more interesting to generalize the performance on a class ofinstances CΠ, that is,
p(y,CΠ) =∑π∈Π
p(y|π)p(π)
9
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
10
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresTheory vs Practice
Task: explain the performance of algorithms
Theoretical Analysis:
I worst case analysis: considers all possibleproblem instances of a problem
I average case analysis: assumes knowledge onthe distribution of problem instances.
But:
I results may have low practical relevance
I problems and algorithms are complex
Experimental Analysis:
I it is (often) easy and fast to collect data
I results are fast and have practical relevance
11
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresExperimental Algorithmics
Experimental Algorithmics: “is concerned with the design,implementation, tuning, debugging and performance analysis of computerprograms for solving algorithmic problems”.
[Demetrescu and Italiano, 2000]
Looks at algorithms as a problem of the natural sciences instead of”only” as a mathematical problem.
Goals
I Defining standard methodologies
I Identifying and collecting problem instances from the real-world andinstance generators
I Comparing relative performance of algorithms so as to identify thebest ones for a given application
I Identifying algorithm separators, i.e., families of problem instancesfor which the performance differ
I Providing new insights in algorithm design
12
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresExperimental Algorithmics
(Algorithm)Mathematical Model Simulation Program
Experiment
[McGeoch, 1996]
Algorithmic models of programs can vary according to their level ofinstantiation:
I minimally instantiated (algorithmic framework), e.g., simulatedannealing
I mildly instantiated: includes implementation strategies (datastructures)
I highly instantiated: includes details specific to a particularprogramming language or computer architecture
13
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresSampling
In experiments,
I we sample the population of instances and
I we sample the performance of the algorithm on each sampledinstance
If on an instance π we run the algorithm r times then we have r
replicates of the performance measure Y, denoted Y1, . . . , Yr, which areindependent and identically distributed (i.i.d.), i.e.
p(y1, . . . , yr|π) =
r∏j=1
p(yj|π)
p(y1, . . . , yr) =∑
π∈CΠ
p(y1, . . . , yr|π)p(π).
14
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresTest Instance Selection
In real-life applications a simulation of p(π) can be obtained by historicaldata.
In research studies instances may be:
I real world instances
I random variants of real world-instances
I online libraries
I randomly generated instances
They may be grouped in classes according to some features whose impactmay be worth studying:
I type (for features that might impact performance)
I size (for scaling studies)
I hardness (focus on hard instances)
I application (e.g., CSP encodings of scheduling problems), ...
Within the class, instances are drawn with uniform probability p(π) = c
15
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresStatistical Methods
The analysis of performance is based on finite-sized sampled data.Statistics provides the methods and the mathematical basis to
I describe, summarizing, the data (descriptive statistics)
I make inference on those data (inferential statistics)
In research, statistics helps to
I guarantee replicability
I make results reliable
I help to extract relevant results from large amount of data
In the practical context of heuristic design and implementation (i.e.,engineering), statistics helps to take sound decisions with the leastamount of experimentation
16
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresObjectives of the Experiments
I Characterization:Interpolation: fitting models to dataExtrapolation: building models ofdata, explaining phenomena
I Standard statistical methods: linearand non linear regressionmodel fitting
I Comparison:bigger/smaller, same/different,Algorithm Configuration,Component-Based Analysis
I Standard statistical methods:experimental designs, testhypothesis and estimation
0.010.01
0.1
1
10
100
10003600
20 40 80 200 400 800 1600
Uniform random graphs
+
+
+
++
+++++
+++++
+++++
+++++
+
++++
++++++++
+++++ +++++
+++++
+++++
+++++++++++++++++++++++
++++++++++++++++++++
+++++
+++++
+++++
++++++++++++++++++++++++++
+++++
+++++++++++++++
SizeSe
cond
s
p=0 p=0.1 p=0.2 p=0.5
p=0.9
17
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresObjectives of the Experiments
I Characterization:Interpolation: fitting models to dataExtrapolation: building models ofdata, explaining phenomena
I Standard statistical methods: linearand non linear regressionmodel fitting
I Comparison:bigger/smaller, same/different,Algorithm Configuration,Component-Based Analysis
I Standard statistical methods:experimental designs, testhypothesis and estimation
Response−2 0 2
0.0
0.1
0.2
0.3
0.4
Alg. 1 Alg. 2 Alg. 3 Alg. 4 Alg. 5
Response−2 0 2
Alg. 1
Alg. 2
Alg. 3
Alg. 4
Alg. 5
17
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
18
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresMeasures and Transformations
On a single instance
Computational effort indicators
I CPU time (real time as measured by OS functions)
I number of elementary operations/algorithmic iterations (e.g., searchsteps, objective function evaluations, number of visited nodes in thesearch tree, consistency checks, etc.)
Solution quality indicators
I value returned by the cost function (or error fromoptimum/reference value)
19
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresMeasures and Transformations
On a class of instances
Computational effort indicators
I no transformation if the interest is in studying scaling
I standardization if a fixed time limit is used
I otherwise, better to group homogeneously the instances
Solution quality indicatorsDifferent instances implies different scales ⇒ need for an invariantmeasure
But also, see [McGeoch, 1996].
20
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
SLS AlgorithmsExperimental AnalysisPerformance MeasuresMeasures and Transformations
On a class of instances
Solution quality indicators
I Distance or error from a reference value(assume minimization case):
e1(x, π) =x(π) − x(π)√
^σ(π)
standard score
e2(x, π) =x(π) − xopt(π)
xopt(π)relative error
e3(x, π) =x(π) − xopt(π)
xworst(π) − xopt(π)invariant [Zemel, 1981]
I optimal value computed exactly or known by instance constructionI surrogate value such bounds or best known values
I Rank (no need for standardization but loss of information)
21
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
22
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
23
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingScenarios (Refinement)
I Single-pass heuristics
I Asymptotic heuristics:Three approaches:
1. Univariate
1.a Time as an external parameter decided a priori1.b Solution quality as an external parameter decided a priori
2. Cost dependent on running time:
3. Cost and running time as two minimizing objectives
24
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingDefinitions
Single-pass heuristics
Deterministic case: A⊥ on π
returns a solution of cost x withcomputational effort t (e.g.,running time).
The performance of A⊥ on π is thevector ~y = (x, t).
Randomized case: A⊥ on π
returns a solution of cost X withcomputational effort T , where X
and T are random variables.
The performance of A⊥ on π is thebivariate ~Y = (X, T).
25
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingSingle-pass Heuristics
Bivariate analysis: Example
Scenario:
B 3 heuristics A⊥1 , A⊥2 , A⊥3on class CΠ.
B homogeneous instances orneed for datatransformation.
B 1 or r runs per instance
I Interest: inspecting solutioncost and running time toobserve and compare thelevel of approximation andthe speed.
Tools:
I Scatter plots ofsolution-cost and run-time
time
cost
105
110
115
120
125
0 1 2 3 4
DSATURRLF
ROS
26
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingDefinitions
Asymptotic heuristicsThere are three approaches:
1.a. Time as an external parameter decided a priori.The algorithm is halted when time expires.
Deterministic case: A∞ on π
returns a solution of cost x.
The performance of A∞ on π is thescalar y = x.
Randomized case: A∞ on π
returns a solution of cost X, whereX is a random variable.
The performance of A∞ on π is theunivariate Y = X.
27
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 1.a, Univariate analysis
Scenario:
B 3 heuristics A∞1 , A∞
2 , A∞3 on class CΠ.
(Or 3 heuristics A∞1 , A∞
2 , A∞3 on class CΠ without interest in
computation time because negligible or comparable)
B homogeneous instances (no data transformation) or heterogeneous(data transformation)
B 1 or r runs per instance
B a priori time limit imposed
I Interest: inspecting solution cost
Tools:
I Histograms (summary measures: mean or median or mode?)
I Boxplots
I Empirical cumulative distribution functions (ECDFs)
28
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 1.a, Univariate analysis: Example
Data representation
Histogram
95 100 105 110 115
0.00
0.05
0.10
0.15
0.20
0.25
0.30
95 100 105 110 115
0.0
0.2
0.4
0.6
0.8
1.0
100 105 110 11595 100 105 110 115
Boxplot
95
Density
Fn(x
)
Empirical cumulative distribution function
Median
outliers
Q3 MaxMinQ1
IQR
Q1−1.5*IQR
Average
Measures of central tendency (mean, median, mode)
and dispersion (variance, standard deviation, inter-quartile)29
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 1.a, Univariate analysis: Example
On a class of instances
TS1
TS2
TS3
−3 −2 −1 0 1 2 3
Standard error: x −− x
σσ
TS1
TS2
TS3
0.2 0.4 0.6 0.8 1.0 1.2 1.4
Relative error: x −− x((opt))
x((opt))
TS1
TS2
TS3
0.1 0.2 0.3 0.4 0.5
Invariant error: x −− x((opt))
x((worst)) −− x((opt))
TS1
TS2
TS3
0 5 10 15 20 25 30
Ranks
30
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 1.a, Univariate analysis: Example
On a class of instances
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
Standard error: x −− x
σσ
$((T5,, values))
Pro
port
ion
<=
x
n:300 m:0
TS1
TS2
TS3
0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.0
0.2
0.4
0.6
0.8
1.0
Relative error: x −− x((opt))
x((opt))
$((G,, err2))
Pro
port
ion
<=
x
n:300 m:0
TS1TS2
TS3
0.1 0.2 0.3 0.4 0.5
0.0
0.2
0.4
0.6
0.8
1.0
Invariant error: x −− x((opt))
x((worst)) −− x((opt))
Pro
port
ion
<=
x
TS1
TS2
TS3
0 5 10 15 20 25 30
0.0
0.2
0.4
0.6
0.8
1.0
Ranks
Pro
port
ion
<=
x
TS1
TS2
TS3
30
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingStochastic Dominance
Definition: Algorithm A1 probabilistically dominates algorithm A2 on aproblem instance, iff its CDF is always ”below” that of A2, i.e.:
F1(x) ≤ F2(x), ∀x ∈ X
15 20 25 30 35 40 45
0.0
0.2
0.4
0.6
0.8
1.0
x
F(x
)
20 30 40 50
0.0
0.2
0.4
0.6
0.8
1.0
x
F(x
)
31
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingDefinitions
Asymptotic heuristicsThere are three approaches:
1.b. Solution quality as an external parameter decided a priori. Thealgorithm is halted when quality is reached.
Deterministic case: A∞ on π
finds a solution in running time t.
The performance of A∞ on π is thescalar y = t.
Randomized case: A∞ on π findsa solution in running time T , whereT is a random variable.
The performance of A∞ on π is theunivariate Y = T .
32
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingDealing with Censored Data
Asymptotic heuristics, Approach 1.b
B Heuristics A⊥ stopped before completion or A∞ truncated (alwaysthe case)
I Interest: determining whether a prefixed goal (optimal/feasible) hasbeen reached
The computational effort to attain the goal can be specified by acumulative distribution function F(t) = P(T < t) with T in [0,∞).
If in a run i we stop the algorithm at time Li then we have a Type I rightcensoring, that is, we know either
I Ti if Ti ≤ Li
I or Ti ≥ Li.
Hence, for each run i we need to record min(Ti, Li) and the indicatorvariable for observed optimal/feasible solution attainment,δi = I(Ti ≤ Li).
33
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingDealing with Censored Data
Asymptotic heuristics, Approach 1.b: Example
B An exact vs an heuristic algorithm for the2-Edge-connectivity augmentation problem.
I Interest: time to find the optimum on different instances.
10 20 50 100 200 500 2000
0.0
0.2
0.4
0.6
0.8
1.0
Time to find the optimum
ecdf
HeuristicExact
Uncensored:
F(t) =# runs < t
n
Censored:
F(t) = 1−∏
j:tj<t
nj − dj
njt ∈ [0, L]
(Kaplan-Meier estimation model)
34
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingDefinitions
Asymptotic heuristicsThere are three approaches:
2. Cost dependent on running time:
Deterministic case: A∞ on π
returns a current best solution x
at each observation in t1, . . . , tk.
The performance of A∞ on π isthe profile indicated by thevector ~y = x(t1), . . . , x(tk).
Randomized case: A∞ on π
produces a monotone stochasticprocess in solution cost X(τ)with any element dependent onthe predecessors.
The performance of A∞ on π isthe multivariate~Y = (X(t1), X(t2), . . . , X(tk)).
35
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 2, Multivariate analysis
Scenario:
B 3 heuristics A∞1 , A‘∞2 , A∞
3 on instance π.
B single instance hence no data transformation.
B r runs
I Interest: inspecting solution cost over running time to determinewhether the comparison varies over time intervals
Tools:
I Quality profiles
36
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 2, Multivariate analysis: Example
The performance is described by multivariate random variables of thekind ~Y = Y(t1), Y(t2), . . . , Y(lk).
Sampled data are of the form ~Yi = Yi(t1), Yi(t2), . . . , Yi(tk),i = 1, . . . , 10 (10 runs per algorithm on one instance)
time
cost
70
80
90
100
0 200 400 600 800 1000 1200
Novelty
0 200 400 600 800 1000 1200
Tabu Search
37
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 2, Multivariate analysis: Example
The performance is described by multivariate random variables of thekind ~Y = Y(t1), Y(t2), . . . , Y(lk).
Sampled data are of the form ~Yi = Yi(t1), Yi(t2), . . . , Yi(tk),i = 1, . . . , 10 (10 runs per algorithm on one instance)
Time occasion
Col
ors
70
80
90
100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Novelty
70
80
90
100
Tabu Search
37
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 2, Multivariate analysis: Example
The performance is described by multivariate random variables of thekind ~Y = Y(t1), Y(t2), . . . , Y(lk).
Sampled data are of the form ~Yi = Yi(t1), Yi(t2), . . . , Yi(tk),i = 1, . . . , 10 (10 runs per algorithm on one instance)
time
cost
70
80
90
100
0 200 400 600 800 1000 1200
NoveltyTabu Search
The medianbehavior ofthe twoalgorithms
37
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingDefinitions
Asymptotic heuristicsThere are three approaches:
3. Cost and running time as two minimizing objectives [da Fonsecaet al., 2001]:
Deterministic case: A∞ on π
finds improvements in solution costwhich are represented by a set ofnon-dominated pointsy = ~yj ∈ R2, j = 1, . . . ,m ofsolution cost and running time
Randomized case: A∞ on π findsimprovements in solution costwhich are represented by a randomvariable, i.e., a random set ofnon-dominated pointsY = ~Yj ∈ R2, j = 1, . . . ,m.
The performance of A∞ on π ismeasured by the random set Y
38
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 3, Bi-objective analysis
Needed some definitions on dominance relations
In a run on π, A∞ passes through states, i.e., a set of pointsy = ~yj ∈ R2, j = 1, . . . ,m, each one being a vector of solution costand running time.
In Pareto sense, for points in R2
~y1 ~y2 weakly dominates y1i ≤ y2
i for all i = 1, 2
~y1 ‖ ~y2 incomparable neither ~y1 ~y2 nor ~y2 ~y1
Hence we can resume the run of the algorithm by a set of mutuallyincomparable points (i.e., weakly non-dominated points)
Extended to sets of points:
A s B weakly dominates for all ~b ∈ B exists ~a ∈ A
such that ~a ~bA ‖ B incomparable neither A S B nor B S A
39
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingPerformance Measures
Multi-Objective Case
How to compare the approximation sets from multiple runs of two ormore multiobjective algorithms?
For two approximation sets A,B ∈ Ω (Ω set of all approximation sets)dominance relations may indicate:A is better then B A and B are incomparableB is better then A A and B are indifferent
Other Pareto compliant quantitative indicators have been introduced
I Unary indicators (I : Ω → R) (need a reference point) andI HypervolumeI Epsilon indicatorI R indicator
I Binary indicators (I : Ω×Ω → R):I Epsilon indicator
I RankingI Attainment function
Note that with unary indicators, if I(A) < I(B), we can only say that A isnot worse than B (they may be incomparable) [Zitzler et al., 2003].
40
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 3, Bi-objective analysis
Let Y = ~Yj ∈ R2, j = 1, . . . ,m be a random set of m mutuallyindependent points of solution-cost and run-time
The attainment or hitting function is defined (similarly to a cumulativedistribution function) as
F(~y) = Pr[Y s ~y] =
= Pr[~Y1 ~y ∨ ~Y1 ~y ∨ . . . ∨ ~Ym ~y] =
= Pr[optimizer attains the goal ~y in a single run]
Sample data are collections of points Y1, . . . ,Yn obtained by n
independent runs. The corresponding ECDF is defined as:
Fn(~y) =1
n
n∑i=1
I(Yi s ~y)
where I(Yi s ~y) = 1, if ~Yi1 ~y or ~Yi
2 ~y or . . . or ~Yimi
~y.41
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 3, Bi-objective analysis
110
210
410
6500 × 10
5
50
52
54
56
58
60
0.2
0.4
0.6
0.8
1.0
Iterations
Colors
F
col
iter
49
50
51
52
53
54
55
56
57
58
59
60
61
62
10^2 10^3 10^4 10^5 10^6
Min Fn=1/50Median Fn=25/50Max Fn=50/50
42
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 3, Bi-objective analysis: Example
time
cost
70
80
90
100
110
0 200 400 600 800 1000
Novelty
0 200 400 600 800 1000
TSinN1
time
cost
70
80
90
100
110
0.0 0.2 0.4 0.6 0.8 1.0
43
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingAsymptotic Heuristics
Approach 3, Bi-objective analysis
110
210
410
6500 × 10
5
50
52
54
56
58
60
0.2
0.4
0.6
0.8
1.0
Iterations
Colors
F
110
210
410
6500 × 10
5
50
52
54
56
58
60
0.2
0.4
0.6
0.8
1.0
Iterations
Colors
F
P(T ≤ t | X ≤ x) =
| i | ∃ Y(X, T) ∈ Yi, X ≤ x ∧ T ≤ t |
| i | ∃ Y(X, T) ∈ Yi, X ≤ x |
P(X ≤ x | T ≤ t) =
| i | ∃ Y(X, T) ∈ Yi, X ≤ x ∧ T ≤ t |
| i | ∃ Y(X, T) ∈ Yi, T ≤ t |
Empirical Qualified Run-Time Distributions
# Iterations
Success P
robabili
ty
4950515253545556575859
0.0
0.2
0.4
0.6
0.8
1.0
1e+01 1e+03 1e+05 1e+07
Empirical Solution Quality Distributions
Success P
robabili
ty
0.4
0.6
0.8
1.0
62 60 58 56 54 52 50 48
10 10^3 10^4 10^5 5^7
0.0
Colors
0.2
44
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
45
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingCorrelation Analysis
Scenario:
B heterogeneous instances, hence data transformationB 1 or r run per instanceB consider time to goal or solution qualityI Interest: inspecting whether instances are all equally hard to solve
for different algorithms or whether some features make the instancesharder
Tools: correlation plots (eachpoint represents an instance),correlation coefficient:Population:
ρXY =cov(X, Y)
σXσY
Sample:
rXY =
∑(Xi − X)(Yi − Y)
(n − 1)sXsY[Hoos and Stutzle, 2004]
46
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingCorrelation Analysis
Scenario:
B heterogeneous instances, hence data transformationB 1 or r run per instanceB consider time to goal or solution qualityI Interest: inspecting whether instances are all equally hard to solve
for different algorithms or whether some features make the instancesharder
Tools: correlation plots (eachpoint represents an instance),correlation coefficient:Population:
ρXY =cov(X, Y)
σXσY
Sample:
rXY =
∑(Xi − X)(Yi − Y)
(n − 1)sXsY[Hoos and Stutzle, 2004]
46
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingScaling Analysis
Scenario:
B one heuristic A⊥ or A∞ with a priori quality goal
B data collected on instances of different size or features
I Interest: characterizing the growth of computational effort.
Given a set of data points (Ni, Yi) obtained from an experiment inwhich Yi = f(Ni), for some unknown function f(n),Find growth function class O(gu(n)) and/or Ω(gl(n)) to which f(n)belongs.
Mix of interpolation of the data trend and extrapolation beyond therange of experimentation
Tools:I Heuristic adaptation of linear regression techniques
I Log-log transformation and linear regressionI Box-cox rules [McGeoch et al., 2002]
I Smoothing techniques
47
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingScaling Analysis
Plots and Linear Trends
log= ''
x
y =ex
log= 'x'
xex
p(x)
y =ex
log= 'y'
x
exp(
x)
y =ex
log= 'xy'
x
exp(
x)
y =ex
log= ''
x
y =xe
log= 'x'
x
x^ex
p(1)
y =xe
log= 'y'
x
x^ex
p(1)
y =xe
log= 'xy'
x
x^ex
p(1)
y =xe
log= ''
y =log x
log= 'x'
log(
x)
y =log x
log= 'y'
log(
x)
y =log x
log= 'xy'
log(
x)
y =log x
48
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingLinear Regression
Simple Linear Regression: dependent variable + independent variable
Yi = β0 + βXi + εi
Uses the Least Squares Method:
min∑
i
ε2i
εi = Yi − β0 − βXi
The indicator of the quality of fitness is the coefficient of determinationR2 (but use with caution)
Multiple Linear regression considers multiple predictors
Yi = β0 + β1X1i + β2X2i + εi
The indicator of the quality of fitness is the adjusted R2 statistic
49
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingScaling Analysis
Example
Running time of RLF for Graph Coloring:with two regressors: # vertices and edge density
0.010.01
0.1
1
10
100
10003600
20 40 80 200 400 800 1600
Uniform random graphs
+
+
+
++
+++++
+++++
+++++
+++++
+
++++
++++++++
+++++ +++++
+++++
+++++
+++++++++++++++++++++++
++++++++++++++++++++
+++++
+++++
+++++
++++++++++++++++++++++++++
+++++
+++++++++++++++
Size
Seco
nds
p=0 p=0.1 p=0.2 p=0.5
p=0.9
50
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
51
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingCharacterization of Run-time
Parametric models used in the analysis of run-times to:
I provide more informative experimental results
I make more statistically rigorous comparisons of algorithms
I exploit the properties of the model (eg, the character of long tailsand completion rate)
I predict missing data in case of censored distributions
Procedure:
I choose a model
I apply fitting methodmaximum likelihood estimation method:
maxθ∈Θ
logn∏
i=1
p(Xi, θ)
I test the model
52
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingCharacterization of Run-time
The distributions used are [Frost et al., 1997; Gomes et al., 2000]:
0 1 2 3 4
0.0
0.5
1.0
1.5
Exponential
x
f(x)
0 1 2 3 4
0.0
0.5
1.0
1.5
Weibull
x
f(x)
0 1 2 3 4
0.0
0.5
1.0
1.5
Log−normal
x
f(x)
0 1 2 3 4
0.0
0.5
1.0
1.5
Gamma
x
f(x)
0 1 2 3 4
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Exponential
x
h(x)
0 1 2 3 4
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Weibull
x
h(x)
0 1 2 3 4
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Log−normal
x
h(x)
0 1 2 3 4
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Gamma
xh(
x)
53
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingCharacterization of Run-time
Motivations for these distributions:
I qualitative information on the completion rate (= hazard function)
I empirical good fitting
To check whether a parametric family of models is reasonable the idea isto make plots that should be linear. Departures from linearity of the datacan be easily appreciated by eye.
Example: for an exponential distribution:
log S(t) = −λt S(t) = 1 − F(t) is the survivor function
hence the plot of log S(t) against t should be linear
Similarly, for the Weibull the cumulative hazard function is linear on alog-log plot
54
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingCharacterization of Run-time
Example
Graphical inspection for the two censored distributions from the previousexample on 2-edge-connectivity.
1 5 50 500
0.0
0.2
0.4
0.6
0.8
1.0
t
ecdf
HeuristicExact
500 1000 1500 20000.0
0.2
0.4
0.6
0.8
1.0
linear => exponential
t
log
S(t
)
1 5 50 500
−3
−2
−1
0
1
linear => weibull
log t
log
H(t
)
55
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingCharacterization of Run-time
Example
1 5 10 50 100 500
0.0
0.2
0.4
0.6
0.8
1.0
Time to find the optimum
ecdf
Heuristic − LognormalExact − Exponential
55
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingExtreme Value Statistics
I Extreme value statistics focuses on characteristics related to the tailsof a distribution function
1. extreme quantiles (e.g., minima)2. indices describing tail decay
I ‘Classical’ statistical theory: analysis of means.Central limit theorem: X1, . . . , Xn i.i.d. with FX
√n
X − µ√Var(X)
D−→ N(0, 1), as n → ∞Heavy tailed distributions: mean and/or variance may not be finite!
56
Definitions and MotivationsExploratory Data Analysis
Inferential Statistics
Representation of Sampled DataRegression AnalysisCharacterization and Model FittingCharacterization of Run-time
Heavy Tails
Gomes et al. [2000] analyze the mean computational cost to find asolution on a single instance
On the left, the observed behavior calculated over an increasing numberof runs.On the right, the case of data drawn from normal or gamma distributions
I The use of the median instead of the mean is recommended
I The existence of the moments (e.g., mean, variance) is determinedby the tails behavior: a case like the left one arises in presence oflong tails
57
Definitions and MotivationsExploratory Data Analysis
Inferential StatisticsOutline
1. Definitions and MotivationsSLS AlgorithmsExperimental AnalysisPerformance Measures
2. Exploratory Data AnalysisRepresentation of Sampled DataRegression AnalysisCharacterization and Model Fitting
3. Inferential Statistics
58
Definitions and MotivationsExploratory Data Analysis
Inferential StatisticsInferential Statistics
I We work with samples (instances, solution quality)
I But we want sound conclusions: generalization over a givenpopulation (all possible instances)
I Thus we need statistical inference
Random SampleXn
Statistical Estimator θ
PopulationP(x, θ)
Parameter θ
Inference
Since the analysis is based on finite-sized sampled data, statements like
“the cost of solutions returned by algorithm A is smaller thanthat of algorithm B”
must be completed by
“at a level of significance of 5%”.
59
Definitions and MotivationsExploratory Data Analysis
Inferential StatisticsA Motivating Example
I There is a competition and two algorithms A1 and A2 aresubmitted.
I We run both algorithms once on n instances.On each instance either A1 wins (+) or A2 wins (-) or they make atie (=).
Questions:
1. If we have only 10 instances and algorithm A1 wins 7 times howconfident are we in claiming that algorithm A1 is the best?
2. How many instances and how many wins should we observe to gaina confidence of 95% that the algorithm A1 is the best?
60
Definitions and MotivationsExploratory Data Analysis
Inferential StatisticsA Motivating Example
I p: probability that A1 wins on each instance (+)I n: number of runs without tiesI Y: number of wins of algorithm A1
If each run is indepenedent and consitent:
Y ∼ b(n, p) : Pr[Y = y] =
(n
y
)py(1 − p)1−y
Under the conditions of question 1,we can check how unlikely thesituation is if it were p(+) ≤ p(−).
If p = 0.5 then the chance thatalgorithm A1 wins 7 or more timesout of 10 is 17.2%: quite high!
0 2 4 6 8 10
0.00
0.05
0.10
0.15
0.20
0.25
y
b(10
,0.5
)
61
Definitions and MotivationsExploratory Data Analysis
Inferential StatisticsA Motivating Example
I p: probability that A1 wins on each instance (+)I n: number of runs without tiesI Y: number of wins of algorithm A1
If each run is indepenedent and consitent:
Y ∼ b(n, p) : Pr[Y = y] =
(n
y
)py(1 − p)1−y
Under the conditions of question 1,we can check how unlikely thesituation is if it were p(+) ≤ p(−).
If p = 0.5 then the chance thatalgorithm A1 wins 7 or more timesout of 10 is 17.2%: quite high!
To answer question 2. we computethe 95% quantile, i.e., y : Pr[Y ≥y] < 0.05 with p = 0.5 at differentvalues of n:
n 10 11 12 13 14 15y 9 9 10 10 11 12
n 16 17 18 19 20y 12 13 13 14 15
61
Summary
What we saw:
I Simple Comparative Analysis
I Characterization
What remains:
I Methods for statistical inference
I Experimental Designs techniques for component based analysis
I Advanced Designs for tuning and configuring
(This will be covered in the next talks and in Ruben’s tutorial)
But also:
I Reactive/Learning approaches
I Phase transitions
I Landscape analysis
I ...
62
References
ReferencesH. Hoos, T. Stutzle (2004). Stochastic Local Search: Foundations and Applications. Morgan Kaufmann Publishers, San Francisco, CA,
USA.
C. Demetrescu, G. F. Italiano (2000). What do we learn from experimental algorithmics? In M. Nielsen, B. Rovan (Eds.), MFCS , vol.1893 of Lecture Notes in Computer Science, pp. 36–51, Springer Verlag, Berlin, Germany, Berlin.
C. C. McGeoch (1996). Toward an experimental method for algorithm simulation. INFORMS Journal on Computing , vol. 8, no. 1, pp.1–15.
E. Zemel (August 1981). Measuring the quality of approximate solutions to zero-one programming problems. Mathematics of operationsresearch, vol. 6, no. 3, pp. 319–332.
V. G. da Fonseca, C. M. Fonseca, A. O. Hall (2001). Inferential performance assessment of stochastic optimisers and the attainmentfunction. In C. Coello, D. Corne, K. Deb, L. Thiele, E. Zitzler (Eds.), Proceedings of Evolutionary Multi-Criterion Optimization FirstInternational Conference, EMO 2001 , vol. 1993 of Lecture Notes in Computer Science, pp. 213–224, Springer Verlag, Berlin, Germany.
E. Zitzler, L. Thiele, M. Laumanns, C. M. Fonseca, V. G. da Fonseca (2003). Performance assessment of multiobjective optimizers: ananalysis and review. IEEE Trans. Evolutionary Computation, vol. 7, no. 2, pp. 117–132.
C. McGeoch, P. Sanders, R. Fleischer, P. R. Cohen, D. Precup (2002). Using finite experiments to study asymptotic performance. InR. Fleischer, B. Moret, E. M. Schmidt (Eds.), Experimental Algorithmics: From Algorithm Design to Robust and Efficient Software,vol. 2547 of Lecture Notes in Computer Science, pp. 93–126, Springer Verlag, Berlin, Germany.
D. Frost, I. Rish, L. Vila (1997). Summarizing csp hardness with continuous probability distributions. In AAAI/IAAI , pp. 327–333.
C. Gomes, B. Selman, N. Crato, H. Kautz (2000). Heavy-tailed phenomena in satisfiability and constraint satisfaction problems. Journal ofAutomated Reasoning , vol. 24, no. 1-2, pp. 67–100.
I. M. Ovacik, S. Rajagopalan, R. Uzsoy (2000). Integrating interval estimates of global optima and local search methods for combinatorialoptimization problems. Journal of Heuristics, vol. 6, no. 4, pp. 481–500.
J. Husler, P. Cruz, A. Hall, C. M. Fonseca (2003). On optimization and extreme value theory. Methodology and Computing in AppliedProbability , vol. 5, pp. 183–195.
M. Birattari (2005). The Problem of Tuning Metaheuristics as Seen from a Machine Learning Perspective. No. DISKI 292, Infix/Aka,Berlin, Germany.
J. D. Petruccelli, B. Nandram, M. Chen (1999). Applied Statistics for Engineers and Scientists. Prentice Hall, Englewood Cliffs, NJ, USA.
W. Conover (1999). Practical Nonparametric Statistics. John Wiley & Sons, New York, NY, USA, 3rd edn.
D. C. Montgomery, G. C. Runger (2007). Applied Statistics and Probability for Engineers. John Wiley & Sons, forth edn.
J. F. Lawless (1982). Statistical Models and Methods for Lifetime Data. Wiley Series in Probability and Mathematical Statistics, jws.
G. Seber (2004). Multivariate observations. Wiley series in probability and statistics, John Wiley.
M. H. Kutner, C. J. Nachtsheim, J. Neter, W. Li (2005). Applied Linear Statistical Models. McGraw Hill, 5th edn.