evolving efficient list search algorithms

62
Evolving Efficient List Search Algorithms Kfir Wolfson Moshe Sipper Department of Computer Science Ben Gurion University 2009

Upload: zorana

Post on 14-Jan-2016

26 views

Category:

Documents


1 download

DESCRIPTION

Evolving Efficient List Search Algorithms. Kfir Wolfson Moshe Sipper Department of Computer Science Ben Gurion University 2009. Agenda. Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work Post-Evolutionary Analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Evolving Efficient List Search Algorithms

Evolving EfficientList Search Algorithms

Kfir WolfsonMoshe Sipper

Department of Computer ScienceBen Gurion University

2009

Page 2: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 2

Agenda

Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work Post-Evolutionary Analysis

Page 3: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 3

Introduction Evolutionary algorithms have been applied to many areas,

but limited research on software engineering and

algorithmic design

We introduce:

Algorithmic design through Darwinian evolution

Begin with a benchmark case — List Search Algorithms:

1. Can evolution be applied to finding a search algorithm?

2. Can evolution be applied to finding an efficient search algorithm?

Yes to both questions — using GP

Only related work

Page 4: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 4

Binary Search

Knuth (The Art of Computer Programming):

“Although the basic idea of binary search is

comparatively straightforward, the details can be

somewhat tricky, and many good programmers

have done it wrong the first few times they tried.”

Page 5: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 5

Agenda

Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work Post-Evolutionary Analysis

Page 6: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 6

Representation

Genotype Koza-style GP

Evaluation trees

Strongly typed

More understandable algorithms

Function and Terminal sets

Same for evolution both of linear and

sublinear search algorithms

If

=

Array[INDEX]

KEY

NOPINDEX:=

ITER

Page 7: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 7

Representation Phenotype

Array search algorithm Searches for a key in a 1-dimentional array

Java function:

public static int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0;

for (int ITER = 0; ITER < iterations; ITER++) {

-> PLUG IN EVOLVING GENOTYPE HERE <- } return INDEX;}

17 18 23 34 60Array =

KEY = 18

0 1 2 3 4

Page 8: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 8

Representation

public static int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0;

for (int ITER = 0; ITER < iterations; ITER++) {

-> PLUG IN EVOLVING GENOTYPE HERE <- } return INDEX;}

17 18 23 34 60Array =

KEY = 18

0 1 2 3 4

Page 9: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 9

Representationpublic static int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0;

for (int ITER = 0; ITER < iterations; ITER++) {

} return INDEX;}

Set to:• n for linear search• log2 n for sublinearreturn variable index

(might be “illegal”)

global variables

17 18 23 34 60Array =

KEY = 18

-> PLUG IN EVOLVING GENOTYPE HERE <-

0 1 2 3 4

Page 10: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 10

Representation-> PLUG IN EVOLVING GENOTYPE HERE <-

General Purpose Problem related

If INDEX INDEX:=

>, <, = Array[INDEX]

TRUE, FALSE KEY ITER

PROGN2 M0, M1 M0:=, M1:=

NOP [M0+M1]/2

17 18 23 34 60Array =

KEY = 18

later replaced with an ADF

0 1 2 3 4

Page 11: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 11

Representation - Example An example correct solution to linear search problem:

LISP:(If (= Array[INDEX] KEY) NOP INDEX:= ITER)))

If

=

Array[INDEX]

KEY

NOPINDEX:=

ITER

Let’s plug into

the phenotype frame…if (arr[INDEX] == KEY) ;else INDEX = ITER;

Equivalent Java:

Page 12: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 12

Representation - Example An example correct solution to linear search problem:

public static int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { if (arr[INDEX] == KEY) ; else INDEX = ITER; } return INDEX;}

Page 13: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 13

Representation

search call: Always halts

No loop functions Only read access to ITER Number of iterations is limited

Inherently deals with keys not in the array With wrapper function

No early termination when key is found Harder problem:

Evolved algorithm will have to learn to retain correct index.

int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER=0; ITER < iterations; ITER++) { -> PLUG IN GENOTYPE HERE <- } return INDEX;}Why?

Page 14: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 14

Fitness Function How do we rate a solution? Many random arrays Of varying array lengths

Search for all keys in all arrays Reward individual for closeness of returned indexes Key range disjoint from index range

Discourage “cheating”

Sorted and unsorted arrays of positive integers

14 17 35 37 67 70 72 99

13 47 50 51 51 66 66 78

14 29 45 48 59 73 74 93

15 21 35 45 70 76 86 95

14 14 37 62 68 74 76 86

16 19 24 28 32 43 55 88

14 19 33 39 49 55 76 97

14 17

13 47 50

14 29 45 48

15 21 35 45 70

14 14 37 62 68 74

16 19 24 28 32 43 55

minN=2

maxN=100

Page 15: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 15

Fitness Function - Definitions Error per single key search

Distance between the correct index of KEY and the index returned Elements are unique - No ambiguity

Hit = finding precise location of KEY error = 0

17 18 23 34 60

correct search(arr,key)

error=2

0 1 2 3 4

17 18 23 34 60

0 1 2 3 4

correct search(arr,key)

error=0 Hit !KEY = 18

Page 16: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 16

Fitness Function

The fitness value of an individual is defined as:

This gives a 0.5% bonus reduction for every 1% of correct hits For example, if an individual scored 300 hits in 1000 search calls,

its fitness will be the average error per call, reduced by 15%

This bonus encourages perfect answers (“almost” is bad…), increases fitness variation in population

calls

hitserroraveragefitness 5.01 _

Page 17: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 17

Generality Test The best solution of each run was subjected to a stringent

generality test, by running it on random arrays of all lengths in the range [2, 5000] ([2, 500] for linear case).

Kinnear (1993) noted that:“For any algorithm... that operates on an infinite domain of data, no amount of testing can ever establish generality. Testing can only increase confidence.”

We included analysis by hand for selected solutions.

Page 18: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 18

GP Operators and Parameters

Page 19: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 19

Agenda

Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work Post-Evolutionary Analysis

Page 20: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 20

Results - Linear It turned out that evolving a linear-time search algorithm

was quite easy with the function and terminal sets we designed.

46 out of 50 runs (92%) produced perfect solutions, passing the generality testing of arrays up to length 500.

Our representation rendered the problem easy enough for a perfect individual to appear in the randomly generated generation 0 in three of the runs.

Search space was small enough for random search.

Page 21: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 21

Results - Linear An example evolved solution:

LISP:

(If (= Array[INDEX] KEY) (M1:= [M0+M1]/2) INDEX:= ITER)))

If

=

Array[INDEX]

KEY

M1:=

[M0+M1]/2

INDEX:=

ITER

if (arr[INDEX] == KEY) M1 = (M0+M1)/2;else INDEX = ITER;

Equivalent Java:

Page 22: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 22

Results - Linear An example evolved solution:

public static int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { if (arr[INDEX] == KEY) M1 = (M0+M1)/2; else INDEX = ITER; } return INDEX;}

Irrelevant but does not effect output index

Page 23: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 23

Sublinear Search We set iterations to log2n,

and proceeded to evolve sublinear search algorithms.

public static int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0;

for (int ITER = 0; ITER < iterations; ITER++) { -> PLUG IN EVOLVING GENOTYPE HERE <- } return INDEX;}

Page 24: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 24

Results - Sublinear Unsurprisingly, this case proved a harder problem, but it

was also solved by the evolution. 35 out of 50 runs (70%) produced perfect solutions,

passing the generality testing of arrays up to length 5,000. 7 runs (14%) produced near-perfect solutions, which

failed on a single key in the input arrays (99.96% hits on the generality test)

Page 25: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 25

Results – Sublinear An example simplified evolved solution:

LISP: Equivalent Java:

(Simplified by hand from a tree of 50 nodes down to 14)

(PROGN2 (INDEX:= [M0+M1]/2) (If (> KEY Array[INDEX]) (PROGN2 (M0:= [M0+M1]/2) (INDEX:= M1)) (M1:= [M0+M1]/2))))

INDEX = (M0+M1)/2 ;if (KEY > arr[INDEX]){ M0 = (M0+M1)/2 ; INDEX = M1;}else M1 = (M0+M1)/2 ;

Page 26: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 26

Results - Sublinearpublic static int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { INDEX = (M0+M1)/2 ; if (KEY > arr[INDEX]){ M0 = (M0+M1)/2 ; INDEX = M1; } else M1 = (M0+M1)/2 ; } return INDEX;}

This is a form ofBinary Search

(with a small twist)

Page 27: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 27

Agenda

Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work Post-Evolutionary Analysis

Page 28: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 28

Less Knowledge – More Automation

Re-examining representation: Most terminals and functions are either

General-purpose or Problem-specific

However, one terminal stands out: [M0+M1]/2 Solution-specific

We proceed to Remove [M0+M1]/2 terminal Add an automatically defined function (ADF)

Page 29: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 29

Adding ADFPROGN2

INDEX:=

Array[INDEX]

>

KEY

M0:= M1:=

M1M0

INDEX

< =

ITER

TRUE FALSENOP

[M0+M1]/2

INDEX:=

PROGN2PROGN2

INDEX:=

Array[INDEX]

KEY

M0:= M1:=

M1

[M0+M1]/2

>

If [M0+M1]/2[M0+M1]/2If ADF0

Page 30: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 30

Adding ADFPROGN2

INDEX:=

Array[INDEX]

>

KEY

M0:= M1:=

M1M0

INDEX

< =

ITER

TRUE FALSENOP

INDEX:=

PROGN2PROGN2

INDEX:=

Array[INDEX]

KEY

M0:= M1:=

M1

>

IfIf ADF0

-

ADF0

ADF Functions & Terminals

+

*

/ 0 1

2

M0

M1

+ /

- *

1 M0

M1

ADF0ADF0

TRUE

Page 31: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 31

Results – Sublinear with ADF

The sublinear search problem with an ADF naturally proved more difficult than with the [M0+M1]/2 terminal

12 out of 50 runs (24%) produced perfect solutions, passing the generality testing of arrays up to length 5,000 (later increased to 20,000)

In a set of 50 additional runs, without the “2” terminal, the success rate was lower - 8%.

Page 32: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 32

Results – Sublinear with ADF Analysis revealed all perfect solutions to be variations of

binary search

The algorithmic idea can be deduced by inspecting the ADFs, all of which turned out to be equivalent to one of the following (all fractions truncated):

which are reminiscent of the [M0+M1]/2 terminal we dropped

(M0+M1)/2 (M0+M1+1)/2 M0/2+(M1+1)/2

Page 33: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 33

Results – Sublinear with ADF An example simplified evolved solution:

LISP: Equivalent Java:

(PROGN2 (PROGN2 (if (< Array[INDEX] KEY) (INDEX:= ADF0) NOP) (if (< Array[INDEX] KEY) (M0:= INDEX) (M1:= INDEX))) (INDEX:= ADF0)))

ADF0:(/ (+ (+ 1 M0) M1) 2)

if (arr[INDEX] < KEY) INDEX = ((1+M0)+M1)/2;if (arr[INDEX] < KEY) M0 = INDEX;else M1 = INDEX;INDEX = ((1+M0)+M1)/2;

(Simplified by hand from a total of 58 nodes down to 26)

Page 34: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 34

Results – Sublinear with ADFpublic static int search(int[] arr, int KEY) { int n = arr.length; int M0 = 0; int M1 = n-1; int INDEX = 0; for (int ITER = 0; ITER < iterations; ITER++) { if (arr[INDEX] < KEY) INDEX = ((1+M0)+M1)/2; if (arr[INDEX] < KEY) M0 = INDEX; else M1 = INDEX; INDEX = ((1+M0)+M1)/2; } return INDEX;}

This is another form ofBinary Search

(with a different twist)

Page 35: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 35

Interesting Results

Interesting to mention some of the other evolved solutions With minN=2, maxN=100 and main-tree max-depth = 17

linear search algorithms had evolved, failing on longer arrays

How is this possible (in log2n iterations)? An O(logn) solution has a constant factor, i.e. algorithm

does klogn operations. We set a limit to number of iterations, where each iteration

the full genotype code is executed. A linear search could evolve, by taking advantage of the

constant factor k

17 18 23 34 60 63 67 75 79 82 87 88 92 96

Skip

Page 36: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 36

Interesting Results

Linear solution ADF: ADF0=(M0+1) Main tree included 16 occurrences of:

For array of size n=100: logn=7, for k=16: klogn=167>100 (enough to traverse all the array)

We proceeded to increase minN, maxN (to 200, 300), decrease maximum k, by lowering max-depth to 10

(If (= Array[INDEX] KEY) NOP (PROGN2 (M0:= ADF0) (INDEX:= M0)))

If key is found, do nothing

else increment INDEX by 1

17 18 23 34 60 63 67 75 79 82 87 88 92 96

Page 37: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 37

Interesting Results

An intiguing solution has evolved Gains perfect scores (100% hits) up to array length 6,643 But ADF is: ADF0 = 2*M1-M0-1

Analyzing it revealed an interesting algorithm

which makes a series of jumps in exponentially increasing size, restarting them when next element is too small

Thus was able to handle array sizes n such that, n ≤ 511 x log2n n ≤ 6643

17 18 23 34 60 63 67 75 79 82 87 88 92 96

1248256 ….

Skip

Page 38: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 38

Exponential Jumps

ADF0 = 2*M1-M0-1 Main tree included 8 occurrences similar to:

Difference grows by factor of 2

(PROGN2 (if (> Array[INDEX] KEY) (M1:= ADF0) NOP) (INDEX:= ADF0))

M1 2*M1-M0-1

M1’ 2*M1 -M0-1

M1’’ 2*M1’ -M0-1------------------M1’’-M1’ = 2(M1’-M1)

17 18 23 34 60 63 67 75 79 82 87 88 92 96

1248256 ….

Notice INDEX is one step ahead of M1 allows backtracking

Page 39: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 39

Agenda

Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work Post-Evolutionary Analysis

Page 40: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 40

Related Work No previous work on evolving list search algorithms

“Closest”: sorting algorithms Loosely related – in both cases, solutions have to be 100% correct

We found 10-15 works on evolving sorting algorithms In some of the works, evolution of sorting algorithms was not the goal,

but just an example problem.

Koza’s ADIs (1999), Kirshenbaum’s iteration schema (2001) not good for sublinear - inherit O(n)

Loops constructs, such as Koza’s ADL (1999) In search, as opposed to sorting, the heart of the algorithm is the loop

contents and not the fact there is a loop, so defined outside genotype.

More…

Page 41: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 41

Agenda

Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work Post-Evolutionary Analysis

Page 42: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 42

Conclusions We showed that algorithmic design of efficient list search

algorithms is possible. high-level fitness function

encouraging correct answers within a given number of iterations

Linear search was very simple with our setup.Sublinear much more challenging.

Evolution produced: many variations of correct binary search, some nearly-correct solutions erring on a mere handful of extreme

cases (which one might expect, according to Knuth), and interesting solutions with innovative algorithmic ideas.

Page 43: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 43

Future Work Post-evolutionary analysis

Tree Alignment algorithms from bioinformatics Joint work with M. Ziv-Ukelson and S. Zakov, BGU, Israel. Submitted as a journal paper

Coevolution of individual main trees and ADFs As in Ahluwalia (2001)

Turing Complete representation e.g. current phenotypes always halt

More algorithms, like interpolation search Ultimately, we wish to find an algorithmic innovation not yet

invented by humans.

Page 44: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 44

Agenda

Introduction Evolutionary Setup Results Less Knowledge – More Automation Related Work Conclusions and Future Work Post-Evolutionary Analysis

Page 45: Evolving Efficient List Search Algorithms

Have Your Spaghetti and Eat it Too:

Kfir Wolfson, Shay ZakovMoshe Sipper, Michal Ziv-Ukelson

Department of Computer ScienceBen Gurion University

2009

Evolutionary Algorithmics and Post-Evolutionary Analysis

Page 46: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 46

Post-Evolutionary Analysis GP solutions tend to be bloated

Arduous to analyze and comprehend

We turn to bioinformatics to design a methodology for analyzing and comprehending our GP programs.

Redefine building blocks (BBs) based on semantics rather then syntax (phenotype instead of genotype)

Take a task-oriented analysis approach for code reasoning by identifying semantic BBs, and analyzing them as a step to understanding the entire evolved algorithm.

Employ a new 3-step analysis tool: G-PEA GP Post Evolutionary Analysis Use the Array Search problem as an example

Page 47: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 47

Post-Evolutionary Analysis Intuitions used as guidelines in this work:

The task, or function, performed by a GP expression is used to find correlation between sub-trees in the search for building blocks.

The standard search for identical structural or syntactical motifs is too strict for code created by evolution.

We employ a similarity-based measurement in the analysis.

Like in nature, the repetition of similar fragments in a number of evolved individuals suggests the importance of these fragments.

(Syntactic) expressions with no observed similar instances are less likely to play a significant role.

Take advantage of the multitude of separately evolved solutions (common in GP), to understand how each of them works.

Page 48: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 48

Novelty

System based on:

1) Similarity (of sub-expressions), not identity

2) Semantics, not syntax

3) Multiple solutions (trees), not single individual

Page 49: Evolving Efficient List Search Algorithms

All sub-expressions

Pairwise functionalsimilarity betweensub-expressions

Bottom-up O(n2)

I

II

III

G-PEA Methodology

For each cluster, try to deduce a common task for the expressions within the cluster. This is a candidate semantic building block.

Page 50: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 50

Measuring Expression Similarity Distance metric

Semantically oriented Language specific (not problem-specific!) For Tree-Based GP Tree Edit Distance

Tree Edit Distance Edit operations

Replace node, remove node, add subtree, etc’. Cost for each operation

Edit script – cost of script consists of all operation costs

Edit distance = cost of minimum script that transforms T1 to T2

Normalize to [0,1]. 0 = equivalent, 1 = no similarity detected Think recursively, implement iteratively - dynamic programming

Page 51: Evolving Efficient List Search Algorithms

Edit script cost:Distance (X,Z) + Distance (Y,Z) + 0

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 51

Tree Edit Distance – Example Edit Script

?

2. Y Z 3. If Z

1. X ZEdit script cost (fixed):

p x Distance (X,Z) + (1-p) x Distance (Y,Z) + 0

p = estimated probability that B is true (0.5 if unknown)Distance (If,Z) = cost of minimum-cost edit script

Page 52: Evolving Efficient List Search Algorithms

52

Measuring Expression Similarity The operations and costs are chosen as to express semantic

similarity For instance, the use of probability “p” in:

p x Distance (X,Z) + (1-p) x Distance (Y,Z) + 0

The Boolean value of “B” defines which branch will be executed.

The Array Search problem* aims at evolving binary search algorithms and has the following language:

General Purpose Problem related

If INDEX INDEX:=

>, <, = Array[INDEX]

TRUE, FALSE KEY ITER

PROGN2 M0, M1 M0:=, M1:=

NOP [M0+M1]/2

* Wolfson, Sipper / Evolving efficient list search algorithms,

LNCS Evolution Artificielle, 2009

Page 53: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 53

Results - Similarity We defined operations for each function and terminal pair,

and ran the sub-tree similarity calculation for 35 separately evolved Array Search solutions.

Example of detected similar sub-expressions:

Size = 10 nodes each. Calculated distance = 0.024 Notice these are almost equivalent Differ only when KEY = Array[INDEX]

Page 54: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 54

Results - Similarity Another Example:

Sizes = 23 and 7 nodes. Calculated distance = 0.0625 Notice they are actually equivalent Currently not employing context analysis, may have detected this. Would be very similar even if Left’s conditionals were different

Page 55: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 55

Results – Clustering 1

Page 56: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 56

Results – Clustering 2

Page 57: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 57

Results – Clustering 2

Page 58: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 58

Summary and Future Work We defined a methodology for post-evolutionary analysis of

GP solutions named G-PEA. It searches for

semantically similar expressions in a host of (probably bloated) separately-evolved GP individuals, e.g. best-of-runs

and clusters similar expressions, used to find the crucial tasks (‘semantic building blocks’) in the evolved algorithms.

These can be used, in turn, to understand how the evolved solutions work.

The ability to asses expression similarity can also be used for code simplification, or during-evolution similarity analysis

(e.g. building-block preserving XO, block-level mutations)

The results can probably be improved by using standard tools like axiomatic semantics and context analysis tools.

Page 59: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 59

Thank You!

Page 60: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 60

Previous Work? ?

Page 61: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 61

BACKUP

Page 62: Evolving Efficient List Search Algorithms

Evolving Efficient List Search Algorithms - ECAL – 09.05.2010 62

Related Work – Evolving Sorting Alg. Kinnear (1993) evolved an O(n2) bubble sort using GP.

Problem-specific vs. less specific functions. Parsimony pressure improved generality.

Withall et al. (2009) developed a new GP representation Fixed-length blocks of genes, representing single program statements. Similarity between child and parent, propagation of building blocks. Developed some list algorithms including bubble sort, with a specific

function set for each problem.

Agapitos et al. (2006-7) used OOGP for sorting algorithms. Compared five different fitness functions of array disorder O(nlogn) solution with a hand-tailored filter function in the function set O(n2) solution with an ADF (runtime measured by method invocations) O(n2) solution with Evolvable Methods that can call each other

< Back