550 ieee transactions on evolutionary computation,...

18
550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER2006 Graph-Based Evolutionary Algorithms Kenneth Mark Bryden, Daniel A. Ashlock, Steven Corns, and Stephen J. Willson Abstract—Evolutionary algorithms use crossover to combine in- formation from pairs of solutions and use selection to retain the best solutions. Ideally, crossover takes distinct good features from each of the two structures involved. This process creates a conflict: progress results from crossing over structures with different fea- tures, but crossover produces new structures that are like their par- ents and so reduces the diversity on which it depends. As evolution continues, the algorithm searches a smaller and smaller portion of the search space. Mutation can help maintain diversity but is not a panacea for diversity loss. This paper explores evolutionary al- gorithms that use combinatorial graphs to limit possible crossover partners. These graphs limit the speed and manner in which infor- mation can spread giving competing solutions time to mature. This use of graphs is a computationally inexpensive method of picking a global level of tradeoff between exploration and exploitation. The results of using 26 graphs with a diverse collection of graphical properties are presented. The test problems used are: one-max, the De Jong functions, the Griewangk function in three to seven dimen- sions, the self-avoiding random walk problem in 9, 12, 16, 20, 25, 30, and 36 dimensions, the plus-one-recall-store (PORS) problem with and , location of length-six one-error-cor- recting DNA barcodes, and solving a simple differential equation semi-symbolically. The choice of combinatorial graph has a significant effect on the time-to-solution. In the cases studied, the optimal choice of graph improved solution time as much as 63-fold with typical impact being in the range of 15% to 100% variation. The graph yielding superior performance is found to be problem dependent. In general, the optimal graph diameter increases and the optimal average degree decreases with the complexity and difficulty of the fitness landscape. The use of diverse graphs as population structures for a collection of problems also permits a classification of the problems. A phylogenetic analysis of the problems using normalized time to solution on each graph groups the numerical problems as a clade together with one-max; self-avoiding walks form a clade with the semisymbolic differential equation solution; and the PORS and DNA barcode problems form a superclade with the numerical problems but are substantially distinct from them. This novel form of analysis has the potential to aid researchers choosing problems for a test suite. Index Terms—Evolutionary algorithm, graph-based algorithms, population structure, test suite. Manuscript received October 18, 2004; revised March 14, 2005. This work was supported in part by a Grant from the National Energy Technology Labo- ratory, U.S. Department of Energy. K. M. Bryden and S. Corns are with the Department of Mechanical Engi- neering, Iowa State University, Ames, IA 50011 USA (e-mail: kmbryden@ias- tate.edu; [email protected]). D. A. Ashlock is with the Department of Mathematics and Statistics, University of Guelph, Guelph, ON N1G 2R4, Canada (e-mail: dashlock@ uoguelph.ca). S. J. Willson is with the Department of Mathematics, Iowa State University, Ames, IA 50011 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TEVC.2005.863128 I. INTRODUCTION I N NATURE, constraints such as geography, mutual infer- tility, or partner selection mechanisms are imposed on a in- dividual’s ability to reproduce sexually with other individuals. In the simple genetic algorithm (SGA) [19], the only constraint on reproduction is that fitter individuals have a higher proba- bility of being selected to participate. In nature, individuals sep- arated by great distances, no matter what their respective fit- nesses, have a very low probability of reproducing with each other. Within many species, one also finds cultural or behavioral constraints on the probability of two individuals reproducing. Birds have complex mating dances that help to identify good partners; frogs use distinctive calls for the same purpose; insects employ pheromones, and human partner selection techniques are complex and variable. Examples of this kind of premating nongeographic isolation can be found in [2]. Any widespread biological phenomenon that appears over and over in popula- tions subject to natural selection probably conveys a selective advantage. Limiting mate choice is thus likely to be desirable in an evolutionary algorithm. In a complex polymodal fitness landscape, it can prevent so-called premature convergence. As we will see subsequently, it can be counterproductive in simple, unimodal fitness landscapes. One of the standard issues in population genetics is ex- plaining why there are not greater problems with loss of diversity in natural populations even though simple mathe- matical models show that diversity should vanish rapidly. The theory of isolation by distance [44] gives one reason why diver- sity loss is lower than expected; the separation imposed by the geography slows the spread of genetic information. Kimura and Crow [24] examined the rate at which populations on different graphical structures lose their genetic diversity under simple reproduction without selection. Analogously, one of the fundamental problems in evo- lutionary algorithms is maintaining useful diversity in the population as the algorithm progresses. It is important to note that for some problems the useful level of diversity is almost nil, in others rich diversity prevents convergence to an undesirable local optimum. During reproduction, individuals in the population are replaced by individuals with parts copied from a stochastically restricted subset of the population, and so diversity loss is acute if not carefully managed. Currently, the primary tool for such management is setting the rate of application of mutation operators. Imposing geography on the algorithm is another management tool. Implemented properly, such geography can have a very low runtime cost. Except possibly for raising the mutation rate, imposing a ge- ography is the least computationally intensive of the extant di- versity preservation techniques. If diversity preservation is re- quired and the randomness of preserving it with a high mutation rate is undesirable, then imposing a geography may be a good 1089-778X/$20.00 © 2006 IEEE

Upload: others

Post on 23-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Graph-Based Evolutionary AlgorithmsKenneth Mark Bryden, Daniel A. Ashlock, Steven Corns, and Stephen J. Willson

Abstract—Evolutionary algorithms use crossover to combine in-formation from pairs of solutions and use selection to retain thebest solutions. Ideally, crossover takes distinct good features fromeach of the two structures involved. This process creates a conflict:progress results from crossing over structures with different fea-tures, but crossover produces new structures that are like their par-ents and so reduces the diversity on which it depends. As evolutioncontinues, the algorithm searches a smaller and smaller portion ofthe search space. Mutation can help maintain diversity but is nota panacea for diversity loss. This paper explores evolutionary al-gorithms that use combinatorial graphs to limit possible crossoverpartners. These graphs limit the speed and manner in which infor-mation can spread giving competing solutions time to mature. Thisuse of graphs is a computationally inexpensive method of picking aglobal level of tradeoff between exploration and exploitation. Theresults of using 26 graphs with a diverse collection of graphicalproperties are presented. The test problems used are: one-max, theDe Jong functions, the Griewangk function in three to seven dimen-sions, the self-avoiding random walk problem in 9, 12, 16, 20, 25,30, and 36 dimensions, the plus-one-recall-store (PORS) problemwith = 15 16 and 17, location of length-six one-error-cor-recting DNA barcodes, and solving a simple differential equationsemi-symbolically.

The choice of combinatorial graph has a significant effect onthe time-to-solution. In the cases studied, the optimal choice ofgraph improved solution time as much as 63-fold with typicalimpact being in the range of 15% to 100% variation. The graphyielding superior performance is found to be problem dependent.In general, the optimal graph diameter increases and the optimalaverage degree decreases with the complexity and difficulty ofthe fitness landscape. The use of diverse graphs as populationstructures for a collection of problems also permits a classificationof the problems. A phylogenetic analysis of the problems usingnormalized time to solution on each graph groups the numericalproblems as a clade together with one-max; self-avoiding walksform a clade with the semisymbolic differential equation solution;and the PORS and DNA barcode problems form a superclade withthe numerical problems but are substantially distinct from them.This novel form of analysis has the potential to aid researcherschoosing problems for a test suite.

Index Terms—Evolutionary algorithm, graph-based algorithms,population structure, test suite.

Manuscript received October 18, 2004; revised March 14, 2005. This workwas supported in part by a Grant from the National Energy Technology Labo-ratory, U.S. Department of Energy.

K. M. Bryden and S. Corns are with the Department of Mechanical Engi-neering, Iowa State University, Ames, IA 50011 USA (e-mail: [email protected]; [email protected]).

D. A. Ashlock is with the Department of Mathematics and Statistics,University of Guelph, Guelph, ON N1G 2R4, Canada (e-mail: [email protected]).

S. J. Willson is with the Department of Mathematics, Iowa State University,Ames, IA 50011 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TEVC.2005.863128

I. INTRODUCTION

I N NATURE, constraints such as geography, mutual infer-tility, or partner selection mechanisms are imposed on a in-

dividual’s ability to reproduce sexually with other individuals.In the simple genetic algorithm (SGA) [19], the only constrainton reproduction is that fitter individuals have a higher proba-bility of being selected to participate. In nature, individuals sep-arated by great distances, no matter what their respective fit-nesses, have a very low probability of reproducing with eachother. Within many species, one also finds cultural or behavioralconstraints on the probability of two individuals reproducing.Birds have complex mating dances that help to identify goodpartners; frogs use distinctive calls for the same purpose; insectsemploy pheromones, and human partner selection techniquesare complex and variable. Examples of this kind of prematingnongeographic isolation can be found in [2]. Any widespreadbiological phenomenon that appears over and over in popula-tions subject to natural selection probably conveys a selectiveadvantage. Limiting mate choice is thus likely to be desirablein an evolutionary algorithm. In a complex polymodal fitnesslandscape, it can prevent so-called premature convergence. Aswe will see subsequently, it can be counterproductive in simple,unimodal fitness landscapes.

One of the standard issues in population genetics is ex-plaining why there are not greater problems with loss ofdiversity in natural populations even though simple mathe-matical models show that diversity should vanish rapidly. Thetheory of isolation by distance [44] gives one reason why diver-sity loss is lower than expected; the separation imposed by thegeography slows the spread of genetic information. Kimura andCrow [24] examined the rate at which populations on differentgraphical structures lose their genetic diversity under simplereproduction without selection.

Analogously, one of the fundamental problems in evo-lutionary algorithms is maintaining useful diversity in thepopulation as the algorithm progresses. It is important tonote that for some problems the useful level of diversity isalmost nil, in others rich diversity prevents convergence to anundesirable local optimum. During reproduction, individualsin the population are replaced by individuals with parts copiedfrom a stochastically restricted subset of the population, andso diversity loss is acute if not carefully managed. Currently,the primary tool for such management is setting the rate ofapplication of mutation operators. Imposing geography on thealgorithm is another management tool. Implemented properly,such geography can have a very low runtime cost.

Except possibly for raising the mutation rate, imposing a ge-ography is the least computationally intensive of the extant di-versity preservation techniques. If diversity preservation is re-quired and the randomness of preserving it with a high mutationrate is undesirable, then imposing a geography may be a good

1089-778X/$20.00 © 2006 IEEE

Page 2: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 551

choice. The geography is selected as part of the algorithm de-sign and need not impact runtime significantly. This paper ex-plores this by imposing various geographical structures codedas combinatorial graphs on an evolutionary algorithm. We callthe result a “graph-based evolutionary algorithm” (GBEA). Aunique feature of this paper is the exploration of many differentgeographic structures rather than a small group of highly relatedgeographies. A small, initial study in this area appears in [8].

Various approaches to managing diversity loss appear in theliterature. These include using a high mutation rate, reducing thefitness of organisms in proportion to the number of organismsrepresenting similar solutions (niche specialization), directly re-jecting duplicate solutions (e.g., taboo search), and attemptingto intelligently manage diversity loss. Many of these methodssuffer from requiring the ability to compute the degree to whichcreatures are similar. With a plain string representation in whicheach character has meaning, this is easy. More complex repre-sentations such as finite state machines [18], parse trees (withtheir potential for bloat) [9], or GP-automata [3], all of whichpermit a single solution to have multiple encodings, render thiskind of distance computation challenging. An example of thistype of distance computation appears in [39], in which the au-thors make diversity part of a multiobjective optimizer.

Intelligent management of diversity loss is a potentiallyvaluable approach. Intelligent management removes diversityrapidly when it is not required and conserves it when it is. Anexample of this type of technique appears in [21]. As presented,the technique requires the ability to estimate which buildingblocks within population members are better or worse thanothers. This restricts its utility to problems that are representedin a fashion such that i) there are identifiable building blocksfor which ii) meaningful estimates of relative worth can bemade. As with other schemes for managing diversity, it comeswith some degree of computational overhead. In [21], two vari-ations of the technique are compared on a variety of parameterestimation problems and are shown to enhance performance.This success relies in part on incorporating domain knowledgeinto the representation of the parameter estimation problemsso that the building blocks are transparently available to thealgorithms.

Another approach to diversity management is to impose ageography upon the population. In [1] a population is placedon each processor of a multiprocessor machine with occasionalmigration. This differs from the work presented here in thateach vertex of the graph contains an entire population ratherthan a single population member. It also uses a single graph,the connection topology wired into the multiprocessor machineon which the work was performed. The current paper general-izes this work in that it considers many different graphs and dif-fers from it in its choice of what to place at a vertex. Placingwhole populations on a vertex is an option. It is also possible tocreate graphs that simulate placing a population at each vertex.The graph , defined subsequently, is an example of this typeof simulation.

In [29], a version of Darwin’s ideas about the origin of di-versity on islands and its later winnowing on continents ap-pears. That paper used a much smaller collection of graphs thanthe current one as well as evaluating population members ontheir competitive ability to play the iterated prisoner’s dilemma

rather than on optimization problems. The only real point ofcommonality is the recognition that graphs may be valuableas geographic structures. In spite of this lack of commonality,there are ideas which may be valuable to extending and im-proving GBEAs in [29]. The idea of the continent/island inter-action suggests the use of graphs not in this paper, and the notionof training competitive agents is a potentially interesting appli-cation.

One of a large series of investigations by Whitley [31] ex-plores island model algorithms. Distinct populations are placedon islands and migration rates and populations sizes are tunedwith resulting performance enhancement at least partially attrib-utable to the geographic preservation of diversity. This work isthe most similar to GBEAs of which we are aware. As with thecontinent/island cycle, the island model can be approximated bychoosing the correct graph.

Davidor et al. [15] tried using a steady-state ecologicalmodel on a grid called the ECOlogical framework. In this work,a neighbor could breed with its eight neighbors in the grid.Davidor demonstrated improved performance over a baselinealgorithm for job shop scheduling with geographically con-strained mating. In all the ECOlogical studies, the geographycorresponded to an 8-neighbor toroidal graph with size tobe chosen as 32 32, 45 45, or 71 71 depending on aheuristic estimate of the correct population size.

Here is an outline of the remainder of this paper. Section IIgives the background mathematical definitions including thechoice of graphs used in the experiments. In Section III,graph-based evolutionary algorithms are defined, and the 23test problems are described. Section IV gives the precise designof the experiments. Section V describes the outcomes of theexperiments and discusses the results. Section VI provides thetaxonomic analysis of the results and discusses their signif-icance. Section VII draws overall conclusions for this paper.Section VIII discusses what directions might be valuable foradditional study.

II. MATHEMATICAL BACKGROUND

We assume some familiarity with graph theory [41] in thispaper. A combinatorial graph or graph is a collectionof vertices and of edges where is a set of unorderedpairs from . Two distinct vertices of the graph are neigh-bors if they are members of the same edge. The number of edgescontaining a vertex is the degree of that vertex. If all vertices in agraph have the same degree, then the graph is said to be regular.If the common degree of a regular graph is , then the graph issaid to be -regular. A graph is connected if one can go fromany vertex to any other vertex by traversing a sequence of ver-tices and edges. The diameter of a graph is the largest numberof edges in a shortest path between any two of the vertices. Thediameter is, in some sense, the shortest path across the graph.

In this paper, a graph used to constrain mating in a populationwill be called the population structure. The general strategy isto use the graph to specify the geography on which a populationlives, permitting mating only between neighbors, and findinggraphs that can preserve diversity without hindering any poten-tial progress due to heterogeneous crossover.

This paper utilizes a nonstandard operation on graphs calledsimplexification. Simplexification at a vertex replaces with

Page 3: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

552 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 1. Simplexification of a vertex with four neighbors.

a cluster of vertices, one for each neighbor of so that all thenew vertices are neighbors of one another and each is a neighborof exactly one of ’s former neighbors. Simplexification of avertex with four neighbors is shown in Fig. 1. The effect ofsimplexification is to create small groups of vertices that areclosely coupled to one another but less closely coupled to therest of the graph. This creates an analog of a biological refugein the graphical connection topology. By simplexification of agraph, we mean simultaneous simplexification of all the graph’svertices.

A. List of Graphs

This section provides some necessary mathematical defini-tions and describes the combinatorial graphs used in this paper.

Definition 1: The complete graph on vertices, denoted ,has vertices and all possible edges. An example of a completegraph is shown in Fig. 2.

Definition 2: The complete bipartite graph with andvertices, denoted , has vertices divided into disjoint setsof and vertices and all possible edges that have one end ineach of the two disjoint sets. The three-pre- graph shown inFig. 2 is the complete bipartite graph .

Definition 3: The -cycle, denoted , has vertex set .Edges join pairs of vertices that differ by 1 so that thevertices form a ring with each vertex having two neighbors.

Definition 4: The -hypercube, denoted , has the set of all-character binary strings as its set of vertices. Edges consist of

pairs of strings that differ in exactly one position. A 4-hypercubeis shown in Fig. 2.

Definition 5: The -torus, denoted , has vertexset . Edges are pairs of vertices that differ either by1 in their first coordinate or by 1 in theirsecond coordinate but not both. These graphs are gridsthat wrap (as tori) at the edges. A 12 6-torus is shown in Fig. 2.

Definition 6: The generalized Petersen graph with parame-ters and , with relatively prime to , is denoted andhas vertex set . The vertices areconnected in a standard -cycle. The vertices arealso connected in an -cycle but with the th vertex connectedto the vertex. Finally, pairs of vertices ,are connected. The graph is shown in Fig. 2.

Definition 7: A tree is a connected graph with no cycles.Degree zero or one vertices are termed leaves of the tree. Aregular balanced tree of degree is a tree constructed in thefollowing manner. Begin with a single vertex. Attach neigh-bors to that vertex and place these neighbors in a queue. Pro-cessing the queue in order, add 1 neighbors to the vertexmost recently removed from the queue and add these neighborsto the end of the queue. Continue in this fashion until the treehas the desired number of vertices. The resulting graph is a tree

in which all nonleaves have degree and which has, construc-tively, the smallest possible diameter among trees with all non-leaves having degree . We denote these graphs ,where is the number of vertices. Notice that not all are pos-sible for a given .

Definition 8: The graph is created by starting withand then simplexifying the entire graph three times. Two of thesteps leading to the graph are shown in Fig. 2.

In addition, four classes of random graphs are used in thispaper. A random graph is specified by the algorithm used tocreate it. Three instances from each class of random graph areused.

Definition 9: An edge move is performed as follows. Twoedges and are found that have the property thatnone of , , , or are themselves edges.The edges and are deleted from the graph, andthe edges and are added. Notice that edge movespreserve the regularity of a graph if it is regular.

Definition 10: Regular random graphs are generated by thefollowing algorithm. Start with a regular graph (recall that aregular graph has all vertices of the same degree) and repeat-edly perform 3000 edge moves on edges selected uniformly atrandom from those that are valid for edge moves. For 3-regularrandom graphs, use as the starting point. For 4-regularrandom graphs, use as the starting point. For 9-regularrandom graphs, use as the starting point. These graphs aredenoted , where is the number of vertices, is theregular degree, and , is the instance of the graph in thispaper.

Definition 11: Generate random toroidal graphs as follows.A set of 512 points are randomly placed onto the unit torus (theunit square wrapped at the edges, not the torus graph) and edgesare created between those at distance 0.07 or less from one an-other. This distance was chosen to give an average degree ofabout six. After generation, the graph is checked to see if it isconnected. Graphs that are not connected are rejected. Thesegraphs are denoted , where is the radius foredge creation, and is the instance of the graph in thispaper.

See Table I for a list of the graphs used in the work reportedin this paper. It should be noted that all graphs used, includingthe random graphs but excluding , have 512 ver-tices so as to control for population size. The one off-size graphhas 510 vertices since a 5-regular balanced tree cannot have512 vertices. Exploration of the tradeoffs involved in varyingthe number of vertices more than a tiny amount is a topic forfuture research. The complete graph is included as a base-line. Graph-based evolutionary algorithms become equivalent tostandard evolutionary algorithms when the graph used is .

III. GRAPH-BASED EVOLUTIONARY ALGORITHMS

This section defines a GBEA as it is used in this paper.(Clearly, many other methods of incorporating graphs intoevolutionary algorithms are possible.) Choose a graph withvertex set and edge set to use as a populationstructure. Place one individual on each vertex of . Then, usea steady-state evolutionary algorithm [32], [37], [42] in whichevolution proceeds one mating event at a time. A mating eventis performed as follows. Pick a vertex uniformly

Page 4: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 553

Fig. 2. Examples of complete, Petersen, Torus, and hypercube graphs, and some of the steps leading to the Z graph. These examples are all smaller than thegraphs actually used but are members of the same families of graphs.

TABLE IGRAPHS USED AND THEIR INDEX NAMES. INDEX NAMES ARE USED TO INDEX THE GRAPHS IN FIGURES

at random. A neighbor of is then chosen for mating. Thevariation operators, crossover and mutation, are used to producea single new individual that may or may not be used to replacethe individual on vertex . The details of how the neighboris picked for mating and how to decide if the new individualreplaces the individual on are together called the local matingrule of the GBEA. This research used local mating rules thatpick a neighbor in direct proportion to its fitness (local rouletteselection) and permit the new individual to replace the oldeither automatically or only if it is at least as fit. These localmating rules are called local roulette mating and local eliteroulette mating, respectively. Section IV will specify whichlocal mating rule is used for each test problem.

A graph-based evolutionary algorithm need not be steadystate. Its steady-state character in this paper is a choice. Agenerational graph-based algorithm could be implemented ina number of ways. For example, roulette-select a neighbor foreach vertex to be the coparent based on fitness. Run some formof reproduction on the population member at the vertex and thecoparent to obtain the structure that will occupy the vertex in thenext generation. The use of a generational form of GBEAs maybe desirable in the following circumstances. Suppose that thefitness evaluation has a variable component, either it changeswith the population or as new cases of the problem being solvedare generated. An example of the former would be the trainingof agents to play the iterated prisoner’s dilemma [7], [16], [17].

Page 5: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

554 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 3. Examples of two optimal and two suboptimal walks for the 4 � 4 instance of the SAW problem. The fitnesses of the examples are 16, 16, 14, and 15,corresponding to the number of squares visited.

Fig. 4. An optimal PORS tree located by a GBEA with graph C for n = 16 nodes shown in LISP-like notation.

An example of the latter would be the Tartarus task [5], [38]. Agenerational algorithm evaluates fitness across the populationagainst the same opponents or test cases, yielding a fairnessunavailable in a steady state algorithm.

A recent paper [14] by Choi and Moon uses the term “graph-based” in a different sense. In that paper, an analysis of the graphtheory underlying the sorting network problem is used to obtainsubstantial performance improvement. Other than the chancesimilarity of terminology, it is a distinct type of research.

IV. EXPERIMENTAL DESIGN

The test problems were chosen because they representeddifferent classes of problems that have been well studied andhave known solutions. For evolutionary algorithms, one-max isa standard test problem. The De Jong functions are well knownand permit comparison with other work using those functions,although they do not meet the criteria given in [43] to be a testsuite. The lower-dimensional cases of the Griewangk functionare difficult functions for optimization. Plus-one-recall-store(PORS) is a test problem with an exceptionally well-character-ized fitness landscape for genetic programming. The

case is a deceptive problem, containing a unique and narrowglobal optimum and many broader local optima, while the

and cases are not. The DNA barcode problemis a new problem, included as an applied problem with theparameters that have been most studied. The ordinary differen-tial equation solution is a precursor to many applied problemsincluding heat transfer, fluid flow, and combustion [10], [13],[33].

Simulations were performed for 23 test problems on eachof the 26 graphs given in Table I. For 22 of the problems,5000 independent evolutionary simulations were performed,and for one problem (differential equation solution), 10,000simulations were performed to obtain tighter confidence inter-vals. The number of mating events required to find a correctsolution to the problem was saved for each of these 3,120,000simulations. If more than 1,000,000 mating events were re-quired, the simulation was recorded as having failed to find ananswer. For each graph and problem, the mean and standarddeviation of the number of mating events to solution were usedto construct 95% confidence intervals for the mean time tosolution. These are displayed in Figs. 5–12. The test problems,

Page 6: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 555

Fig. 5. Mean mating events to solution with 95% confidence intervals for theone-max problem.

the local mating rule used with each test problem, and the exactcharacter and rate of the variation operators used are describedin the following sections.

A. One-Max

The one-max problem uses a string of bits for a chromo-some. In this paper, we used 20-bit strings. The fitness of astring is its weight (the number of ones in it). For the one-maxstudy, we used local roulette mating. The crossover operator wastwo-point crossover. The mutation operator flipped one bit se-lected uniformly at random. The choice not to use elite replace-ment on the one-max problem reflected the essentially trivialcharacter of the problem. The search problem is harder withoutelite replacement and so more likely to yield information aboutthe relative merit of graphs.

B. De Jong Functions

The De Jong functions are described in detail in[22]. is a three-dimensional bowl. is a fourth-degree bi-variate polynomial surface featuring a broad suboptimal peak.

is a sum of integer parts of five independent variables cre-ating a function that is flat where it is not discontinuous, a kindof six-dimensional ziggurat. is a fourth-order paraboloid in30 dimensions with distinct diameters in different numbers ofdimensions made more complex by adding Gaussian noise.is the so-called “foxhole” function with many narrow local op-tima placed on a grid. These functions are traditional test prob-lems in function optimization but do not serve as a complete testsuite. See [43] for incisive comments.

C. The Griewangk Function

The Griewangk function is a sum of quadratic bowls, oneper dimension, with cosine terms added to them, subsequently

translated to yield a positive function. It has a plethora of localoptima and is a natural member of a test suite. As the dimensionof the Griewangk function increases, it approaches a unimodalbowl [43]. For this reason, we include this function in five casesof relatively low dimension, .

D. Self-Avoiding Walks

The self-avoiding walk (SAW) problem uses a string as itschromosome. The string is over the alphabetwith the letters corresponding to up, down, left, and rightmoves on a grid, respectively. The cases of the SAW problemon grids of size 3 3, 3 4, 4 4, 4 5, 5 5, 5 6,and 6 6 are used. The length of a SAW chromosome isequal to the number of cells in the grid minus one. Fitnessis evaluated by starting in the lower left corner of the gridand then making the moves specified by the chromosome.The sequence of moves made is referred to as the walk.If a move is made that would cause the walk to leave thegrid, then that move is ignored. The walk can also revisitcells of the grid. Fitness is equal to the number of squaresvisited when the walk is completed. The problem is calledthe self-avoiding walk problem because optimal solutions donot revisit squares; they are self-avoiding walks. Examplesof SAW chromosomes and their fitness evaluations are givenin Fig. 3.

The self-avoiding walk functions fill a role similar to those ofNK-landscapes [23]. Both types of problem are scalable with alarge degree of epistasis, and both possess many global and localoptima. The fourth example given in Fig. 3 has fitness 15 but nonear neighbors (in the Hamming metric) with fitness 16. It is anexample of a local optimum. The SAW problem differs from theNK-landscape problems in several ways. Every instance of theSAW problem has a known best fitness; it is possible to knowwhen you have succeeded. This makes the collection of statis-tics on algorithm behavior easier. The walk for a given SAWchromosome yields a simple and intuitive visualization that canbe used to help in analysis.

As SAW problems are a new type of test problem, they shouldbe checked against the list of criteria for good test suite problemsgiven in [43].

Criterion 1): SAW problems are quite resistant to hillclimbing. Testing with a single mutation hill climber using themutation operator of this paper showed that the ratio of local toglobal optima located explodes combinatorially as the problemsize increases.

Criterion 2): The SAW problem is constructively nonlinear,nonseparable, and asymmetric. If we permute the order ofmoves made, the fitness of a given chromosome varies sub-stantially. A perfect walk’s moves can be reordered so that themajority of moves are made off the grid, reducing its fitnesssubstantially. Since sequences of moves are good only from aparticular starting position, the SAW problem is quite nonsepa-rable. Loci near the beginning of the chromosome have fitnessindependent of later loci, but the fitness of later loci deeplydepends on the values of earlier loci; the fitness is thus not evenclose to additive and the problem is nonlinear.

Page 7: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

556 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 6. Mean mating events to solution with 95% confidence intervals for the De Jong test suite, functions F � F .

Criterion 3): The SAW problem is scalable. The SAWproblem contains an infinite number of cases that canbe scaled from trivial to hard.

Criterion 4): This is the sole criterion that the SAW problemfails to satisfy. The evaluation cost of a SAW problem is smallwhen its size is such that there is any hope of solving it.

Page 8: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 557

Fig. 7. Mean mating events to solution with 95% confidence intervals for the Griewangk function in three to seven dimensions.

Criterion 5): The SAW problem uses a canonical represen-tation, a string over a four-letter alphabet. The SAW problem

thus satisfies four of the five criteria needed for members of agood test suite and so, paired with a problem that has scalable

Page 9: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

558 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 8. Mean mating events to solution with 95% confidence intervals for self-avoiding walks of size 3 � 3, 3 � 4, 4 � 4, and 4 � 5.

cost, yields an acceptable test suite. Work on using GBEAs witha high evaluation cost problem appear in [12] and [40].

E. Plus-One-Recall-StoreThe PORS problem is described in detail in [6]. It is a type of

maximum problem within the domain of genetic programming[9], [25], [26] with a small operation set and a calculator-stylememory. The goal of the test problem, called the PORS effi-cient node use problem, is to find parse trees that evaluate to thelargest integer result possible given a fixed maximum numberof parse tree nodes. The language has two operations: integeraddition and a store operation that places its argument in an ex-ternal memory location. The language also has two terminals:the integer 1 and recall from an external memory. The diffi-culty of the PORS efficient node use problem varies strongly

according to the congruence class ( 3) of the number ofnodes permitted. We ran experiments on , , and

nodes representing all three classes. The hardest caseis ; the easiest is . An example of a solution lo-cated for is given in Fig. 4. Fitness for a given parsetree was the size of the number it produced when evaluated. Inthis set of experiments, the initial population was composed ofrandomly generated trees with exactly nodes. A successful in-dividual was defined to be a tree that produced the largest pos-sible number (these numbers are computed in [6]). Crossoverwas performed by the usual subtree exchange [26]. If this pro-duced a tree with more than nodes, then a subtree of the rootnode iteratively replaced the root node until the tree had fewerthan nodes. This operation is called chopping. Mutation wasperformed by replacing a subtree picked uniformly at random

Page 10: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 559

Fig. 9. Mean mating events to solution with 95% confidence intervals for self-avoiding walks of size 5 � 5, 5 � 6, and 6 � 6.

with a new random subtree of the same size for each new treeproduced. For the PORS experiments, local elite roulette matingwas used.

F. DNA Barcodes

DNA barcodes [4] are error correcting codes [28] over theDNA alphabet which are able to correct errors rel-ative to the edit metric [20]. They are used as embedded markersin genetic constructs to permit retention of source informationwhen sequencing pooled genetic libraries. An example of theirsuccessful use to retrieve sequence source information appearsin [30].

Unlike binary error correcting codes over the Hammingmetric, edit metric codes lack a beautiful algebraic theory.Those used were located with a greedy closure evolutionaryalgorithm [4]. This type of evolutionary algorithm uses arepresentation consisting of a partial structure. The fitness of an

individual partial structure is the quality (in this case size) of itscompletion by a greedy algorithm. When searching for DNAbarcodes, the partial structure is a choice of three random DNAcodewords, and the greedy algorithm is Conway’s lexicodealgorithm [4]. Fitness is simply the size of the code locatedby Conway’s algorithm. The DNA barcode search problemexhibits a high degree of epistasis, and work thus far suggestsit has an exceedingly rugged fitness landscape.

The algorithm in this paper searches for six-letter DNA wordsthat are at a mutual distance of at least three. These are the pa-rameters used for the wet lab testing of the technique in [30].Barcodes of this size and distance can correct one sequencing(edit) error.

G. Differential Equation Solution

Solving differential equations is a common genetic program-ming problem. Modifying the usual technique, the algorithm in

Page 11: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

560 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 10. Mean mating events to solution with 95% confidence intervals for the PORS problem with n = 15 (top left), n = 16 (top right), and n = 17 (bottom).

this paper when computing fitness extracts the derivatives sym-bolically. Code for solving differential equations with symbolicderivatives used in the fitness function is available by contactingthe second author.

We solve the differential equation

(1)

a simple homogeneous equation with a two-dimensional solu-tion space

(2)

for any constants , .The parse tree language used has operations and terminals

given in Table II. Trees were initialized to have six total oper-ations and terminals. Fitness for a parse tree coding a function

was computed as the sum of the error functionover 100 equally spaced sample

points in the range . This is the squared deviationfrom agreement with the differential equation. This function isto be minimized, and the algorithm continues until 1,000,000mating events have taken place (this did not happen in prac-tice), or until the fitness function summed over all 100 samplepoints drops below 0.001. A filter was included to prevent trivialsolutions, e.g., , and trivial solutions were assigned afitness of when they were detected.

Crossover and chopping were performed as in the PORS ex-periments; trees were chopped if they had in excess of 22 totaloperations and terminals. In addition to subtree mutation of thesort used in the PORS experiments, a constant mutation was ap-plied to each new parse tree. Constant mutation has no effect on

Page 12: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 561

TABLE IIOPERATIONS AND TERMINALS FOR THE DIFFERENTIAL EQUATION PARSE TREE

LANGUAGE. THE SYMBOL r DENOTES A REAL NUMBER

parse trees that do not contain ephemeral real constants. For atree that does contain such constants, either as a terminal or aspart of a unary scaling operation, one of the constants is selecteduniformly at random, and a number uniformly distributed in theregion [ 0.1,0.1] is added to the constant. Ephemeral constantsare initialized in the range [ 1,1] but may be taken outside ofthis range by constant mutation. Local elite roulette mating wasused. Each new tree produced resulted from a subtree crossoverand was subjected to both a subtree mutation and a constant mu-tation.

Equations (3)–(5) are examples of solutions found by aGBEA on the graph . All of these are in fact analytical solu-tions to the equation as were the majority of solutions located

(3)

(4)

(5)

V. RESULTS

The primary objective of this paper was to determine the po-tential impact of population structure in the form of combinato-rial graphs on solution speed. It also sought to document whichgraphs yield superior performance for a specific problem. Themajor result can be summarized by saying that choice of graphsubstantially impacts solution time and that the correct choiceof graph varies from problem to problem.

For each graph and test problem, 5000 tests were completed,except in the case of the differential equation where 10,000 testswere completed for each graph. In each case, time-to-solutionnumbers were saved with time measured in mating events. Forthe one-max, SAW, and PORS problems, the solution consistedof the appearance of the first instance of the known correct so-lution. In the function optimization problems, a simulation wassaid to have found the solution when it obtained a value within0.001 of the known optimal value. DNA barcodes were evolveduntil they achieved the size of the current best known solution.For the differential equation problem, the correct solution wastaken to be a total squared error over all 100 sample points of at

Fig. 11. Mean mating events to solution with 95% confidence intervals for thedifferential equation solution problem.

Fig. 12. Mean mating events to solution with 95% confidence intervals for theDNA barcode problem.

most 0.001. Figs. 5–12 show the relative performance of eachgraph as scatter plots with 95% confidence intervals and thegraphs sorted in increasing order of time to solution.

As used in this discussion, “performance” refers to thenumber of mating events required to find an acceptable solutionto the problem. The top-to-bottom impact that the choice ofgraph has on problems is shown in Table III. An initial examina-tion of the confidence intervals shows that performance variesfrom graph to graph, often significantly. Also, the degree towhich performance varies is problem dependent. This indicatesthat graph-based evolutionary algorithms have the potentialto significantly reduce convergence time for many classes ofchallenging problems. It is important to remember that the

Page 13: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

562 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

one-max problem as presented here uses a nonelitist algorithm,increasing its difficulty. The other 20 sets of simulations useelitist algorithms.

The test problems can be divided into the following groups.

1) Problems with a simple fitness landscape (one-max, ,, and ). The fitness landscape for these problems is

a single hill that fills the entire landscape, adjusted in thecase of by noise. The landscapes for one-max andare very similar (discontinuous pyramids), though theyuse different representations. These are relatively straight-forward optimization problems.

2) Problems in which the fitness landscape has many localoptima and several global optima (PORS16, PORS17, dif-ferential equation, SAW). Although the PORS16 problemhas multiple optimal hills in its fitness landscape, each ofthese hills has the same fitness value. Additionally, manyof the suboptimal solutions for the PORS16 problem con-tain “building blocks” that are tree fragments that createthe numbers 2 or 3 or multiply a single argument by thosenumbers. The optimal answer requires two fragments cre-ating or multiplying by two and two fragments creatingor multiplying by three. The effect of this is that the ma-jority of the local optima in the search space contain thetree fragments needed in each of the 24 optimal answers.These optimal answers differ only in the details of howthey use the building blocks. See [6] for details.

The fitness landscape for the differential equationproblem is the most intricate of these test problems. It isfar larger and weirder than landscapes for the other testproblems. As a search problem, it is dense with smallcorrect answers [e.g., (3) and (5)], so much of the spaceis not involved in most searches. Unlike the PORS16problem, the majority of these solutions cannot be builtfrom fragments of each other.

3) Mildly deceptive or difficult landscapes with a global op-tima hidden by a larger local optima ( and some of thelower -dimensional Griewangk functions). This class ofproblems had or one of its random variants as theirbest graph, with a modest improvement in performancefrom 2% to 16%.

4) Problems with very difficult, possibly deceptive land-scapes (PORS15, , DNA barcodes). The PORS15problem is the hardest search problem among the testproblems. The difficulty arises because the correct solu-tion is a unique tree that computes 32 (2 ), and becausetrees that generate threes are local optima that use large(five node) tree fragments that are of no use at all in anoptimal solution. See [6] for details. The foxhole function

also has a large number of traps.

From Table III, it is clear that the use of graphs has a substan-tial impact on the difficult or deceptive problems in the test set.However, for the three hardest problems ( , PORS15, DNAbarcodes), the best graph to use was very different. To compare

and PORS15, examine Fig. 13. The baseline evolutionary al-gorithm, the GBEA with the complete graph , is the lowestdiameter graph in both plots, with Log Diameter . For thefoxhole function , the complete graph is an outlier, whereas

TABLE IIIIMPACT OF CHOICE OF GRAPH ON SOLUTION TIME. FOR EACH PROBLEM THE

MINIMUM AND MAXIMUM MEAN TIME TO SOLUTION FOR ANY OF

THE 26 GRAPHS USED IS GIVEN TOGETHER WITH THEIR RATIO. THE

BENEFIT COLUMN GIVES THE IMPROVEMENT OVER THE BASELINE

STANDARD EVOLUTIONARY ALGORITHM

for PORS15, the complete graph is part of a smooth inverse cor-relation between diameter and time to solution.

A. Performance of the Complete Graph

The complete graph, a GBEA configured to run as a stan-dard evolutionary algorithm, yielded the best results for the fol-lowing problems: one-max, the noisy unimodal function , thesimplest of the three PORS problems, all cases of the SAWproblem, and the differential equation problem. This last hadthe shortest time to solution on average of any of the problemschecked. These problems include both unimodal and the mosthighly polymodal problems in the test set. They do not includethe difficult problems: PORS15, DNA barcodes, and the foxholefunction .

B. Performance of Degree-9 Graphs

The hypercube or one of the three random graphs derivedfrom it was the best graph for , , , , and all instances ofthe Griewangk function. In most of these cases, the hypercubeand its random analogs outperformed the complete graph by amodest margin. For the deceptive function , the differencewas quite large. If only one graph must be chosen, then the suiteof problems used in this paper suggest the hypercube is a goodcompromise choice. It did, however, perform poorly on bothPORS15 and DNA barcodes.

C. Performance on the Hardest Problem

If we rate problem difficulty by time to solution, the PORS15problem is the most difficult problem in the test suite. The worstgraph for this problem is the complete graph. The second worstare the four degree-9 graphs. Thus, the two types of graph thatbetween them are the best for all other problems in the test suite

Page 14: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 563

Fig. 13. A plot of graph diameter versus time to solution for F and PORS15.

are the worst for PORS15. The best graphs for this problem are, , and .

The graphs , are the two highest diameter graphs.The graph is the only one specifically designed for GBEAs. Itis essentially fractal in character, with closely coupled smallergroups organized into less closely coupled larger groups acrossthree levels of scale; it is intended to create local demes. Havinga high diameter is another method of creating disparate demesthrough isolation by distance.

D. The Impact of Randomization

For degree 3, 4, and 9, three randomized graphs each wereincluded. The experiments demonstrate very little impact ofthis randomization. In most experiments, the randomizedgraphs group with the nonrandom graphs of the same degree.Even when a statistically significant separation appeared, e.g.,degree-4 graphs for PORS15, the graphs were in the middleof the distribution of performances. There are combinatoriallyhuge numbers of randomized versions of a given type of regulargraph. Some of these may in fact exhibit significantly superiorperformance—nevertheless randomly sampled graphs within adegree family used in this paper do not exhibit useful levels ofenhanced performance.

E. The Very Worst Type of Graph

The regular trees were extraordinary in having only oneproblem in which any of them performed well, the DNAbarcode problem. For the SAW problems they were in themiddle of the pack. The SAW problems were best solved witha standard algorithm, i.e., a GBEA using . For PORS15,they were in the bottom half but beat the complete graph andthe degree-9 (hypercube) family. For all other problems theywere the worst, often by a large margin. The current test suiteof problems gives no reason to think these graphs should everbe used.

F. The Deceptive Functions

The De Jong function and PORS15 are the deceptive prob-lems in the test suite. Fig. 13, which displays time to solutionversus the log of graph diameter, shows that the behavior of thegraphs on these problems are very different. For , there is arough correlation of log diameter and time to solution, with thecomplete graph and the regular trees behaving as outliers. ForPORS15, there is a fairly strict inverse correlation of log diam-eter with time to solution.

The behavior of and on demonstrate that di-ameter and degree do not tell the whole story. If we dismiss thebehavior of the regular trees as pathological, the two graphs withextreme degree and diameter have almost the same average timeto solution on . The way that beats on many prob-lems is additional evidence that graph structure beyond degreeand diameter impacts performance.

PORS15 has a simpler behavior than . It was hypothesizedthat high diameter graphs act like island models. The water be-tween the islands is made of majority low fitness members ofthe initial population. The islands form around distinct higherfitness individuals. Each island is a chance not to fall into oneof the local optima of the fitness landscape. In order to checkthis hypothesis, a set of runs was performed with 128 disjointcopies of . The time to solution was comparable to that of

.

VI. PHYLOGENETIC ANALYSIS

The data available after performing the 26-graph 23-problemcomparison permit a novel sort of analysis of the problems. A

Page 15: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

564 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

taxonomy is a hierarchical classification of a set. Linnaeus estab-lished the first definite hierarchy used to classify living organ-isms. Each organism was assigned a kingdom, phylum, class,order, family, genus, and species. This hierarchy gave a treestructure to the taxonomy for all living creatures. Modern tax-onomy has nineteen levels of classification, extending Linnaeus’original seven. A cladogram is a tree diagram showing the evo-lutionary relationship among various taxonomic groups. Thereader should see [27] for details on modern taxonomic proce-dures for living organisms. The data gathered in the GBEA studywere used to create a taxonomy of test problems. Making sucha cladogram required extracting taxonomic characters from thecollection of problems. A taxonomic character is simply a mea-surable or computable quantity such as number of legs or max-imum number of teeth in a healthy adult. Using the taxonomiccharacters, hierarchical clustering produced a cladogram thatclassified the problems as more or less similar. Hierarchicalclustering starts with the members of a set (thought of as sin-gleton subsets), finds the two closest, and replaces them withtheir union or average, repeating until all members are merged.

The choice of taxonomic characters used for clustering is crit-ical. They must avoid bias; they must vary across the set of prob-lems; and they must avoid arbitrary judgments to the greatestdegree possible. Using color in a numerical tree-building algo-rithm, for instance, requires numbers be assigned to colors in afashion that arbitrarily ranks some colors as closer to one an-other than others. The preceding brief discussion gives only ataste of the difficulty of choosing good taxonomic characters.Readers familiar with choosing decision variables for automaticclassification, decision trees, and related branches of machinelearning will recognize the issues. Any taxonomic character ordecision variable must be relevant to the decision being made,vary across the set of objects being classified, and be cleanlycomputable for all members of the set of objects being classified.

GBEAs provide a source of taxonomic characters that arecomputable for any evolutionary computation problem that hasa detectable solution or end point. The time to solution for aproblem varies in a complex manner with the choice of graph-ical connection topology. This complexity is itself the genesis ofthe taxonomic characters. The taxonomic characters used to de-scribe a problem are the normalized mean solution times for theproblem on each graph. These characters are purely numerical.They are objective in the sense that they do not favor any par-ticular choice of representation or parameter setting. This giveseach of our 23 problems a set of 26 taxonomic characters. Theresulting taxonomy is given in Fig. 14.

A. Details of the Taxonomic Technique

For each of the 23 problems , a real vector with 26entries corresponding to the normalized mean solution timein each of the 26 graphs was created. The entry of

corresponding to graph was the normalized meannumber of mating events required to solve problem on graph

. The linear normalization was set so that the solution ofon the graph which required the largest mean number ofmating events among the 26 different graphs received the score

, and the graph which required the smallestmean number of mating events received the score .

Fig. 14. Results of taxonomic analysis of the test problems.

For each pair of problems and , the Euclidean distancebetween the vectors and was then com-

puted by the formula

was interpreted as the distance between the problemsand . An “UPGMA” tree was used to describe the taxonomicrelationships among the 23 problems.

UPGMA is a clustering method commonly used to transformdistance data into a tree. It received attention in [34], and a goodrecent description may be found in [36]. It is especially reliableif the distances have a uniform meaning. Normalization of thenumbers makes the widely different rates of conver-gence comparable so that the inferred distances are appropriatefor analysis by UPGMA.

UPGMA is an acronym for “unweighted pair group methodwith arithmetic mean.” Given a collection of taxa and distance

between taxa and , the method first links the two taxaand that are least distant. The taxa and are merged into anew unit . For all taxa other than and , a new distanceis computed as the average of and , and it is noted that thenew taxon really represents the average of two original taxa.Henceforth, and are ignored, and the procedure is repeated tofind the next pair of taxa that are least distant. When two taxa

Page 16: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 565

and are combined into a new taxon , the new distance isthe average of and , weighted according to the number oforiginal taxa in and , respectively; contains all the originaltaxa in both and . The procedure ends when the last two taxaare merged.

The UPGMA tree was computed using the standard soft-ware package PAUP [35]. is shown in Fig. 14. Horizontaldistances are proportional to the edge lengths, while verticaldistances are arbitrary and selected for legibility. Problems sepa-rated by a small horizontal distance (such as Griewangk5, 6, and7) should be regarded as very similar. Wide separations shouldbe regarded as significant.

B. Discussion of the Taxonomic Results

The tree given in Fig. 14 has several striking features.

1) All the numerical problems are grouped into a single cladewith the one-max problem.

2) All the SAW problems are grouped into a single clade.Moreover, the SAW problems break into two subclades(one of which is 3 4, 3 4, and 4 4) of comparablehorizontal extent as the numerical problems and hence ofcomparable diversity of problem type as the numericalproblems.

3) The three PORS functions appear substantially differentboth from each other and from the numerical and SAWclades, as indicated both by their large horizontal extentand their placement so as not to form a clade.

The utility of the taxonomy is demonstrated by thetwo-member clade containing the PORS15 and DNA barcodeproblems, the most difficult problems tested here. Suppose thatPORS15 were part of a standard test suite of problems, andthe DNA barcode problem was regarded as a new practicalproblem, not as part of the test suite. Taxonomic analysis placesthe DNA barcode problem with PORS15, which suggests thatthe graphs which worked well on PORS15 would be mostlikely to work well on the DNA barcode problem. In fact, thisexpectation is realized in this case. Examining Figs. 10 and12, we see that these two problems perform best on the samethree graphs ( , , and ) and also perform worston the same five graphs ( (512,9,1), (512,9,2), (512,9,3),

, and complete). The good performance is on comparativelysparse graphs, and the poor performance is on graphs of highregular degree. This suggests that future searches for betterDNA barcodes should use GBEAs with sparse graphs (andavoid graphs of high regular degree). This information is ofsubstantial worth in an ongoing project to create DNA barcodesets for new sequencing projects.

The SAW and PORS problems demonstrate their worth as testsuite problems by exhibiting substantial diversity in problemcharacteristics (horizontal extent in ). These results confirmthe mathematical analysis in [6] that suggests that the threePORS problems have substantially different characteristics. Theplacement of PORS17 between PORS15 and PORS16 confirmsthat PORS17 is of an intermediate nature compared to the othertwo problems. The SAW problem set generates substantial di-versity by simply varying its parameter. By contrast, the numer-

ical problems generate less diversity, and an effective test suitemight omit some of the problems as being redundant.

It is important to note that the taxonomy reflects relative per-formance on different graphs and not problem difficulty. Thenormalization of mean times into the range [0,1] before use incomparing the problems eliminates all information about theamount of time required to solve the problem. This explainswhy the semisymbolic differential equation problem ended upas a sister group to the SAW clade even though it is enormouslyeasier than most of the SAW problems. This comparative sim-plicity is shown by the small number of mating events requiredfor solution in Fig. 11 compared with Fig. 9.

Overall, the taxonomy in Fig. 14 is plausible and agreeswith what the authors know of the test problems. The tech-nique shows promise for helping to decide which problems aresimilar. It may also help to winnow large test problem suitesby picking representatives from groups of similar problems(such as selecting only a few representative numerical problemsrather than including all of them).

VII. CONCLUSION

Graph-based evolutionary algorithms can improve per-formance on some problems. Among the problems used inthis paper, performance gain was the greatest on the hardestproblems. The largest improvement in performance was inexcess of 1200%, but roughly half of all test problems showedno improvement from using a GBEA. The choice of correctgraph for a GBEA is clearly problem dependent. The taxonomygiven in Fig. 14 gives some guidance as to which problems aresimilar, at least in the sense of being solved quickly or slowlyon the same graphs. As a rule of thumb, difficult and deceptiveproblems work best with sparser graphs.

The additional runtime cost of using a GBEA, over that of astandard evolutionary algorithm for the same problem, is verylow. If a good graph for the problem can be located, then thereis potential for substantial benefit at very low cost. For theGriewangk functions and the SAW problems, the performanceordering of the graphs was robust as the dimension of theproblem changed. This suggests that locating a “correct” graphfor a problem could be done on lower dimensional or smallerproblem cases and then scaled. This notion requires additionalstudy.

The behavior of a problem on a suite of graphs forms an in-teresting description of the problem itself. By looking at whichgraphs work well or badly with a given problem, the problemcan be characterized. This gives an objective taxonomic toolwhich could be quite useful for classifying problems. It is worthnoting that the taxonomy, as presented here, is an essentially ex-ploratory technique for data analysis.

GBEAs have been applied to the problem of designing awood-burning stove for use in Nicaragua. The goal was todecide where to place baffles in the flow of combustion gassesto make the temperature of the stove top as even as possible.The design and deployment of these stoves is described in[40]. The details of thermal systems engineering of the stovesappears in [12]. A description of the impact of using GBEAs onthe problem is given in [11]. To summarize the results: it was

Page 17: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

566 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

found that preserving diversity uniformly hindered progress.The better diversity was maintained, the worse the averagevalue of the objective function for the stove.

The application of placing baffles in a wood-burning stove isan example of a problem for which the GBEA technique is notrequired. The evidence inversely correlating diversity preser-vation with performance suggests that the baffle placementproblem is best run on a highly connected graph if a GBEAis used at all. The evidence also suggests that a standard EAis a better choice than a GBEA for this problem. At presentthere are no obvious features of this thermal systems problemthat suggest ab initio that it was a problem for which diversitypreservation was bad.

VIII. FUTURE DIRECTIONS

The graphs used in this paper are highly symmetric graphs,regular graphs, regular trees, or random regular graphs. Thesole exception is the random toroidal graphs which, while notregular, are isotropic in the sense that the neighborhood of anyvertex is generated with the same kind of random process asall the other vertex-neighborhoods. The idea of interactionsbetween a continent and nearby islands which motivated thework in [29] suggests the use of a different sort of graph. Thisconjectural class of graphs would have a highly connected corewith sparse connections outward to other connected regionswith fewer vertices than the core. The core area would serve asthe continent and the smaller connected regions as the islands.The graph is somewhat similar in its connectivity to anarchipelago, at best a distant approach to a continent/islandgraph.

During the review procedure, it was pointed out that exami-nation of GBEA behavior without crossover would be valuable.In such a GBEA, a population member would be selected atrandom, and neighbor selected in a fitness proportional manner,and that neighbor copied over the selected population memberand then mutated. A comparison of the work presented here withdata from a crossover-free version would permit examination ofthe utility of crossover. It would help to distinguish between twodifferent explanations for the observed changes in performance.Are they due to the effects of geographic isolation or to hetero-geneous crossover? It is hypothesized in this paper that enablingthe maximum number of crossovers between dissimilar parentswill enhance performance on at least some of the test problems.It is just as plausible that isolation, enabled by the connectivityof the various graphs, is varying the effective number of sub-populations exploring distinct solutions.

In addition to crossover, a number of other standard evolu-tionary algorithm parameters have not been tested for sensi-tivity. Additional work has already demonstrated that popula-tion size has a substantial impact on the performance of GBEAs.In this paper, the degree of a graph was strongly predictive ofperformance on a problem. Often graphs of the same degreesorted together in the ordering from best to worst performance.This ordering by degree changes when the number of verticesin the graph is varied and a manuscript addressing this featureof GBEAs is in preparation.

The taxonomic analysis technique can benefit from win-nowing the list of graphs. In many cases several graphs yieldessentially the same taxonomic information. Using the timeto solution data, a smaller set of graphs has been selectedthat is conjectured to yield similar taxonomic information. Inparticular, random graphs derived from the same regular graphseem to yield similar performance to their progenitor on allproblems and hence provide no additional taxonomic informa-tion. The reduced list of graphs recommended is: (510,5),

, (512,3), , , , , ,, , , , (0.07,1), , and . Re-

ducing the list of graphs from 23 to 15 permits generation of thetaxonomic characters with 40 000 fewer evolutionary algorithmruns with the current experimental design. Researchers wantingto apply GBEAs to their problems on these graphs in a mannerthat can be incorporated into the current taxonomic effort maycontact the second author for exact descriptions of the graphs,especially (0.07,1), which is an instance of running an algo-rithm for generating graphs and so not completely specifiedhere.

In this paper, the taxonomic technique is used to compare 26different problems each of which appears with exactly one rep-resentation and exactly one setting of the possible evolutionaryalgorithm parameters. A distinct application of the techniquewould be to taxonomize the impact of changing representa-tion and evolutionary algorithm parameters within a problem.This would be a step toward understanding which versions of aproblem (encompassing both representation and algorithm pa-rameter settings) are substantially different from one anotherand which are essentially the same.

ACKNOWLEDGMENT

The authors would like to thank the members of the IowaState Complex Adaptive Systems Program for helpful com-ments and discussions.

REFERENCES

[1] D. L. Ackley and M. L. Littman, “A case for distributed Lamarckianevolution,” in Artificial Life III: Santa Fe Institute Studies in the Sciencesof Complexity, C. Langton, C. Taylor, J. D. Farmer, and S. Ramussen,Eds. Redwood City, CA: Addison-Wesley, 1993, vol. 10.

[2] J. Alcock, Animal Behavior, an Evolutionary Approach, 7thed. Sutherland, MA: Sinauer Associates, 2003.

[3] D. Ashlock, “GP-automata for dividing the dollar,” in Proc. 2nd Annu.Conf. Genetic Programming, San Francisco, CA, 1997, pp. 18–26.

[4] D. Ashlock, L. Guo, and F. Qiu, “Greedy closure genetic algorithms,” inProc. Congr. Evol. Comput., Piscataway, NJ, 2002, pp. 1296–1301.

[5] D. Ashlock and M. Joenks, “ISAc lists, a different representation forprogram induction,” in Proc. 3rd Annual Genetic Programming Conf.,San Francisco, CA, 1998, pp. 3–10.

[6] D. Ashlock and J. I. Lathrop, “A fully characterized test suite for ge-netic programming,” in Evolutionary Programming VII. New York:Springer-Verlag, 1998, pp. 537–546.

[7] D. Ashlock, M. D. Smucker, E. A. Stanley, and L. Tesfatsion, “Prefer-ential partner selection in an evolutionary study of prisoner’s dilemma,”Biosystems, vol. 37, pp. 99–125, 1996.

[8] D. Ashlock, J. Walker, and M. Smucker, “Graph based genetic algo-rithms,” in Proc. Congr. Evol. Comput., San Francisco, CA, 1999, pp.1362–1368.

[9] W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone, Genetic Pro-gramming: An Introduction. San Francisco, CA: Morgan Kaufmann,1998.

[10] J. Bebernes and D. Eberly, Mathematical Problems From CombustionTheory. New York: Springer-Verlag, 1989.

Page 18: 550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, …eldar.mathstat.uoguelph.ca/dashlock/eprints/GBEA1.pdf · Davidor et al. [15] tried using a steady-state ecological model on a

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 567

[11] K. M. Bryden, D. Ashlock, and D. McCorkle, “An application of graphbased evolutionary algorithms for diversity preservation,” in Proc. 2004Congr. Evol. Comput., vol. 1, 2004, pp. 419–426.

[12] K. M. Bryden, D. A. Ashlock, D. S. McCorkle, and G. L. Urban, “Opti-mization of heat transfer utilizing graph based evolutionary algorithms,”Int. J. Heat Fluid Flow, vol. 24, no. 2, pp. 267–277, 2003.

[13] H. S. Carslaw and J. C. Jaeger, Conduction of Heat in Solids, 2nded. London, U.K.: Oxford Univ. Press, 1959.

[14] S. Choi and B. Moon, “A graph-base Lamarkian-Baldwinian hybrid forthe sorting network problem,” IEEE Trans. Evol. Comput., vol. 9, no. 1,pp. 105–114, 2005.

[15] Y. Davidor, T. Yamada, and R. Nakano, “The ECOlogical frameworkII: Improving GA performance at virtually zero cost,” in Proc. 5th Int.Conf. Genetic Algorithms, San Mateo, CA, 1993, pp. 171–176.

[16] D. B. Fogel, “Evolving behaviors in the iterated prisoners dilemma,”Evol. Comput., vol. 1, no. 1, pp. 77–97, 1993.

[17] , “On the relationship between the duration of an encounter and theevolution of cooperation in the iterated prisoner’s dilemma,” WorkingPaper, Jul. 1994.

[18] L. J. Fogel, A. J. Owens, and M. J. Walsh, “Intelligent decision makingthrough a simulation of evolution,” Behav. Sci., vol. 11, no. 4, pp.253–272, 1965.

[19] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Ma-chine Learning. Reading, MA: Addison-Wesley, 1989.

[20] D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Sci-ence and Computational Biology. Cambridge, U.K.: Cambridge Univ.Press, 1997.

[21] S. Ho, L. Shu, and J. Chen, “Intelligent evolutionary algorithms forlarge parameter optimization optimization problems,” IEEE Trans. Evol.Comput., vol. 8, no. 6, pp. 522–541, 2004.

[22] K. A. De Jong, “An analysis of the behavior of a class of genetic adaptivesystems,” Ph.D. dissertation, Univ. of Michigan, Ann Arbor, MI, 1975.

[23] S. A. Kauffman, The Origins of Order. New York: Oxford Univ. Press,1993.

[24] M. Kimura and J. Crow, “On the maximum avoidance of inbreeding,”Genet. Res., vol. 4, pp. 399–415, 1963.

[25] K. Kinnear, Advances in Genetic Programming. Cambridge, MA: MITPress, 1994.

[26] J. R. Koza, Genetic Programming. Cambridge, MA: The MIT Press,1992.

[27] E. Mayr and P. D. Ashlock, Principles of Systematic Zoology. NewYork: McGraw-Hill, 1991.

[28] R. McEliece, The Theory of Information and Coding. Reading, MA:Addison-Wesley, 1977.

[29] H. Mühlenbein, “Darwin’s continent cycle theory and its simulation bythe prisoner’s dilemma,” Complex Syst., vol. 5, pp. 459–478, 1991.

[30] F. Qiu, L. Guo, T. J. Wen, D. A. Ashlock, and P. S. Schnable, “DNA se-quence-based bar-codes for tracking the origins of ESTS from a maizeCDNA library constructed using multiple MRNA sources,” Plant Phys-iology, vol. 133, pp. 475–481, 2003.

[31] D. Whitley, S. Rana, and R. Heckendorn, “Island model genetic algo-rithms and linearly separable problems,” in Proc. AISB Workshop Evol.Comput., D. Corne and J. Shapiro, Eds., New York, 1997, pp. 109–125.

[32] C. Reynolds, “An evolved, vision-based behavioral model of coordi-nated group motion,” in From Animals to Animals 2, J.-A. Meyer, H. L.Roiblat, and S. Wilson, Eds. Cambridge, MA: MIT Press, 1992, pp.384–392.

[33] H. Schlichting, Boundary Layer Theory, 7th ed. New York: McGraw-Hill, 1979.

[34] P. H. A. Sneath and R. R. Sokal, Numerical Taxonomy; the Princi-ples and Practice of Numerical Classification. San Francisco, CA:Freeman, 1973.

[35] D. L. Swofford,PAUP . Phylogenetic Analysis Using Parsimony ( andOther Methods). Version 4. Sunderland, MA: Sinauer, 2002.

[36] D. L. Swofford, G. J. Olsen, P. J. Waddell, and D. M. Hillis, “Phyloge-netic inference,” in Molecular Systematics, 2nd ed, D. Hillis, C. Moritz,and B. Mable, Eds. Sunderland, MA: Sinauer, 1996, pp. 407–514.

[37] G. Syswerda, “A study of reproduction in generational and steadystate genetic algorithms,” in Foundations of Genetic Algorithms. SanMateo, CA: Morgan Kaufmann, 1991, pp. 94–101.

[38] A. Teller, “The evolution of mental models,” in Advances in GeneticProgramming, K. Kinnear, Ed. Cambridge, MA: The MIT Press, 1994,ch. 9.

[39] A. Toffolo and E. Benini, “Genetic diversity as an objective in multi-objective evolutionary algorithms,” Evol. Comput., vol. 11, no. 2, pp.151–167, 2004.

[40] G. L. Urban, K. M. Bryden, and D. Ashlock, “Engineering optimizationof an improved plancha stove,” Energy Sustain. Develop., vol. 6, no. 2,pp. 5–15, 2002.

[41] D. B. West, Introduction to Graph Theory. Upper Saddle River, NJ:Prentice-Hall, 1996.

[42] D. Whitley, “The genitor algorithm and selection pressure: Why rankbased allocation of reproductive trials is best,” in Proc. 3rd Int. Conf.Genetic Algorthms, 1989, pp. 116–121.

[43] D. Whitley, K. Mathias, and R. J. Dzubera, “Evaluating evolutionaryalgorithms,” Artif. Intell., vol. 85, pp. 245–276, 1996.

[44] S. Wright, Evolution, W. B. Provine, Ed. Chicago, IL: Univ. ofChicago Press, 1986.

Kenneth Mark Bryden is an Associate Professorand Associate Chair of the Mechanical EngineeringDepartment, Iowa State University (ISU), Ames, IA.He currently heads the Virtual Engineering ResearchLaboratory with the Virtual Reality ApplicationsCenter. The Virtual Engineering Research Groupfocuses on integration of information technologiesand cognition into the engineering process to supportdecision making for and realization of complexsystems. Prior to his arrival at ISU, He worked 14years in a wide range of engineering positions with

Westinghouse Electric Corporation within the Naval Reactors Program. Thisincluded eight years in power plant operations and testing and six years inengineering support. His primary research interests are in the integration ofvirtual reality, high-performance computing, and new computational algorithmsto solve complex, tightly coupled engineering, and decision analysis problems.

Daniel A. Ashlock received the doctoral degree fromthe California Institute of Technology, Pasadena.

He is a Researcher with interests in bioinformaticsand the theory and practice of evolutionary com-putation. His doctoral work was in combinatorics.During 13 years at Iowa State University, he wasHead of the Complex Adaptive Systems Programand developed courses in both evolutionary com-putation and bioinformatics. Joining the faculty ofthe Department of Mathematics and Statistics in theUniversity of Guelph as their Bioinformatics Chair,

he continues to work in both evolutionary computation and bioinformatics.This work appears in more than 50 peer-reviewed scientific publications withtopics as diverse as corn genomics, automatic programming, and the design ofefficient wood burning stoves for use in the third world.

Steven Corns received the B.S. and M.S. degrees inmechanical engineering from Iowa State University,Ames, in 2001 and 2003, respectively. He is currentlyworking towards the Ph.D. degree in mechanicalengineering with the Virtual Engineering ResearchGroup.

His main research interests are in the area of evo-lutionary computation applied to biological systemsand the mechanics of information transfer in evolu-tionary algorithms.

Stephen J. Willson received the A.B. degree fromHarvard University, Cambridge, MA, in 1968, andthe M.A. and Ph.D. degrees from the University ofMichigan, Ann Arbor in 1970 and 1973, respectively,all in mathematics.

His dissertation was in algebraic topology underthe supervision of A. G. Wasserman. He joinedIowa State University, Ames, in 1973, where he iscurrently a Professor of mathematics. His researchinterests include computational biology (especiallyphylogenetics), fractals, cellular automata, and game

theory.