a nite t constrain satisfaction problem (csp) is ... - bracil · called the guided genetic...
TRANSCRIPT
Guided Genetic Algorithm
Lau Tung Leng
A thesis submitted for the degree ofPh.D in Computer Science
Department of Computer ScienceUniversity of Essex
Wivenhoe Park, ColchesterEssex CO4 3SQ, United Kingdom
ABSTRACT
A �nite Constraint Satisfaction Problem (CSP) is a problem with a �niteset of variables, where each variable is associated with a �nite domain. Re-lationships between variables constraint the possible instantiations they cantake at the same time. A solution to a CSP is an instantiation of all the vari-ables with values from their respective domains, satisfying all the constraints.In many CSPs, such as resource planning and time table scheduling, somesolutions are better than others. The task is to �nd optimal or near-optimalsolutions. These are called Constraint Satisfaction Optimization Problems(CSOPs).
Guided Local Search (GLS) is a meta-heuristic search algorithm whichsits on top of local search algorithms. When the local search algorithm settlesin local optimum, it imposes penalties to modify the objective function usedby the local search. This brings the search out of local optimum.
Genetic Algorithms (GA) are stochastic search algorithms that borrowsome concepts from nature. The hope is to evolve good solutions by ma-nipulating a set of candidate solutions. It has been successfully applied tooptimization problems. Di�culties abound when one tries to solve CSOPsusing a GA, due to the epistasis problem; GA assumes that variables in acandidate solution have no relationship to each other, which is contrary toCSOP.
This thesis reports the design and experiments with a new strain of GA,called the Guided Genetic Algorithm (GGA), that can handle CSOPs. GGAis a hybrid between GA and GLS. The novelty in GGA is in its making useof penalties in GLS by the its specialized genetic operators. By doing so,it improves e�ectiveness and robustness in both GA and GLS. GGA foundworld class results in the Royal Road Function, the Radio Length FrequencyAssignment Problem, the Processors Con�guration Problem and the GeneralAssignment Problem.
ACKNOWLEDGEMENTS
Dedicated to Dr Edward Tsang,for whom this would never have been possible.
Dedicated to Ma, Pa and Sis,for their support.
Dedicated to Yu-Tzu,for her love.
Contents
1 Introduction 1
1.1 Constraint Satisfaction Problem . . . . . . . . . . . . . . . . . 21.2 Stochastic Search . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 61.4 Using GA for CSP . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.1 The Preparations Needed . . . . . . . . . . . . . . . . 91.4.2 Di�culties . . . . . . . . . . . . . . . . . . . . . . . . . 111.4.3 A Suitable GA . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Current Research in GA for CP . . . . . . . . . . . . . . . . . 121.5.1 Classifying Current Research . . . . . . . . . . . . . . . 14
1.6 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.7 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2 Guided Genetic Algorithm 26
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.2 Guided Local Search . . . . . . . . . . . . . . . . . . . . . . . 272.3 Overview of GGA . . . . . . . . . . . . . . . . . . . . . . . . . 292.4 Features and Penalties in GGA . . . . . . . . . . . . . . . . . 302.5 Fitness Template . . . . . . . . . . . . . . . . . . . . . . . . . 332.6 GGA Operators and Population Dynamics . . . . . . . . . . . 33
2.6.1 Specialized Cross-over Operator . . . . . . . . . . . . . 352.6.2 Specialized Mutation Operator . . . . . . . . . . . . . 372.6.3 The Control Flow in GGA . . . . . . . . . . . . . . . . 38
2.7 Experimentation Environment . . . . . . . . . . . . . . . . . . 412.7.1 Problem Sets . . . . . . . . . . . . . . . . . . . . . . . 422.7.2 Testing Guidelines . . . . . . . . . . . . . . . . . . . . 44
3 Royal Road Functions 47
3.1 Understanding the Royal Road functions . . . . . . . . . . . . 483.2 Preparing GGA to solve RRf . . . . . . . . . . . . . . . . . . . 53
3.2.1 Mutation Operator . . . . . . . . . . . . . . . . . . . . 53
i
CONTENTS
3.2.2 De�ning solution features for the RRf . . . . . . . . . . 533.2.3 Fitness Function . . . . . . . . . . . . . . . . . . . . . 54
3.3 Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.3.1 Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.3.2 Test Environment . . . . . . . . . . . . . . . . . . . . . 553.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4 Generalized Assignment Problem 61
4.1 Expressing the GAP as a CSOP . . . . . . . . . . . . . . . . . 644.1.1 Variables and Domains . . . . . . . . . . . . . . . . . . 644.1.2 Constraints and the Objective Function . . . . . . . . . 65
4.2 Preparing GGA to solve GAP . . . . . . . . . . . . . . . . . . 664.2.1 Features for Constraint Satisfaction . . . . . . . . . . . 664.2.2 Features for Optimization . . . . . . . . . . . . . . . . 67
4.3 Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.3.1 Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . 704.3.2 Test Environment . . . . . . . . . . . . . . . . . . . . . 714.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.4 Varying Tightness . . . . . . . . . . . . . . . . . . . . . . . . . 774.4.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5 Radio Link Frequency Assignment Problem 83
5.1 The RLFAP in PCSP Expression . . . . . . . . . . . . . . . . 865.1.1 Variables and Domains . . . . . . . . . . . . . . . . . . 865.1.2 Constraints . . . . . . . . . . . . . . . . . . . . . . . . 875.1.3 Objective Function . . . . . . . . . . . . . . . . . . . . 88
5.2 Preparing GGA to solve RLFAP . . . . . . . . . . . . . . . . . 885.3 Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.1 Test Environment . . . . . . . . . . . . . . . . . . . . . 905.3.2 Performance Evaluation . . . . . . . . . . . . . . . . . 91
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6 Processor Con�guration Problem 98
6.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 996.2 Understanding the PCP . . . . . . . . . . . . . . . . . . . . . 1026.3 Expressing PCP as a CSOP . . . . . . . . . . . . . . . . . . . 106
6.3.1 Solution representation . . . . . . . . . . . . . . . . . . 1066.4 Preparaing GGA to solve PCP . . . . . . . . . . . . . . . . . . 107
6.4.1 Setting up GGA's genetic operators . . . . . . . . . . . 107
ii
CONTENTS
6.4.2 De�ning solution features of the PCP . . . . . . . . . . 1096.4.3 Transforming the Representation . . . . . . . . . . . . 111
6.5 Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1116.5.1 Test Environment . . . . . . . . . . . . . . . . . . . . . 1126.5.2 Benchmark One . . . . . . . . . . . . . . . . . . . . . . 1126.5.3 Benchmark Two . . . . . . . . . . . . . . . . . . . . . . 117
6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7 Conclusion 123
7.1 Completed Tasks . . . . . . . . . . . . . . . . . . . . . . . . . 1237.1.1 Designing the GA . . . . . . . . . . . . . . . . . . . . . 1237.1.2 Test Suite Results . . . . . . . . . . . . . . . . . . . . . 124
7.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1267.2.1 Robustness . . . . . . . . . . . . . . . . . . . . . . . . 1267.2.2 E�ectiveness . . . . . . . . . . . . . . . . . . . . . . . . 1277.2.3 Alleviate Epistatsis . . . . . . . . . . . . . . . . . . . . 1287.2.4 General Framework . . . . . . . . . . . . . . . . . . . . 129
7.3 Identifying GGA's Niche . . . . . . . . . . . . . . . . . . . . . 1307.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
A Terms, Variables and Concepts of GGA 132
B Publications 136
B.1 Journal Papers . . . . . . . . . . . . . . . . . . . . . . . . . . 136B.2 Conference Papers . . . . . . . . . . . . . . . . . . . . . . . . 137
iii
List of Figures
1.1 A canonical Genetic Algorithm . . . . . . . . . . . . . . . . . 71.2 Classi�cation of current research . . . . . . . . . . . . . . . . . 15
2.1 The Guided Genetic Algorithm. . . . . . . . . . . . . . . . . . 312.2 Algorithm for the Distribution of Penalty. . . . . . . . . . . . 342.3 Algorithm of the GGA Cross-over Operator. . . . . . . . . . . 362.4 An example of the GGA's Cross-over Operator in action. . . . 372.5 Algorithm of the GGA Mutation Operator. . . . . . . . . . . . 39
3.1 A 64 bits optimum string, and the set of derived building blocks. 493.2 Algorithm for computing Schemas. . . . . . . . . . . . . . . . 513.3 An example of the GGA's Cross-over Operator on the RRf. . . 58
4.1 Pictorial view of the Generalized Assignment Problem. . . . . 62
6.1 A 12-processor ring. . . . . . . . . . . . . . . . . . . . . . . . . 1006.2 A 32-processor torus. . . . . . . . . . . . . . . . . . . . . . . . 1016.3 Sample graph showing paths from node 0 to node 8. . . . . . . 1026.4 Processor con�guration without System Controller. . . . . . . 1046.5 Processor con�guration with System Controller(SC). . . . . . 1046.6 A sample network. . . . . . . . . . . . . . . . . . . . . . . . . 1066.7 Comparing best davg produced by GGA and other algorithms. 1146.8 Comparing average davg produced by GGA and other algorithms.1156.9 Comparing best Nover produced by GGA and other algorithms. 1166.10 Comparing average Nover produced by various algorithms. . . 1176.11 Comparing best execution time. . . . . . . . . . . . . . . . . . 1186.12 Comparing average execution time. . . . . . . . . . . . . . . . 1196.13 Comparing GGA against popular con�gurations. . . . . . . . . 1216.14 Comparing GGA against other algorithms. . . . . . . . . . . . 122
iv
List of Tables
3.1 Comparing results by GGA with others. . . . . . . . . . . . . 57
4.1 Summary of all runs on GAP by GGA. . . . . . . . . . . . . . 734.2 Summary of all runs on GAP by PGLS. . . . . . . . . . . . . 744.3 Recorded runs on GAP8-3 and GAP10-3 by GGA. . . . . . . . 754.4 Recorded runs on GAP8-1, GAP8-3, GAP8-5, GAP10-3 and
GAP12-3 by PGLS. . . . . . . . . . . . . . . . . . . . . . . . . 764.5 Comparing results by GGA with others. . . . . . . . . . . . . 784.6 Comparing � results by GGA using feature sets of di�erent
tightness. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.1 Characteristics of RLFAP instances. . . . . . . . . . . . . . . . 865.2 Comparison of GGA with GLS and the CALMA project algo-
rithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 935.3 Comparing Robustness between GGA, GLS and PGLS. . . . . 97
6.1 Time (in CPU seconds) for 32-processor con�gurations. . . . . 1016.2 Time (in CPU seconds) for 63-processor con�gurations. . . . . 1026.3 Important terms in the PCP. . . . . . . . . . . . . . . . . . . . 1036.4 davg produced by GGA and other algorithms (dmax = 3). . . . 1146.5 Nover produced by GGA and other algorithms (dmax = 3). . . 1166.6 Execution time (in CPU seconds). . . . . . . . . . . . . . . . . 1186.7 Comparing davg, the average interprocessor distances. . . . . . 120
A.1 Components of GGA . . . . . . . . . . . . . . . . . . . . . . . 133A.2 Data Types of GGA . . . . . . . . . . . . . . . . . . . . . . . 134A.3 Inputs and Parameters to GGA . . . . . . . . . . . . . . . . . 135
v
Chapter 1
Introduction
Many tasks we encounter in arti�cial intelligence research can be conceptual-
ized as a Constraint Satisfaction Problem (CSP) [1]. Such is the extent and
usefulness of CSP, that it has spawned a wealth of algorithms and techniques
for a range of CSP types. In this thesis, we are particularly interested in two
classes of CSP: the Constraint Satisfaction Optimization Problem (CSOP),
and the Partial Constraint Satisfaction Problem (PCSP).
CSPs (and CSOPs and PCSPs) are generally NP-hard [1] and although
heuristics have been found useful in solving them, most systematic search al-
gorithms are deterministic and constructive [2], and would thereby be limited
by the combinatorial explosion problem. Systematic methods include search
and inference techniques. These search methods are complete and will leave
no stone unturned, so they are able to guarantee a solution or to prove that
one does not exist. Thus systematic techniques will, if necessary, search the
entire problem space for the solution [3].
1
1.1. CONSTRAINT SATISFACTION PROBLEM
The combinatorial explosion is an obstacle faced by systematic search
methods for solving realistic CSPs, and in looking for optimal and or near-
optimal solutions in CSOPs and PCSPs. In optimization, to ensure that the
solution found is the optimal, systematic search algorithms would need to
exhaust the entire problem space to establish that fact.
Stochastic search methods are usually incomplete. They are not able to
guarantee that a solution can be found, and neither can they prove that
a solution does not exist. They forgo completeness for e�ciency. Often,
stochastic search methods can be faster in solving CSOPs than systematic
methods [4]. Many publications such as [5{7] demonstrated on several large
problems that what systematic search algorithms fail to solve, stochastic
alternatives e�ciently conquer.
The objective of this research is to tackle constraint optimization prob-
lems using stochastic methods.
1.1 Constraint Satisfaction Problem
A �nite Constraint Satisfaction Problem can be described as a problem with a
�nite set of variables, where each variable is associated with a �nite domain.
Relationships between variables constraint the possible instantiations they
can take at the same time [1, 8]. To solve a CSP, one must �nd the solution
tuple that instantiate variables with values of their respective domains, and
that these instantiations do not violate any of the constraints. Our area
of research is in Constraint Satisfaction Optimization Problem and Partial
2
1.1. CONSTRAINT SATISFACTION PROBLEM
Constraint Satisfaction Problem, two variations of the CSP.
In the realms of CSP, the instantiation of a variable with a value from its
domain is called a label. A simultaneous instantiation of a set of variables is
called a compound label, which is a set of labels. A complete compound label
is one that assigns values, from the respective domains, to all the variables
in the CSP.
A CSOP is a CSP with an objective function f that maps every complete
compound label to a numerical value. The goal is to �nd a complete com-
pound label S such that f(S) gives an optimal (maximal or minimal) value,
and that no constraint is violated. A PCSP is similar to a CSOP except that
the complete compound label may have variable instantiations that violate
some of its constraints. Violation is unavoidable because the constraints are
so tight that a satis�able solution does not exist, or cannot be found [1, 9].
Deciding which constraint to violate is in uenced by its cost and type. Hard
constraints are types of constraints that must not be violated, whereas soft
constraints may. The sum cost of all violated constraints is re ected in the
objective function, further to other optimization criteria.
To facilitate matters, the term Constrained Problems (CP) shall be used
to collectively refer to CSP, CSOP and PCSP in this document.
3
1.2. STOCHASTIC SEARCH
1.2 Stochastic Search
Stochastic search is a class of search alogrithms that uses heuristics to guide
its nondeterministic traversing of the search space. The search is nondeter-
ministic because it includes an element of randomness. This randomness can
take several forms, such as: when choosing a starting point, selecting a sub
set of the search space, or in choosing which neighboring point (in the search
space) to proceed to.
The search space of a problem is determined by its structure. For instance,
we can perceive the search space as the set of all feasible and non-feasible
candidate solutions. Therefore, a point in this search space is a candidate
solution to the problem, be it feasible or otherwise. Any two points are
neighbors if they are next1 to each other in the search space. A neighbouring
point (solution) represents a minor change to the current solution. A set of
neighbors to a point forms its neighborhood, which is a set of points that a
stochastic search can jump to (from the current point). A stochastic search is
an iterative process of choosing and moving to a neighboring point, usually
guided by heuristics that can determine the most bene�cial point in the
current neighborhood. Thus, stochastic search are local in nature; the search
progresses by making small changes to the solution, without awareness of
where these changes will eventually lead to.
Hill climbing, Simulated Annealing [10{12], Tabu Search [13{15], Neural
Networks [16{18] and Genetic Algorithms [19{21] are just some stochastic
1Proximity is measurable through a metric de�ned within the problem's structure.
4
1.2. STOCHASTIC SEARCH
search algorithms that attempt to contain the combinatorial explosion prob-
lem at the price of completeness. Hill climbing, Simulated Annealing, Tabu
Search and Neural Networks all have the same characteristic of maintaining
only one solution throughout the search process. This solution is repeatedly
improved by exploitation of the search space; making small bene�cial jumps
in the search space.
Genetic Algorithms2 di�er by maintaining several solutions in parallel.
In addition to progressing search by exploitation, Genetic Algorithms o�er
exploration of the search space; large and undetermined jumps in the search
space. Exploration is acheived by combining elements from two exisiting
solutions to create a third, which may represent a new point in the search
space. This new point opens a door to a part of the search space that
may otherwise be left untouched if the search method is purely exploitive in
operation.
By maintaining a collection of solutions in parallel, and through the ex-
plorative mode of Genetic Algorithms, they are able to mine the search space
in several locations at the same time. This makes Genetic Algorithms less
local in its nature, giving them an edge of robustness. Robustness is de�ned
as consistency in �nding optimal (or near optimal) solutions, and the con-
sistency in the quality of these solutions. It is precisely this reason that has
encouraged our research into Genetic Algorithms.
2Details on Genetic Algorithms are to follow in the next section.
5
1.3. GENETIC ALGORITHMS
1.3 Genetic Algorithms
Genetic Algorithms (GA) are stochastic search algorithms that borrow some
concepts from nature [20{22]. GA maintains a population pool of candidate
solutions called strings or chromosomes. Each chromosome p is a collection
of � building blocks known as genes, which are instantiated with values from
a �nite domain. Let p;q denote the value of gene q in chromosome p in the
population .
Associated with each chromosome is a �tness value which is determined
by a user de�ned function, called the �tness function. The function returns
a magnitude that is proportional to the candidate solution's suitably and/or
optimality. Fig. 1.1 shows the control and data ow of a canonical GA. At the
start of the algorithm, an initial population is generated. Initial members of
the population may be randomly generated, or generated according to some
rules. The reproduction operator selects chromosomes from the population
to be parents for a new chromosome and enters them into the mating pool.
Selection of a chromosome for parenthood can range from a totally random
process to one that is baised by the chromosome's �tness.
The cross-over operator oversees the mating process of two chromosomes.
Two parent chromosomes are selected from the mating pool randomly and
the cross-over rate, which is a real number between zero and one, determines
the probability of producing a new chromosome from the parents. If the
mating was performed, a child chromosome is created which inherits comple-
menting genetic material from its parents. The cross-over operator decides
6
1.3. GENETIC ALGORITHMS
Generate initialpopulation
Start
Mutationoperator
Cross-overoperator
populationRe-assess
Criteriamet?
Control Flow
Data Flow
Reproductionoperator
Mating pool
End
Populationpool
Offspringpool
yes
no
Figure 1.1: A canonical Genetic Algorithm
7
1.3. GENETIC ALGORITHMS
what genetic material from each parent is passed onto the child chromosome.
The new chromosome produced is entered into the o�spring pool. This new
chromosome may represent an unexplored point in the search space.
The mutation operator takes each chromosome in the o�spring pool and
randomly change part of its genetic make-up, ie. it's content. The probability
of mutation occuring on any chromosome is determined by the user speci�ed
mutation rate. Chromosomes, mutated or otherwise, are put back into the
o�spring pool after the mutation process.
Thus each new generation of chromosomes are formed by the action of
genetic operators (reproduction, cross-over and mutation) on the older pop-
ulation. Finally, the members of the population pool are compared with
those of the o�spring pool. The chromosomes are compared via their �tness
value to derive a new population, where the weaker chromosomes may be
eliminated. In exact, weaker members in the population pool are replaced
by the �tter child chromosomes from the o�spring pool. The heuristic for
assessing the survival of each chromosome into the next generation is called
the replacement strategy.
The process of reproduction, cross-over, mutation and formation of a new
poplution completes one generation cycle. A GA is left to progress through
generations, until certain criteria (such as a �xed number of generations, or a
time limit) are met. GAs were initially used for machine learning systems, but
it was soon realised that GAs have great potential in function optimization
[20, 21, 23].
8
1.4. USING GA FOR CSP
1.4 Using Genetic Algorithms for Constrained
Problems
1.4.1 The Preparations Needed
In using a GA to solve any CP, one needs to think about representing the
candidate solutions, the constraints, and the �tness function (especially for
CSOP and PCSP). From these designs, the next step would be to choose a
GA suitable for the task.
Representating Solutions
In any local search algorithm, the way candidate solutions of a problem are
represented usually shapes the type of neighbourhoods that the search will
navigate in. Similarly for GA, the representation of the candidate solutions
would a�ect the choice of cross-over and mutation operators we use. A
property of any realistic CP is the non-trivial amount of variables it has.
Another property is that the domains of these variables are varied, and may
be mixed; for example, real number, boolean or class domains. A candidate
solution to any CP would be a chromosome, where each gene represents a
variable in the CP. The representation used in each gene is real coded3, since
binary coding4 may not be feasible due to the large number of variables and
di�erent domains. In [24], Smith et al. showed that a binary representation
of a set of realistic CP has no performance or quality advantages over real
3Each gene represents one variable, and encodes the values as it is in the variable'sdomain.
4Each gene representing one bit of a variable.
9
1.4. USING GA FOR CSP
coding. This view is also shared in several publications such as [25{27],
where the researchers preferred real coded representation by arguing that
the number and nature of variables in a realistic CP would only complicate
binary representations.
Constraints
Constraints in CP are treated in a variety of ways. Eiben et al. [28{30] sug-
gested that some problems can have solution representation that eliminates
constraints. Warwick and Tsang [31, 32] and Paredis [33] uses heursitics that
repair candidate solutions to conform to the constraints. Ri� Rojas [25]
suggested a �tness function that favours constraint satisfaction.
Fitness Function
The �tness function for CSP would be to minimize the number of constraints
violated, and to minimize the total cost of constraints violated in the case of
PCSP. As for CSOP, one not only needs to prevent constraint violation, but
also to search for an optimal solution. As such, the �tness function would
have multiple objectives; consisting of more than one optimization objective,
ie. one objective for measuring solution optimality, while the other to assess
constraint violation. Alternatively, Warwick and Tsang [31] suggested a hy-
brid algorithm that has the GA portion on just generating good solutions,
and relying on a herusitic repair mechanism to correct the constraints that
were violated.
10
1.4. USING GA FOR CSP
Summary
The preparation process is a delicate task and there are many avenues open
to us. As dicussed, representation of candidate solutions, constraints and
�tness function are all closely related to each other. Whatever the decision,
the perparation is just as important, if not more in the case of CP, as choosing
the search algorithm. It could easily a�ect the success of any given GA.
1.4.2 Di�culties
A basic assumption of the canonical GA is that genes are independent of each
other, so that the value taken by one gene will not in uence the instantiation
of any other gene in the same chromsome. Constrained problems are highly
epistatic in nature. Epistatsis is the interaction between di�erent genes in a
chromosome. As discussed previously, a candidate solution to a typical CP
is often represented as a chromosome, where each gene in the chromosome
describes a variable in the CP. Constraints in uence both the values that sets
of genes can take simulanteously and the overall �tness of that chromosome.
Goldberg suggested high epistasis as an explanation to GAs failure in certain
tasks [20]. Eiben et al. [30] agrees that the canonical GA will not perform
well on CP for precisely this reason.
CSOP requires further sophistication in the �tness function. As discussed,
not only must we reduce the constraint violated, we would also need to look
for the optimal solution at the same time. This is called Multi-objective
(or multi-criteria) optimization. Scha�er's Vector Evaluated Genetic Algo-
11
1.5. CURRENT RESEARCH IN GA FOR CP
rithm (VEGA) [34] extends on Grefenstette's GENESIS5 program [35, 36]
to provide multi-objective optimization. VEGA creates sub-populations of
equal sizes for each di�erent optimization criterion. Chromosomes in one
sub-population evolve independently of each other, except when mating is
required; chromosome in one sub-population is allowed to mate with chromo-
some from another6. However CSOPs are more dynamic, since di�erent sets
of constraints may need to be violated7 at di�erent periods of the search so
that the search can reach di�erent (and more preferable) parts of the search
space.
1.4.3 A Suitable GA
From the previous sections (sections 1.4.1 and 1.4.2), we can derive an out-
line of the GA we want: A GA that is comfortable with real-coded represen-
tations, is capable of dealing with highly epistatic problems, and supports
multi-objective optimization (for CSOPs).
1.5 Current Research in GA for CP
Examples of applications wherein GAs (or variations) had been successful
with CPs were limited to a small number. Perhaps this is due to the di�-
5A generic GA environment that is publicly available for researchers to customize andexperiment with.
6Of note, a study by Richardson et al. [37] suggested that the e�ect of VEGA is nodi�erent from the linear combination of all optimization objective.
7Increasing the violation count in the constraint violation optimization objective; anundesired action.
12
1.5. CURRENT RESEARCH IN GA FOR CP
culties discussed in the previous section.
Research into GAs for CPs can generally be classi�ed into two sets: GAs
that incorporate constraint knowledge, and otherwise. By constraint knowl-
edge, we refer to the awareness of properties in a typical CP. Thus, a GA
that incorporates constraint knowledge would make use of properties that a
CP exhibits to enhance its operation.
GAs that do otherwise may involve representation strategies, heuristic
repairs, or other established search algorithms. Representation strategies
requires careful analysis of the CP and through clever representation of can-
didate solutions, eliminate most of the constraints in the problem. Heuristic
repair algorithms are usually speci�c to the CP at hand, and its role is to
repair violated constraints in candidate solutions generated by the GA. Hy-
brid algorithms often involve GAs as a seeding program which generates good
starting solutions for the other search algorithm to improve upon.
These two sets of GAs can be further classi�ed into three categories by
identifying in temporal, if the changes to the canonical GA is operative before,
during or after the GA has completed its run:
� Before GA: Representation strategies and problem reduction techniques
(such as maintaining node and arc consistency) are examples of changes
that are operative before the GA is executed.
� During GA: Heuristic repair, changes to the �tness function or the
genetic operators are modi�cations that come into e�ect during the
13
1.5. CURRENT RESEARCH IN GA FOR CP
execution cycle of the GA. We will also classify approaches that rely
on very large population sizes into this category.
� After GA: Hybrid algorithms that uses GA as a seeding program be-
longs to the last group (after GA has completed its run), since it uses
the solution at the end of the GA run as input to the next search
algorithm.
Finally, each algorithm or modi�cation may be applicable to either CSP,
CSOP, PCSP, or all three. Although algorithms designed speci�cally may
not be directly helpful to our cause, nonetheless they represent a signi�cant
amount of research that would de�nitely be bene�cial to our goal.
1.5.1 Classifying Current Research
In Fig. 1.2, we classi�ed publictions relating to research in this �eld into
the groups discussed in the previous section. First, they are divided into
two groups: if the algorithm uses constraint knowledge or otherwise. Within
each group, the algorithms are classi�ed into three categories: before, during
or after. Each category depicts temporally, when the changes made to the
canonical GA will be operative.
E�orts by Eiben et al. are concentrated on CSPs [28{30]. They improved
GA's performance yield by using a combination of representation strategy
and a hybrid of GA and classical search. However, the true impact of their
approach is di�cult to appreciate since the problem they have demonstrated
14
1.5.CURRENTRESEARCHIN
GAFORCP
GeneticAlgorithms
ConstraintKnowledge
No ConstraintKnowledge
During GA
Before GA
During GA
After GA
Ri� Rojas [25, 38, 39]
Kolen [24]
Warwick and Tsang [31, 32]
Bowen and Gerry [40]
Davis [41]
Paredis [33]
Michalewicz et. al [42{44]
Chu and Beasely [45, 46]
Crompton et. al [47]
Lau and Tsang [48, 49]
Eiben et. al [28{30]
Eiben et. al [28{30]
Figure 1.2: Classi�cation of current research
15
1.5. CURRENT RESEARCH IN GA FOR CP
were small, when compared to real life CSPs.
Ri� Rojas' �rst idea was to incoporate a constraint network into the
�tness function [25, 38] . With this, the �tness function would able to \look
ahead" and preceive constraint violation; thus it is able to rate a candidate
solution poorly, if it's current state will lead to violation of constraints. Ri�
Rojas' later works [39] expanded on this theme which included arc-mutation
and arc-crossover, genetic operators that take advantage of the constraint
network, and use it to guide their operations. Her works are limited to
CSPs, with experimentation on random graph colouring problems.
Chu and Beasely focus are on well known Operations Research (OR) prob-
lems [45, 46]. Their GA is a combination of representation strategy, modi�-
cation to the genetic operators and heuristic repair to handle the constraints
in these problems. All these OR problems can be mapped into CSOPs, but
the modi�cation to the operators and heuristic repair are speci�c to each
problem.
The Radio Link Frequency Assignment Problem (RLFAP) [50] consists
of scenarios that are CSOP or PCSP in nature. Smith et al. [24] meticu-
lously investigated a variety of ways to simplify the problem set, represent
the candidate solutions and adapt the canonical GA for optimum perfor-
mance. The end product used real coded representation for the candidate
solutions, and the mutation operator has been modi�ed to operate as such.
In addition, the mutation operator will propergate equality constratnts; af-
ter choosing a value for a variable (represented by a gene), all variables that
16
1.5. CURRENT RESEARCH IN GA FOR CP
are related by equality constratnts will be instantiated with the same value.
For CSOP scenarios, the �tness function manages a composition of minimis-
ing constraint violations and solution optimality. The proportion of these
two objectives changes as the search progresses; contribution from constraint
violations decreases and solution optimality increases.
Kolen's GA [24] was another that attempted the PCSP instances of the
RLFAP. The GA uses a massive population running on parallel hardware.
The cross-over operator uses domain knowledge to create new chromosomes,
and the mutation operator is restrained from causing genetic drift. Smith
et al. suggested that by preventing genetic drift, Kolen's GA was suitable
for highly constrained problems, in so explaining its excellent results in this
subset of the RLFAP.
Crompton et al. also used a massively parallel GA approach [47]. The GA
is of the traditional variety, but using a very large population of chromosomes
running on a computer that does symmetrical multi-processing on an array
of microprocessors.
Warwick and Tsang interests are in CSOPs. Their GAcSP [31, 32] is a
hybrid algorithm that is a marriage of the canonical GA and Tabu Search,
and heuristic repair within the GA to �x constraint violations. Heuristic
repair warrants that all chromosomes are valid solutions (no constraint vio-
lated), so the �tness function measures only solution optimality. The GA is
used as a seeding program to provide a good starting point for Tabu Search
to improve on.
17
1.5. CURRENT RESEARCH IN GA FOR CP
Similarly, Davis [41] and Bowen and Gerry [40] used hybrid approaches
where a canonical GA to seed another search algorithm with good starting
solutions. For Davis, that second algorithm is Simulated Annealing, whereas
Bowen and Gerry used a systematic search.
LPGA (Lau and Tsang [48, 49]) was designed speci�cally for generating
optimum plans to interconnet processors into a network. Their approach
includes representation strategy, modi�cation to the mutation operator and
the inclusion of an application speci�c penalty algorithm. When the pop-
ulation is trapped in a local optimum, the penalty algorithm analyse the
�ttest chromosome to formulate a map. This map indicates to the mutation
operator a set of gene positions that must not be mutated, since these genes
(variables) are deemed to be at their optimum instantiation, based on the
analysis.
Michalewicz's GENOCOP series [42{44] is a GA for numerical constraint
optimization. The intended use is in the �eld of linear programming. Ver-
sion one of this algorithm [42] is limited to only linear constraints, and uses
Newton-Rhapson method to guide the search. Versions two and three wraps
around version one, and supports both linear and non linear constraints.
They operate by calling GENOCOP I repeatedly with varying temperature
(as a penalty coe�cient to the �tness function). The GENOCOP series is
computationally expensive; it takes about 1000 generations to optimize a six
variable problem with two constraints.
Paredis' approach [33] to handling constraints is unique. In his GA,
18
1.5. CURRENT RESEARCH IN GA FOR CP
candidate solutions and constraints each form a sub-population. The �ttest
candidate solution is one with the least constraint violation, whereas the
�ttest constraint is the one that most candidate solution could not satisfy.
The constraint sub-population behaves like a penalty function; the �tness of a
constraint is a penalty for its violation in a candidate solution. Paredis' GA
however requires expensive computational power, and there is a possiblity
that both sub-populations may get trapped in an endless feedback loop.
From the discussion of existing publications, we can observe that there
are a number of ways to use GAs to solve constrained problems. Most re-
searchers agree that a good representation can reduce the e�ect of epistatsis,
but there is no universal representation strategy for all our problems. From
the classi�cation chart, there is a concentration of methods that occur within
the execution cycle of a GA. Methods such as Eiben et al. [28, 29] and Chu
and Beasely [45, 46] used specalised genetic operators or repair operators for
their applications. However, these methods are not generic and therefore are
limited to the applications they were designed for.
Overall, the chart in Fig. 1.2 showed that most existing methods belong
to the two aforementioned categories. And that a great majority of these
methods are isolated in application to the speci�c problems they were written
for.
Of the methods discussed, only Smith et al. [24] came up with a GA
that does CSP (CSP can be easily mapped onto CSOP), CSOP and PCSP.
Most other algorithms were speci�c to only one class of constrained problems.
19
1.6. GOALS
Kolen [24] and Crompton et al. [47] used GAs with very large population
sizes running on massively parallel machine, but not everyone has access to
such computation power. Only Ri� Rojas [25, 38, 39] and Kolen [24] touched
on adding constrained knowledge to evolutionary algorithms.
None of the algorithms really addressed the inherent GA problem of epis-
tatsis, especially prominent when dealing with CPs. If this root problem is
eliminated or suppressed, we would no longer rely on problem speci�c meth-
ods such as in the articles we discuessed. Of note, Lau and Tsang provided a
novel approach where contribution of each gene to its chromosome's �tness
can be approximated by domain knowledge [48, 49]. This information is used
to tag gene positions that are mutatable (ie. have their values changed) and
otherwise, since undesirable genes should change while desirable genes must
be preserved. Although there is no way to measure the exact degree of in-
uence that a gene imposes, nonetheless this approach has ease the problem
of epistatsis.
1.6 Goals
From the previous sections, we identi�ed the necessities that GAs should
have in order to ensure success in solving CPs, and the pitfall of GAs that
may damper success. To summarize, a suitable and generic GA that we aim
to create, should provide the following tools for problem de�nition:
� Representation: The proposed GA should cater for real coded repre-
20
1.6. GOALS
sentation since most CPs have domains that are more demanding than
what binary representation can provide.
� Constraints: Constraints should be represented with ease. Although
constraints knowledge can be exploited to reduce the set of constraints,
this may make our GA non-generic. In addition, the GA must also have
the necessary mechanism (such as genetic operators) to process these
constraints.
� Fitness function: For CSOPs, the proposed GA must support multi-
objective optimization. And for PCSPs, it should be aware of the cost
of violated constraints. Overall, our GA needs a �tness function that
can react to the search, and relax certain subsets of the constraints in
order to progress the search.
A problem that is properly de�ned using our GA's facilities should be
able to make the fullest of its abilities. One should therefore see the following
advantages:
� Robustness: Robustness is a property of GAs. This is a much desired
quality that could booster con�dence in the use of Arti�cial Intelligence
in the real world. A method that is robust should have consistency in
recommending good solutions, and the quality of these solutions should
not vary too much amongst each other. It would be di�cult to entrust
a mission critical problem to a method that gives solutions with a high
degree of variance.
21
1.7. THESIS OVERVIEW
� E�ectiveness: The de�nition of e�ectiveness varies with the problem at
hand. It may mean that the algorithm should provide robust answers
(which we covered above), or to return the best solutions. Given a
suitably de�ned problem, our GA should be able to guarantee (conser-
vatively) above average e�ectiveness.
In building on the foundation of the classical GA, we have beared in mind
the major contributing factor that hindered GA to solving CPs, and many
other types of problems: the e�ects of epistatsis. It is our aim to:
� Alleviate the e�ects of Epistatsis: All CPs have this problem and
though using representation strategy may ease the problem somewhat,
but there is no generic representation strategy for all CPs. Epistatsis
has plagued GA since its inception and the successful suppression of its
e�ects would not only be signi�cant to the current goal (that of using
GAs for Constrained Problems), but also to the �eld of GAs. This will
involve fundamental changes to the canonical GA architecture.
1.7 Thesis Overview
In this chapter, we provided the motivation for our research. We identi�ed
the key areas that a suitable and generic GA should provide if it is to be
successful. We looked at the existing research publication and discovered
a fundamental aw that has plagued GAs since its inception. Finally we
outlined the goals of this project.
22
1.7. THESIS OVERVIEW
In chapter 2, we will discuss our proposed GA, the Guided Genetic Al-
gorithm. First, we give a glimpse of its ancestor and followed by a complete
description of the algorithm. We will also set the environment for experi-
mentation, which includes the choice of problems and the testing guidelines.
Chapter 3 describes a classical GA problem, whilst chapters 6, 5 and 4
are sets of pratical problems that we used to test the e�ectiveness of our
algorithm. Each chapter follows a similar organisation:
� Introduction of the problem.
� Problem de�nition as a constrained problem.
� Instantiation of the Guided Genetic Algorithm to solve it.
� Details of the testing environment.
� Benchmarks and experimental results.
� Discussion and Conclusion.
Chapter 3 contains the classical and theoretical GA problem, the Royal
Road Functions. This is the original class of problems that stemmed from
John Holland's Building Block Hypothesis [19]. It was devised to show the
strength of GAs over hill climbing search, but it turned out otherwise. The
royal road functions are a set of four objective functions, where each function
has a set of bit strings that represents the optimum instantiation for a part
of the solution. We will also investigate what Mitchell, Holland and Forrest
23
1.7. THESIS OVERVIEW
[51] called the Idealised Genetic Algorithm, and to discuss in what way does
GGA approximates it.
And in chapter 4, we exmaine the Generalised Assignment Problem. This
is a well explored Operations Research problem. The Generalized Assignment
Problem is an NP-hard problem that has pratical instances in the real world.
One has to �nd the optimum assignment of a set of jobs to a group of agents.
However, each job can only be performed by one agent, and each agent has
a work capacity. Further, assigning di�erent jobs to di�erent agents involve
di�erent utilities and resource requirements. These are constraints that limit
the choice of job allocation.
The Radio Link Frequency Assignment Problem is our focus in chapter 5.
The Radio Link Frequency Assignment Problem is an abstraction of a real
life military application that involves the assigning of frequencies to radio
links. This problem set consists of eleven instances that are classed as either
a CSOP or a PCSP. Each problem has di�erent optimization and constraint
requirements, and can have up to 916 variables, and up to 5548 constraints.
In chapter 6, we set our GA on the Processor Con�guration Problem.
This is an NP-hard real life problem. The goal involves designing a network
of a �nite set of processors, such that the maximum distance between any two
processors that a parcel of data needs to travel is kept to a minimum. Since
each processor has a limited number of communication channels, a carefully
designed layout would assist in reducing the overhead for message switching
in the entire network.
24
1.7. THESIS OVERVIEW
In the �nal chapter, we summarize the results from these experimentation
and attempt to discover traits of our GA. From these traits, we would try
to de�ne a niche where our method would enjoy preference. We will also be
able to de�ne the limitations of the algorithm. This is followed by a section
on the discussion of points raised during our research, before ending with a
grand conclusion.
25
Chapter 2
Guided Genetic Algorithm
2.1 Background
Among our earlier work on CSOPs, we looked at the Processor Con�guration
Problem (PCP) [26, 52{54]. Brie y, the PCP is a real life CSOP where the
task is to link up a �nite set of processors into a network, whilst minimizing
the maximum distance between these processors. Since each processor has a
limited number of communication channels, a carefully planned layout will
help reduce the overhead for message switching.
We developed a GA called the Lower Power Genetic Algorithm (LPGA)
[48, 49, 55] speci�cally for solving the PCP. LPGA is a two-phase GA ap-
proach where in the �rst phase, we run LPGA until a local optimum has
been determined. The best chromosome from this run is analysed and used
to construct a �tness template for use in the next phase. This �tness tem-
plate is a map that de�nes undesirable genes, so in uencing LPGA to change
their contents. By insisting that crucial genes do not change, the evolution
26
2.2. GUIDED LOCAL SEARCH
in the second phase shifts focus onto other parts of the string; resulting in a
more compact search space.
LPGA found solutions better than results published so far in the PCP. It's
success could be attributed to the use of an e�ective data representation and
more importantly, the presence of an application speci�c penalty algorithm.
In our e�ort to generalize LPGA, we sought to develop a GA that utilizes a
dynamic �tness template constructed by a general penalty algorithm. The
Guided Genetic Algorithm (GGA) reported in this thesis was the result of
this e�ort.
2.2 Guided Local Search
The Guided Local Search (GLS) developed by Voudouris and Tsang [56, 57]
is an intelligent search scheme for combinatorial optimization. It is a meta-
heuristic search that is conceptually simple, and has been shown to be e�ec-
tive for a range of problems: function optimization [58], Partial Constraint
Satisfaction Problems [59], vehicle routing problems [60], real-life scheduling
problem [61], and the traveling salesman problem [57]. With its proven track
record, it was only natural that we selected GLS as the penalty algorithm to
which GGA is built around.
GLS exploits information gathered from various sources to guide local
search to promising parts of the search space. Information exist in candidate
solutions as features. Thus we can say that solutions are characterized by
27
2.2. GUIDED LOCAL SEARCH
a set of features �, where a solution feature �i is any non-trivial1 property
exhibited by the solution (Eq. 2.1).
� = f�1; �2; : : : ; �mg (2.1)
Research on GLS has indicated that feature de�nition is not di�cult, since
the domain often suggests features that one could use [56]. Each feature �i
is represented by an indicator function �i, which tests the existence of that
feature in the candidate solution p (Eq. 2.2).
�i( p) =
(1; solution p exhibits feature �i
0; otherwise(2.2)
Given an objective function f that maps every candidate solution p to
a numerical value. A local search which makes use of GLS will have its
objective function f replaced by g (Eq. 2.3). The variable �i is the penalty
counter for feature �i which holds the number of times �i was penalized. All
penalty counters are initialized to zero at the beginning of the search. The
variable � is a parameter to GLS, that measures the impact penalties have
with respect to f .
g( p) = f( p) + � �X
(�i � �i( p)) (2.3)
1A property not exhibited by all candidate solutions.
28
2.3. OVERVIEW OF GGA
The strategy that GLS uses to select and penalize feature(s) is paramount
to its e�ectiveness. For each feature �i, we attach a cost �i that rate its
signi�cance relative to other features in the set �. When a search is trapped
in local optimum, we would want to systematically fade out (in subsequent
generations) features that contributed most2 to the current state through
penalizing them. However, to avoid constantly penalizing the most signi�cant
features, we must consider the number of times each present feature was
penalized in the past. The function util (Eq. 2.4 models our requirements
above. Features that maximizes util are selected to be penalized. We penalize
a feature �i by adding one to its penalty counter �i.
util( p; �i) = �i(s) ��i
1 + �i(2.4)
By penalizing key features present in a solution, the function g is modi�ed,
and hopefully this will guide the search towards other candidate solutions. In
iterations, GLS methodically manages the feature composition in a solution.
In so, GLS helps the search to escape from local optimums, without limiting
the search space.
2.3 Overview of GGA
GGA is a hybrid between GA and GLS. It can be seen as a GA with GLS to
bring it out of local optimum: if no progress has been made after a number
2Has a higher cost �i for its presence.
29
2.4. FEATURES AND PENALTIES IN GGA
of iterations (this number is a parameter to GGA), GLS modi�es the �tness
function (which is, or contains the objective function) by means of penalties.
GA will then use the modi�ed �tness function in future generations. The
penalties are also used to bias cross-over and mutation in GA; genes that are
involved in more penalties are made more susceptive to changes by these two
GGA operators. This allows GGA to be more focused in its search.
On the other hand, GGA can roughly be perceived as a number of GLSs
from di�erent starting points running in parallel, exchanging material in a
GA manner. The di�erence is that only one set of penalties is used in GGa
whereas parallel GLS would have used one independent set of penalties per
run. Besides, learning in GGA is more selective than GLS; the updating
of penalties is only based on the best chromosome found at the point of
penalization.
In the following sections, details of GGA will be explained. An overview of
the algorithm is shown in Fig. 2.1. Like ordinary GA, GGA uses a number of
parameters. Appended in Appendix A (on page 132) are three tables (Table
A.1, A.2 and A.3), summarizing for the readers' convenience, the terms and
technology introduced henceforth.
2.4 Features and Penalties in GGA
As in GLS, one needs to de�ne the feature set � in GGA. In GGA, a feature
�i is de�ned by the assignment of a �xed set of variables to speci�c values.
We denote the set of variables that de�ne �i as Eq. 2.5.
30
2.4. FEATURES AND PENALTIES IN GGA
Generate initialpopulation
Start
Mutationoperator
Cross-overoperator
populationRe-assess
Localoptimum?
Criteriamet?
End
operatorPenalty
Control Flow
Data Flow
ReproductionoperatorMating pool
Populationpool
Offspringpool
Fitnesstemplates
no
no
yes
yes
Figure 2.1: The Guided Genetic Algorithm.
31
2.4. FEATURES AND PENALTIES IN GGA
V ar(�i) = fxi1; xi2; : : : ; xing (2.5)
Similar to GLS, we de�ne an indicator function �i for each feature �i as
in Eq. 2.2.
Following GLS, GGA requires the user to de�ne a cost �i for each feature
�i in the feature set. Internally, GGA associates one penalty counter �i to each
feature �i. Costs will remain constant but penalties could be updated during
the run. Penalties are all initialized to zero. Penalties are global, meaning
only one set of penalties values is used for all chromosomes in GGA.
Penalties are used to change the objective function used by GA in order
to bring it out of local optimum. They are also used to bias cross-over and
mutation operations, which will be described later.
GGA invokes the penalty operator whenever the best chromosome in the
population remains unchanged for k GA generations, where k is a parameter
to GGA. The value of k is small; for instance, we used a k between 3 to 5
for all applications reported in this thesis. The penalty operator is a special
operator in GGA. It selects features present in the best chromosome p in
the current population in the same way as GLS, ie. the feature(s) which have
the maximizes the function util (Eq. 2.4).
32
2.5. FITNESS TEMPLATE
2.5 Fitness Templates, the key to GGA's per-
formance
One novel feature of GGA is the �tness template. It is the bridge between
GLS and GA. It enables penalties to a�ect GA's cross-over and mutation
operators. Each chromosome p is associated with a �tness template �p,
which is made up of smaller units called weights. One weight �p;q corresponds
to each gene p;q. Each weight �p;q de�nes how susceptive to change the
corresponding gene p;q is during cross-over and mutation; the greater the
value of �p;q (\heavier") , the more likely that change will occur to the value
of gene p;q. This will be elaborated later.
The �tness template are computed from the penalties when a chromosome
is created. They are updated when the penalties are changed. The weights of
a �tness template is calculated in the following way: all weights are initialized
to zero. Then for each feature �i that is exhibited by the chromosome, the
weight �p;q is increased by the value in the penalty counter for �i, provided
that gene p;q represents a variable that is a member of V ar(�i). The pseudo
code of this procedure is documented in Fig. 2.2.
2.6 GGA Operators and Population Dynam-
ics
In this section, we shall explain the operations of a generic GGA. It can
be specialized to handle individual problems. First, we shall explain the
33
2.6. GGA OPERATORS AND POPULATION DYNAMICS
FUNCTION DistributePenalty( chromosome p )f
FOR EACH weight �p;q RELATED TO chromosome pf
�p;q 0g
FOR EACH solution feature �i IN feature set �f
IF �i( p) = 1 THENf
FOR EACH gene position q that represents a variablein V ar(�i)f
�p;q �p;q + �ig
gg
g
Figure 2.2: Algorithm for the Distribution of Penalty.
34
2.6. GGA OPERATORS AND POPULATION DYNAMICS
special operators in GGA. Then we shall explain the control ow, which is
summarized in Fig. 2.1.
2.6.1 Specialized Cross-over Operator
In GAs, cross-over operators di�er from each other primarily in the way they
choose the genes from parents (chromosomes) to form the o�spring. In the
canonical GA, one of the simplest form of cross-over is the one-point cross-
over [20], where gene selection is purely random. GGA's cross-over operator
makes use of the �tness templates of the parents.
Two parent chromosome p and p0 are selected to produce one o�spring
c. Each gene p;q in parent p compenets against the corresponding gene
p0;q in parent p0 for a place in the child. This competition is in reverse
proportion to the corresponding weights �p;q and �p0;q. In other words, the
probablity of genes p;q and p0;q being selected are fracdeltap0;q�p;q + �p0;q and
fracdeltap;q�p;q + �p0;q respectively. Thus, the \lighter" gene (the one with
a smaller weight) has a better chance of being selected to become a gene of
the o�spring. The selection of each gene is independent of the others during
cross-over. The pseudo code for the cross-over operation is shown is Fig.
2.3. Fig. 2.4 shows an example of a cross-over operation, where each square
represents a variable, and each corresponding circle represents its weight.
Since the o�spring may exhibit features di�erent from those present in
the parents, the �tness template is computed afresh, rather than inherited.
35
2.6. GGA OPERATORS AND POPULATION DYNAMICS
FUNCTION CrossOver( parent chromosomes p and p0 )f
FOR EACH gene position q IN the chromosomef
sum �p;q + �p0;q
point random integer from f0; : : : ; sum� 1g
IF point < �p;q THENf
gene p0;q
gELSEf
gene p;qg
g
RETURN gene as p00;q for the o�springg
Figure 2.3: Algorithm of the GGA Cross-over Operator.
36
2.6. GGA OPERATORS AND POPULATION DYNAMICS
GeneWeight
0 1 1 1 1 0 1 1 0 0
5 2 3 40 1 2 1 3 3
002 4 4 2 23 0 3
0 1 0 1 1 0 0 1 0 1
1 0 1 1 0 0 1 1 0 0
Parent 2
Parent 1
Child
Figure 2.4: An example of the GGA's Cross-over Operator in action.(Weights for the child is calculated afresh, not inherited.)
2.6.2 Specialized Mutation Operator
Mutation produces variations in the population through altering the infor-
mation that genes carry. For canonical GAs, the mutation rate determines
the probability that mutation will take place on one gene in chromosome. In
GGA, the mutation rate is de�ned as a fraction of the length of each chro-
mosome. The number of genes in a chromosome to mutate is the product of
the mutation rate and length of that chromosome.
For every chromosome, GGA selects the required number of genes to
mutate. The selection id done using the roulette wheel method. In this
selection method, the probability for each gene p;q to be picked is directly
proportional to its weight �p;q. In other words, the \heavier" is gene p;q (ie.
37
2.6. GGA OPERATORS AND POPULATION DYNAMICS
the greater the value of �p;q), the more chance it has of being selected for
mutation.
Having picked a gene p;q for mutation, the next step is to seek a value
for replacement. In GGA, the value that yields the best �tness (ie. the best
local improvement) will be selected, with ties broken randomly. However
in some applications, random selection of a value (for the gene) may be
more bene�cial (see chapter 3.2.1 on page 53 for such an example). This
choice is usually in uenced by the problem's objective function, and only
experimentation will give the best advice.
After a gene p;q has been mutated, its weight �p;q may optionally be
reduced by one unit. This will reduce the chance of this gene being selected
for mutation again. GGA allows the same gene to mutate more than once,
because GGA understands that a gene may have relationship(s) with one or
more genes (de�ned through the feature set �). A gene mutated would void
the best value for a gene (that was previously mutated) it is related to.
The pseudo code of the mutation operation is shown in Fig. 2.5.
2.6.3 The Control Flow in GGA
Having described the components of GGA, we are now in a position to de-
scribe its control ow. Like any standard GA, given a problem, one needs
to de�ne a representation for candidate solutions in GGA. One also needs to
de�ne the features and their costs. To start, an initial population is gener-
38
2.6. GGA OPERATORS AND POPULATION DYNAMICS
FUNCTION Mutation( chromosome p00 )f
i 0
WHILE i < mutation rate � � (length of chromosome)f
q RouletteWheel( p00)best g( p00)list p00;q
FOR EACH value xj IN domain D(q)f
p00;q xjz g( p00)
IF z � best THENf
IF z > best THENf
best z
list fgg
list list + xjg
g
p00;q random value in list
i i+ 1g
RETURN the mutated chromosome p00
g
Figure 2.5: Algorithm of the GGA Mutation Operator.
39
2.6. GGA OPERATORS AND POPULATION DYNAMICS
ated randomly. A �tness template is computed for each chromosome in the
population. Then GGA cycles ensue.
In each GGA cycle, the reproduction operator picks chromosomes from
the population to form the mating pool3. This is performed using the roulette
wheel method ; the �tter the chromosome, the greater is its chance of hav-
ing picked. Then pairs of chromosomes are picked from the mating pool
randomly to form parents. They generate o�spring using the cross-over op-
erator described above. Whether cross-over is performed is determined by
the cross-over rate, which is a GGA parameter. O�spring are potentially
mutated using the mutation operator described above, depending on the
mutation rate.
Mutated o�spring are then given a chance to join the population in the
next generation. The old population and the o�spring are ranked by their
�tness. The �ttest n chromosome in the ranking are used to form the popula-
tion of the next generation, where n is the population size (and a parameter
to GGA).
New elements of GGA come into play at this point. The new population
is surveyed for the possibility of being trapped in local optimum. If the best
solution in the population remains unchanged for k iterations (see section
2.4 on page 30), then GGA concludes that the search is trapped in a local
optimum. This invokes the penalty operator, which modi�es the objective
function in exactly the way that GLS does. To re ect the change of the
3Depending on the selection method used, the mating pool could exist as a logicalstore; ie. it is not represented by any data structure, but is used to represent a concept.
40
2.7. EXPERIMENTATION ENVIRONMENT
objective function, the �tness templates for individual chromosomes are up-
dated. This will a�ect the cross-over and mutation operators in the next
GGA cycle.
GGA cycles until it reaches the pre-de�ned termination conditions, which
could be de�ned in terms of a maximum number of cycles or computation
time.
2.7 Experimentation Environment
In this section, we give the relevance behind our choice of four sets of problems
that comprise our test set. A mixed set of well chosen problems should enable
us to test the boundaries of GGA. The information that we can gleam from
carrying out these experiments would assist us in determining GGA's:
� Flexibility: The ease in which GGA needs to be re-coded to solve a
new problem set. Given a good set of tools (see section 1.6 on page
20), the problem can be expressed without altering GGA's core logic.
� Performance: The quality of the solutions given by GGA and the time
needed to reach these solutions, compared to existing published results.
� Robustness: The consistency of solution quality, which is usually mea-
sured by the mean and standard deviation of solutions from repeated
runs. A robust algorithm should have solutions that give a mean that
is at or near the global optimum, with a very small deviation.
41
2.7. EXPERIMENTATION ENVIRONMENT
� Advantage over GLS: Based on performance and robustness, we can
decide if GGA show any improvement beyond algorithms based on
GLS.
� Advantage over GA: Similarly, using information on performance and
robustness of GGA, we can conclude the magnitude of GGA's advan-
tage or disadvantage over the canonical GA.
� Niche: From the information above, we can try to identify a niche area
where GGA will be more useful than existing algorithms.
The items above are declared as out testing objectives, and are to be used
to select and build the test set. We will also outline the guidelines to carry
out these test to derive meaningful information relating to the items in the
aforementioned list. These guidelines are to be observed for all the problems
in our test set.
2.7.1 Problem Sets
Following the items listed in the previous section, we chose one classical GA
problem and three practical problems that were abstracted from their real
world counterparts. Next, we give a brief description of these problems.
The Royal Road Functions are a set of objective functions that has a
reputation as a classical GA problem. This is the original class of problems
that stemmed from John Holland's Building Block Hypothesis [19]. It was
devised to show the strength of GAs over hill climbing search, but it be-
42
2.7. EXPERIMENTATION ENVIRONMENT
came otherwise when a simple hill-climbing algorithm easily out-performed
the canonical GA on the simplest problem of the set [51]. The royal road
functions are a set of four objective functions, where each function has a
set of bit strings that represents the optimum instantiation for a part of the
solution. This is a good problem to understand how and what GGA o�ers
above its canonical ancestor and the basic GLS algorithm.
The Processor Con�guration Problem is an NP-hard real life problem
[31, 54]. The goal involves designing a network of a �nite set of processors,
such that the maximum distance between any two processors that a parcel of
data needs to travel is kept to a minimum. Since each processor has a limited
number of communication channels, a carefully designed layout would assist
in reducing the overhead for message switching in the entire network. We use
this problem to compare the performance between GGA and its predecessor
LPGA4 [48, 49]. Further, we examine how GGA will function without the
standard GGA genetic operators.
The Radio Link Frequency Assignment Problem is an abstraction of a
real life military application that involves the assigning of frequencies to
radio links [50]. This problem set consists of eleven instances that are classed
as either a CSOP or a PCSP. Each problem has di�erent optimization and
constraint requirements, and can have up to 916 variables, and up to 5548
constraints. Therefore, this would be a good problem set to test GGA's
exiblity given the diverse objectives. Also, the size of each problem in the
set makes it a non-trivial challenge for any algorithm.
4LPGA was a GA specialised for this problem.
43
2.7. EXPERIMENTATION ENVIRONMENT
The Generalised Assignment Problem is a well explored Operations Re-
search problem [45, 62]. The Generalized Assignment Problem is an NP-hard
problem that has pratical instances in the real world. One has to �nd the
optimum assignment of a set of jobs to a group of agents. However, each job
can only be performed by one agent, and each agent has a work capacity.
Further, assigning di�erent jobs to di�erent agents involve di�erent utilities
and resource requirements. These are constraints that limit the choice of
job allocation. The problem has attracted the participation of several local
search algorithm [27, 63{68], in so giving us a good benchmark to compare
GGA against other non-CP type algorithms.
2.7.2 Testing Guidelines
Again, adhering to the features we are interested in evaluating that we de�ned
in 2.7, we designed the following guidelines:
� Platform: We standardised on the IBM PC Linux platform for our de-
velopment and testing. The Linux operating system is free and publicly
available, and has a full compliment of development tools. Our choice
of this platform is due to its low cost, wide availability, and realistic
computation power, and also to ease re-implementation by third par-
ties. To demonstrate portability, the PCP set of benchmarks (section
6.5 on page 111) shall be implemented on a SUN 3/50.
� Report GGA settings: Not only is this information useful for recreating
experiments, but also allow us to understand the processing require-
44
2.7. EXPERIMENTATION ENVIRONMENT
ments of GGA. For example, we would tend to favor a GA that uses a
smaller population size to achieve the same or better performance than
its peers, since it would have lower resource requirements.
� Repeated runs: GGA is a stochastic algorithm that has a di�erent
starting point (in the search space) each time it is ran, and therefore
each answer that GGA �nds may be di�erent. So for each instance
in each problem set, GGA (and the competing algorithm, if provided
by their respective authors) is run n times to collect n solutions. We
used a value of 50 for n for all benchmarks in our test set, except in
the Generalised Assignment Problem (chapter 4 on page 61), due to
benchmarking rules. Applying statistical tools on these results would
show the relative robustness of GGA.
� PGLS: To provide a level playing �eld between GGA and GLS, we
introduce the Parallel Guided Local Search (PGLS). The PGLS is sim-
ilar to running GLS n times and picking the test result from the run.
The value of n is equal to the population size GGA used5. PGLS
would therefore have n sets of penalty counters, compared to just one
in GGA. By comparing the results between GGA and PGLS, we hope
to establish that GGA is more than a parallel version of GLS.
� Measurements: For each run, we collect the �tness of the solution and
the time (or the number of steps, where applicable) taken to reach this
solution. In our test set, we will calculate �tness up to at least two
decimal digits where it is a real number. Run time is usually measured
5Population sizes are varied for di�erent instances of each problem set in our test set.
45
2.7. EXPERIMENTATION ENVIRONMENT
in CPU seconds.
� Anlysis: From calculating the mean and standard deviation of the mea-
surements for each instance for an algorithm, the can form a compari-
son. Such comparison would indicate the relative performance advan-
tage (such as robustness, quality and speed) over its peers. Further,
di�erent problem sets may have their unique analysis procedures, to
which we would follow and report.
46
Chapter 3
Royal Road Functions
In [51], Mitchell, Forrest and Holland attempted to understand the mechan-
ics of the canonical GA. In so, they also hope to identify the charateristics of
the �tness landscapes for which the GA performs well, notably when com-
pared with other search algorithms. To aid their study, they de�ned a set of
features of �tness landscapes that are particularly relevant to GAs. Various
con�gurations of these features are then applied ont the GA to study their ef-
fects. The initial set of such features discussed in [51] resulted in the \Royal
Road" functions (RRf); functions that generate simple landscapes making
use of those features. The RRf is a set of four functions which the authors
believed were easy for GAs to solve, hence the name \Royal Road".
The motivation behind the Royal Road functions was based on research
into the building block hypothesis [19, 20]. The building block hypothesis
states that GAs works well when short, low order, highly �t schemas1 re-
combine to form longer schemas are are of a higher order, and are �tter.
1Schemas are building blocks, which are partial solutions.
47
3.1. UNDERSTANDING THE ROYAL ROAD FUNCTIONS
This notion of producing ever improving sequences of new schemas through
recombination of existing schemas is believed to be central to the GA's suc-
cess. Mitchell et al. were interested in isolating landscape features implied
by the building block hypothesis. These features can then be embedded into
simple landscapes to study their e�ects on the GA.
Controversy arose when in [69], Mitchell et al. found that a simple hill
climbing algorithm could easily out perform the canonical GA.
3.1 Understanding the Royal Road functions
Mitchell et al. built the RRf by �rst selecting an optimum string [51, 69, 70].
This string is broken down into smaller building blocks. An optimum string
is denoted by Opt in Fig. 3.1, which is broken down into a set of building
blocks S. The smallest entity of an optimum string is a bit. Therefore we
can express Opt = fOpt1; Opt2; : : : ; Optmg, where m de�nes the length of
this string. The set of building blocks S = fS1; S2; : : : ; Sng, where n varies,
depending on the function chosen (more on that later). Each bit within a
schema can take three states: 02, 1 or �. A schema bit that has the value of
0 or 1 requires the corresponding bit in the candidate solution to be of the
same value, in contrast to a � (a wildcard) state. A � state indicates that
the corresponding bit in the candidate solution can take any value.
From Fig. 3.1, the schemas S1 to S8 are of the lowest order, for they
are the most basic schemas. These basic schema are also called level one
2The state 0 is not used by Mitchell et al.
48
3.1. UNDERSTANDING THE ROYAL ROAD FUNCTIONS
Level One:S1 = 11111111********************************************************S2 = ********11111111************************************************S3 = ****************11111111****************************************S4 = ************************11111111********************************S5 = ********************************11111111************************S6 = ****************************************11111111****************S7 = ************************************************11111111********S8 = ********************************************************11111111
Level Two:S9 = 1111111111111111************************************************S10 = ****************1111111111111111********************************S11 = ********************************1111111111111111****************S12 = ************************************************1111111111111111
Level Three:S13 = 11111111111111111111111111111111********************************S14 = ********************************11111111111111111111111111111111
Opt = 1111111111111111111111111111111111111111111111111111111111111111
Figure 3.1: A 64 bits optimum string, and the set of derived building blocks.
49
3.1. UNDERSTANDING THE ROYAL ROAD FUNCTIONS
schemas. In [51, 69, 70], each of the basic schemas usually have eight con-
secutive bits representing a non-overlapping portion of the optimum string.
These basic schemas are used to build higher order schemas, such as S9,
which is a combination of S1 and S2. Both S9 and S10 are in turn used to
build an even higer schema S13.
Given that each schema Si is a set of bits, and where i is denotes the index
of the schema in set S, we can compute the content of this schema using the
alogrithm in Fig. 3.2. Optlength denotes the total length of the optimum
string, and Slength holds the length of the schema of the lowest order. These
are parameters given by the user. The algorithm �rst calculates the level
that this schema should belong to, and store it into the variable level. Using
the variable level, we can compute the variable pos and len. The variable
pos contains the index of the bit that is the start of the schema Si. And the
variable len holds the number of bits in the basic schema that should have
the state of 1. With these two variables, can can set the respective bit in Si
to state 1 (instead of the initial � state).
The RRf is a set of four functions to be optimized. They are given
the denotations R1; R2; R3 and R4, in order of increasing complexity. Each
function has the following properties:
� R1: A 64 bits optimum string with schema set S = fS1; S2; : : : ; S8g.
� R2: A 64 bits optimum string with schema set S = fS1; S2; : : : ; S14g.
� R3: A 256 bits optimum string with schema set S = fS1; S2; : : : ; S8g,
50
3.1. UNDERSTANDING THE ROYAL ROAD FUNCTIONS
FUNCTION ComputeSchema(Schema index i)f
FOR EACH bit Si;j in Sif
Si;j �g
level 0j 1
k OptlengthSlength
fk k
2j j + k
level level + 1g WHILE i < j
len Slenlevel
pos i� (j �OptlengthSlength
)
WHILE len > 0f
Si;pos 1pos pos+ 1len len� 1
g
RETURN the schema Sig
Figure 3.2: Algorithm for computing Schemas.
51
3.1. UNDERSTANDING THE ROYAL ROAD FUNCTIONS
but with introns. An intron is a bit whose value do not count towards
the objective function. The basic schemas in R3 are seperated from
each other by introns of 24 bits. Note that the results for the R3 were
never discussed [51, 69, 70].
� R4: A 128 bits optimum string with schema set S = fS1; S2; : : : ; S30g.
All four objective functions in the RRf are of the form in Eq. 3.1. Given
a candidate solution a, function Rz will accumulate the cost of each schema
in set S that matches a exactly3. The function M (Eq. 3.2) gives a 1 when
solution a matches schema Si exactly, and 0 otherwise. The cost of each
schema Si is the sum of bits that is in 1 state, given by the function C (Eq.
3.3). The functions in the RRf are step like in behaviour; ie. to induce even
the smallest improvement to the �tness of a solution, a (previously unmatch)
subset of bits in the solution must match any of the basic schema.
Rz(a) =nXi=1
(M(a; Si) � C(Si)) (3.1)
whereM(a; Si) �
(1; if a matches Si exactly
0; otherwise(3.2)
C(Si) � Number of bits in Si that is in 1 state (3.3)
3Each bit in the solution a must match the corresponding bit in Si, where the bit inSi is a 1. We can also say that schema Si exist in solution a.
52
3.2. PREPARING GGA TO SOLVE RRF
3.2 Preparing GGA to solve RRf
3.2.1 Mutation Operator
Due to the nature of the RRf, GGA's mutation operator will have to function
slightly di�erently in this application than the others in this thesis. In the
introduction to GGA's mutation operator (see section 2.6.2 on page 37), we
wrote that the operator is one that is greedy; one that would choose the value
that gives the greatest improvement for the gene it is mutating. In the RRf,
the probablity that changing a single gene will result in a change in �tness
(in either direction) would be 1 out of 82, which is 0:016 (where eight is the
number of bits in the state of 1 in the basic schema, and two is the number of
states a bit can take). This is too small a probability for a greedy mutation
to work e�ectively, and would have no di�erence to random mutation. So
for the RRf, GGA's mutation operator will choose values randomly, for each
gene it is to mutate.
3.2.2 De�ning solution features for the RRf
Before we start to run GGA on the RRf, we need to de�ne a set of features
that will help guide GGA in its search. This set of features should help
GGA, and not confuse it. Our natural choice for the feature set would be
the members in the schema set S. In Eq. 3.4, we de�ne our feature set,
where the indicator function �i would return a one if the schema Si does not
exist in chromosome p.
53
3.3. BENCHMARK
8�i 2 � : �i( p) �
(1; if p does not match schema Si
0; otherwise(3.4)
However only the basic schemas are eligible, because higher order schemas
(which are composites of the basic schemas) overlap schemas under (lower
than) them. When these higher order schemas are penalized, GGA would
induce weight gain for all the genes de�ned by them. But this would also
a�ect some basic schemas that do exist in all the population, because they
are covered by these penalized higher order schemas. This would cause genes
de�ned by basic schemas that were previously in the correct state to be open
to mutation again.
3.2.3 Fitness Function
Each objective function (Eq. 3.1) in the RRf will be incorporated into GGA's
�tness function g as in Eq. 3.5.
g( p) = Rz( p) + � �Xi=1
(�i � �i( p)) (3.5)
3.3 Benchmark
The benchmark in this chapter follows the work by Mitchell et al. in [51, 69,
70]. Since results for R3 were never reported in these two papers, we would
therefore only test the functions R1, R2 and R3.
54
3.3. BENCHMARK
3.3.1 Indices
In the RRf, we are interested in testing the e�ciency of the algorithm. This
e�ciency is measured by the number of calls to the �tness function. Mitchel
et al. [51] suggested that resources consumed by the �tness function (in
computing the solution's �tness) during an algorithm's operation is often
much greater than components of the alogrithm itself. So in the RRf, we are
to record the number of calls made to the �tness function, for the duration
of the search. The search terminates when the optimum string is found, or
when the number of calls to the �tness function exceeds a set limit. We limit
the number of �tness function calls to 256000 for functions R1 and R2, and
106 calls for R4.
Each �tness function in the RRf are ran 200 times. These results are
use to compute the mean and median for each algorithm under the di�erent
�tness functions.
3.3.2 Test Environment
In our physical environment, GGA was written in C++ and complied us-
ing GNU GCC version 2.7.1.2. The code runs on an IBM PC compatible
with a Pentium 133 MHz processor, 32MB of RAM and 512KB of Level 2
cache. Both compilation and executaion of GGA was performed on the Linux
operating system, using kernel version 2.0.27.
Under GGA's environment, we have a mutation rate and cross-over rate
55
3.3. BENCHMARK
of 1.0, a population size of 40, and a stopping criterion of 1000 generations.
Each feature has a cost �i of 100. For the �tness function g, � has a value of
one.
3.3.3 Results
There are a total of six algorithm taking part in the benchmark. We have
the canonical genetic algorithm (GA). The Steepest Ascent Hill Climbing
(SAHC) [51, 69] mutates all bits in a string from left to right, and recording
the �tness of each resultant string. It then chooses the string with the greatest
improvement in �tness for further hill climbing.
The Next Ascent Hill Climbing (NAHC) [71] also mutates all bits in
the string from left to right, but will stop once a resultant string with any
improvement is found. This new resultant string is used as a starting point
for the next hill climb.
The Random Mutation Hill Climbing (RMHC) [69] mutates a locus of
bits in the string. The process is repeated until an improvement (in �tness)
occurs, at which the mutated string becomes the new starting point for fur-
ther hill climbing. A locus is a randomly chosen consecutive set of bits in
the string.
We left out PGLS in this experiment since it would de�nitely generate
more calls to the �tness function, compared to GLS.
For R1, R2 and R4, only GGA was able to �nd the optimum string within
56
3.3. BENCHMARK
Table 3.1: Comparing results by GGA with others.Function Algorithm Mean Median Std. Dev.
GA 61334 54208 2304SAHC > 256000 > 256000 n/a
R1 NAHC > 256000 > 256000 n/aRMHC 6179 5775 186
GLS 56316 54692 3455GGA 4815 4416 2031GA 62692 56448 2391
SAHC > 256000 > 256000 n/aR2 NAHC > 256000 > 256000 n/a
RMHC 6551 5925 212GLS 58944 55793 3862GGA 5918 5472 2109GA > 106 > 106 n/a
SAHC > 106 > 106 n/aR4 NAHC > 106 > 106 n/a
RMHC > 106 > 106 n/aGLS > 106 > 106 n/aGGA 10774 9752 4342
n/a = not applicable
GA: Genetic Algorithm [19, 20]SAHC: Steepest Ascent Hill Climbing [51, 69, 70]NAHC: Next Ascent Hill Climbing [71]RMHC: Random Mutation Hill Climbing [69, 70]
GLS: Guided Local Search [56, 57]GGA: Guided Genetic Algorithm
57
3.3. BENCHMARK
the limits we set. Further GGA was also consistently the best performing
algorithm for the three objective functions. We can also see that in moving
from R2 to R4, GGA took only twice as much calls to reach the optimum
string compared to an exponential increase for GA, RMHC and GLS. Also,
compared to the canonical GA, GGA (in R1 and R2) has been shown to be
12 times more e�cient.
The impact of GGA's specialized genetic operators to the RRf is best
understood using an example. Fig. 3.3 shows an example of a cross-over
operation on an RRf problem. Using similar notation to Fig. 2.4 (page 37),
each square represents a variable, and each corresponding circle represents its
weight. Fig. 3.3 only shows the �rst 16 bits of the 64 bit chromosomes. The
depicted states of the parent chromosomes are results of several generations
of evolution, and penalization.
0 0 1 0
3
0 1 1 0
3 3 3 3 3 3 3 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 2 2 2 2 2 22 2
11 1 0 1 1 111 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
Parent 2
Parent 1
Child
Figure 3.3: An example of the GGA's Cross-over Operator on the RRf.(Weights for the child is calculated afresh, not inherited.)
The subset of weights that are non-zero indicates that the correspond-
ing subset of genes do not match a schema in the RRf's schema set. GGA
58
3.4. SUMMARY
combines genes from both parents where the weights of these genes are zero
(ie. these genes matches all related schema in the schema set.) to form
the new chromosome. The e�ect of GGA's cross-over operator is quite dra-
matic, when one considers that this operator is applied to a population of
chromosomes.
Using Fig. 3.3 again, suppose GGA is to mutate the chromosome labelled
parent 1. With the weights as guides, GGA's specialized mutation operator
would concerate all its e�orts on genes whose weights are non-zero. Mu-
tated genes will have their weight decremented. Together, we increase the
probabilty of a mutation that would improve the �tness of the chromosome.
With this example, we can see that the combined e�ects of two genetic
operators, coupled with the �tness templates, were the main reasons for
GGA's supremacy in the RRf.
3.4 Summary
The results clearly show that GGA has a distinct improvement from its roots,
the canonical GA and the GLS. The RRf is a particularly suitable problem
for GGA. The �tness templates of GGA was certainly monumental for GGA's
sucess. With the �tness template, the cross-over operator could recombine
intelligently, speeding up the search. Further, the mutation operator knew
approximately which bits to mutate, concentrating the search e�ort.
In [70], Mitchell et al. suggested the idea of the Idealized Genetic Algo-
59
3.4. SUMMARY
rithm (IGA) when discussing how a GA would out perform hill climbing. The
IGA is a GA that would \know" what are good schemas, and keep these in
the population. However, Mitchell et al. argued that this would be cheating.
For any problem, GGA knows the variables and the relationships between
them (provided they are properly de�ned), since GGA was developed with
a Constrained Problem perspective. But GGA does not know the optimum
values that these variables should be instantiated with, which is its job to
�nd out. A black box alogrithm is one that knows nothing of the problem.
GGA is a grey box algorithm, since it only has some information (variables
and relationships) on the problem.
From these points, we can see that GGA closely approximates Mitchell
et al.'s IGA as possible without cheating.
60
Chapter 4
Generalized Assignment
Problem
A paper of this chapter was submitted to the journal Computers and Oper-
ations Research [72], and a shorter version of that paper was accepted for
presentation at the 10th IEEE International Conference on Tools with Arti-
�cial Intelligence [73].
The GAP is concerned with the assignment of jobs to agents. Each job
can only be performed by one agent, and each agent has a �nite resource ca-
pacity that limits the number of jobs it can be assigned. GAP is aware that
di�erent agents perform the same piece of work with varying degrees of suc-
cess. Similarily, di�erent agents will exhaust di�erent amounts of resources
when doing the same job. One is to strive for the optimal allocation of jobs
to agents that derives the most productivity, while respecting the resource
capacity of the agents [45, 74, 75].
This is an NP-hard combinatorial optimization problem that has been
61
Agents Jobs ji
1
2
m
1
2
3
n-1
n
Figure 4.1: Pictorial view of the Generalized Assignment Problem.
62
well explored. The importance of GAP can be appreciated by its mani-
festation in real life, such as: the scheduling of jobs on the assembly line;
the assigning of di�erent products to tooling machineries each with di�erent
production capacity and resource consumption; and in the daily planning of
delivery details; but to name just three. The essence of GAP is to maximize
productivity within the capacity constraints of each worker/machine. In the
publications of Cattrysse and Van Wassenhove [76] and Osman [68], the au-
thors have written further on the applications and extensions of the GAP.
Fisher and Jaikumar [75] and Gheysens et al. [77] demonstrated that in ve-
hicle routing problem, GAP can exist as a sub-problem. It was also shown
by Ross and Soland [74] that a class of location problem can be formulated
as GAPs. As above, the GAP is a maximization problem and the following
is its formal de�nition.
Let I = f1; 2; : : : ; mg be a set of m agents, and J = f1; 2; : : : ; ng
be a set of n jobs. For i 2 I; j 2 J , de�ne ci;j as the revenue
produced by agent i after completing job j, ri;j as the resource
required by agent i to perform job j, and bi as the resource ca-
pacity of agent i. We also de�ne xi;j as a boolean variable that
will have a value of 1 if agent i is assigned job j and 0 hence or
otherwise. The task is therefore to �nd a m � n matrix x such
that xi;j 2 f0; 1g, and Eq. 4.1 is maximized, subject to Eq. 4.2
and 4.3 holding true.
63
4.1. EXPRESSING THE GAP AS A CSOP
mXi=1
nXj=1
ci;jxi;j (4.1)
8j 2 J :mXi=1
xi;j = 1 (4.2)
8i 2 I :nX
j=1
ri;jxi;j � bi (4.3)
8i 2 I; 8j 2 J : xi;j =
(1; agent i assigned to job j
0; otherwise(4.4)
4.1 Expressing the GAP as a CSOP
A CSOP is de�ned as a quadruple of (Z;D;C; f) where Z is a �nite set of
variables. Relative to Z, D is a function which maps every variable to a set
of values, which is called a domain. C is a �nite set of constraints each of
which a�ects a subset of the variables, and f is an objective function that
returns a magnitude based on the instantiation of all the variables.
4.1.1 Variables and Domains
To formulate the GAP as a CSOP, we need to de�ne Z;D;C and f . For
Z, one could choose either agents or jobs as the variables. Choosing one as
variable will set the other as its domain. From Fig. 4.1 we can perceive
that in taking jobs as the variables, each job variable will only have one
value; the agent it is assigned to. Having agents as variables will result in a
representation where each variable is a list containing jobs that the agent is
64
4.1. EXPRESSING THE GAP AS A CSOP
given to perform. Thus choosing jobs as the variables would result in a more
compact and less ambiguous representation, and also that Eq. 4.2 will be
true for all j. These were also the arguments Chu and Beasley [27, 45] gave
for their preference of jobs as variables.
For a GAP withm agents and n jobs, let qj be a variable in Z representing
job j. For each variable qj in Z, there is one associated domain mapped by
the function D, denoted by D(qj), which contains a set of m values each
representing an agent in the problem.
Z = fq1; q2; : : : ; qng (4.5)
where 8qj 2 Z : D(qj) = f1; 2; : : : ; mg (4.6)
4.1.2 Constraints and the Objective Function
The constraints in the GAP are covered by Eq. 4.2 and 4.3. However, since
we have chosen jobs as the variables in Z, the formulation guarantees to
satisfy the constraints covered by Eq. 4.2 (see previous section). Also, we
can be more speci�c in our de�nition of each constraint.
65
4.2. PREPARING GGA TO SOLVE GAP
C = fk1; k2; : : : ; kmg (4.7)
where 8ki 2 C : ki �nX
j=1
R(i; j) > bi (4.8)
R(i; j) �
(rqj ;j; if qj = i
0; otherwise(4.9)
ki =
(1; constraint is violated
0; otherwise(4.10)
Eq. 4.8 ensures that capacity constraint (Eq. 4.3) is satis�ed for each
agent. In Eq. 4.1, we have the function whose resulting value we are to
optimize. Rewriting it as our objective function f , we take advantage of Z
to de�ne a more speci�c version.
f(q) =nX
j=1
cqj ;j (4.11)
4.2 Preparing GGA to solve GAP
4.2.1 Features for Constraint Satisfaction
In preparing GGA for solving GAP, we need to de�ne the set of solution
features that will guide us escape from local optima. The constraints and
objective function were de�ned in section 4.1.2, but we need to rewrite them
to suit the GGA context. Therefore, Eq. 4.8, 4.9 and 4.11 are rewritten as
Eq. 4.12, 4.13 and 4.14 respectively.
66
4.2. PREPARING GGA TO SOLVE GAP
8�i 2 � : �i( p) �nX
q=1
R(i; q) > bi (4.12)
R(i; q) �
(r p00;q;q
; if p00;q = i
0; otherwise(4.13)
g( p) =nX
q=1
c p;q ;q + � �Xi=1
(�i � taui( p)) (4.14)
Feature �i is exhibited by chromosome p if the total resource require-
ments for the jobs assigned to agent i exceeds its capacity bi. Eq. 4.12 and
4.13 de�nes the indicator function �i which returns a one if �i is present, and
zero otherwise. The function g (Eq. 4.14) is used as the �tness function for
GGA as it incorporates the penalties for a chromosome (a means for measur-
ing constraints satisfaction), in addition to solution evaluation for optimality.
The constraints de�ned in Eq. 4.12 are for generating valid solutions. We
require solution features that will methodically attempt to move the search
process to the global optimum. This will be discussed in the next section.
4.2.2 Features for Optimization
GAP has some characteristics that we can use to formulate these features.
We know that for each job, there are n agents that can perform it, where each
agent produces and consumes di�erent amount of revenue and resource. A
heuristic is to favour the assignment of job j to agent i, provided that agent
i: requires the least amount of resources ri;j, produces the maximum amount
of revenue ci;j and has the largest capacity bi. These criteria are formulated
as the tightening index t (Eq. 4.15). In our assignments, we would want job
67
4.2. PREPARING GGA TO SOLVE GAP
j to be given to the agent that produces more or equal to the agent i that
maximizes t. So we de�ne the minimum revenue to produce (MRTP) uj as
the value of ci;j for all i such that t(i; j) is maximized (Eq. 4.16). Given such,
we can now de�ne the indicator function � 0 as Eq. 4.17, where it returns a
one when job j produces below uj and zero if otherwise.
t(i; j) = ci;j � (max(r1;j; : : : ; rm;j)� ri;j) � bi (4.15)
uj = ci;j maxi2I
t(i; j) (4.16)
� 0i p =
(1; if ci;j < uj; where j = p; i
0; otherwise(4.17)
However, using the indicator function � 0 above yielded sub-optimal results
for some problem sets in the GAP, as the features it de�ned were too loose; ie.
most chromosomes do not exhibit the features. Therefore, we need to further
tighten these solution features by increasing the respective MRTP uj where
possible. We introduce an additional step, the tightening index t0 (Eq. 4.18)
that returns a real value which represents the gradient for moving from the
current agent (represented by uj and vj) to another, for job j. The variable
vj is de�ned as the resource consumption of the agent i that maximizes t for
doing job j (Eq. 4.19).
t0(i; j) =ci;j � uj
ri;j � vj(4.18)
vj = ri;j maxi2I
t(i; j) (4.19)
68
4.3. BENCHMARK
With t0, we can now de�ne a tighter (ie. increased) MRTP u0j, using uj as
input (Eq. 4.20). The variable u0j takes the revenue produced ci;j of the agent
i that maximizes t0 for doing job j, provided that the calculated 0t is greater
than 1.01. Otherwise u0j takes uj, signifying the lack of a better alternative
agent. Given u0j, we can rede�ne � 0; ie. replace Eq. 4.17 by Eq. 4.21.
u0j =
(ci;j maxi2I t(i; j); if max t0(i; j) > 1:0
uj; otherwise(4.20)
� 0i p =
(1; if ci;j < uj0; where j = p; i
0; otherwise(4.21)
In summary, we have de�ned one feature per agent and one per job.
Functions � and � 0 together form one set of indicator functions for optimizing
the GAP.
4.3 Benchmark
A set of benchmark GAP is made up of 60 problems that ranges from in-
stances with 5 agents and 15 jobs to 10 agents and 60 jobs. These instances
were complied from literature by Osman [68] and Cattrysse [65, 67] and are
publicly available via the internet at the OR-Library [62]. The GAP is orga-
nized into 12 groups, each consisting of �ve problems that are common in the
number of agents and jobs they each have. Optimum results have already
been proven for the problems in GAP [65]. The interest is more on individual
1A gradient greater than 1.0 indicates improvement over the current uj and vj .
69
4.3. BENCHMARK
algorithm's performance on robustness; it's ability to consistently return a
solution that is the global optimum. And in [66], Martello remarked that de-
spite problems in GAP being relatively small, they are still computationally
challenging.
4.3.1 Indices
For benchmarking purposes, we have adopted the style employed by Chu in
his PhD thesis [27]. Ten runs were recorded for each problem in the GAP.
Using results from the runs, we compute the following indices: average per-
centage deviation of each problem �i (Eq. 4.22), average percentage deivation
of all solutions �all (Eq. 4.23), average percentage deviation of all the best
solutions of each group �best (Eq. 4.24), and the number of problems where
the optimal solution was found NOPT . We introduced the new index TOPT
(Eq. 4.27), the total number of optimal results for all M �N runs, to o�er
a di�erent (arguably better) perspective on robustness of alogrithms. In the
above equations, we denote M as the number of problems in GAP (which is
60), and N as the number of trial run for each problem (which is 10). The
variable opti is the computed optimal result for the problem i, and runi;j is
result of the jth trial run for problem i.
70
4.3. BENCHMARK
�i =
PNj=1
opti�runi;jopti
N� 100 (4.22)
�all =
PMi=1 �i
M(4.23)
�best =
PMi=1
opti�max(runi;1;::: ;runi;N )opti
M� 100 (4.24)
NOPT =MXi=1
S(i) (4.25)
where S(i) =
(1; if max(runi;1; : : : ; runi;N) = opti
0; otherwise(4.26)
TOPT =MXi=1
NXj=1
S 0(i; j) (4.27)
where S 0(i; j) =
(1; if runi;j = opti
0; otherwise(4.28)
4.3.2 Test Environment
In our physical environment, GGA was written in C++ and complied us-
ing GNU GCC version 2.7.1.2. The code runs on an IBM PC compatible
with a Pentium 133 MHz processor, 32MB of RAM and 512KB of Level 2
cache. Both compilation and executaion of GGA was performed on the Linux
operating system, using kernel version 2.0.27.
Under GGA's environment, we have a mutation rate and cross-over rate
of 1.0, a population size of 120, and a stopping criterion of 100 generations.
Each satisfaction constraint and optimization constraint has a penalty cost
of 10000 and 100 respectively2. For the �tness function g, � has a value of
2A solution with no violation of satisfaction constraint is preferred over an optimal
71
4.3. BENCHMARK
10.
PGLS will maintain 120 candidate solutions concurrently, with each can-
didate solution having a limit of 200 iterations. PGLS also uses the same
GLS parameters as GGA.
4.3.3 Results
For all 60 problems, the results for GGA are summarised in Table 4.1. In this
table, we report the details of each problem: the number of agents and jobs,
and the optimum solution. Against each problem is the computed average
percentage deviation � from 10 runs of GGA. Results from the PGLS are
reported in a similar table (Table 4.2).
From Table 4.1, we see that the � for problems gap8-3 and gap10-3 were
non-zero. This indicated that GGA had some solutions (from the total of
10) that were sub-optimal. In Table 4.3 we show the recorded results of the
10 runs for these problems. Likewise for PGLS, Table 4.4 shows results for
each of the problems: gap8-1, gap8-3, gap8-5, gap10-3 and gap12-3, where
PGLS had some solutions which were sub-optimal.
In Table 4.5, we compare GGA with other algorithms. These algo-
rithms are either systematic or stochastic. For systematic algorithms, we
have branch and bound methods [64, 66], constructive heuristic [63], and set
partitioning heuristic [67]. Stochastic algorithms are either tabu search [68],
simulated annealing [65, 68], or genetic algorithms [27]. We also compare
partial solution, thus the large di�erence in cost.
72
4.3.BENCHMARK
Table 4.1: Summary of all runs on GAP by GGA.
Problem Agents m Jobs n Optimum � Problem Agents m Jobs n Optimum �
gap1-1 5 15 336 0 gap7-1 8 40 942 0
gap1-2 327 0 gap7-2 949 0
gap1-3 339 0 gap7-3 968 0
gap1-4 341 0 gap7-4 945 0
gap1-5 326 0 gap7-5 951 0
gap2-1 5 20 434 0 gap8-1 8 48 1133 0
gap2-2 436 0 gap8-2 1134 0
gap2-3 420 0 gap8-3 1141 0.04
gap2-4 419 0 gap8-4 1117 0
gap2-5 428 0 gap8-5 1127 0
gap3-1 5 25 580 0 gap9-1 10 30 709 0
gap3-2 564 0 gap9-2 717 0
gap3-3 573 0 gap9-3 712 0
gap3-4 570 0 gap9-4 723 0
gap3-5 564 0 gap9-5 706 0
gap4-1 5 30 656 0 gap10-1 10 40 958 0
gap4-2 644 0 gap10-2 963 0
gap4-3 673 0 gap10-3 960 0.05
gap4-4 647 0 gap10-4 947 0
gap4-5 664 0 gap10-5 947 0
gap5-1 8 24 563 0 gap11-1 10 50 1139 0
gap5-2 558 0 gap11-2 1178 0
gap5-3 564 0 gap11-3 1195 0
gap5-4 568 0 gap11-4 1171 0
gap5-5 559 0 gap11-5 1171 0
gap6-1 8 32 761 0 gap12-1 10 60 1451 0
gap6-2 759 0 gap12-2 1449 0
gap6-3 758 0 gap12-3 1433 0
gap6-4 752 0 gap12-4 1447 0
gap6-5 747 0 gap12-5 1446 0
73
4.3.BENCHMARK
Table 4.2: Summary of all runs on GAP by PGLS.
Problem Agents m Jobs n Optimum � Problem Agents m Jobs n Optimum �
gap1-1 5 15 336 0 gap7-1 8 40 942 0
gap1-2 327 0 gap7-2 949 0
gap1-3 339 0 gap7-3 968 0
gap1-4 341 0 gap7-4 945 0
gap1-5 326 0 gap7-5 951 0
gap2-1 5 20 434 0 gap8-1 8 48 1133 0.07
gap2-2 436 0 gap8-2 1134 0.10
gap2-3 420 0 gap8-3 1141 0.04
gap2-4 419 0 gap8-4 1117 0
gap2-5 428 0 gap8-5 1127 0.01
gap3-1 5 25 580 0 gap9-1 10 30 709 0
gap3-2 564 0 gap9-2 717 0
gap3-3 573 0 gap9-3 712 0
gap3-4 570 0 gap9-4 723 0
gap3-5 564 0 gap9-5 706 0
gap4-1 5 30 656 0 gap10-1 10 40 958 0
gap4-2 644 0 gap10-2 963 0
gap4-3 673 0 gap10-3 960 0.18
gap4-4 647 0 gap10-4 947 0
gap4-5 664 0 gap10-5 947 0
gap5-1 8 24 563 0 gap11-1 10 50 1139 0
gap5-2 558 0 gap11-2 1178 0
gap5-3 564 0 gap11-3 1195 0
gap5-4 568 0 gap11-4 1171 0
gap5-5 559 0 gap11-5 1171 0
gap6-1 8 32 761 0 gap12-1 10 60 1451 0
gap6-2 759 0 gap12-2 1449 0
gap6-3 758 0 gap12-3 1433 0.03
gap6-4 752 0 gap12-4 1447 0
gap6-5 747 0 gap12-5 1446 0
74
4.3.BENCHMARK
Table 4.3: Recorded runs on GAP8-3 and GAP10-3 by GGA.
Problem Optimum Runs (10) �
gap8-3 1141 1140 1140 1141 1141 1140 1141 1141 1141 1140 1141 0.04
gap10-3 960 960 959 960 958 959 960 960 960 959 960 0.05
75
4.3.BENCHMARK
Table 4.4: Recorded runs on GAP8-1, GAP8-3, GAP8-5, GAP10-3 and GAP12-3 by PGLS.
Problem Optimum Runs (10) �
gap8-1 1133 1132 1132 1133 1132 1133 1133 1131 1131 1133 1132 0.07
gap8-3 1141 1139 1140 1140 1139 1140 1141 1141 1140 1140 1139 0.10
gap8-5 1127 1127 1127 1127 1126 1127 1127 1127 1127 1127 1127 0.01
gap10-3 960 959 957 957 958 959 959 958 959 959 958 0.18
gap12-3 1433 1433 1432 1432 1433 1433 1433 1432 1433 1433 1432 0.03
76
4.4. VARYING TIGHTNESS
GGA to PGLS (section 2.7.2 on page 44).
From the table, we observe that results obtained by GGA, GA-HEU
and PGLS are better than others. Only GGA, GA-HEU and PGLS were
able to return the optimal solution for all 60 problem (based on NOPT ).
Results between GGA, GA-HEU and PGLS on the NOPT are very close. In
comparing GGA, GA-HEU and PGLS on TOPT, GGA achieved 592 optimal
solutions from the total of 600 runs, which is about 99% of the runs. GA-
HEU found 553 optimal solutions from the 600 runs, which is 92% of all runs.
And PGLS found 571 optimal solutions giving it a score of 95%.
4.4 Varying Tightness
When formulating a set of features for the GAP, we realised that like con-
straints, features have tightness too (see section 4.2.2). This lead us to inves-
tigate how tightness will a�ect GGA's performance. Before proceeding any
further, let us de�ne two useful terms to describe features:
� A \loose" feature, in the most extreme sense, is a property that is
exhibited by one candidate solution in the entire search space. A loose
feature will either not be or seldomly penalised. A problem with a loose
feature set will reach a point where no penalisation will occur, and thus
the search will become unguided.
� A \tight" feature (in a similarly extreme sense) is a property that is
displayed by all candidate solution in the search space. If the feature
77
4.4.VARYINGTIGHTNESS
Table 4.5: Comparing results by GGA with others.
Problem MTH FJVBB FSA MTBB SPH LT1FA RSSA TS6 TS1 GA-WH GA-HEU PGLS GGA
gap1 5.43 0.00 0.00 0.00 0.08 1.74 0.00 0.00 0.00 0.00 0.00 0.00 0.00
gap2 5.02 0.00 0.19 0.00 0.11 0.89 0.00 0.24 0.10 0.01 0.00 0.00 0.00
gap3 2.14 0.00 0.00 0.00 0.09 1.26 0.00 0.03 0.00 0.01 0.00 0.00 0.00
gap4 2.35 0.83 0.06 0.18 0.04 0.72 0.00 0.03 0.03 0.03 0.00 0.00 0.00
gap5 2.63 0.07 0.11 0.00 0.35 1.42 0.00 0.04 0.00 0.10 0.00 0.00 0.00
gap6 1.67 0.58 0.85 0.52 0.15 0.82 0.05 0.00 0.03 0.08 0.01 0.00 0.00
gap7 2.02 1.58 0.99 1.32 0.00 1.22 0.02 0.02 0.00 0.08 0.00 0.00 0.00
gap8 2.45 2.48 0.41 1.32 0.23 1.13 0.10 0.14 0.09 0.33 0.05 0.04 0.01
gap9 2.18 0.61 1.46 1.06 0.12 1.48 0.08 0.06 0.06 0.17 0.00 0.00 0.00
gap10 1.75 1.29 1.72 1.15 0.25 1.19 0.14 0.15 0.08 0.27 0.04 0.04 0.01
gap11 1.78 1.32 1.10 2.01 0.00 1.17 0.05 0.02 0.02 0.20 0.00 0.00 0.00
gap12 1.37 1.37 1.68 1.55 0.10 0.81 0.11 0.07 0.04 0.17 0.01 0.01 0.00
�all n/a n/a n/a n/a n/a 1.15 0.21 0.10 0.07 0.12 0.01 0.01 0.00
�best 2.56 0.84 0.72 0.78 0.13 1.15 0.04 0.06 0.03 0.02 0.00 0.00 0.00
NOPT 0 26 n/a 24 40 3 39 40 45 51 60 60 60
TOPT n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a 553 571 592
n/a = results not available
MTH: Constructive heuristic, Martello [63]
FJVBB: Branch and bound procedure with upper CPU limit, Fisher [64]
FSA: Fixing simulated annealing algorithm, Cattrysse [65]
MTBB: Branch and bound procedure with upper CPU limit, Martello [66]
SPH: Set partitioning heuristic, Cattrysse [67]
LT1FA: Long term decent with 1-interchange mechanism and �rst-admissible selection, Osman [68]
RSSA: Hybrid simulated annealing/tabu search, Osman [68]
TS6: Long term tabu search with �rst-admissible selection, Osman [68]
TS1: Long term tabu search with best-admissible selection, Osman [68]
GA-WH: GA without the heuristic operator, Chu [27]
GA-HEU: GA with the heuristic operator, Chu [27]
GGA: Guided genetic algorithm
78
4.4. VARYING TIGHTNESS
set of a problem are de�ned too tightly, penalisation will always occur.
The penalisation will rotate among a group of the tightest features,
and the search will degrade into a cyclic process.
We can use the tightening index t0 (Eq. 4.18), vj (Eq. 4.19) and u0j (Eq.
4.20) together as a recursive function to vary the tightness of our feature set.
We would need three feature set of di�erent tightness:
� Tightness T0: This is the least tight of the three feature set. This
feature set is computed by using just t (Eq. 4.15) and uj (Eq. 4.16)
only (ie. without passing uj into Eq. 4.18, 4.19 and 4.20, so we are
simply assigning u0j = uj.).
� Tightness T1: This is the standard feature set that we used in the
benchmark. After computing uj through Eq. 4.15 and 4.16, we pass
it through Eq. 4.18, 4.19 and 4.20 to obtain u0j. This is then used to
form our feature set.
� Tightness T2: The tightest feature set of the three, this is comupted as
follows: Compute uj using Eq. 4.15 and 4.16. Pass uj into Eq. 4.18,
4.19 and 4.20 to get u0j. Again, pass u0
j into Eq. 4.18, 4.19 and 4.20 to
as uj to get the tighter u0
j.
Adhering to the same GGA parameters and testing environment for the
benchmark, we ran it for the three di�erent feature sets. The results are
based on the computed average percentage deviation � (Eq. 4.22) for 10
79
4.4. VARYING TIGHTNESS
runs of each problem. These results are tabulated into Table 4.6, and can be
compared directly to results in Table 4.5.
Table 4.6: Comparing � results by GGA using feature sets of di�erent tight-ness.
Problem T0 T1 T2gap1 0.00 0.00 0.00gap2 0.00 0.00 0.00gap3 0.00 0.00 0.00gap4 0.00 0.00 0.00gap5 0.00 0.00 0.00gap6 0.00 0.00 0.00gap7 0.00 0.00 0.00gap8 0.02 0.01 0.03gap9 0.00 0.00 0.00gap10 0.01 0.01 0.02gap11 0.00 0.00 0.00gap12 0.00 0.00 0.01�all 0.00 0.00 0.01�best 0.00 0.00 0.00
NOPT 60 60 60TOPT 586 592 568
4.4.1 Results
Based on the results, we can see that feature sets that are too loose (T0)
or too tight (T2) will not guide GGA as well as T1. Moreover, the tightest
feature set (T2) performed worst of the three.
Our explanation: a tighter feature set would imply that features will
occur more easily. This may lead to over-penalisation, such that the search
space becomes restricted. Thus, GGA's cross-over operator will not be able to
produce any useful new chromosome because it (the new chromosome) would
80
4.5. SUMMARY
usually have contained one or more features. The search will eventually enter
into a penalisation cycle, and solutions produced by GGA will not improve,
and may even degrade.
When a feature set is too loose (such as T0), its members may not provide
enough stimulation to guide the search. The search will reach a point where
no penalisation will ever occur. In such state, the search becomes unguided.
But GGA's cross-over operator was able to roam the search space in a more
thorough manner. Therefore, GGA was still able to perform relatively well
for feature set T0.
4.5 Summary
Results in Table 4.1 shows that GGA guarantees to return the optimal results
for all 10 trial runs of all problems except two, gap8-3 and gap10-3. The trial
runs for these two problems in Table 4.3 shows GGA to return the optimal
solution half the time, and where the result is not the optimum, GGA's
solution is just one step away from the computed optimum.
Further, GGA has a �all value of 0.00150 versus 0.00983 of GA-HEU and
0.0075 of PGLS, which shows that the solutions returned by GGA are closer
to the optimal than GA-HEU's or PGLS's. Also noteworthy is that out of
the 60 problems in the GAP, GGA returned only the optimal solution for all
10 runs in 58 instances (97% of the 60 problems) versus GA-HEU's 47 (78%),
and PGLS's 55 (92%). From these results, one can conclude that both GGA,
GA-HEU and PGLS have very good results, but GGA edge out GA-HEU
81
4.5. SUMMARY
and PGLS by o�ering more robustness.
Through varying tightness, we see that designing a good feature set would
have some e�ects on GGA's performance. A tight feature set would restrict
the usefulness of GGA's cross-over operator. In such, GGA would have little
advantage over GLS. On the other hand, a loose feature set would need the
population paradigm of GGA to be successfully solved. Overall, choosing
GGA would be a safe choice, generally giving better performance than its
peers.
Choosing a good feature set is usually a trial and error process, driven
more by experience and a bit of instinct. But knowing tightness of a problem,
and through understanding its e�ects, we can choose our algorithm more
informatively.
82
Chapter 5
Radio Link Frequency
Assignment Problem
A paper on this chapter was submitted to the Journal of Constraints, special
issue on Biomatics and Biocomputing [78].
The EUCLID CALMA (Combinatorial Algorithms for Military Appli-
cations) consortium is a group of six research bodies in Europe that was
formed to investigate the use of AI techniques to aid military decisions. The
Radio Link Frequency Assignment Problem (RLFAP) is a case study pro-
posed within the group to observe the e�ectiveness of di�erent approaches.
It is an abstraction of a real world military application that is concerned
with assigning frequencies to radio links. The RLFAP1 benchmark contains
eleven instances with various optimization criteria, made publicly available
through the e�orts of the French Centre d'Electronique l'Armament. RLFAP
is NP-hard and is a variation on the T-graph colouring problem introduced
1RLFAP is available at the Centre d'Electronique l'Armament (France), via ftp atftp.cert.fr/pub/bourret.
83
in [79].
Types of Instances
Each instance in the RLFAP benchmark has a set of �les that describe its
variables, their domains, the constraints, and the objective. In addition, we
are also given information on the respective optimization requirements based
on the solubility of the problem. Optimization criteria describe the interpre-
tation of variable instantiations and the means of measuring their desirability;
thus shaping the objective function of our search routine, whereas the solu-
bility of the problem states if a solution can be found under the condition
that no constraint was violated. If an instance can be solved without con-
straint violation, then its optimality is de�ned as either O1 or O2, otherwise
it is O3 (see below). For an insoluble instance, we use the violation cost of
each constraint to solve the instance as a PCSP. In this paper, regardless of
the problem's solubility, all instances in RLFAP are solved as PCSP since
a CSOP problem can be mapped into a PCSP by giving each constraint a
violation cost. This violation cost is the same value for all constraints in the
instance.
O1 - optimal solution is one with the fewest number of di�erent values in
its variables.
O2 - optimal solution is one where the largest assigned value is minimal.
O3 - if a the problem cannot be solved without violating constraints, �nd
a solution that minimizes the objective function as follows:
84
a1 � nc1 + a2 � nc2 + a3 � nc3 + a4 � nc4 +
+ b1 � nv1 + b2 � nv2 + b3 � nv3 + b4 � nv4 (5.1)
Where nci is the number of violated constraints of priority i, nvi is the
number of modi�ed variables with mobility i. Mobility for a radio link
states the cost for changing the frequency from its assigned default.
The values of the weights ai and bi are given if necessary.
All constraints in the RLFAP benchmark are binary; that is, each con-
straint operates on the values in two variables. These constraints test the
absolute di�erence of two variables in a candidate solution, where this logical
test can belong to either of the two following classes:
C1 - the absolute di�erence must be lesser than a constant.
C2 - the absolute di�erence must be equal to a constant.
Table 5.1 lists the instances, their characteristics and its objective. From
this table, we can observe that the RLFAP contains instances that are varied
in both the optimization and constraint criteria. Further, the number of
variables, their domain sizes, and the number of constraints on these variables
make the RLFAP a non-trivial problem set for any algorithm. The RLFAP
would not only test the quality and robustness of an algorithm, but also its
exibility to adapt to the di�erent optimization and constraint criteria of
each instance.
85
5.1. THE RLFAP IN PCSP EXPRESSION
Table 5.1: Characteristics of RLFAP instances.
Instance No. of No. of Souble Minimize Typevariables constraints
scen01 916 5548 Yes Number of di�erent values used O1scen02 200 1235 Yes Number of di�erent values used O1scen03 400 2760 Yes Number of di�erent values used O1scen04 680 3968 Yes Number of di�erent values used O1scen05 400 2598 Yes Number of di�erent values used O1scen06 200 1322 No The maximum value used O2scen07 400 2865 No Weighted constraint violations O3scen08 916 2744 No Weighted constraint violations O3scen09 680 4103 No Weighted constraint violations
and mobility costsO3
scen10 680 4103 No Weighted constraint violationsand mobility costs
O3
scen11 680 4103 Yes Number of di�erent values used O1
5.1 The RLFAP in PCSP Expression
A PCSP is de�ned as a quadruple of fZ;D;C; fg where Z is a �nite set of
variables. With respect to Z, D is a function that maps every variable to a set
of values, which is called a domain. C is a �nite set of constraints that a�ect
a subset of the variables, and each constraint has a cost for its violation. The
objective function f returns a magnitude based on the instantiation of the
variables and the satisfaction of constraints. In the RLFAP, each instance
has a set of �les that conveniently describe the respective Z, D and C sets,
and the optimization objective f .
5.1.1 Variables and Domains
For any of the RLFAP instance with m variables, let qj be a variable in Z
representing one radio link. For each variable qj in Z, there is one associated
86
5.1. THE RLFAP IN PCSP EXPRESSION
domain mapped by the function D, denoted by D(qj), which contains a set
of n values, each value representing a valid frequency that can be assigned
to the variable.
Z = fq1; q2; : : : ; qmg (5.2)
where 8qj 2 Z : D(qj) = ffreq1; freq2; : : : ; freqng (5.3)
5.1.2 Constraints
The constraint set C consists of n elements, representing n constraints in the
instance. Each element in C consists of the constraint ci and its cost costi2
(Eq. 5.4). As discussed at the start of section 5 on page 84, there are two
types of binary constraints in the RLFAP; C1 and C2 which we formulate
into Eq. 5.5. In that equation, qa and qb are two variables from a candidate
solution, and z is a constant. Eq. 5.6 states that constraint ci returns a
binary value that is 1 for a violation and 0 otherwise.
C = f< c1; cost1 >;< c2; cost2 >; : : : ; < cn; costn >g (5.4)
8ci 2 C :
�ci � jqa � qbj < z; if ci is type C1ci � jqa � qbj = z; if ci is type C2
(5.5)
where ci =
�1; constraint is violated0; otherwise
(5.6)
2Cost for violating the constraint.
87
5.2. PREPARING GGA TO SOLVE RLFAP
5.1.3 Objective Function
The respective objective function f of the instances in RLFAP are stated
in Table 5.1. The objective functions are also explained at the beginning of
section 5 (on page 84.
5.2 Preparing GGA to solve RLFAP
In section 5.1, we expressed the RLFAP as a formal PCSP. In this section,
we discuss the steps needed to adapt those de�nitions into a form that GGA
can use.
In general, for all (O1, O2 and O3) types of instances, the feature set � is
a union of the feature set of constraints �cst and the set of mobility of radio
links �mbt (Eq. 5.7). Constraint ci de�ned in Eq. 5.5 is recast as a feature
in the set �cst (Eq. 5.8), where a one is returned if the constraint cannot be
satis�ed, and zero otherwise. The value of cost �costi to each constraint ci
depends on the nature of the instance. If the instance is soluble, then all �csti
are set to a large value; say 10000, to signify that the constraint must not
be borken (ie. hard constraints). For insoluble instances, �csti is set to the
weights given for its priority class (see section 5, page 84). Similar to soluble
instances, hard constraints in insoluble instances will have their �csti set to a
large value. The set �cst has n features, where n is the number of constraints
in the instance.
In addition to minimizing constraints violation costs (as above), the O3
88
5.2. PREPARING GGA TO SOLVE RLFAP
objective type of instances require us to minimize the mobility cost of our
candidate solution. The set of mobility cost de�nes our next feature set, �mbt
(Eq. 5.9). For each variable in these instances, there is a mobility cost �mbti
and a default assigned frequency defaulti. If in our candidate solution, a
variable has been assigned a value di�erent from its default defaulti, then
a one is returned and zero otherwise. The mobility cost �mbti is set to the
weights given for its prioriy class (again see section 5). There are radio links
whose frequency should never change, and the mobility cost for these have
been set to a large value. The feature set �mbt de�nes n features, where n is
the number of variables in the instance.
� = f�cst; �mbtg (5.7)
8�i 2 �cst : �csti( p) �
8<:
1; if C1 and j p;a � p;bj � zi1; if C2 and j p;a � p;bj 6= zi0; otherwise
(5.8)
8�i 2 �mbt : �mbti( p) �
�1; if p;i 6= defaulti0; otherwise
(5.9)
For all instances in the RLFAP, we seek to minimize the function g (Eq.
5.10). In g, the function f depends on the objective type of the instance
(Eq. 5.11). The cost �i and �i both refers to the uni�ed feature set of �.
They will automatically associate with �csti and �csti, or �mbti and �mbti where
applicable. Thus the value of n in Eq. 5.10 is the sum of number of features
in �cst and �mbt.
89
5.3. BENCHMARK
g( p) = f( p) + � �nXi=1
(�i � �i( p)) (5.10)
f( p) =
8>><>>:
if O1; number of di�erent values used in pif O2; largest value used in pif O3; a1 � nc1 + a2 � nc2 + a3 � nc3 + a4 � nc4 +
+ b1 � nv1 + b2 � nv2 + b3 � nv3 + b4 � nv4
(5.11)
5.3 Benchmark
The RLFAP benchmark results by algorithms devised within the CALMA
group were reported by Tiourine et al. in [80]. In this section, we compare
GGA's results with the CALMA algorithms (section 5.3.2). Further, we will
also evaluate the examine the value that GGA adds to the canonical GLS
(section 5.3.2).
5.3.1 Test Environment
In our physical environment, GGA was written in C++ and complied using
GNU GCC version 2.7.1.2. The code runs on an IBM PC compatible with
a Pentium 133 MHz processor, 32MB of RAM and 512KB of Level 2 cache.
Both compilation and executaion of GGA was performed on the Linux op-
erating system, using kernel version 2.0.27. Under GGA's environment, we
have a mutation rate and cross-over rate of 1.0, a population size of 20, and
a stopping criterion of 100 generations. For the �tness function g, � has a
value of 10.
90
5.3. BENCHMARK
5.3.2 Performance Evaluation
Comparing all Algorithms
In Table 5.2, we see the published results of all the CALMA algorithms,
GLS and GGA. Algorithms from the CALMA project groups consist of ei-
ther complete or stochastic methods. The results recorded in the table are
from the best solution each algorithm had generated. For soluble instances
(scen01, scen02, scen03, scen04, scen05 and scen11)3, we report the number
of frequencies above the known optimum that each solution (generated by
the respective algorithms) uses. Results for the insoluble instances are re-
ported as the percentage deviation from the best known reported solution.
The average time taken for each algorithm to arrive at these solutions are
also reported in the table.
For soluble instances, we observe that only seven out of the 13 algorithms4
were able to provide a solution to all the instances. Of the six soluble in-
stances, GGA failed to return an optimum solution only for instance scen11.
This instance has proved to be di�cult for most of the algorithms, since only
three algorithms (Taboo Search (EUT), Branch and Cut (DUT,EUT) and
Constraint Satisfaction (LU)) were able to return a solution on par with the
best known. Only two algorithms (Branch and Cut (DUT,EUT) and Con-
straint Satisfaction (LU) were able to report solutions that gave the most
optimal resutls. However, these two algorithms were limited to solving solu-
3Instance scen06 was found to be insoluble, and thus solved as an O3 problem.4The Genetic Algorithms from (LU) was designed for PCSP, and so it did not attempt
any of the soluble instances.
91
5.3. BENCHMARK
ble instances only.
Looking at insoluble instances, we see that 11 out of the 14 algorithms
were able to tackle these PCSP instances. Of the 11, nine of these algorithms
managed to provide a satisfactory solution to the instances. Of note is the
Genetic Algorithms (LU), which found the best known solutions.
Overall, we see that top performers in each category (soluble and insol-
uble) are limited (in application) to only that category; Branch and Cut
(DUT,EUT) and Constraint Satisfaction (LU) for soluble instances, and Ge-
netic Algorithms (LU) for insoluble instances. Of the total of 14 algorithms,
10 were applicable to both categories. Of these 10, only four algorithms were
able to provide a satisfactory solution to all the 11 instances. Amongst these
four, GLS and GGA had better consistency in arriving at solution quality
very close to the best known solutions for each of the 11 instances.
Comparing Genetic Algorithms
Genetic Algorithms (UEA) [24], Genetic Algorithms (LU) [24] and GGA
are the three GAs reported here. Genetic Algorithms (UEA) are a set of
GAs based on the canonical GA but with specialized genetic operators, and
are able to tackle both soluble and insoluble instances. Results-wise, GGA
performed better than Genetic Algorithms (UEA) in all but two instances
(scen06 and scen10). Of note, GGA had a clear edge over solutions from
Genetic Algorithms (UEA) for instances scen01, scen07, scen08 and scen11.
Genetic Algorithms (LU) has genetic operators that exploit domain knowl-
92
5.3.BENCHMARK
Table 5.2: Comparison of GGA with GLS and the CALMA project algorithms.
Soluble Instances Insoluble Instances
Instance (scen) 01 02 03 04 05 11 Time 06 07 08 09 10 Time Platform
Simulated Annealing (EUT) 2 0 2 0 0 2 1min 6% 65% 5% 0% 0% 310min SUN Sparc 4
Taboo Search (EUT) 2 0 2 0 - 0 5min - - - - - - SUN Sparc 4
Variable Depth Search (EUT) 2 0 2 0 - 10 6min 3% 0% 14% 0% 0% 85min SUN Sparc 4
Simulated Annealing (CERT) 4 0 0 0 - 10 41min 42% 1299% 70% 2% 0% 42min SUN Sparc 10
Tabu Search (KCL) 2 0 0 0 0 2 40min 167% 1804% 566% 8% 1% 111min DEC Alpha
Extended GENET (KCL) 0 0 0 0 0 2 2min 12% 27% 40% - - 20min DEC Alpha
Genetic Algoritms (UEA) 6 0 2 0 - 10 24min 0% 386% 134% 3% 0% 120min DEC Alpha
Genetic Algorithms (LU) - - - - - - - 0% 0% 0% 0% 0% hours DEC Alpha
Partial Constraint Satisfaction (CERT) 4 0 6 0 0 - 28min 83% 2563% 246% 47% 12% 6min SUN Sparc 10
Potential Reduction (DUT) 0 0 2 0 0 - 3min 27% - - 4% 1% 10min HP 9000/720
Branch and Cut (DUT,EUT) 0 0 0 0 0 0 <10min - - - - - - -
Constraint Satisfaction (LU) 0 0 2 0 0 0 hours - - - - - - PC
Guided Local Search (UE) 0 0 0 0 0 6 20sec 4% 9% 7% 0.7% 0.003% 2.88min DEC Alpha
Guided Genetic Algorithm (UE) 0 0 0 0 0 2 40min 4% 9% 7% 0.7% 0.003% 60min PC Linux
Best known solution 16 14 14 46 792 22 3437 343594 262 15571 31516
Results for the soluble instances are reported as the number of frequencies more than the optimum used.
Results for the insoluble instances are reported as the percentage deviation from the best known solution.
CERT Centre d'Etudes et de Recherces de Toulouse, France
DUT Delft University of Technology, The Netherlands
EUT Eidhoven Univeristy of Technology, The Netherlands
KCL King's College London, United Kingdom
LU Limburg University, Maastricht, The Netherlands
UEA University of East Anglia, Norwich, United Kingdom
UE University of Essex, United Kingdom
93
5.3. BENCHMARK
edge. This GA type algorithm has been shown to be very e�ective for insolu-
ble instances of the RLFAP benchmark, as it was able to return solutions that
are currently the best known. However, this achievement comes at the price
of expensive computation, as Genetic Algorithms (LU) requires a large pop-
ulation size and long processing tie (when compared to Genetic Algorithms
(UEA) and GGA).
Comparing GGA and GLS
Evaluting an algorithm's worth should not be limited to just the quality of
the best solution it recommends. Besides solution quality and computation
speed, one other measure is robustness. Robustness of an algorithmmeasures
the consistency of the solutions it returns. In certain environments, one may
need to be absolutely sure that the solution returned by an algorithm is,
or very near the optimum. Unfortunately, we do not have any information
on the robustness of the algorithms in the CALMA project. Following, we
compare the statistical results of GGA and GLS.
For the comparison, GGA runs with a population of only �ve chromo-
somes. This is the minimum number of chromosomes needed to maintain an
advantage over GLS, and the minimum population size to achieve acceptable
results. Increasing the population size will improve robustness of GGA, but
with diminishing signi�cance. We introduce a variation of GLS, called PGLS
(section 2.7.2 on 44) to compete with GGA on even grounds. PGLS has �ve
GLS running concurrently of each other, each maintaining its own candidate
solution. Run time and iteration count for each GLS thread within PGLS
94
5.3. BENCHMARK
has also been extended to meet with GGA's. That is, each GLS thread in
PGLS has a limit of 100 iterations to match the 100 generations limit for
GGA. At the end of each run, only the best solution from PGLS was used.
The three algorithms (GGA, GLS and PGLS) were each executed 50 times,
for each of the 11 instances.
The results from Table 5.3 shows GGA to have better robustness than
GLS or PGLS in soluble instances, and in the case of scen11, GGA had
reached a better solution. Also, GGA pratically guarantees �nding the op-
timum solutions for soluble instances scen01, scen02, scen03, scen04 and
scen05. Results for insoluble instances are mixed, with both algorithms hav-
ing results very close to each other. But overall, GGA achieved better average
cost and smaller standard deviations for four of the �ve insoluble instances
(over PGLS).
In Table 5.2, we see that solution quality reported by GLS and GGA were
very much the same except in soluble instance scen11, where GGA managed
to better GLS's result. The amount of CPU time required for computing
these results have shown GLS to be much superior. However, it is unfair to
view GGA as just a parallel version of GLS:
In section 2 (page 26), we describe the mechanics of GGA. We have
integrated GLS as a component of GGA; ie. as the penalty operator. The
feedback from the GLS is used at two levels within GGA. On one level, GLS
modi�es the objective (�tness) function of GGA to in unce its search. On
another level, information from GLS are encoded into the template of each
95
5.4. SUMMARY
chromosome, which rates the relative �tness of each gene in the chromosome.
The templates are used to in uence the cross-over and mutation processes.
But because GGA manages several candidate solutions at the same time, it
will have a more complex means to detect local optimum traps.
5.4 Summary
In section 5.3.2 (page 91), we see that GGA is one of the few algorithms
that can solve both CSOP and PCSP type of problems. Out of the six
CSOP instances, GGA achieved the published optimum results for �ve of
these instances. This placed GGA as the third most e�ective algorithm (out
of 13). Further, in Table 5.3, we see that GGA managed to �nd �ve (out of
the six) CSOP instances all the time.
Although GGA did not reach the published optimum for the �ve PCSP
instances, it had been consistent in �nding solutions very close to it. Out of
the 11 algorithms, GGA is the third most e�ective (on par with GLS). From
Table 5.2, we see that both GGA and GLS achieved the same results. But
in terms of robustness (Table 5.3), GGA managed to hold more consistent
results than GLS, and PGLS.
This lack of performance supports the hypothesis put forward in section
4.4.1 (page 80) demonstrated through the tightness experiments (in section
4.4 on page 77). As the constraints in the PCSP instances of the RLFAP
made very tight feature sets, the tightness experiments showed that GGA
had little advantage over GLS when the feature set is tight.
96
5.4.SUMMARY
Table 5.3: Comparing Robustness between GGA, GLS and PGLS.
Best Known Average Cost Standard Deviation
Instance Solutions GLS PGLS GGA GLS PGLS GGA
scen01 16 18.6 17.0 16.0 2.3 0.8 0.0
scen02 14 14.0 14.0 14.0 0.0 0.0 0.0
scen03 14 15.4 14.4 14.0 1.3 0.4 0.0
scen04 46 46.0 46.0 46.0 0.0 0.0 0.0
scen05 792 792.0 792.0 792.0 0.0 0.0 0.0
scen11� 22 - 33.9 30.2 - 3.1 1.7
scen06 3437 4333.8 4129.6 4051.6 766.0 538.2 529.3
scen07 343594 530641.1 510532.5 513044.8 79666.7 75149.4 75612.8
scen08 262 335.7 322.6 320.5 34.7 23.1 21.5
scen09 15571 15999.7 15895.0 15889.6 194.7 112.6 109.3
scen10 31516 31686.6 31631.4 31626.1 146.1 108.7 102.8
� For scen11, results for GLS could not be computed because it did not
return a solution that satis�ed all constraints for some runs.
97
Chapter 6
Processor Con�guration
Problem
Two papers were written of this chapter. One was accepted by the IEEE
Expert(Intelligent systems and their applications) [81], and a shorter ver-
sion was accepted by the 10th IEEE International Conference on Tools with
Arti�cial Intelligence [82].
If a complex problem can be decomposed into smaller parts, then solving
these smaller parts concurrently could potentially mean a shorter solution
time. A distributed memory multi-processor system would therefore be a
suitable candidate to solve these problems. Example of such a system may
consists many transputers processing in parallel, each having its own set of
memory and instructions. These processors work in tandem to solve complex
problems, sharing information where necessary by exchanging messages.
The PCP is an NP-hard real life problem that involves designing a net-
work of a �nite set of processors, where each processor has the same �nite
98
6.1. APPLICATIONS
number of communication channels [54]. The PCP has a signi�cant role in
the design of an e�ective system, as the performance of a multiprocessor net-
work largely depends on the e�ciency of the message transfer system, which
coordinates the shuttling of message packets between the processors.
Processors not directly connected to each other will have to communicate
via intermediate processors. As the number of processors in the network
increases, the average number of intermediate processor a message will have
to switch through to arrive at its destination may increase. Thus careful
planning in the layout of the transputers would reduce the traveling distance
between any two transputers in the network.
6.1 Applications
To better understand the signi�cance that global communication overhead
has on solution time, consider Chalmers' application of image synthesis by
radiosity algorithms [54]. Chalmers compared various multiprocessor layouts
to show the e�ects of communication overhead on the time taken to generate
the test image.
From the special e�ects in movies, to the realistic models of buildings
that help architects envision and to the real-time rendering in virtual reality
systems, realistic image synthesis has become an important part of our daily
lives. Realism in generating these images depends on simulating the interac-
tion of light as they strike the surfaces of objects in a complex environment,
which gives a more natural representation of light, shade and texture. The
99
6.1. APPLICATIONS
radiosity method by Goral et. al [83] closely models the physical propaga-
tion of light through an environment, as it was based on the fundamental law
of the conservation of energy within closed systems. However the radiosity
method is a computationally intensive algorithm and requires vast amount
of processing power and memory, which makes it very suitable for solving by
parallel processing.
In comparing di�erent multiprocessor layouts, Chalmers used a graphical
model as the test subject and varied its number of patches1 to create a set of
problems. Chalmers compared the layouts generated by his algorithm, the
AMP against two classic layouts, the ring and the torus. The ring (Fig. 6.1)
is one of the simpliest and most popular con�guration, and requires only two
links per processor. The torus con�guration (Fig. 6.2) may be constructed
as rings of the basic ring layout for processors. Each processor in the torus
has two links to its neighbour in its ring, and has another two links to the
processors in the adjacent rings [84]. The AMP produces layouts that seek
to minimize the average distance between any two processors in the layout.
Figure 6.1: A 12-processor ring.
1An object is represented by a set of patches, where each patch describes one part of itssurface. So the greater the number of patches used to model an object, the more realisticit will appear.
100
6.1. APPLICATIONS
Figure 6.2: A 32-processor torus.
Table 6.1: Time (in CPU seconds) for 32-processor con�gurations.
Con�gurationAMP Torus Ring
448 92.863 99.491 124.548700 159.794 168.202 203.336
No. of 1008 268.480 280.929 323.865patches 1372 437.157 446.410 498.552
1792 676.630 697.977 766.0652268 997.340 1023.468 1104.618
The model is rendered on 32-processor and 63-processor con�gurations
in torus, ring and the layout produced by the AMP algorithm. The Table
6.1 and 6.2 report the time taken (in CPU seconds) for the rendering pro-
cess with di�erent number of patches, using di�erent multiprocessor layouts.
From the two tables, we can clearly see that layouts produced by the AMP
algorithm has a consistent advantage in all cases. Rendering photo-realistic
images is one of many complex problems that produces heavy global commu-
nication tra�c on a transputer network. Careful design of network layouts
is a practical and important task that helps to reduce the solution times,
without incurring additional cost.
101
6.2. UNDERSTANDING THE PCP
Table 6.2: Time (in CPU seconds) for 63-processor con�gurations.
Con�gurationAMP Torus Ring
448 87.199 96.947 194.176700 128.066 140.220 259.962
No. of 1008 192.156 209.165 357.064patches 1372 283.004 302.443 478.847
1792 411.057 434.849 643.2962268 585.474 614.951 806.150
6.2 Understanding the PCP
To understand the terms used in this paper, a little introduction is in order.
Observe Table 6.3. To witness the use of these terms, examine the graph in
Fig. 6.3, where 1 - 2, 4 - 5 and 6 - 7 are examples of links. Assuming that
we are passing a message from node 0 to node 8, the paths available are 0 -
1 - 2 - 3 - 8, 0 - 4 - 5 - 8 and 0 - 6 - 7 - 8. However, of the three paths, only
0 - 4 - 5 - 8 and 0 - 6 - 7 - 8 are the shortest (since each have a distance of
three links) and so they are the optimal paths from processor 0 to processor
8.
0
1
4 5
6 7
8
2 3
Figure 6.3: Sample graph showing paths from node 0 to node 8.
Fig. 6.4 and 6.5 are sample layouts of �ve processors networks. In both
102
6.2. UNDERSTANDING THE PCP
Table 6.3: Important terms in the PCP.
Term Explanation
Link A connection between any two transputers. All links arebi-directional.
Path An ordered set of links. Links are listed in order of theirprogression from one processor to another.
Optimal path The optimal path between any two processors is the shortestpath between them. A message traveling from one processorto another could have several optimal paths to choose from,each equally short and therefore, equally desirable.
Distance The length of a path is the number of links in it. Thedistance between two processors is the length of the optimalpath between them.
Diameter The distance between any two processors, and is used inconjunction to state the nodes falling within the range of areferenced processor.
layouts, each processor has a total of four communication channels. In Fig.
6.4, the distance between any two processors is one. But in Fig. 6.5, we have
to consider the additional requirement that two arbitrary communication
channels (each channel belonging to a di�erent processor) are reserved for
interfacing with a system controller. The network in Fig. 6.4 is called a
regular network as all processors have the same number of communication
channels, and all these channels are connected to other processors. In Fig.
6.5, not all the communication channels are connected to other processors
and so the network is termed irregular. Networks that interface with a system
controller are the subject of interest in this paper.
The network of processors can be perceived as a graph (from extremal
103
6.2. UNDERSTANDING THE PCP
04
3
2
1
Figure 6.4: Processor con�guration without System Controller.
04 2
1
3
SC
Figure 6.5: Processor con�guration with System Controller(SC).This is the type of network that we are interested in generating.
graph theory), with the processors being the nodes and the communication
channels assuming the role of edges [52]. For the PCP domain, a good
measure of quality would be the mean inter-node distance, de�ned as Eq.
6.1, where N is the number of nodes and Npd is the number of processors
at distance d away from node p (in terms of number of intermediate nodes),
and dmax represents the maximum range to calculate for (user speci�ed).
davg =
PNp=1
Pdmax
d=1 (d �Npd)
N2: (6.1)
Another measurement of quality would be the number of processor pairs
104
6.2. UNDERSTANDING THE PCP
whose shortest path exceeds the maximum range dmax. The equation for the
measurement of Nover is de�ned in Eq. 6.2. Depending on an algorithm's
optimization strategy, the index davg can be optimized at the expense ofNover;
an alogrithm could shorten a few paths (thereby optimizing davg) through
lengthening a path (beyond the dmax) elsewhere.
Nover =NXp=1
Xd=dmax+1
Npd: (6.2)
Given each node has q edges, Chalmers and Gregory [52] de�ned the
theoretical upper bound2 as the maximum number of nodes nmax that can
form a graph, such that the shortest distance between any two nodes will
not exceed dmax (Eq. 6.3). This is used as a guide to set the dmax value in
davg. For example, if q = 4 and dmax = 3, then nmax = 30+31+32+33 = 40.
But since we understand that this is a theoretical bound, we need to allow
for some slack and usually choose a dmax value that is higher by one unit; ie.
using the same example: when q = 4 and dmax = 3 we have nmax = 40, so to
calculate davg for a 40-processor con�guration, we should use a dmax value of
4.
nmax =dmaxXd=0
(q � 1)d: (6.3)
2This is only theoretical, and there may not be a processor con�guration that cansatisfy the diameter constraint.
105
6.3. EXPRESSING PCP AS A CSOP
6.3 Expressing PCP as a CSOP
6.3.1 Solution representation
We chose to represent an irregular network by conceiving it as a regular
network with a missing link. Our approach is to design the layout for a regular
network, and then transform it to the requirement we want (see section
6.4.3). A regular network is symmetrical whenever the sum of communication
channels on each processor is an even number. This property can be exploited
to derive a representation with fewer constraints, than otherwise [48, 49].
2 3
1
Figure 6.6: A sample network.
Consider a regular network of three processors, each with four communi-
cation channels (Fig. 6.6). We can represent this network by the chromosome
f< 2; 3 >;< 1; 3 >;< 2; 1 >g, where where each number represents the in-
stantiation of a variable. Each variable represents a communication channel,
and its value indicates which processor it is connected to. A processor group3
is a set of two variables that depicts two channels on a processor. In this
representation, only two links are explicitly represented. For example, if we
record that processor 1 uses a link to communicate with processor 2, then it
is understood that processor 2 uses one (the same) link to communicate with
3Each group is denoted by the enclosing angular brackets.
106
6.4. PREPARAING GGA TO SOLVE PCP
processor 1.
So given N processors each with q communication channels, the con-
straints of the PCP under this representation are as follow:
(i) There will be N processor groups, with q2 variables each.
(ii) The domain of each variable will be from 1 to N .
(iii) But a variable must not assume a value equivalent to its processor
group (ie. a processor cannot connect to itself).
(iv) Each value in the domain will appear exactly q2 times.
6.4 Preparaing GGA to solve PCP
6.4.1 Setting up GGA's genetic operators
Besides reducing the number of potential constraints, our chosen represen-
tation has the added advantage of casting all valid solutions (ie. without
any constraint violation) as a set of permutations. With this advantage,
we can reduce the computational cost for running this application through
simplifying the genetic operators in GGA.
107
6.4. PREPARAING GGA TO SOLVE PCP
Cross-over operator
The canonical GA's cross-over operator (and for that matter, the generic
GGA's cross-over operator) can be destructive when confronted with chro-
mosomes that are permutations. This is because the crossing over of two
permutations would seldom create a child chromosome that is also a permu-
tation. Researchers such as Goldberg [20] has suggested cross-over operators
that checks for alignment of values in both parent chromosomes, or to use
heuristic repairs to correct the child chromosome produced by any cross-over
operator.
However, using these cross-over operators or heuristic repair algorithms
not only complicate the search (and therefore increasing resource require-
ments), but they do not guarantee the exploration e�ect of cross-over oper-
ators since the cross-over process is now more restrictive. And as for GGA's
cross-over operator, we could just rely on the constraints (de�ned in sec-
tion 6.3.1) to generate valid solutions, but that would have neutralized any
advantage the representation has given us.
The absence of any advantage and the promise of e�ciency (without hav-
ing to perform cross-over) made it logical for us to rely on the mutation4 and
penalty operators only. Ths cross-over operator is shut-o� by setting the
cross-over rate to zero, and child chromosomes are produced by duplicat-
ing the chromosomes in the population pool; ie. the chromosomes in the
population pool are copied into the o�spring pool.
4A GA without cross-over is also known as an Evolution Strategy (ES) [85].
108
6.4. PREPARAING GGA TO SOLVE PCP
Mutation operator
Our modi�ed mutation operator transforms a candidate solution by swapping
the contents of two genes. The two genes are selected by calling the roulette
wheel selection method (see section 2.6.2) twice. Provided that the two
selected genes are not the same and that they observe constraint (iii) (in
section 6.3.1), the values in the two genes are exchanged. Otherwise, the
selection process is repeated again.
6.4.2 De�ning solution features of the PCP
Solution features help the search to escape from local optima and for the
PCP, we chose links as the solution features. The set of solution features are
all the N � (N � 1) possible permutation of links for a pair of processors.
Our indicator function Ii would test if link i appears in a solution. The cost
of a link is dynamic, and depends on the layout of the network. In PCP we
favour links that are frequently accessed, so the cost of a link should increase
as a link's utility decreases.
The Critical Path Analysis (CPA) method designed for LPGA's attempt
on the PCP is used to calculate the cost of the links present in a processor
con�guration [48, 49]; ie. costs are dynamic in this implementation. When
the trigger function determines that the search is trapped in a local opti-
mum, the penalty operator is called. The penalty operator hands the best
chromosome in the population pool to the CPA. The CPA will output a hit
rate table that reports the utility of all links in the processor con�guration
109
6.4. PREPARAING GGA TO SOLVE PCP
represented by that chromosome. Using the hit rate table, the cost of link i
is the highest hit rate in the table less the hit rate of link i. In other words,
the more a link is used, the lower is its cost, and vice-versa.
Critical Path Analysis
The Critical Path Analysis is a heuristic analysis of the utility of each link
in a network. The aim is to produce a hit rate table, which is a collection of
links used in the network, and the estimated frequency of their usage. The
hit rate table is created in the following way:
1. For each pair of processors i and j, �nd the set of all paths between
them which have the minimum number of intermediate processors, call
this set Sij.
2. For each link between two processors, count the number of paths in all
Sij that contains this link. We collect these counts into a count table.
3. Create a hit rate table in roughly the same way as the count table,
except that only one path is selected in each Sij as described:
(a) First, order the sets Sij in ascending order of their cardinality.
(b) Process one Sij at a time according to this ordering. If Sij has one
path only, then all links used by this path will have their counts
in the hit rate table incremented by one. If more than one path
exists, then choose the path which contains a link (among all the
110
6.5. BENCHMARK
links in Sij) that has the greatest value in the count table. Update
the hit rate table accordingly, using the selected path.
6.4.3 Transforming the Representation
As mentioned in section 6.3.1, the candidate solutions that GGA synthesized
are represented as regular networks. To convert from a regular network to
one which meets the speci�cation (in section 6.2), one needs to just cut a
link. By snipping away a link, we would get two communication channels
ready to be linked to the system controller.
Choosing the link is an important task because after a link is removed,
the distance between certain pairs of processors could be increased. The
goal is to cut a link that does not substantially increase davg. This module
uses the information provided by the CPA method to guide its decision. We
look for the least popular link in the hit rate table and remove it from the
candidate solution.
6.5 Benchmark
There are two sets of benchmark in this chapter. Benchmark One is based on
the work of Warwick's work on con�guring transputers [26, 31], which is an
extension of [52]. The test suite requires an algorithm to return con�guration
with a dmax value of 3, where processor sizes range from 32 to the theoretical
upper bound of 40 (mentioned in Eq. 6.3). Benchmark Two follows the work
111
6.5. BENCHMARK
of Sharpe et. al in [86] and involves measuring performance for processor
sizes from 8 to 128. This involves calculating the davg of the con�gurations
generated by the participating algorithms. We used Benchmark One to ob-
serve the e�ects of GGA for small problem size increments, and Benchmark
Two to test how scalable it is.
6.5.1 Test Environment
GGA is written in ANSI C for portability, and complied using GNU GCC
on the SUN 3/50. All results contained in this paper were benchmarked on
Sun 3/50 with SunOS 4.1.
Under GGA's environment, mutation swaps the content of one pair of
genes for each chromosome in the o�spring pool. For the trials, we set a
population size of 50, and a stopping criterion of 200 generations. For the
�tness function h, � has a value of 10. PGLS will maintain 50 candidate
solutions concurrently, with each candidate solution having a limit of 200
iterations. PGLS also uses a � value of 10.
6.5.2 Benchmark One
Benchmark One is used to see what e�ects small increments in the PCP will
have on GGA, in terms of solution quality and processing time. Solution
quality is evaluated by measuring a solution's through the davg and Nover
indices. Processing time involves the measurement in CPU seconds, the time
taken to arrive at a solution.
112
6.5. BENCHMARK
The four competing algorithms in this benchmark are AMP [52], GAcSP
[26], LPGA [48, 49] and PGLS (section 2.7.2 on 44). AMP is an algorithm
that uses depth-�rst search with hill-climbing. Note that for AMP, no run-
time results were available and only results for 32 and 40 processor con�gura-
tions were reported. GAcSP is based on the standard GA, with the addition
of a repairer and an optional hill climber.
In the charts and tables to follow, GAcSP(HC) denotes GAcSP operating
with the optional hill climber active. And LPGA(CPA) denotes LPGA with
second phase optimization. The averages of the published results for GAcSP
and GAcSP(HC) were computed from �ve runs of each, while the mean of
results for LPGA, LPGA(CPA), PGLS and GGA were derived from 50 runs
of each.
Results: Comparing Quality via davg
The davg comparison chart shows that GGA, LPGA and GAcSP(HC) have
very close results. GGA edged out the other two algorithms by consistently
producing the lowest results. See Table 6.4 and, Fig. 6.7 and 6.8.
Results: Comparing Quality via Nover
Another indicator of a solution's quality is the number of paths that exceeded
the distance dmax. Any processor-to-processor pairs with an optimal path dis-
tance consisting of above three processors are considered to have \overshot".
So besides trying to minimize davg, we should attempt to minimize its Nover.
113
6.5. BENCHMARK
Table 6.4: davg produced by GGA and other algorithms (dmax = 3).Algorithms
AMP GAcSP GAcSP LPGA LPGA PGLS GGA(HC) (CPA)
Best davg32 2.31 2.33 2.29 2.30 2.29 2.30 2.29
33 - 2.37 2.32 2.33 2.31 2.33 2.30
34 - 2.40 2.34 2.34 2.34 2.34 2.31
No. of 35 - 2.42 2.37 2.37 2.35 2.37 2.32
Processors 36 - 2.42 2.38 2.42 2.39 2.39 2.33
37 - 2.45 2.41 2.43 2.41 2.42 2.34
38 - 2.47 2.43 2.46 2.42 2.44 2.36
39 - 2.51 2.45 2.48 2.43 2.47 2.38
40 2.53 2.51 2.47 2.49 2.43 2.50 2.41
Average davg32 - 2.344 2.344 2.312 2.302 2.305 2.300
33 - 2.380 2.330 2.336 2.322 2.332 2.304
34 - 2.400 2.400 2.352 2.348 2.356 2.314
No. of 35 - 2.422 2.422 2.384 2.370 2.385 2.328
Processors 36 - 2.446 2.446 2.428 2.406 2.401 2.342
37 - 2.470 2.470 2.442 2.420 2.436 2.356
38 - 2.488 2.488 2.472 2.438 2.462 2.378
39 - 2.516 2.516 2.494 2.448 2.494 2.396
40 - 2.524 2.524 2.506 2.464 2.511 2.412
2.25
2.30
2.35
2.40
2.45
2.50
2.55
32 33 34 35 36 37 38 39 40
d av
g
No. of Processors
AMPGAcSP
GAcSP(HC)LPGA
LPGA(CPA)GGA
Figure 6.7: Comparing best davg produced by GGA and other algorithms.
114
6.5. BENCHMARK
2.25
2.30
2.35
2.40
2.45
2.50
2.55
32 33 34 35 36 37 38 39 40
d av
g
No. of Processors
GAcSPGAcSP(HC)
LPGALPGA(CPA)
GGA
Figure 6.8: Comparing average davg produced by GGA and other algorithms.
GGA returned results that were generally better than any of the other
algorithms. See Table 6.5 and, Fig. 6.9 and 6.10. Together with results from
the previous section (on comparing with davg), we can see that GGA was
more sucessful than others at �nding solutions that optimizes both davg and
Nover.
Results: Comparing Speed
For reference, run time of the compared algorithms are reported in Table 6.6
and Fig. 6.11 and 6.12. In terms of run times, GGA is slower than either
of the LPGA implementations, but faster than GAcSP(HC). GGA loses out
to LPGA because LPGA is a much more specialized algorithm for the PCP.
The more general GAcSP(HC) is about two to four times slower than GGA.
It is also important to note from Fig. 6.11 and 6.12 that processing time
115
6.5. BENCHMARK
Table 6.5: Nover produced by GGA and other algorithms (dmax = 3).Algorithms
AMP GAcSP GAcSP LPGA LPGA PGLS GGA(HC) (CPA)
Best Nover
32 - 12 1 10 1 1 0
33 - 15 4 10 3 4 3
34 - 24 3 12 4 6 3
No. of 35 - 33 11 19 9 12 8
Processors 36 - 31 8 25 14 18 13
37 - 47 18 30 16 24 16
38 - 49 19 34 22 29 20
39 - 65 25 44 28 35 25
40 - 62 34 50 29 42 27
Average Nover
32 - - - 11.3 1.6 1.4 0.2
33 - - - 12.6 4.4 4.6 3.2
34 - - - 14.9 5.4 6.9 3.4
No. of 35 - - - 22.4 9.8 13.1 8.5
Processors 36 - - - 26.1 15.1 18.8 13.4
37 - - - 32.4 17.0 25.3 16.3
38 - - - 37.5 24.5 30.8 20.8
39 - - - 49.2 31.3 37.7 25.6
40 - - - 58.2 33.2 44.2 28.2
0
10
20
30
40
50
60
70
32 33 34 35 36 37 38 39 40
N o
ver
No. of Processors
GAcSPGAcSP(HC)
LPGALPGA(CPA)
GGA
Figure 6.9: Comparing best Nover produced by GGA and other algorithms.
116
6.5. BENCHMARK
0
10
20
30
40
50
60
70
32 33 34 35 36 37 38 39 40
N o
ver
No. of Processors
LPGALPGA(CPA)
GGA
Figure 6.10: Comparing average Nover produced by various algorithms.
grows linearly and gradually with the number of processors in the problem.
This allows GGA to scale gracefully as the problem sizes increase beyond 40.
6.5.3 Benchmark Two
Having seen how GGA will behave for small problem size increments, we test
how scalable GGA is. This is through exposing GGA to a set of problems
with more processors, where the problem size grows by (approximately) a
power of two. In this benchmark, an algorithm's e�ectiveness is measured
using the davg index5.
The three main algorithms in this benchmark areAMPDFS [52], AMPGA [86]
and our GGA. AMPDFS is a heuristic depth-�rst search implemented in
PROLOG. AMPGA is a GA with heuristic repair, and is the only GA in
5The davg index is the only published information available for some algorithms.
117
6.5. BENCHMARK
Table 6.6: Execution time (in CPU seconds).Algorithms
AMP GAcSP GAcSP LPGA LPGA PGLS GGA(HC) (CPA)
Best Time32 - 1160 4592 813 1652 1212 2316
33 - 1101 4784 852 1738 1341 2455
34 - 1018 5979 875 1903 1455 2563
No. of 35 - 1656 6046 908 2099 1523 2670
Processors 36 - 2095 10135 942 2159 1677 2781
37 - 2488 7682 995 2214 1823 2891
38 - 5032 10304 1033 2271 2091 3004
39 - 3240 11925 1087 2356 2235 3112
40 - 5432 14908 1131 2420 2542 3231
Average Time32 - 1290 8098 858 1731 1288 2355
33 - 1450 8784 902 1830 1393 2474
34 - 1701 8724 937 2042 1501 2592
No. of 35 - 2401 10528 971 2213 1687 2691
Processors 36 - 2328 12050 1015 2286 1832 2809
37 - 2967 11463 1079 2351 2141 3012
38 - 3810 17141 1121 2408 2245 3144
39 - 4408 13314 1196 2509 2457 3200
40 - 5143 18542 1228 2644 2893 3253
0
2000
4000
6000
8000
10000
12000
14000
16000
32 33 34 35 36 37 38 39 40
time
No. of Processors
GAcSPGAcSP(HC)
LPGALPGA(CPA)
GGA
Figure 6.11: Comparing best execution time.
118
6.5. BENCHMARK
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
32 33 34 35 36 37 38 39 40
time
No. of Processors
GAcSPGAcSP(HC)
LPGALPGA(CPA)
GGA
Figure 6.12: Comparing average execution time.
this benchmark to use binary representation. We also included results of
LPGACPA andGAcSPHC from Lau and Tsang [49], andWarwick and Tsang [31]
respectively. LPGACPA is the forefather of GGA, and was also a mutation-
based GA that uses the Critical Path Analysis heuristic and the same solu-
tion representation. GAcSPHC is a GA with heuristic repair that provides
candidate solutions for its Tabu Search routine to improve on. Results of
LPGACPA and GAcSPHC are limited to con�gurations of 32 and 40 pro-
cessors, since they belong to a di�erent set of benchmark [26] where the
emphasis is to compare generated con�gurations of every size between 32
and 40 processors.
The dmax (used in the �tness function) for each problem is calulated using
nmax (Eq. 6.3), and these values are also reported in Table 6.7. Runtime for
GGA was between two seconds for con�gurations with eight processors, to
45 minutes for con�gurations of 128 processors.
119
6.6. SUMMARY
Results
Results from GGA were averages from 50 runs for each processor size. These
results are tabulated into Table 6.7. We have also provided the davg appraisal
of popular con�gurations: Hypercube, Torus, Ternary Tree and Ring. The
results are also presented visually as charts in Fig. 6.13 and 6.14. Fig. 6.13
compares the results of GGA with the popular con�gurations, and Fig. 6.14
compares the results of GGA with other algorithms in this benchmark. In
both charts, GGA consistently held the lowest curves, indicating robustness
in providing the best results.
Table 6.7: Comparing davg, the average interprocessor distances.
Number of processors8 13 16 23 32 40 53 64 128
dmax values 2 2 3 3 3 4 4 5 6GGA 1.28 1.55 1.66 2.02 2.29 2.41 2.52 2.63 3.18PGLS 1.28 1.55 1.68 2.04 2.30 2.50 2.61 2.83 3.32LPGACPA - - - - 2.29 2.43 - - -AMPGA 1.28 1.55 1.66 2.02 2.29 2.46 2.58 2.79 3.41GAcSPHC - - - - 2.29 2.47 - - -AMPDFS 1.28 1.55 1.73 2.05 2.31 2.53 2.76 2.92 3.58Hypercube 1.50 - 2.00 - 2.50 - - 3.00 3.50Torus 1.50 - 2.00 - 3.00 3.50 - 4.00 5.64Ternary Tree 1.97 2.56 2.91 3.39 3.93 4.25 4.77 5.01 6.25Ring 2.00 3.23 4.00 5.74 8.00 10.00 13.24 16.00 32.00
A '-' indicates that result was not available.
6.6 Summary
Results from the two benchmark combined showed GGA to be consistent in
providing good solutions. GGA gave the best solutions for all problems in the
120
6.6. SUMMARY
davg
2.00
1.00
3.00
4.00
5.00
6.00
7.00
8.00
8 32 56 80 104 128
Ternary TreeRing
TorusHypercube
GGA
No. of processors
Figure 6.13: Comparing GGA against popular con�gurations.
two benchmark. It even performed better than LPGA, its predecessor. GGA
is a generalized version of LPGA. In Benchmark One, GGA competed against
LPGA using the same set of strategy; ie. same representation, penalization
strategy, and relying on mutation only. We can conclude that, despite of
a general architecture, GGA has improved over LPGA through the use of
�tness templates, and the incorporation GLS.
Unique to this application, GGA worked without its cross-over operator.
GGA sans cross-over closely resembles its cousin, the PGLS. However, GGA
still managed to performed better than PGLS. Observing the di�erence be-
tween the two algorithms, we can again attribute GGA's advantage to the
use of the �tness templates, and its implementation of the GLS. As discussed
before, GGA's implementation of GLS di�ers from the original; there is only
one set of features for all chromosomes in the population, and GGA only
penalize feature(s) of the current best chromosome (in the population) at
121
6.6. SUMMARY
davg
GGALPGA(CPA)
AMP(GA)GAcSP(HC)AMP(DFS)
3.60
3.20
2.80
2.40
2.00
1.60
1.20
8 32 56 80 104 128
No. of processors
Figure 6.14: Comparing GGA against other algorithms.
points of penalization. For the PCP, this could be a means to alleviate the
e�ects of over-penalization.
122
Chapter 7
Conclusion
7.1 Completed Tasks
7.1.1 Designing the GA
In section 1.6 (page 20), we described the requirements of the GA that we
wanted to create. We see that GGA managed to satis�ed these requirements:
� Representation: GGA and its genetic operators uses real coded repre-
sentation. With this representation scheme, GGA was able to easily
represent any discrete binary CPs.
� Constraints: Each constraint in a CP is a feature in GGA, and it
belongs to the feature set. Writing constraints into GGA is easy, taking
the diverse RLFAP as our testimony. Genetic operators in the GGA
have also been designed to be aware of the relationships between genes
that a feature imposes.
123
7.1. COMPLETED TASKS
� Fitness function: GGA supports multi-objective optimization1 through
GLS. In multi-objective optimization, we do not know the magnitude
of each component in the �tness function. Through GLS, GGA learns
from its treks in the search space, modifying the �tness function as it
progresses.
7.1.2 Test Suite Results
We tested the GGA against four diverse problems. A brief description of
these problems can be found in section 1.7 (page 22). Following, we summa-
rize the results from these tests:
� Royal Road functions (chapter 3, page 47): GGA was the most e�cient
algorithm, coming in �rst in all functions of the RRf. For the functions
R1 and R2, GGA was 11 times more e�cient than the canonical GA,
and 10 times more e�cient than GLS. In the R4 function, GGA was
the only algorithm to reach the optimum solution in reasonable time.
The RRf creates a balanced set of features; neither tight nor loose.
� Generalized Assignment Problem (chapter 4, page 61): GGA was the
most robust algorithm in this problem set. GGA managed to �nd the
optimum solutions 592 times out of 600 runs2. Compare this to PGLS,
which found the optimum solutions 571 times out of 600 runs. Followed
by GA-HEU, which found it 553 times out of 600.
1For example, in the CSOP instances of the RLFAP: the objective function includesminimizing the number of violated constraints in the solution, while optimizing its quality.
210 runs per problem, 60 problems in the GAP.
124
7.1. COMPLETED TASKS
We also conducted an experiment to discover GGA's behaviour to-
wards feature sets of di�erent tightness. When a feature set is de�ned
too tightly, GGA usually has little advantage over PGLS. Otherwise,
GGA would show some form of signi�cant advantage over PGLS, be it
robustness or e�ectiveness.
� Radio Link Frequency Assignment Problem (chapter 5, page 83): The
RLFAP is composed of CSOP and PCSP types of problems, with a
similar mix of feature sets (from loose to very tight). In the CSOP,
GGA found the optimum solution in �ve out of six cases, ranking it
third. In terms of robustness, GGA always found the optimum solution
in �ve out of the six instances. For PCSP type instances in RLFAP,
GGA had results that ranked third, with good robustness. Overall,
GGA was the best performing algorithm.
� Processor Con�guration Problem (chapter 6, page 98): PCP consists of
two benchmarks, where both have features that are tight. Benchmark
One tests the e�ects of an algorithm for small problem size increments,
while Benchmark Two tests how scalable it is. In terms of optimal-
ity, solutions returned by GGA in both benchmarks were better than
any of the algorithms (including published results). Ranking next is
LPGACPA for Benchmark One, and PGLS for Benchmark Two. In
terms of robustness, GGA was the most robust, followed by LPGACPA
and PGLS respectively.
125
7.2. CONTRIBUTIONS
7.2 Contributions
7.2.1 Robustness
For all problems in the test suite, GGA demonstrated a high level of robust-
ness. In the RRf (page 47), GGA was the only algorithm that could always
return the optimum solution within the given limits.
Results from GAP (page 61) showed that GGA has the most robust
performance for that problem. It will always return the optimum solution
for all problems3 except two. Of the remaining two instances, GGA found
the optimum solution �ve out of ten times. Where GGA was unable to �nd
the optimum, its solution would be only one step away from the optimum.
For the CSOP instances in the RLFAP (page 83), GGA would always �nd
the optimum solution for �ve of the six instances. Standard deviation for the
remaining instance was very small. Also, standard deviation for the PCSP
instances showed GGA to be quite consistent in the quality of the solution
it provided, compared to GLS or PGLS.
Standard deviation for all instances in Benchmark One of the PCP (page
98) showed GGA to have the lowest of all participating stochastic algorithm.
Testing on robustness of participating algorithms was limited to Benchmark
One, since that was the only published results available.
Less robust performance from GA or GLS (compared to GGA) on prob-
3There are 60 problems in the GAP.
126
7.2. CONTRIBUTIONS
lems in the test suite indicates that GGA's advantage is not due to mecha-
nisms from the two respective algorithms. GGA's success is an amalgamation
of GA's population paradigm, the GLS penalty algorithm, �tness templates,
and the specialized genetic operators. Overall, GGA o�ers a level of robust-
ness unseen by any other stochastic algorithms in these problems.
7.2.2 E�ectiveness
Our de�nition of e�ectiveness covers quality of solutions, and e�ciency of
algorithm. In the RRf (page 47), GGA came in �rst in all functions of the
RRf. For the functions R1 and R2, GGA was 11 times more e�cient than
the canonical GA, and 10 times more e�cient than GLS. In the R4 function,
GGA was the only algorithm to reach the optimum solution in reasonable
time.
GAP (page 61) results showed that GGA found the optimum solution to
all 60 problems in the GAP.
GGA also managed to �nd the optimum solution for �ve of the six CSOP
instances in the RLFAP (page 83). For the one remaining CSOP instance,
GGA found solutions that was one step from the optimum. As for the PCSP
instances, GGA found solutions that were very close to the published opti-
mum.
In the PCP (page 98), GGA found results that were better than existing
published results.
127
7.2. CONTRIBUTIONS
In terms of quality, GGA o�ers a very competent performance. At the
minimum, GGA would always �nd a solution very near the known optimum.
For some instances, GGA found better solutions than published results, such
as in the RRf and PCP. This success is again, the combined e�ort of: GA's
population dynamics, the GLS penalty algorithm, �tness templates, and the
specialized genetic operators.
The e�ects of the �tness templates, coupled with the specialized genetic
operators could be clearly appreciated in the RRf. In this problem, with
the help of the �tness templates, GGA knew what to cross-over, and where
to mutate. This obviously was the reason that GGA managed to �nd the
optimum solution with such little number of calls to the �tness function.
7.2.3 Alleviate Epistatsis
Epistatsis is caused by GA being ignorant of relationships between genes,
which is problematic when trying to solve CPs. GGA o�ers the concept
of �tness templates, and specialized genetic operators (mutation and cross-
over) that uses these templates. Templates record the e�ects of features
(ie. constraint violation or satisfaction) in solutions while GGA's genetic
operators work on them.
A weight (in the �tness template) indicates the probability that a gene in
a chromosome would change. Modifying the content of a gene would a�ect
the weights of related genes (related through features), making them more
(or less) susceptible to changes by the genetic operators. In other words,
128
7.2. CONTRIBUTIONS
the �tness templates, together with the genetic operators, o�er a form of
probabilistic constraint propagation. We say probabilitistic because all of
GGA's genetic operators are randomly bais by the weights of the genes.
Our test suite clearly showed that GGA performed signi�cantly better
than GAs in this thesis. Further, GGA also had better results (more robust
or better solutions) than PGLS, indicating that GGA adds value on top of
GLS; ie. GGA's better performance than the canonical GA is not due to
GLS alone. This performance advantage was achieved through the combined
work of the �tness templates and specialized genetic operators, which was
designed to reduce the e�ect of epistatsis.
7.2.4 General Framework
GGA easily adapted to the di�erent problems in the test suite simply through
coding the respective feature set for each problem, and where necessary,
modi�cation to GGA's genetic operators (eg. The Processor Con�guration
Problem (PCP) in chapter 6 on page 98). All this was done without alteration
to GGA's core code.
In all, it took us less than an hour each to adapt and test GGA for the
Royal Road functions (RRf) (chapter 3 on page 47), the Generalised Assign-
ment Problem (GAP) (chapter 4 on page 61) and the Processor Con�gura-
tion Problem (chapter 6 on page 98). The Radio Link Frequency Assignment
Problem (RLFAP) (chapter 5 on page 83) took about three hours because it
has four di�erent types of problems.
129
7.3. IDENTIFYING GGA'S NICHE
More importantly, GGA's open framework meant that it need not depend
on GLS as its penalty operator. As long as a penalty algorithm could provide
information to build �tness templates, it should intergrate well.
7.3 Identifying GGA's Niche
Summarizing the observations from the previous section, we can say the
following:
Use GGA when robustness is absolutely necessary. Our applications have
shown GGA to be very robust. This is probably because GGA maintains a
population of candidate solutions. And also, GGA is more selective in penal-
izing feautes; ie. when GGA is trapped, it will only penalize the feature(s)
exhibited by the �ttest chromosome in the population, and so avoiding un-
necessary penalizations.
Also, use GGA when the feature set is very loose. By having a population
of solutions, and through its cross-over operator, GGA would still be able to
make progress when the search becomes unguided.
Do not use GGA when you know that the feature set is very tight, and
that computation time is a major concern. Our tests have shown that GGA
usually takes much longer to �nd a satisfactory solution (than GLS) when
the feature set is very tight. In such situations, it would be wiser to use GLS.
130
7.4. FUTURE WORK
7.4 Future Work
A key area where GGA can be improved is the way it detects local opti-
mums. At present, GGA concludes it is trapped in a local optimum when
the �tness of the best chromosome in the population remains unchanged for
a number of generations. This approach has no guarantee, since we do not
know if all genes have been attempted. So there may exist some genes in the
chromosome that may lead to better �tness.
It is impossible for GGA (or any GA) to monitor which genes were at-
tempted. Chromosomes in the population are in transient; chromosomes get
created or destroyed from generation to generation, and we would loose track
of which genes were tested. Neither would it be possible to have individual set
of penalty counters for each chromosome, since this would make the �tness
template meaningless for the cross-over operator; ie. the cross-over operator
chooses a gene for the child from the parent chromosomes by comparing their
weights, the weights would be incomparable if they are made from di�erent
sets of penalty counters4.
The challenge is to create a more robust local optimum trap detection
scheme. A way forward would be to detect the trap based on the population
as a whole, and not on an individual representative.
4Chromosomes at di�erent points of the search space would be penalized for di�erentfeatures. The penalty counter of the same feature for di�erent chromosomes would havedi�erent degrees of penalization.
131
Appendix A
Terms, Variables and Concepts
of GGA
132
Table A.1: Components of GGA
Algorithms Purpose
Cross-over operator Uses the �tness templates of two parent chro-mosomes to decide each parent's contributionof genetic material towards creating a childchromosome.
Mutation operator The �tness template of a chromosome is usedto guide in the alteration of the chromosome'sgenetic content.
Penalty operator This operator detects and selects undesir-able solution features in a chromosome topenalize. Penalization involves incrementingpenalty counters of the assocaited features.
Local optimum detec-tor
Detects if the search is trapped in a local op-timum. If it is, the penalty operator is called.
Distribute Penalty If a solution feature is present in a chromo-some, the penalty counter associated withthis feature is added onto the weights of thegenes that are constituents of this feature.
133
Table A.2: Data Types of GGA
Data structures Purpose
Weight � Each gene has one weight. The weight is ameasure of undesirability of the gene's cur-rent instantiation, compared to the rest of thechromosome.
Fitness template A �tness template is a collection of weights.Each chromosome has one �tness template.
Solution feature � Solution features are domain speci�c and userde�ned. A feature is exhibited by a set of vari-able assignments that describes a non-trivalproperty of a problem.
Penalty counter � Each feature has one penalty counter. Apenalty counter keeps count of the numberof times its related solution feature has beenpenalized since the start of the search.
134
Table A.3: Inputs and Parameters to GGA
Inputs/Parameters Purpose
Solution feature � See Table A.1
Cost � Each solution feature has a cost to rate itsundesirability of presence.
Indicator function � Each solution feature has a user de�ned in-dicator function that tests for the feature'spresence in a chromosome.
Objective function A function that maps each solution to a nu-merical value.
Regularization param-eter �
A parameter that determines the proportionof contribution that penalties have in an aug-mented �tness function.
Augmented �tnessfunction g
A function that is the sum of the objectivefunction on a chromosome and the penaltiesof features that exist in it.
Mutation rate A fraction that de�nes the number of genesin the chromosome to mutate.
Cross-over rate The probability that cross-over will occur be-tween two chromosomes.
135
Appendix B
Publications
B.1 Journal Papers
1. T. L. Lau and E. P. K. Tsang, \Solving the processor con�guration
problem with a mutation-based genetic algorithm," International Jour-
nal on Arti�cial Intelligence Tools, vol. 6, no. 4, pp. 567{585, 1997.
2. T. L. Lau and E. P. K. Tsang, \Guided genetic algorithm and its appli-
cation to the processor con�guration problem," IEEE Expert(Intelligent
systems and their applications), 1998.
3. T. L. Lau and E. P. K. Tsang, \Guided genetic algorithm and its appli-
cation to the generalized assignment problem," Submitted to Computers
and Operations Research, 1998.
4. T. L. Lau and E. P. K. Tsang, \Solving the radio link frequency as-
signment problem with the guided genetic algorithm," Submitted to
Constraints, 1998.
136
B.2. CONFERENCE PAPERS
B.2 Conference Papers
1. T. L. Lau and E. P. K. Tsang, \Applying a mutation-based genetic
algorithm to the processor con�guration problem," in Proceedings of
IEEE 8th International Conference on Tools with Arti�cial Intelligence,
pp. 17{24, 1996.
2. T. L. Lau and E. P. K. Tsang, \Solving the generalized assignment
problem with the guided genetic algorithm," in Proceedings of 10th
IEEE International Conference on Tools with Arti�cial Intelligence,
1998.
3. T. L. Lau and E. P. K. Tsang, \Guided genetic algorithm and its ap-
plication to the large processor con�guration problem," in Proceedings
of 10th IEEE International Conference on Tools with Arti�cial Intelli-
gence, 1998.
137
Bibliography
[1] E. P. K. Tsang, Foundations of Constraints Satisfaction. Academic Press
Limited, 1993.
[2] P. Meseguer, \Constraint satisfaction problems: An overview," AI Com-
munications, vol. 2, pp. 3{17, Mar. 1989.
[3] Kumar, \Algorithms for constraint satisfaction problems: A survey," AI
Magazine, vol. 13, no. 1, pp. 32{44, 1992.
[4] E. Freuder, R. Dechter, B. Selman, M. Ginsberg, and E. Tsang, \Sys-
tematic versus stochastic constraint satifaction," in Proceedings of 14th
International Joint Conference on Arti�cial Intelligence, 1995.
[5] S. Minton, M. Johnston, A. Philips, and P. Laird, \Minimizing con-
icts: A heuristic repair method for constraint satisfaction and schedul-
ing problems," Arti�cial Intelligence, vol. 58, pp. 161{205, 1992.
[6] B. Selman, H. Levesque, and D. Mitchell, \A new method for solving
hard satis�ability problems," in Proceedings of 10th National Conference
on Arti�cial Intelligence, pp. 440{446, 1992.
138
BIBLIOGRAPHY
[7] P. Langley, \Systematic and non-systematic search strategies," in Pro-
ceedings of Arti�cial Intelligence Planning Systems: Proceedings of the
�rst international conference, pp. 145{152, 1992.
[8] E. Freuder and A. Mackworth, Constraints-based Reasoning. MIT Press,
1994.
[9] E. Freuder and R. Wallace, \Partial constraint satisfaction," Arti�cial
Intelligence, vol. 58, pp. 21{70, 1992.
[10] S. Kirkpatrick, C. D. G. Jr, and M. P. Vecchi, \Optimization by simu-
lated annealing," Science, vol. 220, pp. 671{680, 1983.
[11] S. Kirkpatrick, \Optimization by simulated annealing: quantitative
studies," Statistical Physics, vol. 34, pp. 975{986, 1984.
[12] V. Cerny, \A thermodynamical approach to the traveling salesman prob-
lem: an e�cient simulation algorithm," Optimization Theory and Ap-
plications, vol. 45, pp. 41{51, 1985.
[13] F. Glover, \Future paths for integer programming and links to arti�cial
intelligence," Computers and Operations Research, vol. 13, pp. 533{549,
1986.
[14] F. Glover, \Tabu search: Part i," ORSA Journal on Computing, vol. 1,
pp. 190{206, 1989.
[15] F. Glover, \Tabu search: Part ii," ORSA Journal on Computing, vol. 2,
pp. 4{32, 1990.
139
BIBLIOGRAPHY
[16] T. Kohonen, Self-organization and associative memory. Springer, 1984.
[17] D. E. Rumelhart and J. L. McClelland, Parallel distributed processing:
explorations in the microstructures of cognition, volume 1: foundations.
Cambridge, Mass.: MIT Press, 1 ed., 1986.
[18] J. L. McClelland and D. E. Rumelhart, Parallel distributed processing:
explorations in the microstructures of cognition, volume 2: psychological
and biological models. Cambridge, Mass.: MIT Press, 1 ed., 1986.
[19] J. Holland, Adaptation in Natural and Arti�cial Systems. The University
of Michigan Press, 1975.
[20] D. E. Goldberg, Genetic Algorithm in Search, Optimization and Ma-
chine Learning. Addison-Wesley Pub. Co., Inc., 1989.
[21] L. Davis, Handbook of Genetic Algorithm. Von Nostrand Reinhold, 1991.
[22] J. Holland, \Some practical aspects of adaptive systems theory," Elec-
tronic Information Handling, pp. 209 { 217, 1965.
[23] A. Bethke, \Genetic algorithms as function optimizers," Tech. Rep. 197,
Logic of Computer Group, University of Michigan, USA, 1978.
[24] G. D. Smith, A. Kapsalis, V. J. Rayward-Smith, and A. Kolen, \Radio
link frequency assignment problem report 2.1 - implementation and test-
ing of genetic algorithm approaches," tech. rep., School of Information
Science, University of East Anglia, UK, 1995.
140
BIBLIOGRAPHY
[25] M. C. R. Rojas, \From quasi-solutions to solutions: An evolutionary
algorithm to solve csp," in Proceedings of 2nd International Conference
on Principles and Practice of Constraint Programming, 1996.
[26] T. Warwick, A GA approach to constraint satisfaction problems. PhD
thesis, Department of Computer Science, University of Essex, UK, 1995.
[27] P. Chu, Genetic Algorithms for Combinatorial Optimization Problems.
PhD thesis, The Management School, Imperial College, University of
London, UK, 1997.
[28] A. Eiben, P.-E. Rau�e, and Z. Ruttkay, \Heuristic genetic algorithms for
constrained problems," in Proceedings of Dutch National AI Conference
NAIC '93, pp. 241{252, 1993.
[29] A. Eiben, P.-E. Rau�e, and Z. Ruttkay, \Solving constraint satisfaction
problem using genetic algorithms," in Proceedings of 1st IEEE World
Conference on Computational Intelligence, pp. 543{547, 1994.
[30] A. Eiben, P.-E. Rau�e, and Z. Ruttkay, \Ga-easy and ga-hard constraint
satisfaction problems," in Proceedings of the ECAI-94 workshop on Con-
straint Processing, pp. 267{284, 1995.
[31] T. Warwick and E. P. K. Tsang, \Using a genetic algorithm to tackle
the processor con�guration problem," in Proceedings of Symposium on
Applied Computing, pp. 217{221, 1994.
141
BIBLIOGRAPHY
[32] T. Warwick and E. P. K. Tsang, \Tackling car sequencing problems
using a generic genetic algorithm," Evolutionary Computation, vol. 3,
no. 3, pp. 267{298, 1995.
[33] J. Paredis, \Co-evolutionary constraint satisfaction," in Proccedings of
the 3rd Conference on Parallel Problem Solving from Nature, pp. 46{55,
1994.
[34] J. Scha�er, Some experiments in machine learning using vector evaluated
genetic algorithm. PhD thesis, Vanderbilt University, Nashville, USA,
1984.
[35] J. Grefenstette, \Genesis: A system for using genetic search proce-
dures," in Proceedings of Conference on Intelligent Systems and Ma-
chines, pp. 161{165, 1984.
[36] J. Grefenstette, \A user's guide to genesis," Tech. Rep. CS-84-11, Van-
derbilt Unversity, Nashville, USA, 1984.
[37] J. Richardson, M. Palmer, G. Liepins, and M. Hillard, \Some guidlines
for genetic algorithms with penalty functions," in Proceedings of 3rd
International Conference on Genetic Algorithms, pp. 191{197, 1989.
[38] M. C. R. Rojas, \Improving �tness function for csp in an evolutionary
algorithm," tech. rep., Rapport de Recherche CERMICS, 1995.
[39] M. C. R. Rojas, \Using the knowledge of the constraints network to de-
sign an evolutionary algorithm that solves csp," in Proceedings of ICEC-
96, 1996.
142
BIBLIOGRAPHY
[40] J. Bowen and D. Gerry, \Solving constraint satisfaction problems using a
genetic/systematic search hybrid that realizes when to quit," in Proceed-
ings of 6th International Conference on Genetic Algorithms, pp. 122{
129, 1995.
[41] L. Davis, \Genetic algorithm and simulated annealing," Research Notes
in AI, 1987.
[42] Z. Michalewicz and C. Janikow, \Handling constraints in genetic al-
gorithms," in Proceedings of 4th Conference on Genetic Algorithms,
pp. 151{157, 1991.
[43] Z. Michalewicz and N. Attia, \In evolutionary optimization of con-
strainted problems," in Proceedings of 3rd Annual Conference on Evolu-
ationary Programming, pp. 98{108, 1994.
[44] Z. Michalewicz, \Genetic algorithm, numerical optimization, and con-
straints," in Proceedings of 6th International Conference on Genetic Al-
gorithms, pp. 151{158, 1995.
[45] P. Chu and J. E. Beasley, \Genetic algorithms for the generalized as-
signment problem," Computers and Operations Research, 1996.
[46] P. Chu and J. E. Beasley, \Constraint handling in genetic algorithms:
the set partitioning problem," tech. rep., The Management School, Im-
perial College, University of London, UK, 1997.
143
BIBLIOGRAPHY
[47] W. Crompton, S. Hurley, and N. Stephens, \Frequency assignment using
a parallel genetic algorithm," in Proceedings of Natural Algorithms in
Signal Processing, 1993.
[48] T. L. Lau and E. P. K. Tsang, \Applying a mutation-based genetic
algorithm to the processor con�guration problem," in Proceedings of
IEEE 8th International Conference on Tools with Arti�cial Intelligence,
pp. 17{24, 1996.
[49] T. L. Lau and E. P. K. Tsang, \Solving the processor con�guration prob-
lem with a mutation-based genetic algorithm," International Journal on
Arti�cial Intelligence Tools, vol. 6, no. 4, pp. 567{585, 1997.
[50] Centre d'Electronique de l'Arement, France,
ftp://ftp.cert.fr/pub/bourret, Radio Link Frequency Assignment
Problem.
[51] M. Mitchell, S. Forrest, and J. Holland, \The royal road for genetic
algorithms: Fitness landscapes and ga performance," in Proceedings of
1st European Conference on Arti�cial Life, pp. 245{254, 1992.
[52] A. Chalmers and S. Gregory, \Constructing minimum path con�gura-
tions for multiprocessor systems," Parallel Computing, vol. 19, pp. 343{
355, 1993.
[53] A. Chalmers, \A minimum-path system: a communication-e�cient sys-
tem for distributed-memory multiprocessors," South African Journal of
Science, pp. 175{181, 1993.
144
BIBLIOGRAPHY
[54] A. Chalmers, A minimum path parallel processing environment. �lpha
Books, 1994.
[55] T. L. Lau, \Applying low power genetic algorithm to constraint satisfac-
tion optimization problems," Master's thesis, Department of Computer
Science, University of Essex, UK, 1995.
[56] C. Voudouris, Guided local search. PhD thesis, Department of Computer
Science, University of Essex, UK, 1997.
[57] C. Voudouris and E. P. K. Tsang, \Guided local search and its appli-
cation to the travelling salesman problem," European Journal of Oper-
ations Research, To appear.
[58] C. Voudouris and E. P. K. Tsang, \Function optimization using guided
local search," Tech. Rep. CSM-249, Department of Computer Science,
University of Essex, UK, Sept. 1995.
[59] C. Voudouris and E. P. K. Tsang, \Partial constraint satisfaction prob-
lems and guided local search," in Proceedings of Practical Application of
Constraint Technology, pp. 337{356, Apr. 1996.
[60] B. Backer, V. Furnon, P. Kilby, P. Prosser, and P. Shaw, \Solving vehi-
cle routing problems with constraint programming and metaheuristics,"
tech. rep., 1997.
[61] E. P. K. Tsang and C. Voudouris, \Fast local search and guided local
search and their application to british telecom's workforce scheduling
problem," Operations Research Letters, vol. 20, pp. 119{127, Mar. 1997.
145
BIBLIOGRAPHY
[62] J. E. Beasley, OR-Library. http://mscmga.ms.ic.ac.uk/info.html.
[63] S. Martello and P. Thot, \An algorithm for the generalized assignment
problem," Operational Research '81, pp. 589{603, 1981.
[64] M. Fisher, R. Jaikumar, and L. N. V. Wassenhove, \A multiplier ad-
justment method for the generalized assignment problem,"Management
Science, vol. 32, pp. 1095{1103, 1986.
[65] D. Cattrysse, Set Partitioning Approaches to Combinatorial Optimiza-
tion Problems. PhD thesis, Katholieke Universiteit Leuven, Centrum
Industrieel Beleid, Belgium, 1990.
[66] S. Martello and P. Thot, Knapsack problems: Algorithms and computer
implementations. John Wiley & Sons, 1990.
[67] D. Cattrysse, M. Salomon, and L. N. V. Wassenhove, \A set partitioning
heuristic for the generalized assignment problem," European Journal of
Operational Research, vol. 72, pp. 167{174, 1994.
[68] I. H. Osman, \Heuristics for the generalized assignment problem: Sim-
ulated annealing and tabu search approaches," OR Spektrum, vol. 17,
pp. 211{225, 1995.
[69] S. Forrest and M. Mitchell, \Relative building-block �tness and the
building-block hypothesis," in Foundations of Genetic Algorithms 2,
pp. 109{126, 1993.
146
BIBLIOGRAPHY
[70] M. Mitchell, J. Holland, and S. Forrest, \When will a genetic algorithm
outperform hill climbing?," in Advances in Neural Information Process-
ing Systems 6, 1994.
[71] M. M�uhlenbein, \Evolution in time and space - the parallel genetic al-
gorithm," in Foundations of Genetic Algorithms, pp. 316{337, 1991.
[72] T. L. Lau and E. P. K. Tsang, \Guided genetic algorithm and its appli-
cation to the generalized assignment problem," Submitted to Computers
and Operations Research, 1998.
[73] T. L. Lau and E. P. K. Tsang, \Solving the generalized assignment
problem with the guided genetic algorithm," in Proceedings of 10th IEEE
International Conference on Tools with Arti�cial Intelligence, 1998.
[74] G. T. Ross and R. M. Soland, \Modelling location facility problems
as generalized assignment problems," Management Science, vol. 24,
pp. 345{357, 1977.
[75] M. Fisher and R. Jaikumar, \A generalized assignment heuristics for the
large scale vechile routing," Network, vol. 11, pp. 109{124, 1981.
[76] D. Cattrysse and L. N. V. Wassenhove, \A survey of algorithms for
the generalized assignment problem," European Journal of Operational
Research, vol. 60, pp. 260{272, 1992.
[77] F. Gheysens, B. Golden, and A. Assad, \A comparison of techniques for
solving the eet size and mix vehicle routing problem," OR Spektrum,
vol. 6, pp. 207{216, 1984.
147
BIBLIOGRAPHY
[78] T. L. Lau and E. P. K. Tsang, \Solving the radio link frequency assign-
ment problem with the guided genetic algorithm," Submitted to Con-
straints, 1998.
[79] K. Hale, \Frequency assignment: Theory and applications," in Proceed-
ings of IEEE, vol. 68, pp. 1497{1514, 1980.
[80] S. Tiourine, C. Hurkens, and J. Lenstra, \An overview of algorithmic ap-
proaches to frequency assignment problems," in Proceedings of CALMA
Symposium, Scheveningen, 1995.
[81] T. L. Lau and E. P. K. Tsang, \Guided genetic algorithm and its appli-
cation to the processor con�guration problem," IEEE Expert(Intelligent
systems and their applications), 1998.
[82] T. L. Lau and E. P. K. Tsang, \Guided genetic algorithm and its ap-
plication to the large processor con�guration problem," in Proceedings
of 10th IEEE International Conference on Tools with Arti�cial Intelli-
gence, 1998.
[83] C. Goral, \Modelling the interaction of light between di�use surfaces,"
ACM Computer Graphics, vol. 18, no. 3, pp. 213{222, 1984.
[84] C. V. Canta, \Torus and other networks as communication networks
with up to some hundred points.," IEEE Transaction on Computers,
vol. 33, no. 7, pp. 657{666, 1983.
148
BIBLIOGRAPHY
[85] J. Heitk�otter and D. Beasley, The hitch-
hiker's guide to evolutionary computation.
http://www.cs.purdue.edu/coast/archive/clife/FAQ/www/, 1998.
[86] P. Sharpe, A. Chalmers, and A. Greenwood, \Genetic algorithms for
generating minimum path con�gurations," Microprrocessors and Mi-
crosystems, vol. 19, no. 1, pp. 9{14, 1995.
149