genetic algorithms
TRANSCRIPT
Genetic AlgorithmsBy
Anas Amjad Obeidat
Advanced Algorithms
02Semester 2 -
2008/20091
March 18 - 2009
Overview
• Introduction To Genetic Algorithms (GAs)
• GA Operators and Parameters
• Genetic Algorithms To Solve The Traveling Salesman Problem (TSP)
• 8-queens Problem
• Summary
Semester 2 - 2008/20092
March 18 - 2009
Introduction To Genetic Algorithms (GAs)
3Semester 2 -
2008/2009
March 18 - 2009
History Of Genetic Algorithms
• “Evolutionary Computing” was introduced in the 1960s by I.
Rechenberg.
• John Holland wrote the first book on Genetic Algorithms
‘Adaptation in Natural and Artificial Systems’ in 1975.
• In 1992 John Koza used genetic algorithm to evolve programs
to perform certain tasks. He called his method “Genetic
Programming”.
4Semester 2 -
2008/2009
March 18 - 2009
What Are GAs?
Genetic Algorithms are search and optimization techniques based on Darwin’s Principle of Natural Selection.
5 Semester 2 - 2008/2009
March 18 - 2009
Principle Of Natural Selection
“Select The Best, Discard The Rest”[1]
6 Semester 2 - 2008/2009
March 18 - 2009
GAs Vs other search methods
“Search” for what?
• Data - Efficiently retrieve a piece of information, (Data mining) Not AI
• Paths to solutions - Sequence of actions/steps from an initial state to a given goal, (AI-tree/graph search)
• Solutions - Find a good solution to a problem in a large space (search space) of candidate solutions
– Aggressive methods (e.g. Simulated Annealing, Hill Climbing)– Non-aggressive methods (e.g. GAs)
Semester 2 - 2008/20097
March 18 - 2009
Applications of GAs
• Numerical and Combinatorial Optimization– Job-Shop Scheduling, Traveling salesman
• Automatic Programming– Genetic Programming
• Machine Learning– Classification, NNet training, Prediction
• Economic– Biding strategies, stock trends
• Ecology– host-parasite coevolution, resource flow, biological arm races
• Population Genetics– Viability of gene propagation
• Social systems– Evolution of social behavior in insect colonies
Semester 2 - 2008/20098
March 18 - 2009
Genetic Algorithms Implementation
Semester 2 - 2008/20099
March 18 - 2009
Computational Model
Main GA algorithm
Semester 2 - 2008/200910
March 18 - 2009
Working Mechanism Of GAs
Begin
Initialize population
Optimum Solution?
T=T+1
Selection
Crossover
Mutation
N
Evaluate Solutions
Y
Stop
T =0
Semester 2 - 2008/200911
March 18 - 2009
Simple Genetic Algorithm
function GENETIC-ALGORITHM(population, FITNESS-FN, crossover-rate, mutation-rate) returns an individualinputs: population, a set of individuals
FITNESS-FN (the fitness function)repeat
new_population empty setcalulate the fitness value of each individualloop for i from 1 to SIZE(population) do
x RANDOM-SELECTION(population, FITNESS-FN)add x to new population
loop for i from 1 to SIZE(population) * crossover-rate dox RANDOM-SELECTION(new_population)y RANDOM-SELECTION(new_population)x, y REPRODUCE(x, y)
loop for i from 1 to SIZE(population) * mutation-rate dox RANDOM-SELECTION(new_population)x MUTATE(x)
population new_population until the average fitness values are stable, or enough time has elapsedreturn the best individual found in any population
Semester 2 - 2008/200912
March 18 - 2009
Nature to Computer Mapping
Nature Computer
Population
Individual
Fitness
Chromosome
Gene
Reproduction
Set of solutions.
Solution to a problem.
Quality of a solution.
Encoding for a Solution.
Part of the encoding of a solution.
Crossover
Semester 2 - 2008/200913
March 18 - 2009
GA Operators and Parameters
Semester 2 - 2008/200914
March 18 - 2009
Encoding
The process of representing the solution in the form of a string that conveys the necessary information.
• Binary Encoding – Most common method of encoding. Chromosomes are strings of 1s and 0s and each position in the chromosome represents a particular characteristic of the problem.
Permutation Encoding – Useful in ordering problems such as the Traveling Salesman Problem (TSP). Example. In TSP, every chromosome is a string of numbers, each of which represents a city to be visited.
• Value Encoding – Used in problems where complicated values, such as real numbers, are used and where binary encoding would not suffice.
Semester 2 - 2008/200915
March 18 - 2009
Fitness Function
A fitness function quantifies the optimality of a solution (chromosome) so that that particular solution may be ranked against all the other solutions.
• A fitness value is assigned to each solution depending on how close it actually is to solving the problem.
• Ideal fitness function correlates closely to goal + quickly computable.
• Example. In TSP, f(x) is sum of distances between the cities in solution. The lesser the value, the fitter the solution is.
Semester 2 - 2008/200916
March 18 - 2009
Recombination
The process that determines which solutions are to be preserved and allowed to reproduce and which ones deserve to die out.
• The primary objective of the recombination operator is to emphasize the good solutions and eliminate the bad solutions in a population, while keeping the population size constant.
• “Selects The Best, Discards The Rest”.
• “Recombination” is different from “Reproduction”.
Semester 2 - 2008/200917
March 18 - 2009
Recombination(Cont.)
• Identify the good solutions in a population.
• Make multiple copies of the good solutions.
• Eliminate bad solutions from the population so that multiple copies of good solutions can be placed in the population.
Semester 2 - 2008/200918
March 18 - 2009
Roulette Wheel Selection
• Each current string in the population has a slot assigned to it which is in proportion to it’s fitness.
• We spin the weighted roulette wheel thus defined n times (where n is the total number of solutions).
• Each time Roulette Wheel stops, the string corresponding to that slot is created.
Strings that are fitter are assigned a larger slot and hence have a better chance of appearing in the new population.
Semester 2 - 2008/200919
March 18 - 2009
Example Of Roulette Wheel Selection
No. String Fitness % Of Total
1 01101 169 14.4
2 11000 576 49.2
3 01000 64 5.5
4 10011 361 30.9
Total 1170 100.0
Semester 2 - 2008/200920
March 18 - 2009
Roulette Wheel For ExampleSemester 2 -
2008/200921
March 18 - 2009
Crossover
It is the process in which two chromosomes (strings) combine their genetic material (bits) to produce a new offspring which possesses both their characteristics.
• Two strings are picked from the mating pool at random to cross over.
• The method chosen depends on the Encoding Method.
Semester 2 - 2008/200922
March 18 - 2009
Crossover Methods
• Single Point Crossover- A random point is chosen on the individual chromosomes (strings) and the genetic material is exchanged at this point.
Semester 2 - 2008/2009
Chromosome1 11011 | 00100110110
Chromosome 2 11011 | 11000011110
Offspring 1 11011 | 11000011110
Offspring 2 11011 | 00100110110
23
March 18 - 2009
Crossover Methods (contd.)
• Two-Point Crossover- Two random points are chosen on the individual chromosomes (strings) and the genetic material is exchanged at these points.
Chromosome1 11011 | 00100 | 110110
Chromosome 2 10101 | 11000 | 011110
Offspring 1 10101 | 00100 | 011110
Offspring 2 11011 | 11000 | 110110
NOTE: These chromosomes are different from the last example.
Semester 2 - 2008/200924
March 18 - 2009
Crossover Methods (contd.)
• Uniform Crossover- Each gene (bit) is selected randomly from one of the corresponding genes of the parent chromosomes.
Chromosome1 11011 | 00100 | 110110
Chromosome 2 10101 | 11000 | 011110
Offspring 10111 | 00000 | 110110
NOTE: Uniform Crossover yields ONLY 1 offspring.
Semester 2 - 2008/200925
March 18 - 2009
Crossover (contd.)
• Crossover between 2 good solutions MAY NOT ALWAYS yield a better or as good a solution.
• Since parents are good, probability of the child being good is high.
• If offspring is not good (poor solution), it will be removed in the next iteration during “Selection”.
Semester 2 - 2008/200926
March 18 - 2009
Elitism
Elitism is a method which copies the best chromosome to the new offspring population before crossover and mutation.
• When creating a new population by crossover or mutation the best chromosome might be lost.
• Forces GAs to retain some number of the best individuals at each generation.
• Has been found that elitism significantly improves performance.
Semester 2 - 2008/200927
March 18 - 2009
Mutation
It is the process by which a string is deliberately changed so as to maintain diversity in the population set.
We saw in the giraffes’ example, that mutations could be beneficial.
Mutation Probability- determines how often the parts of a
chromosome will be mutated.
Semester 2 - 2008/200928
March 18 - 2009
Example Of Mutation
• For chromosomes using Binary Encoding, randomly selected bits are inverted.
Offspring 11011 00100 110110
Mutated Offspring 11010 00100 100110
NOTE: The number of bits to be inverted depends on the Mutation Probability.
Semester 2 - 2008/200929
March 18 - 2009
Advantages Of GAs
• Global Search Methods: GAs search for the function optimum starting from a population of points of the function domain, not a single one. This characteristic suggests that GAs are global search methods. They can, in fact, climb many peaks in parallel, reducing the probability of finding local minima, which is one of the drawbacks of traditional optimization methods.
Semester 2 - 2008/200930
March 18 - 2009
Advantages of GAs (contd.)
• Blind Search Methods: GAs only use the information about the objective function. They do not require knowledge of the first derivative or any other auxiliary information, allowing a number of problems to be solved without the need to formulate restrictive assumptions. For this reason, GAs are often called blind search methods.
Semester 2 - 2008/200931
March 18 - 2009
Advantages of GAs (contd.)
• GAs use probabilistic transition rules during iterations, unlike the traditional methods that use fixed transition rules.
This makes them more robust and applicable to a large range of problems.
Semester 2 - 2008/200932
March 18 - 2009
Advantages of GAs (contd.)
• GAs can be easily used in parallel machines- Since in real-world design optimization problems, most computational time is spent in evaluating a solution, with multiple processors all solutions in a population can be evaluated in a distributed manner. This reduces the overall computational time substantially.
Semester 2 - 2008/200933
March 18 - 2009
Genetic Algorithms To Solve The Traveling Salesman
Problem (TSP)
Semester 2 - 2008/200934
March 18 - 2009
The Problem
The Traveling Salesman Problem is defined as:
‘We are given a set of cities and a symmetric distance matrix that indicates the cost of travel from each city to every other city.
The goal is to find the shortest circular tour, visiting every city exactly once, so as to minimize the total travel cost, which includes the cost of traveling from the last city back to the first city’.
Semester 2 - 2008/200935
March 18 - 2009
Encoding
• We represent every city with an integer .
• Consider 6 Jordanian cities – Amman, Irbid, Al-Mafraq, Al-Salt , Aqabah and Al-Karak and
assign a number to each.
Amman 1Irbid 2Al-Mafraq 3Al-Salt 4Aqabah 5Al-Karak 6
Semester 2 - 2008/200936
March 18 - 2009
Encoding (contd.)
• Thus a path would be represented as a sequence of integers from 1 to 6.
• The path [1 2 3 4 5 6 ] represents a path from Amman to Irbid , Irbid to Al-Mafraq, Al-Mafraq to Al-Salt, Al-Salt to Aqabah , Aqabah to Al-Karak . Finally Al-Karak to Amman
• This is an example of Permutation Encoding as the position of the elements determines the fitness of the solution.
Semester 2 - 2008/200937
March 18 - 2009
Fitness Function
• The fitness function will be the total cost of the tour represented by each chromosome.
• This can be calculated as the sum of the distances traversed in each travel segment.
The Lesser The Sum, The Fitter The Solution Represented By That Chromosome.
Semester 2 - 2008/200938
March 18 - 2009
Distance/Cost Matrix For TSP
Amman1
Irbid2
Al-Mafraq3
Al-Salt4
Al-Aqabah5
Al-Karak6
Amman [1] 0 90 100 35 300 200
Irbid [2] 90 0 60 120 400 290
Al-Mafraq [3] 100 60 0 70 480 225
Al-Salt [4] 35 120 70 0 320 150
Aqabah [5] 300 400 480 320 0 290
Al-Karak [6] 200 290 225 150 290 0
Cost matrix for six city example. Distances in Kilometers
Semester 2 - 2008/200939
March 18 - 2009
Fitness Function (contd.)
• So, for a chromosome [4 1 3 2 5 6], the total cost of travel or fitness will be calculated as shown below
• Fitness = 35+ 100+ 60+ 400+ 290 + 150
= 1035 kms.
• Since our objective is to Minimize the distance, the lesser the total distance, the fitter the solution.
Semester 2 - 2008/200940
March 18 - 2009
Selection Operator
Tournament Selection.
As the name suggests tournaments are played between two solutions and the better solution is chosen and placed in the mating pool.
Two other solutions are picked again and another slot in the mating pool is filled up with the better solution.
Semester 2 - 2008/200941
March 18 - 2009
Why we can’t use single-point
• Single point crossover method randomly selects a crossover point in the string and swaps the substrings.
• This may produce some invalid offsprings as shown below.
4 1 3 2 5 6
4 3 2 1 5 6
4 1 3 1 5 6
4 3 2 2 5 6
Semester 2 - 2008/200942
March 18 - 2009
Order 1 crossover
• Idea is to preserve relative order that elements occur
• Informal procedure:
1. Choose an arbitrary part from the first parent
2. Copy this part to the first child
3. Copy the numbers that are not in the first part, to the first child:
• starting right from cut point of the copied part,
• using the order of the second parent
• and wrapping around at the end
4. Analogous for the second child, with parent roles reversed
Semester 2 - 2008/200943
March 18 - 2009
Order 1 crossover example
• Copy randomly selected set from first parent
• Copy rest from second parent in order 1,9,3,8,2
Semester 2 - 2008/200944
March 18 - 2009
Mutation Operator
• The mutation operator induces a change in the solution, so as to maintain diversity in the population and prevent Premature Convergence.
• In our project, we mutate the string by randomly selecting any two cities and interchanging their positions in the solution, thus giving rise to a new tour.
4 1 3 2 5 6
4 5 3 2 1 6
Semester 2 - 2008/200945
March 18 - 2009
TSP Example: details (1)
• Initial Population:
P1 : {2,1,3,4,5,6}P2 : {1,2,3,5,4,6}P3: {1,4,3,2,6,5}P4: {5,3,2,1,4,6}
• Generation 1:1- Fitness Function (P1) (2,1) + (1,3) + (3,4) + (4,5) + (5, 6) + (6,2) = 90 + 100 + 70 + 320 + 290 +290 = 1060 km
2- Fitness Function (P2) (1,2)+(2,3)+(3,5) + (5,4) + (4,6) + (6,1) = 90 + 60 +480 + 320 + 150 + 200 = 1300 km
3- Fitness Function (P3)(1,4) + (4,3)+(3,2)+(2,6)+ (6,5)+(5,1) = 35 + 70 + 60 + 290 + 290 + 300 = 1045 km
4- Fitness Function (P4) (5,3)+(3,2)+(2,1)+(1,4)+(4,6)+(6,5) = 480 + 60 + 90 + 35 + 150 + 290 = 1105 km
Fitness Function: Minimum Distance between Cites
Semester 2 - 2008/2009
Termination Condition: Generation 3Termination Condition: Generation 3
46
March 18 - 2009
TSP Example: details (2)
• Tournament SelectionP1: 1060 kmP2: 1300 km P3: 1045 km P4: 1105 km
• Crossover (Two Points): Order (1)
Semester 2 - 2008/2009
The Winners P1 & P3
Nodes Solution Notes
P1 2 1 | 3 4 5 | 6P3 1 4 | 3 2 6 | 5S1 2 6 | 3 4 5 | 1 5 1 4 3 2 6 (Order 1)S2 4 5 | 3 2 6 | 1 6 2 1 3 4 5 (Order 1)
Table 1
47
March 18 - 2009
TSP Example: details (3)
• Generation 2P1 : {2,1,3, 4,5,6} = 1060 km P2 : {1,4,3,2,6,5} = 1045 km P3: {2,6,3,4,5,1} = 1295 km P4: {4,5,3,2,6,1} = 1385 km
• Tournament SelectionP1: 1060 kmP2: 1045 km P3: 1295 km P4: 1385 km
• Crossover (Two Points): Order (1)
The Winners P1 & P2
Nodes Solution Notes
P1 2 1 | 3 4 5 | 6
P3 1 4 | 3 2 6 | 5
S1 2 6 | 3 4 5 | 1 1 2 6 (Order 1)
S2 4 5 | 3 2 6 | 1 1 4 5 (Order 1)
Table 2
48 Semester 2 - 2008/2009
March 18 - 2009
TSP Example: details (4)• Generation 3
P1 : {2,1,3, 4,5,6} = 1060 km P2 : {1,4,3,2,6,5} = 1045 km P3: {2,6,3,4,5,1} = 1295 km P4: {4,5,3,2,6,1} = 1385 km
• Tournament SelectionP1: 1060 kmP2: 1045 km P3: 1295 km P4: 1385 km
• Crossover (Two Points): Order (1)The crossover result will be as previous table (2)
• Mutation ^P1 2 6 3 4 5 1 Fitness = 1290 km
We used the mutation to solve the local minimum problem
The Winners P1 & P2
We Find that Optimal solution is a P2
Depends on Generation #3
We Find that Optimal solution is a P2
Depends on Generation #3
49 Semester 2 - 2008/2009
March 18 - 2009
8-queens Problem
Semester 2 - 2008/200950
March 18 - 2009
8-queens
• How to represent the 8-queens problem in GA?
• Remember an individual is a potential solution.
• In the 8-queens problem, it will be a state with 8-queens on the board.
• One way is to specify the position of the 8 queens, each in a column of 8 squares.
• For example, the setting on the right will be specified by this chromosome: (86427531)
• This can be represented by bits or digits.
• Note: this is not an optimization problem.
Semester 2 - 2008/200951
March 18 - 2009
8-queens: (a) Initialization
• Assume we have the following initial populations with 4 individuals:
v1 = (24748552)
v2 = (32752411)
v3 = (24415124)
v4 = (32543213)
Semester 2 - 2008/200952
March 18 - 2009
8-queens: (b) Fitness Evaluation
• Fitness function: the less conflicts (attacking queens) the better
• We can use the number of non-attacking pairs of queens. The highest possible value of the fitness function is 8C2 = 28. Every solution will have a fitness value of 28.
Semester 2 - 2008/200953
March 18 - 2009
8-queens: (b) Fitness Evaluation
• We calculate the fitness value of each chromosome.
• For example, fitness of the chromosome v1 (24748552) is 28 – 4 = 24
• That is because only 4 pairs of queens attack each other:– The queens on 1st and 8th column
– The queens on 2nd and 4th column
– The queens on 6th and 7th column
– The queens on 3rd and 8th column
Semester 2 - 2008/200954
March 18 - 2009
8-queens: (b) Fitness Evaluation
• The fitness values for the chromosomes are calculated as follows:
eval(v1) = 24
eval(v2) = 23
eval(v3) = 20
eval(v4) = 11
• None of the chromosomes is the solution to the problem. If a solution is found, the algorithm stops and returns the solution.
Semester 2 - 2008/200955
March 18 - 2009
8-queens: (c) Selection
• The total sum of fitness values = 24 + 23 + 20 + 11 = 78
• So, the probability of each chromosome to be selected into the next generation is as follows:
prob(v1) = 24/78 = 31%
prob(v2) = 23/78 = 29%
prob(v3) = 20/78 = 26%
prob(v4) = 11/78 = 14%
Semester 2 - 2008/200956
March 18 - 2009
8-queens: (c) Selection
• Next, we arrange these probabilities into different ranges from 0 to 1 to facilitate the roulette wheel process:
v1 : 0.00 to 0.31
v2 : 0.31 to 0.60
v3 : 0.60 to 0.84
v4 : 0.84 to 1.00
Semester 2 - 2008/200957
March 18 - 2009
8-queens: (c) Selection
• Four random numbers are then drawn for the next generation. Suppose we have the following random numbers:0.4012 0.14860.59730.8129
• The following individuals will be chosen:
0.4012 v2 (32752411) v1'
0.1486 v1 (24748552) v2'
0.5973 v2 (32752411) v3'
0.8129 v3 (24415124) v4'
Semester 2 - 2008/200958
March 18 - 2009
8-queens: (d) Crossover
• Next, some of these four chromosomes will perform crossover. Suppose the crossover probability is 0.80. All 4 chromosomes are selected for crossover (the number is rounded up to an even number).
• The selected chromosomes are paired up randomly.
• A crossover point is randomly chosen for each crossover.
Semester 2 - 2008/200959
March 18 - 2009
8-queens: (d) Crossover
• Suppose the 3rd digit in the first pair is chosen as the crossover point.
v1' = (327 | 52411)
v2' = (247 | 48552)
• After crossover, we will have:
v1'' = (327 | 48522)
v2'' = (247 | 52411)
Semester 2 - 2008/200960
March 18 - 2009
v1'v2'
v1''v2''
Semester 2 - 2008/200961
March 18 - 2009
8-queens: (e) Mutation
• For each gene (digit), there is a small chance that it will be mutated.
• In the 8-queens problem, it means choosing a queen at random and moving it to a random square in its column.
• Suppose the mutation probability is 0.05
• 32 random numbers are generated in total.
• Suppose the 6th, 19th, and 32nd random numbers are smaller than 0.05.
• The three corresponding digits will be mutated:– 6th digit in v1''
– 3rd digit in v3''
– 8th digit in v4''
Semester 2 - 2008/200962
March 18 - 2009
8-queens: (e) Mutation
• For each digit to be mutated, another random number will be generated to determine where the queen should be moved to.
• For example,
v1'' = (32748522)
• If a random number determines that the digit should be mutated to 1, the new chromosome will become:
v1''' = (32748122)
Semester 2 - 2008/200963
March 18 - 2009
8-queens: (e) Mutation
• The same process is applied to every gene to be mutated.
• The final chromosomes for the new generation are thus as follows:
v1''' = (32748122)
v2''' = (24752411)
v3''' = (32252124)
v4''' = (24415417)
• The process is then repeated from step (b) until a solution is found.
Semester 2 - 2008/200964
March 18 - 2009
8-queens: A SummarySemester 2 -
2008/200965
March 18 - 2009
Summary
Semester 2 - 2008/200966
March 18 - 2009
• Genetic Algorithms (GAs) implement optimization strategies based on simulation of the natural law of evolution of a species by natural selection
• The basic GA Operators are:EncodingRecombinationCrossoverMutation
• GAs have been applied to a variety of function optimization problems, and have been shown to be highly effective in searching a large, poorly defined search space even in the presence of difficulties such as high-dimensionality, multi-modality, discontinuity and noise.
Semester 2 - 2008/2009
Summary67
March 18 - 2009
References
1. D. E. Goldberg, ‘Genetic Algorithm In Search, Optimization And Machine Learning’, New York: Addison – Wesley (1989)
2. John H. Holland ‘Genetic Algorithms’, Scientific American Journal, July 1992.
3. Kalyanmoy Deb, ‘An Introduction To Genetic Algorithms’, Sadhana, Vol. 24 Parts 4 And 5.
4. T. Starkweather, et al, ‘A Comparison Of Genetic Sequencing Operators’, International Conference On Gas (1991)
5. D. Whitley, et al , ‘Traveling Salesman And Sequence Scheduling: Quality Solutions Using Genetic Edge Recombination’, Handbook Of Genetic Algorithms, New York
Semester 2 - 2008/200968
March 18 - 2009
References (contd.)
WEBSITES
6.www.iitk.ac.in/kangal
7.www.math.princeton.edu
8.www.genetic-programming.com
9.www.garage.cse.msu.edu
10.www.aic.nre.navy.mie/galist
Semester 2 - 2008/200969
March 18 - 2009
Questions ?
Semester 2 - 2008/200970
March 18 - 2009