genetic algorithms -...

EE682 Intelligent Control Theory Prof J H KimEE682 Intelligent Control Theory, Prof. J.-H. Kim

Lecture 19

Genetic Algorithms - I

What are they?yHow do they work?Why do they work?Why do they work?

Robot Intelligence Technology Lab.

Genetic Algorithms

Evolutionary Computation: Evolutionary Computation: GA, EP, ES, GP, etc

GAs What are they? What are they? How do they work? Why do they work?

2

Derivative-Free Optimization

Characteristics: Derivative freeness: instead, repeated evaluations of the objective function

Intuitive guideline: evolution and thermodynamics.

Slowness: generally slower than derivative-based optimization

Flexibility: complex objective ftn. w/o sacrificing coding/computation time

Randomness: random number generator in deciding next search direction

Analytic opacity: because of randomness and problem-specific nature

Iterative nature: terminal condition Computation time Optimization goal: (the best objective ftn, fk ) < (a certain preset goal value) Minimal improvement: ( fk - fk -1 ) < (a preset value) Minimal relative improvement: (( fk - fk -1 ) / fk -1 ) < (a preset value)

3

Minimal relative improvement: (( fk fk 1 ) / fk 1 ) (a preset value)

Genetic Algorithms

Derivative-freeness Parallel-search procedure: can be implemented on parallel-processing

machine for massively speeding upA li bl t b th ti d di t ( bi t i l) ti i ti Applicable to both continuous and discrete (combinatorial) optimization

Stochastic: less likely to get trapped in local minima Flexibility: both structure and parameter identification in complex modelsy p p

Terminology Chromosome: a binary bit string Population (gene pool)

A t f h A set of chromosomes Population-based optimization

Generation

4

GAs: What are they?

What are they?St h ti l ith b d t l h Stochastic algorithm based on natural phenomena:

genetic inheritance and Darwinian strife for survival

C. Darwin’s principle of natural selection

G Mendel’s basic principles of hereditary

5

G. Mendel s basic principles of hereditary

GAs: What are they?

Balancing between exploitation and exploration: Exploiting the best solution and exploring the search space

Aim to complex problems: Large-scale combinatorial optimization problem and Large-scale combinatorial optimization problem and

highly constrained engineering problems

Belong to the class of probabilistic algorithms: Directed and stochastic search

Maintain a population of potential solutions P f l i di i l h Perform a multi-directional search

Cf. Hill-climbing method and Simulated annealing technique A single point in the search spaceg p p

Crossover: information exchange between different potential solutions Mutation: introduce some extra variability into the population

6

GAs: What are they?

Five components:p

A genetic representation

A way to create an initial population

An evaluation function considered as an environment

Genetic operators

V l f i Values for various parameters

7

GAs: What are they?

Encoding scheme:

(11,6,9) 1011 0110 1001 - binary code

1110 0101 1101 - gray code

Fitness evaluation: f (x) or ranking

g y

f ( ) g

Selection: Determine which parents participate in producing offspring

for the next generationSelection probability of the i th chromosome: Selection probability of the i-th chromosome:

n

kkii ffp

1

8

k 1

GAs: What are they?

Crossover operator Crossover rate Crossover rate Generate new chromosomes

100|11110 100|10010One-point crossover101|10010

1|0011|1101|0110|010

101|11110

1|0110|1101|0011|010

One point crossover

Two-point crossover

Mutation operatorG h

1|0110|010 1|0011|010

Generate new chromosomes Flip a bit with a probability (mutation rate)

10011110 10011010

ElitismK b t b

10011110 10011010Mutated bit

9

Keep best members

GAs: How do they work?

How do they work? Maximize a function of k variables

E h i bl i d d bi t i f l th Each variable is coded as a binary string of length Suppose six decimal places for the variables’ values is desirable

ix im

Construction of a roulette wheel: for the selection process Calculate the fitness value eval (vi ) for each chromosome vi

10

(i =1, …, pop_size)


Find the total fitness of the population

Calculate the probability of a selection p for each chromosome v Calculate the probability of a selection pi for each chromosome vi

Calculate a cumulative probability qi for each chromosome vi

Selection: spinning the roulette wheel pop_size times

Generate a random (float) number r from the range [0, 1]

If r < q1, then select the first chromosome (v1); otherwise select th i th h h th t)2( ii

11

the i-th chromosome such that . )_2( sizepopivi ii qrq 1


Crossover: crossover probability pc , expected #:G d b f h [0 1]

sizepoppc _

Generate a random number r from the range [0, 1] If r < pc , select given chromosome for crossover Mate selected chromosome randomly: for each pair of coupledMate selected chromosome randomly: for each pair of coupled

chromosomes we generate a random integer number pos from the range [1, m-1], where m is the total length, i.e. total number of bits in a chromosomein a chromosome

Two chromosomes and are replaced by a pair of their offspring:

and

Mutation: mutation probability pm, p y pm,Expected # mutated bits:

Generate a random number r from the range [0 , 1]sizepopmpm _

12

If r < pm , mutate the bit


Ex) The problem:

where .A th i i i 4 d i l l f h i blAssume the precision is 4 decimal places for each variable

: This range is divided into equal size ranges. Since , 18 bits are required to represent 1817 2100001.152 q pa variable x1.

: This range is divided into equal size ranges.Since 15 bits are required to represent a variable xSince , 15 bits are required to represent a variable x2.

String (000...0) corresponds to a

13

String (000...0) corresponds to a String (111...1) corresponds to b


String corresponds to:

Let us consider a string of 18+15 = 33 bits:(010001001011010000111110010100010) The first 18 bits, 010001001011010000, represent

x

The next 15 bits, 111110010100010, represent

1x

The next 15 bits, 111110010100010, represent

So the chromosome (010001001011010000111110010100010)corresponds to

The fitness value for this chromosome is

14


To optimize the function f using ti l ith ta genetic algorithm, we create

a population of pop_size = 20 chromosomes.

All 33 bits in all chromosomes are initialized randomly.

15


Evaluation:

16


Construction of a roulette wheel: Total fitness:

Probability of a selection for each chromosome

17


Roulette wheel: for the cumulative probabilities .1q

Generated random numbers:

18


New population:

19


Crossover: assume 0.25 cp

For each chromosome in the new population, generate a random p p , gnumber r:

20


After crossover

21


Mutation: assume 0.01 mp

Expected # mutated bits per generated = =p p g= 6.6 bits. For every bit in the population, generate a random number, if r < 0.01, mutate the bit.

22


After mutations:

23


Evaluation of new population:

24


Note that the total fitness of the new population F (just after one generation) is 447.049688, much higher than total fitness of the previous population, 387.776822.

Also, the best chromosome now has a better evaluation (33 351874) than the best chromosome from the(33.351874) than the best chromosome from the previous population (30.060205).

25


After 1000 generations...

26


Evaluation

Remark Remark Most values are over 30. Population starts to converge. However, in generation 396

the best chromosome had value of 38.827553. What happened?

27

pp


Classical Genetic Algorithms

Fixed-length binary strings

Two operators for crossover and mutation, respectivelywo ope a o s o c ossove a d u a o , espec ve y

Though nicely theorized, failed in many areas

Neatness inabilit to deal ith nontri ial constraints Neatness inability to deal with nontrivial constraints

Hard to implementation of constraints

Too domain independent to be useful in many applications

28


Hybrid GA by Davis: GA + Current algorithm Use the current encoding

Hybridize where possible: Incorporate the positive features of the current algorithm

Adapt the genetic operators: Create crossover and mutation operators for the new type of encoding,

incorporate domain-based heuristics

Si il t E l ti P (Z Mi h l i ) t th Similar to Evolution Programs (Z. Michalewicz) except the assumption of the existence of one or more current (traditional) algorithm available on the problem domainalgorithm available on the problem domain

29

GAs: Why do they work?

A schema: A template allowing exploration of similarities among chromosomes

Built by introducing a don’t care symbol (★) into the alphabet of genes

A schema:

S = (★★ 1 1 1 ★★★★★)

The schema S matches 27 strings:(0 0 1 1 1 0 0 0 0 0)(0 0 1 1 1 0 0 0 0 1)(0 0 1 1 1 0 0 0 0 1)(0 0 1 1 1 0 0 0 1 0)(0 0 1 1 1 0 0 0 1 1)

...

(1 1 1 1 1 1 1 1 1 0)(1 1 1 1 1 1 1 1 1 1)

30

(1 1 1 1 1 1 1 1 1 1)

GAs: Why do they work?

Strings and schemata: each string of the length m is matched by 2m schemataby 2 schemata.

Ex) Let us consider a string (1001110001). This string is matched by the following 210 schemata:by the following 2 schemata:

31

Order of a schema

The order of the schema S (denoted by o(S)) is

the number of 0 and 1 positions, i.e., fixed positions

(non-don’t care positions) present in the schema(non don t care positions), present in the schema.

Ex) The following schema of length 10

S (★★★ 0 0 1★ 1 1 0)S = (★★★ 0 0 1★ 1 1 0),

has the order o(S) = 6.( )

32

Length of a schema

The defining length of the schema S (denoted by (S)) is the distance between the first and the last fixed string positions

the distance between the first and the last fixed string positions. It defines the compactness of information contained in a schema.

I th i E 6410)(SIn the previous Ex, .

Note that the schema with a single fixed position has a defining

6410)( S

length of zero.

Ex) S = (★★★ 0 0 1★ 1 1 0 )Ex) S1= (★★★ 0 0 1★ 1 1 0 )

S2= (★★★★ 0 0★★ 0★ )S3= ( 1 1 1 0 1★★ 0 0 1 )3

and ,459)(S,6410)(S8)( and ,3)( ,6)(

21

321

SoSoSo

33

9110)(S)()(

3

21

Fitness of a schema

(S, t): the number of strings in a population at time t, matched by a schema S.

eval(S, t): fitness of a schema S at time t - Defined as the average fitness of all strings in the population

matched by the schema S.

Assume there are p strings in the population },,{ 1 ipi vv

matched by a schema S at the time t.

Then,

./)(),(1

pvevaltSeval p

j ij

34

Schema and selection

After the selection step, we expect to have (S, t +1) stringst h d b h S Si

matched by schema S. Since

(1) for an average string matched by a schema S,the probability of its selection (in a single stringselection) is equal to eval(S, t)/F(t),

(2) the number of strings matched by a schema S is (S, t), and

(3) the number of single string selections is pop size

(3) the number of single string selections is pop size,

it is clear that ),(/),(),()1,( tFtSevalsizepoptStS

where F(t) is the total fitness of the whole populationat time t.

),(),(_),(),( p p

35

Reproductive schema growth equation

/)()(

populationtheoffitnessaverage thegConsiderin

itFtF

)(/)()()1(

formula above therewritecan we,_/)()(

tFtSevaltStS

sizepoptFtF

receivesschemaaverage" below"a,generationnext in the stringsofnumber increasingan receives schema average" above"an that means This

.)(/),(),()1,( tFtSevaltStS

If we assume that a schema S remains above average by then

level. same on the stays schema averagean and strings, ofnumber decreasingggg

If we assume that a schema S remains above average by , then

)(/))()((

and ,)1)(0,()1,(

tFtFtSeval

StS t

schema. average belowfor 0 and schemata, average abovefor 0 where

,)(/))(),((

tFtFtSeval

36

Example

Assume pop size = 20, the length of a string (of a schema Template) m = 33,and the population consists of the following strings:and the population consists of the following strings:

37

Example

For a given schema,S0= (★★★★111★★★★★★★★★★★★★★★★★★★★★★★★★★)

is, that ,3),( 1615130 v,v,vtS

3/)867227230602053031670227()(257)( :length defining The 3 :schema theoforder The

0

00

tSlS

), o(S S

20/776822387)()(

081378.27 3/)867227.23060205.30316702.27(),(

20

0

/pop sizevevaltF

tSeval

396751.1)(/),(

388841.19

20/776822.387)()(

0

1

tFtSeval

/pop_sizevevaltFi i

strings.such 85.51.3967513 ),2(at

,by matched strings 4.191.3967513 ),1(At 396751.1)(/),(

20

0

t

SttFtSeval

38

Example

New population:

. and :strings 5 matches )1( at time schema theIndeed, 2019181170 'v,'v,'v,'v,'vtS

39

Effect of crossover

A string, (1110111100) is matched by 210 schemata; in particular it is matched by S1 = (★★★★ 1 1 1★★★),y 1 ( ),

S2 = (1 1★★★★★★ 0 0).After the crossover with (1010111101), none of the offspring matches S2.

In general, a crossover site is selected uniformly among m-1 possible sites.

This implies that the probability of destruction of a schema S isThis implies that the probability of destruction of a schema S is

and consequently the probability of schema survival is

,1)()(

mSSpd

and consequently, the probability of schema survival is

Th b i li di i f h f h if

.1)(1)(

mSSps

The above inequality condition comes from the fact that even ifa crossover site is selected between fixed positions in a schema,there is still a chance for the schema to survive

40

there is still a chance for the schema to survive.

Combined effect: selection and crossover

The combined effect of selection and crossover gives us a new form of the reproductive schema growth equation:

.1)(-1 )(/),(),()1,(

m

SptFtSevaltStS c

Since the probability of the alteration of a single bit is pm,Effect of mutation

p y g pm,the probability of a single bit survival is 1- pm.A single mutation is independent from other mutations,

th b bilit f h S i i t tiso the probability of a schema S surviving a mutation(i.e., sequence of one-bit mutations) is

)1()( )(SopSp

)(1)(by edapproximat becan y propabilit this,1 Since

.)1()( )(

m

ms

pSoSpp

pSp

41

.)(1)( ms pSoSp

Combined effect of selection, crossover, and mutation

The combined effect of selection, crossover, and mutation gives us a new form of the reproductive schema growth equation:us a new form of the reproductive schema growth equation:

.)(1)(1 )(/),(),()1,(

mc pSom

SptFtSevaltStS

Schema Theorem:Short, low-order, above-average schemata receive exponentially increasing trials in subsequent generations of a genetic algorithmincreasing trials in subsequent generations of a genetic algorithm.

Building Block Hypothesis:A genetic algorithm seeks near-optimal performance through the juxtaposition of short, low-order, high-performance schemata, called the building blocks.“Just as a child creates magnificent fortresses through the arrangement ofJust as a child creates magnificent fortresses through the arrangement of simple blocks of wood, so does a GA seek near optimal performance through the juxtaposition of short, low-order, high performance schemata.”

42

Implicit Parallelism

Holland showed that at least pop_ size of schemata are processed usefully he called this property an implicit parallelism as it isusefully - he called this property an implicit parallelism, as it is obtained without any extra memory/processing requirements. It is interesting to note that in a population of pop size strings g p p p p_ gthere are many more than pop_size schemata represented. This constitutes possibly the only known example of a combinatorial explosion working to our advantage instead of our disadvantage.

The building block hypothesis is just an article of faith, which g yp j ,for some problems is easily violated – deception.

A phenomenon of deception is strongly connected with the p p g yconcept of epistasis, which means strong interaction among genes in a chromosome.

43

Deception

S1= ( 1 1 1★★★★★★★★ )

★★★★★★★★★S2= (★★★★★★★★★ 1 1 )Their combination, S3= ( 1 1 1★★★★★★ 1 1 ) is much less fit than

S ( 0 0 0★★★★★★ 0 0 )S4= ( 0 0 0 ★★★★★★ 0 0 ).Optimal solution (S3 matches it): S0= ( 1 1 1 1 1 1 1 1 1 1 1 )

D i b f diffi l i i i S i i d Deception because of difficulties in converging to S0 , since it tends to converge to S4.

Three approaches were proposed to deal with deception Three approaches were proposed to deal with deception.1. Prior knowledge of the objective function to code it in an appropriate

way.h hi d i l i i hi i2. Use the third operator, inversion: it selects two points within a string

and inverts the order of bits between selected points, but remembering the bit’s ‘meaning.’

44

3. Use a messy genetic algorithm (mGA).

genetic algorithms -...

Documents