fda- a scalable evolutionary algorithm for the optimization of adfs by hossein momeni

33
FDA- A scalable evolutionary algorithm for the optimization of ADFs By Hossein Momeni

Upload: martin-byrd

Post on 16-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

FDA- A scalable evolutionary algorithm for the optimization of ADFs

By Hossein Momeni

Page 2

Outline

• Factorization Theorem• FDA• Analysis of FDA for large populations• Boltzmann and Truncation selections• Finite and critical population • Numerical results• LFDA

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 3

Introduction• In a deceptive function the global optimum

x=(1,…,1) is isolated.• Neighbors of the second best fitness value x=(0,

…,0) have large fitness value• GAs are deceived by the fitness distribution• Most Gas will convergence to x=(0,…,0)

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 4

Solutions • Mathematical methods are suitable to optimize

deceptive functions• Consider additively decomposed functions (ADF)

• Sj are non-overlapping substrings of X with k elements

• This class of functions is of great theoretical and practical importance

• Optimization of an arbitrary in this space is NP complete

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 5

ADFs Optimization Approaches• Adaptive recombination• Explicit detection of relations

(kargupta&Goldberg, 97)• Dependency trees(Baluja&Davies, 97)• Bivariate marginal distributions

(pelikan&Muhleinbein,98) • Estimation of Distributions(Muhlenbein et

all,1997)

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 6

ADF

• Definition: An additively decomposed function (ADF) is defined by:

• For theoretical analysis, use Boltzmann Distribution

)()(i

i

sSs

i xfxf

Xsssss il ,...,, 21

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 7

Gibbs or Boltzmann distribution• Definition: The Gibbs or Boltzmann distribution of a

function f is defined for u>=1 by

• is partition function• larger function value f(x) and larger p(x)• Such a search distribution is suitable for an

optimization problem• exponential computation

u

u

F

xfExpxp

)(:)(

uF

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 8

Reduce of B.D. computation

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

1) Approximate the Boltzmann distribution (simulated Annealing)

2) Look for ADFs with distribution computation in Polynomial time

• factorize distribution into a product of marginal and conditional probabilities (used by FDA)

Page 9

Input sets for Factorization theoremDefinition: if S={s1,s2, …, sl} for i=1, 2,…, l then

In the decomposable graphs theory:

di histories

bi residuals

ci separators

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 10

Factorization Theorem

Theorem1: Let p(x) be a Boltzmann distribution on X

If

then

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 11

FDAr

S0: set t=0, generate (1-r)*N>>o point randomly and r*N points (Equation 16)

S1: selection

S2: Compute using selected points

S3: Generate a new population

S4: If termination criteria is met, Finish

S5: Add the best point of previous generation to generated points (elitist)

S6: Set t=t+1, Go to Step2

),( txxpii cb

s

l

icb

s txxptxpii

1

),()1,(

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 12

Analysis of Factorization Algorithm• The computational Complexity depends on the factorization

and population size N• Number of function evaluations: FE=GENe*N

GENe is the number of generation till Convergence p(x,t+1)=p(x,t)

• The computational Complexity of computing N new search points is

• The Computational Complexity of computing probability is

Nlnts)compl(Npoi

Ml

i

)2(compl(p)1

si

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 13

Analysis of … (Contd)• Computation of FDA depends on:

1) Number of decomposition functions (l)2) Size of the defining sets (si)

3) Size of selected point (M)

• An infinite population is needed to exactly computation

• Should use a minimal population size N* in a numerical efficient FDA

• Computation of N* is a difficult problem for any search method using a population of points

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 14

FDA-FAC• S0: set i=1, is non-linear sub-function

• S1: compute

• S2: Select sk which has maximal overlap with and

• S3: if no set is found go to step 5

• S4: Set if i<L go Step1

• S5: Compute the factorization using Eq. 6 with sets

is~

i

jji sd

1

~:~

id~

ik ds~

1:,~1 iiss ki

is~

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 15

Generation of Initial Population

• Normally the initial population is generated randomly

• with ADF, initial point can be generated with this information.

• Generate subsets with high local fitness values• Distribution is an approximation of • Conditional probabilities are computed using local

fitness functions

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 16

Generation of Initial Population….

• The larger u, the steeper distribution• if u=1 the distribution is uniform.

• if function Onemax(n)=∑xi then • FDA computes span=1 and u=10

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 17

Generation of Initial Population….• if function Onemax(n)=∑xi then • FDA computes span=1 and u=10

• There will be 10 times more 1s than 0s in the initial population

• Such an initial population might not give a B.D. • Only half of the population is generated by this

method• Other half is generated randomly

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 18

Convergence of FDA• If points are selected base on Bol. Distribution

convergence of FDA is proved.• The distribution ps of selected points is given by:

• If p(x,t) is B.D. then ps(x,t) is B.D. • FDA computes new search points according to

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 19

• Theorem2 : If the initial points are distributed according to with u>=1, then for FDA the distribution at generation is given by

with

Tip: B. Selection with fixed basis v>1 defines an annealing schedule with that t is number of generation

Theorem3 remains valid for any annealing schedule with

tvuw .

))ln()ln((1)( uvttT

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 20

• Theorem 3(Convergence): Let be the set of optima, then base on Theorem 2 :

• FDA with B. selection is exact simulated annealing algorithm.

• simulated annealing is controlled by 2 parameters: N(T) and annealing schedule

• N can be called population size

,...},{ 21 optoptopt xxX

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 21

Truncation Selection Vs B. selection

• Numerically truncation selection is easier to implement• With truncation threshold ד the best ד*N individual are

selected.• Conditional probabilities of selected point is: • Based on factorization theorem to generate new search points :

• Problem: After Truncation selection the distribution is not B.D. therefore:

• With this inequality that this makes a convergence proof difficult.

),( txxpii cb

s

l

icb

s txxptxPii

1

),()1,(

l

icb

ss txxptxpii

1

),(),(

),()1,( txptxp opts

opt

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 22

Theoretical Analysis for Infinite populations

• For analysis two linear function will be investigated:

• OneMax has (n+1) different fitness value which are multinomial D.

• Int has 2n different fitness value.• For ADFs the multinomial distribution is typical• The distribution generated by Int is more special• Both functions is linear, therefore can use following

factorization:

n

ii

in

n

iin

xxInt

xxOneMax

1

1

1

2)(

)(

n

ii txptxp

1

),()1,(

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 23

• Theorem 4 For B. selection with basis v the probabilities

distribution for OneMax is given by:

• Number of generations to generate the optimum is given by:

nt

xtf

v

vtXp

)1(),(

)(

)ln(

ln

v

n

GEN

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 24

• Theorem 5For Truncation selection ד with selection intensity Iד

the marginal probability p(t) obeys for OneMax

• The approximate solution of this equation is :

Where

• The number of generations till convergence is given by:

))(1)(()()1( tptnpn

Itptp

))12arcsin(sin(1(5.0)( 0 ptn

Itp

I

npt ))12arcsin(

2( 0

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 25

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 26

Comparison Truncation & B. selection

• T.S. need more number of generation to convergence than B.S.

• GENe is of order for B.S. and for T.S. is

• If basis v is small (e.g. v=1.2) T.S. convergence is faster

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 27

• B.S. with fixed v gives an annealing schedule of

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 28

• FDA with truncation selection generates a B.D. with annealing schedule

• The annealing schedule depends on the average fitness and the variance of the population.

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 29

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 30

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Page 31

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

• For Int the B.D. is concentrated around the optimum

• The selected population has a small diversity• In finite population this cause a problem, some

genes will get fixed to wrong alleles

Page 32

Analysis of FDA for Finite Populations

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

In finite population, convergence of FDA can be Probabilistic

Page 33

Analysis of FDA for Finite Populations

Factorized Distributed Algorithm

Iran University of Science and Technology November 2006Of 47

Cumulative fixation probability for Int(16) Truncation Selection vs. Boltzmann selection with v=1.01