introduction to evolutionary computation genetic algorithms are inspired by the biological...

30
Introduction to Evolutionary Computation Genetic algorithms are inspired by the biological processes of reproduction and natural selection. Natural selection determines which members of a population survive to reproduce, and reproduction ensures that the species will continue.

Post on 20-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Introduction to Evolutionary Computation Genetic algorithms are inspired by the biological

processes of reproduction and natural selection. Natural selection determines which members of a population survive to reproduce, and reproduction ensures that the species will continue.

Introduction to Evolutionary Computation(Applications)

Aircraft Design, Routing in Communications Networks, Game Playing (Checkers [Fogel]) Robotics, Air Traffic Control, Design, Scheduling, Machine Learning, Pattern Recognition,

Introduction to Evolutionary Computation(Applications cont.)

Job Shop Scheduling, VLSI Circuit Layout, Market Forecasting, Design of Filters and Barriers, Data-Mining, User-Mining, Resource Allocation, Path Planning, Etc.

Evolutionary Databases Optimization

Evolutionary splitting in R-Trees for Spatial Databases

Evolutionary Dynamic Clustering for Spatial Databases

Evolutionary Optimization in Distributed Databases

Structure of R-Trees

The Minimum Bounding Rectangle (MBR) of the object is stored in R-Tree

The leafs: (PO,MBR(PO))

The internal nodes: (PC, MBR(PC))

Parameters: M – the maximum number of entries m – the minimum number of entries

within a node

Evolutionary node splitting in R-Trees

Generalization of the node splitting Minimization of the area and the overlapping of MBRs

(Minimum Bounding Rectangle)

Evolutionary splitting in R-Trees Algorithm (ESRA)

The M+1 objects that has to be distributed to MBRs will be denoted by 1, 2, .. M, M+1.

A maximum of [M/m] MBRs A potential solution of the problem (a chromosome) is

a string of constant length { , , ..., }, where the gene indicates to which MBR the object i belongs.

Fitness function:

Generaliation:

1x 2x 1+Mx]}/,..[2,1{ mMxi ∈

RXf →: )()( xOxTf +=

]1,0[,),()( ∈+= βαβα xOxTGf

169

282

356

222213

265249

305 326

236254

235

0

50

100

150

200

250

300

350

400

450

50 100 150 200 250 300

Maximum size of the rectangles

Average fitness

ESRA

R-tree

The fitness function is to be minimized; the fitness value obtained with the R-Tree algorithm is well - determined, while the fitness value for ESRA is considered to be the average value obtained after 100 executions of it; the minimum usage of an R-Tree node is m=16;

Evolutionary Dynamic Clustering for Spatial Databases

Virgo - cluster of galaxies

The proposed algorithm is optimizing the clusters’ centers and is determining their number

The population contains the clusters prototypes The quality of a chromosome is computed by

taking into account the distances between prototypes and all the other objects that need to be grouped

The very close individuals are merged, and this way the population size decreases

The final population contains only the clusters’ centers

Evolutionary Dynamic Clustering for Spatial Databases

P – prototype

n – number of total objects

Xi – the i object

∑= +

=n

i i CPxdPf

1

,),(

1)(

α

Evolutionary Dynamic Clustering for Spatial Databases

Existing clusters

Quadratic Cost R-Tree Algorithm

R*-Tree Algorithm

ESRA EDCA

3 63156 63156 22672 (3 detected clusters)

22672 (3 detected clusters)

3 58118 58118 23919 (3 detected clusters)

22427 (4 detected clusters)

4 89736 89736 45151 (4 detected clusters)

45151 (4 detected clusters)

4 87943 87943 31459 (4 detected clusters)

28166 (5 detected clusters)

5 112485 112485 43111 (5 detected clusters)

43111 (5 detected clusters)

5 61846 61846 29684 (5 detected clusters)

29684 (5 detected clusters)

6 113598 127792 59896 (5 detected clusters)

42425 (6 detected clusters)

6 105338 105338 65716 (5 detected clusters)

48245 (6 detected clusters)

The quality of the solutions obtained after using the Quadratic Cost Algorithm, R*-Tree, ESRA and EDCA for 8 sets of grouped spatial objects

Evolutionary Optimization in Distributed Databases

Design relies on the graph representation and system management improvement by use of intelligent agents

initial estimated cost that can be assigned by the system designer to an edge; this cost is estimated based on network transfer rate, data access time and computing power on a site

up-to-date computed cost by agents performing statistic on queries frequency

a potential solution for the problem (a chromosome) is a string of constant length { , ,…, },where the value of the gene indicates to which node of the graph the group of tuples i belongs (n – number of nodes, s – the number of tuples’ groups

1x 2x},...,2,1{, nxx ii ∈

sx

,: RXF → ∑∑==

=s

jiNjij

n

ij

cnofxF11

11)(

Evolutionary Fragmentation and Allocation in DDBS

Experimental results

The given graph and the new costs associated with the edges between nodes are given

Experimental results - contd

TABLE I INITIAL DATASET FRAGMENTATION AND DISTRIBUTION

Node Number of tuples Tuples

1n 170.000

000.1701 tt −

2n 210.000 000.380001.170 tt −

3n 50.000 000.430001.380 tt −

4n 160.000 000.590001.430 tt −

5n 210.000 000.800001.590 tt −

TABLE 2 DATASET RE-FRAGMENTATION AND RE-DISTRIBUTION

Node Number of tuples Tuples

1n 180 .000

000.1001 tt − ,

000.330001.260 tt −

000.490001.480 tt −

2n 110.000 000.110001.100 tt −

,

000.230001.170 tt −

000.340001.330 tt −,

000.630001.600 tt −

3n 270.000 000.140001.110 tt −

,

000.250001.230 tt −

000.470001.380 tt −,

000.520001.490 tt −

000.730001.630 tt −

4n 160.000 000.170001.140 tt −

,

000.480001.470 tt −

000.590001.520 tt −,

000.790001.740 tt −

5n 80.000 000.260001.250 tt −

,

000.380001.340 tt −

000.600001.590 tt −,

000.740001.730 tt −

000.800001.790 tt −

Experimental results - contd

We compare the total cost of the requests with the initial fragmentation and distribution with the total cost of the requests after reallocation of tuples in graph.

For the given example, the total cost of the requests in the graph with the initial distribution is 38.5 mil., while the total cost of the requests after reallocation made by the proposed algorithm is 22.2 mil.

(47.5% improvement)

Conclusions

Evolutionary technique for splitting R-Trees nodes for Spatial Databases

Evolutionary Dynamic Clustering for Spatial Databases

Evolutionary re-fragmentation and re-allocation within a distributed system

Collaborative Evolutionary Search

Adaptive Goal Guided Recombination (AGGX)

Two new features:

information sharing mechanism between the individuals of a population (each individual knows the value of the best individual obtained so far - the so called global best)

each individual has memory (each individual knows the value of its best related individual - the so called local best)

the control of the amount of relevant genetic information transferred from the global best and local best to the offspring

NoKept - the number of genes kept in the offspring NoEdges - the total number of edges involved in the common

sequences of the global and the local best NoCrt - the number of the current generation NoGen - the number of total generations of the algorithm

a randomly chosen sequence of one parent is always kept in the offspring (in order to increase the population diversity)

NoGen

NoCrtNoGen

eNoEdgesNoKept−

−= *

Adaptive Goal Guided Recombination (AGGX)

Adaptive Goal Guided Recombination

Evolutionary Approach of TSP

k cities

potential solution for the problem - a string of length k that contains a permutation of the set {1,…k}

Fitness function f to be minimized:

∑=

++ ≡+=→

k

iii kccdfRSf

1)1()( )11(),,()(,: πππ

Experimental Results

Standard Genetic Algorithm (SGA) for:

TSP instance with 130 cities TSP instance with 76 cities

Results obtained after 100 runs of SGA for TSP with 130 cities

Results obtained after 100 runs of SGA for TSP with 76 cities

The convergence process for SGA using OX and AGGX,for TSP with 130 cities.

The convergence process for SGA using OX and AGGX,for TSP with 76 cities.

Conclusions and future work

AGGX is based on two new features: social behavior of individuals within a population the memory of each individual

AND control of the amount of relevant genetic information

transferred from the global best and local best to the offspring

a randomly chosen sequence of one parent is always kept in the offspring

a way to introduce diversity in the population will be pursued