searching and optimization. applications and techniques branch-and-bound search genetic...
TRANSCRIPT
Searching and optimization
Applications and techniques Branch-and-bound search Genetic algorithms Successive refinement Hill climbing
Characterized by looking for a solution to a problem that has many potential solutions
An exhaustive search is not feasible for many problems◦ Directed search is used instead
May not necessarily search for the optimal (best) solution◦ Non-optimal solutions may be acceptable
Examples of problems:◦ Travelling salesperson problem◦ 0/1 knapsack problem◦ n-queens problem◦ 8-puzzle◦ 15-puzzle
All of these have an extremely large number of permutations◦ Searches can be very time-consuming
Examples of real world applications:◦ Financial forecasting◦ Airline fleet and crew assignment◦ VLSI chip layout
Examples of search and optimization techniques:◦ Branch-and-bound search◦ Dynamic programming◦ Hill climbing◦ Simulated annealing◦ Genetic algorithms
Uses a state space tree◦ Root node is the starting point◦ Transition from one level to the next represents a choice
being made towards the solution◦ Can be considered as dividing the problem into sub-problems◦ Each node represents a sub-problem
How the tree is constructed depends on the problem◦ Called a dynamic tree
If the choice at each node just consists of selecting either true or false, a binary tree is created◦ Called a static tree
The state space tree can be explored using various methods◦ Depth-first search◦ Breath-first search◦ Best-first search
Searching all nodes is expensive◦ Consider only exploring paths that will likely lead to the best
solution (i.e., best-first search)◦ Avoiding unpromising paths prunes the state space tree
A bounding (or cut-off) function may be used to determine whether or not a given path is promising
When a node is reached, the bounding function produces an upper or lower bound
This can be compared to the value of the best path that has been found so far◦ Determine if it is worth it to continue searching along this
path or not
Terminology◦ Live node: A node that has been reached, but not all of its children
have been explored◦ E-node: (Expanded node.) A live node in which its children are
currently being explored◦ Dead node: A node where all of its children have been explored
As the search proceeds, a list of live nodes is maintained in a queue◦ Called an open list
A list of dead nodes may also be maintained◦ Called a closed list◦ Used to check for duplicates
Parallelization◦ Simplest approach...allocate a chuck of the state space tree to
each processor and let them search their portion independently
◦ Possible to parallelize the evaluation of the bounding function However, there are some issues
The size of the state space tree may not be known in advance◦ Issue of load balancing
The current lower or upper bound (generated by the bounding function) should be known by all processors◦Must be updated so that all processors can prune their state
space tree quickly
Another approach is to allow the open list queue to be accessed concurrently (i.e., in shared memory)◦When a processor chooses an item from the queue, that item
must be locked so that another processor doesn’t choose it as well
◦ However, this will limit the potential speedup
Speedup anomalies◦ Acceleration anomaly: Super-linear speedup can be achieved
if one processor finds a solution extremely quickly◦ Deceleration anomaly: A solution may be positioned such
that it cannot be found in 1/p of the sequential time◦ Detrimental anomaly: There is evidence that shows that the
speedup factor could be less than 1
Attempts to simulate the natural evolution of a population of individuals
Chromosomes contain information that characterizes a living being◦ Evolution works at the chromosome level
The chromosomes of offspring are generated from the chromosomes of the offspring’s parents◦ This blending is called crossover
Sometimes, there may be a random change an individuals chromosome pattern◦ This is called mutation
With a low mutation rate, there is a tendency for fit individuals to pass on their traits to the next generation◦ Unfit individuals tend to die out
With a high mutation rate, individuals of the next generation are very randomly different from the individuals of the previous generation◦ Can cause instability or non-convergence
Basically, you have a population of solutions. Crossover and mutation occur. A new generation of solutions is produced and this process repeats
Algorithm:◦ Create an initial population of solutions (individuals)◦ Evaluate each solution to determine how “fit” they are◦ Characterize the solutions from most fit to least fit◦ Select a subset of the population (favouring more fit solutions)◦ Use the subset to produce a new generation of offspring (using
crossover)◦ Apply random mutations to some of offspring◦ Repeat this process until some termination condition is satisfied
Representation of an individual (their chromosomes)◦ Commonly, binary strings (e.g., 000101011110)◦More recently, floating points, gray codes, and integer strings◦ Bits in the binary string have some sort of meaning
For example, the first bit might represent the sign of a number and the remaining bits might represent the number itself
An initial population is generated using a random number generator
Individuals are evaluated according to some function◦ Could produce a number, with higher numbers representing
that an individual is very fit
Constraints might be placed on a solution◦ For example, if an individual falls outside of a range of values,
it could be discarded or “repaired” by changing some bits The number of individuals in a population affects:◦ How quickly a good solution will be found◦ How much computation needs to be done per generation
Selection process◦ In nature, there is a bias towards the most fit individuals◦ However, there may be problems where there are many local
optimum solutions◦ Just selecting the most fit individuals could gather around a
single local optimum rather than the global optimum◦One approach is tournament selection◦ A set of individuals from the population is selected◦ The most fit individual wins the tournament is selected as a
parent to generate offspring◦Many of these tournaments are played until the whole
population has participated
The individuals chosen by the selection process are paired up as parents and produce offspring, using crossover and mutation
Single-point crossover◦ Given two parents, A and B, their chromosomes are split into
two at some boundary◦ The rightmost portions of the parents’ chromosomes are then
swapped to generate two children Other types of crossover: Multi-point crossover,
uniform crossover
Mutation◦ Generally, the mutation rate is kept small◦Mutation occurs by simply changing one or more bits in a
child’s chromosome Other variations that can be incorporated into the
algorithm:◦ Carry over some of the most fit individuals from the previous
generation into the next generation◦ Randomly generate new individuals for the next generation◦ Vary the population size from one generation to the next
Possible termination conditions:◦ Stop after a number of generations
Cannot tell if you have an optimal solution or if the population has moved far away from it
◦ Consider the degree of improvement from one generation to the next
◦ Consider the similarity of all the individuals in the population
Two approaches to parallelization:◦ Let each processor control its own population and allow some
migration to occur between populations◦ Let each processor do a portion of each step of the algorithm
on a shared population
Isolated subpopulations◦ After a certain number of generations, the processors share
their best individuals◦ Immigrants must be selected, send, received, then integrated
into the population◦ Sending and receiving is done easily with message passing
There are a couple models of migration to consider◦ The island models allow individuals to move to any other
subpopulation Models nature better, but requires more communication
◦ The stepping-stone model only allows moving between neighbouring populations
Common population◦ The steps of the algorithm can easily be parallelized, but
under a message passing system, there would be a huge amount of communication
◦ This is more feasible with shared memory
Related to solving problems like finding the maximum value of a function
Consider a graph of a function◦ A grid is placed with some amount of spacing between grid
lines◦ At each of these grid lines, a point is examined◦ The k best points are kept
A finer grid spacing is then used◦ Around each of the chosen best points, points on the grid
lines are examined◦ The best points are kept and the process repeats until some
condition is met
This is simple, but requires more computations than genetic algorithms◦ Though, still much better than an exhaustive search
Parallelization is straightforward◦ Divide up the points to be examined amongst all the
processors◦ Each processor sends its best points to the master processor◦ The master processor determines which of those points are
the best, then distributes them to the slave processors◦ The points to be examined are divided up again and the
process repeats
Consider a blindfolded hiker placed in a terrain having many local maximums
The hiker’s task is to find the global maximum Simple algorithm the hiker uses:◦ Don’t go downhill
If the hiker always moves uphill, it will always reach the top of a maximum
However, if there are many local maximums, it’s unlikely the hiker will stop on the global maximum
So...randomly place thousands of blindfolded hikers on the terrain
Still no guarantee, but this increases the chances that one of them will stop on the global maximum
Hill climbing is a Monte Carlo search technique Again, this relates to problems like finding the global
maximum of a function
Parallelization◦ The area can be divided up amongst all the processors◦ Each processors generates a random number of starting
points and moves all of those points◦ Each processor reports the maximum they found to the
master processor◦ The master processor returns the largest maximum from the
findings
Some hikers might take longer than others to reach a maximum◦ A tiny hill versus a steep mountain◦ Some processors might finish earlier than others
A work-pool approach considers this◦ Divide the area up into even smaller segments◦ As processors finish a segment, they report their maximum to
the master processor and choose another segment from the work-pool