high performance, pipelined, fpga-based genetic algorithm machine a review grayden smith ganga...
TRANSCRIPT
High Performance, Pipelined, FPGA-Based Genetic Algorithm MachineA Review
Grayden SmithGanga Floora
1
Outline• Paper Summary• Background on Genetic Algorithms• Implementation• Results• Critique• Conclusion
2
The paper in summary• Genetic algorithms are a commonly used heuristic solution
discovery technique• Improving the speed at which genetic algorithms are run• Explore a pipeline, custom hardware solution for running
genetic algorithms quickly• Make use of pipelining, and specific genetic algorithm
technique choices• Achieved a speedup of 9600x over comparable general
purpose processors
3
Genetic algorithms• Before exploring how to increase the performance of genetic
algorithms, it is important to understand genetic algorithms• Genetic algorithms mimic the process of natural selection to
create progressively fitter solutions• There are 4 main components of a genetic algorithm• Population generation• Fitness measurement• Crossover• Mutation
• In short form, a pool of possible solutions is created where the features of the solution are understood as “genetic code”
• Solutions are bred with other solutions, and superior solutions are added back to the pool for future stages
• This ends when an acceptable solution is reached
4
Genetic algorithm, psuedo-code
5
Algorithm design choices• The type of genetic algorithm used in this design is called a
steady-state, survival driven genetic algorithm• This type of genetic algorithm was chosen specifically to make
it easy to implement in hardware
Population storage:• The alternative is generational genetic algorithm• However, a generational genetic algorithm would require a
flexible memory spaceParent Selection:• The alternative would be to choose based on fitness• Choosing based on fitness would necessitate added hardware in
the memory stage6
Algorithm design choices
Crossover and mutation:• Can easily rely on simple bit shift registers
Survival Driven Evolution:• Parents are not selected by fitness so the population will
slowly become built only on survivors
• In short, this is a very simple genetic algorithm• Tests were done using the Royal Road technique to verify
validity
7
Genetic algorithm pipeline
8
Pipeline Datapath Bit-Slice
9
• Rough estimate using 4-input LUTS• Five logic functions
• 4 multiplexers• 1 mutation function
• 7 flip-flops
• Assuming 2 outputs for each LUT then:• 8 LUTs for bit-slice
• Total cost in LUTS for pipeline =
Crossover implementation
10
Mutation implementation
11
Real time control screen
12
Protein folding problem: real-time control screen. For a 50,000 crossover run, the screenis updated at a rate of 36.67 runs per second which allows real-time adjustment of the mutation andcrossover parameters.
Example Problem: Protein Folding• To discover the minimum-energy conformation for a lattice-
constrained problem• This problem is NP-hard• The problem represents:• Chain of amino acids/residues• Divided into hydrophobic residues that are repelled by water and
hydrophilic residues that can form hydrogen bonds with water molecules
• Hydrophobic residues become clustered in the center of hydrophilic residues
13
14
Example Problem: Protein Folding
• Optimization problem that models many resource selection problems and is vital for logic circuit minimization
• This problem is NP-hard• Defined as:• Have a collection C of finite sets, with non-negative cost• Find minimum cost sub-collection C’• Every element within C belongs to at least one set in C’
15
Example Problem: Set Covering
Example Problem: Set Covering
16
17
Example Problem: Set Covering
Set-coverage fitness function circuit.
Example Problem: Set Covering
18
Performance
19
Problem Clock Rate
LUTs Speedup Population Size (chromosomes)
Hardware
Set Coverage (94x520)
1MHz -- 2200x 256 Aptix AXB-MP3
Protein Folding (512x82)
66MHz 6000 320x/9600x 512 Xilinx XCV300 FPGA/Xilinx XCV3200E FPGA
CritiqueThe fitness function:• Needs to be implemented for different problems• Can it be fit nicely into one clock cycle?• If new fitness functions can’t be implemented quickly, is it worth it?• Should development costs be incurred for ASIC implementation?
• This is addressed to some degree in the paper• This function would likely require a good deal of design time and effort
Evolutionary stasis not explained:• The pipeline has no apparent method for finding stasis• Should it be a counter? A comparator?
Pipeline data hazards:• Not addressed anywhere• Quite low risk, but can impact performance and results
20
Critique
Paper results and evaluation:• Limited information provided on:• Resource usage• Performance
• Little comparison between FPGA implementations• However, good study of genetic algorithm behaviour
Good dataflow description:• Clear diagrams and description of implementation
21
Conclusion• GA can be implemented onto FPGA technologies to help
improve performance through speed-up• Outlined the algorithm and the architecture that was used to
demonstrate the claims for the research• Demonstrated significant performance increase over
conventional general purpose processing using a couple of different FPGA technologies
• Explains that the use of FPGA technology can be desirable for its speed but has limited flexibility when introducing new fitness functions
• More research needs to be conducted to allow for fast reconfiguration of fitness functions to maximize throughput 22
References
[1] Shackleford, B., Snider, G., Carter, R. J., Okushi, E., Yasuda, M., Seo, K., & Yasuura, H. (2001). A high-performance, pipelined, FPGA-based genetic algorithm machine. Genetic Programming and Evolvable Machines, 2(1), 33-60
23