design space exploration using parameterized cores
TRANSCRIPT
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS – UNIVERSITY OF WINDSOR
Design Space Exploration UsingParameterized Cores
Ian D. L. AndersonM.A.Sc. Candidate
March 31, 2006
Supervisor: Dr. M. Khalid
Design Space Exploration Using Parameterized Cores 1
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
OUTLINE
2
• Introduction• Designing Systems Using IP Cores• Design Space Exploration (DSE)• Genetic-based DSE Case Study• Results
Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Introduction
3Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Embedded Systems• An embedded system: A device that utilizes
computational hardware and application-specific software to carry out a specific task.
• Often hidden from the user of the device (i.e. “embedded” within a larger system)
4Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Major Components of an Embedded System
• Digital Hardware:– Microprocessor or µC– Application-specific
hardware generally used for accelerating time-critical tasks
• Embedded software running on the µP or µC
5Design Space Exploration Using Parameterized Cores
Application-specific
hardware
Software runningon CPU
Embedded CPU
Memory& I/O
Embedded System
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
The Challenges of Designing Embedded Systems
• Improvements in IC process tech. enable more complex and intricate designs to be realized
• Therefore, designing from scratch is too expensive and time-consuming for many people
• Traditional or “co-design”methodology
6Design Space Exploration Using Parameterized Cores
Final Embedded SystemFinal Embedded System
Hardware/Software Partitioning
Hardware/Software Partitioning
Hardware Design
Hardware Design
Hardware Synthesis
Hardware Synthesis
Placement & Routing
Placement & Routing
Software Development
Software Development
CompilerCompiler
Assembler/Linker
Assembler/Linker
Integration & TestingIntegration & Testing
System Specification
System Specification
HW/SW Interface Design
HW/SW Interface Design
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Designing Systems Using IP Cores
7Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Core-based Design
• It makes sense for many designers to use and re-use pre-designed and pre-tested hardware and software components
• These are generally known as “Intellectual Property (IP) Cores”
• Reduce design time at the expense of some flexibility and area/performance penalty
8Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Three Classes of (Hardware) IP Cores• Soft cores: components
described in a hardware description language (HDL)
• Firm cores: gate-level netlist that is ready for technology mapping, placement and routing, etc.
• Hard cores: pre-placed and pre-routed circuits
9Design Space Exploration Using Parameterized Cores
Hard CoreHard Core
Firm CoreFirm Core
Circuit Layout
Logic primitives(gates, FF’s, etc.)
Soft CoreSoft CoreHDL description
RTLevelRT
Level
LogicLevelLogicLevel
CircuitLevel
CircuitLevel
HDL Synthesis
Tech. mapping, placement & routing, etc.
IncreasingAbstraction& Flexibility
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Soft IP Cores
• Hardware components described in a hardware description language (HDL) such as VHDL, Verilog, etc.
• Some advantages of soft cores:• Higher level of abstraction – easier to
understand• More flexible – designers can change the core
by editing source code or selecting parameters (more on that later)
• Platform independent – can be synthesized for any IC technology, incl. FPGAs, ASICs, etc.
• More immune to obsolescence
10Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Popular Examples of Soft IP Cores
• Altera Corp. – Nios and Nios II processors– Customizable embedded
RISC microprocessors targeting certain AlteraFPGAs
• Xilinx Inc. – MicroBlaze– Flexible 32-bit
microprocessor for XilinxFPGA families
• Tensilica Xtensa• Open-source cores:
– LEON2 and LEON3 by Gaisler Research
– www.opencores.org
11Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Parameterized Cores
• In order to increase core flexibility, many IP cores (esp. soft cores) are “parameterized”
• Certain aspects of the hardware’s architecture can be changed so that the core can be tailored to suit a specific application more closely• E.g. Bit-widths, functional unit
implementation, etc.• “Parameters” are essentially variables with a
finite set of possible values• Assigning values to all parameters of a core
produces one “configuration”
12Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Classification of Core Parameters• Static or dynamic parameters:
• Static – Must be set prior to chip fabrication (e.g. HDL generic statements)
• Dynamic – Can be set after chip fabrication provided the chip has proper facilities built-in• Extreme example: FPGAs
• Two or more parameters can share interdependencies with each other:• Hard interdependency: requires simultaneous
parameter selection for a valid configuration• Soft interdependency: value selection should be
done simultaneously in order to create an optimal configuration
13Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Classification of Core Parameters (Cont’d)• Classification by Function:
• Parameters affecting:1. The bit-width of parts of the core
• Datapath width, width of address bus, etc.2. How many sub-components are
instantiated• E.g. # of registers in register file
3. The type or implementation of components being instantiated• E.g. Multiplier implementation
4. How components are connected together5. Some combination of 1, 2, 3 and 4
14Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Design Space Exploration (DSE)
15Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
What is the Design Space?• “Design space” – the set
of all possible HW and SW configurations that will achieve the system’s required functionality
• Configurations are evaluated in terms of how well they meet “objectives”
• Design space often contains a large number of possibilities that are sub-optimal
• Therefore the design space should be “explored” to determine the best configuration for the job
16Design Space Exploration Using Parameterized Cores
Objective 2
Objective 1
Design Space
The Design Space can be picturedAs an n-dimensional space, wheren is the number of objectives. Forexample, a 2-objective space:
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
DSE and Multi-objective Optimization• DSE is essentially a
multi-objective optimization problem
• The designer must balance a set of competing objectives– i.e. min. chip area &
power while max. performance
• Often, there is not one single “optimal”configuration, but rather a set called the “Pareto-optimal” set
17Design Space Exploration Using Parameterized Cores
Objecti
ve 1
Objective 2
Obj
ectiv
e 3
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Pareto-Optimality
• When optimizing several objectives at once, a configuration is Pareto-optimal if you cannot improve on one objective without sacrificing another
• Example from geometry: optimize the area of three non-overlapping circles, A, B and C, within the area of the triangle
18Design Space Exploration Using Parameterized Cores
Vilfredo Pareto
A
B
C
Pareto-optimal
Pareto-optimal
A
B
C
NOT Pareto-optimal(Area of C can be increased without reducing A or B)
A
B
C
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Pareto-Optimality (Cont’d)• If you do not know the
relative priority of each objective, then you are left with a set of “non-dominated” solutions
• No one solution is better than another, unless one knows which objectives have priority (e.g. it may be most important that circle A be larger)
19Design Space Exploration Using Parameterized Cores
Objective 2
Objective 1
Design Space
The Pareto-optimal set lieson the lower boundary of thedesign space known as the“Pareto-optimal front”.
Pareto-optimalfront
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
DSE Using Parameterized Cores
• Many parameterized cores have multiple parameters and each parameter can have numerous possible values
• This can lead to potentially thousands, millions (or more) of different possible configurations
• Each parameter can affect the area, performance and power consumption of the core • Many configurations are sub-optimal
• The goal of DSE is to determine the set of combinations of parameter values that constitute the Pareto-optimal set of configurations
• The “best” configuration for an application can be chosen from that set
20Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Automated Approaches to DSE
• Obviously, exhaustively searching thedesign space is tedious and a big wasteof time when the number of parametersis large
• Therefore, a lot of research has focused on automating the process
• One of the most widely known and applied approaches involves using some form of a genetic or evolutionary-based algorithm
21Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Genetic and Evolutionary Algorithms
• A class of optimization algorithms that have been applied to a wide array of problems
• Many variations, but they all have one thing in common: they take their inspiration from the field of biological sciences
• They attempt to emulate the biological process of natural selection
• They have found to be good at solving multi-objective optimization problems
22Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Genetic-based DSE Case Study
23Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Objectives of Case Study
• Preliminary study with the following objective:• To investigate the feasibility of applying a genetic
algorithm-based approach to a parameterized soft IP core with a sizeable design space in order to approximate its Pareto-optimal set of configurations
• Altera Nios soft-core processor was chosen as the test-case
• Ultimately this technique will be applied to other parameterized components in order to assist designers in deriving application-specific processing cores
• Nios is just a convenient test case
24Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
A Bit About the Altera Nios Processor
• Popular embedded RISC processor targeting Altera FPGAs
• Flexible; with the parameters shown at right
• With just the processor, there are a total of 15,696 possible configurations
25Design Space Exploration Using Parameterized Cores
Yes or NoSupport OCI Module
Yes or NoSupport interrupts/traps
Yes or NoSupport RLC/RRC
More stalls, Fewer stallsPipeline optimization
Software, MSTEP, MULInteger multiplication
Off, 1, 2, 4, 8 or 16 kBData cache size
Off, 1, 2, 4, 8 or 16 kBInstruction cache size
Read-only or writableWVALID register
128, 256 or 512Register file size
LE’s or ROMInstruction decoder
16 or 32 bitDatapath width
Possible Vals.Parameter
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
The SEAMO Algorithm
• The Simple Evolutionary Algorithm for Multi-objective Optimization (SEAMO) by C. Valenzuela (2002) was chosen as the exploration algorithm
• It is population-based – it maintains a set or “population” of configurations rather than just a single solution
• As the algorithm progresses, it gradually “evolves”the population until it converges towards the Pareto-optimal set
26Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
How it works…
• Parameters of the core are represented as “genes” – discrete variables (pi) with a finite set of possible values
• Configurations are represented as strings of n genes called “chromosomes”
27Design Space Exploration Using Parameterized Cores
p1 p2 p3 pn…
“Chromosome”
“Gene”
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
How it works… (Cont’d)
• The “population” is made up of a set of Nchromosomes
• Each chromosome has an “objective vector”which stores the values of each objective separately
• There can be any number of objectives
28Design Space Exploration Using Parameterized Cores
p1 p2 pn…1
p1 p2 pn…
p1 p2 pn…
…
p1 p2 pn…
2
3
N
o1 o2
o1 o2
o1 o2
o1 o2
…
“Population” “Objectives”
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
The Algorithm - Initialization
• Create an initial population of Nindividuals randomly
• Evaluate the objective vectors for each chromosome
• Record the “best-so-far” values for each objective
29Design Space Exploration Using Parameterized Cores
p1 p2 pn…1
p1 p2 pn…
p1 p2 pn…
…
p1 p2 pn…
2
3
N
o1 o2
o1 o2
o1 o2
o1 o2
…
“Population” “Objectives”
o1 o2Best-so-far:
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
The Algorithm – Offspring Creation• For each
chromosome in the population:– Pair with another,
randomly selected individual
– Apply the “crossover” operator to produce an “offspring”
– “Mutate” the offspring
30Design Space Exploration Using Parameterized Cores
p1 p2 pn…Parent 1
p1 p2 pn…Parent 2
+Random cut-point
p1 p2 pn…Offspring
Crossover
Mutationp1 p2 pn…
Gene selected at random and Changed to another possible value
Offspring
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
The Algorithm – Replacement Strategy• Parent chromosomes are replaced by their offspring based
on three rules:1. Parents are replaced only by their own offspring2. Offspring only replace parents if they are superior
(“elitist strategy”)3. Duplicates in the population are deleted
• The newly formed offspring is evaluated based on its objectives
• One of the two parents is replaced by the offspring if the offspring:• Improves on one of the “best-so-far” values• Dominates a parent (i.e. is superior in all objectives)
• If the offspring already exists in the population, then it is deleted
31Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
The Algorithm – Iteration
• After all individuals in the population have had a chance to produce offspring, one “generation” of the algorithm has passed
• The algorithm will pass through several generations before the population converges
• The population size, N, and the number of generations, G, constitute the parameters of the algorithm
• Also the number of genes in the chromosome, and the number of objectives can be changed to fit different problems
32Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Evaluation of Configurations• Each individual in the population needs to
evaluated in terms of its objectives• In this case study, objectives are to:
• Minimize equivalent LE usage on StratixFPGA
• Minimize critical path delay• 47 different Nios configurations were
synthesized; area and delay data were collected from Quartus II reports
• Using these data, area and delay estimation equations were established using n-dimensional regression techniques
33Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Results(To be presented at CCECE06)
34Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Implementation of the Algorithm
• Testing of objective functions for 20 random test cases:• Area estimation: within 7.22% of actual
values (on average)• Delay estimation: within 7.58% of actual
values (on average)• Estimation equations were integrated into a
C++ implementation of the SEAMO algorithm• The algorithm was run for various population
sizes to determine suitable values
35Design Space Exploration Using Parameterized Cores
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Algorithm Convergence CharacteristicsAverage LE Usage Vs. Generation
1500
2000
2500
3000
3500
4000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Generation
Ave
rage
Equ
ival
ent L
E's Population = 10
Population = 15Population = 20Population = 25Population = 30Population = 35Population = 40Population = 45Population = 50
36Design Space Exploration Using Parameterized Cores
Average Delay Vs. Generation
10
12
14
16
18
20
22
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
Generation
Ave
rage
Del
ay (n
s)
Population = 10Population = 15Population = 20Population = 25Population = 30Population = 35Population = 40Population = 45Population = 50
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Results
37Design Space Exploration Using Parameterized Cores
Area Versus Critical Path Delay forInitial and Evolved Population
1000
2000
3000
4000
5000
6000
7000
10 15 20 25 30Critical Path Delay (ns)
Are
a (E
quiv
alen
t LE'
s)
Initial Population After 20 Generations
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
Conclusions and Future Work
38Design Space Exploration Using Parameterized Cores
• The purpose of this study was to investigate the feasibility of using a genetic algorithm to design embedded systems• It is still a work in progress…
• Genetic algorithms may be useful in assisting designers to make good decisions when deriving application-specific components from parameterized cores
• Current work involves the development of a tool that will utilize a genetic approach to semi-automatically generate application-specific soft processors from parameterized components
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
References
39Design Space Exploration Using Parameterized Cores
• [1] Altera Corporation, “Nios 3.0 CPU datasheet", October 2004, Version 2.2
• [2] Altera Corporation Website, www.altera.com, February 2006
• [3] Altera Corporation, “Nios embedded processor 16-bit programmer's reference manual", January 2004, Version 3.1
• [4] Altera Corporation, “Nios embedded processor 32-bit programmer's reference manual", January 2003, Version 3.1
• [5] Altera Corporation, “Avalon bus specification reference manual", July 2003, Version 2.3
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR
References (Cont’d)
40Design Space Exploration Using Parameterized Cores
• [6] C. L. Valenzuela, “A simple evolutionary algorithm for multi-objective optimization (SEAMO)," Proceedings of the 2002 Congress on Evolutionary Computation, 2002, CEC '02, vol. 1, 12-17 May 2002, pp. 717-722
• [7] P. K. Jha and N. D. Dutt, “Rapid estimation for parameterized components in high-level synthesis,“ IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 1, issue 3, Sept. 1993, pp. 296-303
• [8] P. Yiannacouras, “The microarchitecture of FPGA-based soft processors," Master's Thesis, University ofToronto, 2005, pp. 47-48