hardware-software codesign
Post on 30-Dec-2015
139 Views
Preview:
DESCRIPTION
TRANSCRIPT
HW/SW Codesign Introduction
Unified design of hardware and software systems
All design based off of logical model no HW/SW partition Maintained throughout design process
Concurrent Design hw/sw optimized for peak
performance
HW/SW Codesign Origins
Field of Embedded systems Demand for consumer information
appliances (cell phone, pda) Specialized industrial products designers developed new tools and
techniques to satisfy demand These became HW/SW codesign
Traditional Systems Design
Early, key decision: HW/SW Partition Must be kept, changes require
extensive redesign for both HW and SW
Lacks a well defined HW/SW interface data flow
Leads to Sub optimal designs And longer design-to-market time
HW/SW Codesign - A Solution
HW/SW Codesign alleviates traditional design issues
Maps system specification to a mixed HW/SW implementation Conventional SW on a RISC processor ASICs (Application Specific Integrated
Circuit)
Practical implementation of hardware/software co-design
The purpose of hardware/software
co-design Four common approaches to the task of
hardware/software co-design unbiased hardware-biased software-biased hardware acceleration
Co-design development routine
Objectives of development routine The first stage is to determine the
performance critical section of a C program using a profiler tool and routine system, as described in Fig.1.
Next step
The next step in the development routine is to implement a critical section in hardware as shown in Fig.2.
Limitationson the type C code:
All C types must be mapped to 16/32 bit signed integers in HardwareC
Type qualifiers, enumerated types, unions and structures NOT permited
Global variables are NOT allowed Parameters for functions may consists of
simple types, pointer types and data arrays only.
No support for "gotos"
Test Results
Execution time
Example 1
Software-only 80 ms
Software-hardware 47 ms
Example 2
Software-only 114 ms
Software-hardware 80 ms
Hardware or software?
Performance Cost Form factor Flexibility Safety Architectural cleanness and
simplicity
Intro
In multi media applications, a considerable amount of memory is required.
To reduce this dominant cost. A quad-tree based image coding
application.
Design Model
If we do not need the flexibility, one or more dedicated hardware processor(s) can be designed to perform the functions which are in the cycle.
When the flexibility is needed, we can use data level parallelism. The advantage of this approach is that it is simple to program but the memory overhead is high.
Design Model
Alternatively, we can use task level parallelism.
The advantages are that the code size per processor is relatively low.
The disadvantage is that the design time will be much higher due to the complex processor partitioning and memory management.
System Level Memory Optimization
All functions are taken together in one big function.
We have an algorithm that operates block per block. All computations are done on the first block.
Buffer memory for only one block will be required between the sub modules.
QSDPCM QSDPCM (Quadtree Structured
Difference Pulse Code Modulation) is a compression technique for video.
The algorithm optimize both the displacement vector and the quadtree mean decomposition jointly.
The displacement which requires the minimum number of bits for the quadtree decomposition is selected
Summary
If the HW/SW partitioning is performed first, remaining buffers afterwards cannot be optimized away anymore.
QSDPCM application, can do much better before the HW/SW partitioning.
Mixed Hardware/Software Systems
Many digital systems contain both hardware and software
Combining hardware and software design tasks has several advantages.
One is that may accelerate the design process. Another is that may enable hardware/software trade-offs to be made dynamically, as the design progresses.
Mixed Hardware/Software Systems
Unless the they are design together, we do not think of it as a mixed hardware/software system.
The distinguishing factor is whether the boundary between hardware and software is logical boundary or a physical boundary.
Simulation of Hardware/Software Systems
Presents the problem of modeling the behavior of a system based on the behavior of the hardware and software components.
Requires a simulation environment that can understand the semantics of both the software and the hardware components
Automated Hardware/Software Co-Synthesis
Allow the designer to explore more of the design space by dynamically reconfiguring the hardware and software.
Another challenge for hardware/software co-synthesis is that hardware and software are often described using different languages and formalisms.
Automated Hardware/Software Co-Synthesis
May include hardware/software partitioning. Some of the considerations are:
Performance requirements Implementation cost Modifiability Nature of computation Concurrency Communication
Several Examples of Hardware/Software Co-Design
Embedded microprocessor systems Heterogeneous multiprocessing
systems Application-specific instruction set
processors Special-purpose functional units Application-specific co-processor
OOP & HW/SW Codesign
Develop entire system in an object oriented programming language
Treat hardware as an object Allows for a unified design
environment HW functions can be simulated in
SW Object and implemented concurently
Problems with OOP
Synchronizing sequential code Interleaved SW and HW functions HW needs to know exactly when a
data object is ready to be worked on Same holds true for SW
C++ Class Library – Cylib
Handle this synchronization problem
Clock function and Done flag: objHardware.Modify( objData,
blnDone);while (!(blnDone)) {
objHardware.clock();}SoftwareFunction( objData );
C++ Class Library – Cylib 2
Approach is similar to interrupts Complexity is greatly reduced Interface allows HW/SW objects to
work hierarchical and in parallel Modification of HW design requires
changing only the class library
Another Approach:Complier Generation
OOP approach won’t work for all cases Example: MPU architecture changes Traditional MPU replacement, 2
options: Backwards compatible hardware. Simply
increase speed of functions, no new functionality.
Rewrite compiler, very costly.
Complier Generation
Theory: third option, generate compiler Radical architecture changes, compilers
wouldn’t need time to catch up Ideal for user defined processors Extract HW architecture information
then generate optimized executable code from high-level language
Complier Generation
Retargetable compilers exist Require significant human skill Simply are superset of all CPU instructions
Compiler Generator would Overcome retargetable compiler
limitations Maintain quality (speed, size, compilation
time) of conventional compiler
How it works 1
Optimize front-end code Architecture independent step Performed by conventional
compilers Passes a optimized grammar tree
structure to the next step
How it works 2 Get parameterized architecture info
Number of general registers, memory word size, instruction behavior, etc
Modify tree branches Using existing language (“twig”)
Translate into pattern functions Allocate registers Generate Executable Code
top related