ip-based design 12 october 2001 sungjoo yoo isrc, seoul nat’l univ
TRANSCRIPT
IP-based Design
12 October 2001Sungjoo YooISRC, Seoul Nat’l Univ.
Outline
Design productivity gap and design reuse IP-based design
Interface-based design
Platform-based design Function-architecture co-design
Practical issues in IP/Platform-based design
Summary
Design Productivity Gap
How to Increase Design Productivity?
(1)Reuse What to reuse? How to reuse? IPs (Intellectual Properties) IP-
based design Previous system design (or
architecture), platform platform-based design
How to Increase Design Productivity?
(2) Design at higher levels of abstraction E.g. 200 lines/man-day
code size: C > assembly
Abstraction levels higher than C code level. SPW, COSSAP, etc. SDL, CORBA, etc.
(3) Combine reuse and high-level design Currently, function-architecture co-
design
Design Methodology Evolution
IP-based Design
Basic strategy SoC design by assembling IP cores.
IP-based Design
Virtual Component (VC) = IP
IP-based Design
VC Interface VC Interface at RT level (VCI).
To reuse RTL IP’s
System-level interface (SLIF) To reuse behavioral IP’s
IP-based Design
VC integration with on-chip bus (OCB)
IP-based Design
Bus wrapper
IP-based Design
Example of VCI transactions
IP-based Design
VCI Options
IP-based Design w/ behavioral IP
System Level Interface (SLIF)
IP-based Design
VC example
IP-based Design
VC internal behavior
IP-based Design
Layering of VC refinement
IP-based Design
IP-based Design
IP-based Design
Integration of VC’s with OCB. IP w/ VCI VC w/ VCI + bus wrapper OCB
Incremental refinement of VC interface Behavioral IP SLIF OCB protocol w/ the behavior
unchanged. IP-based design
Formally, interface-based design
Interface-based Design
Separation between behavior and communication
Interface-based Design
Separation between behavior and communication It enables IP reuse. Each one can be refined separately.
Behavior refinementCommunication refinement
Design Automation Conf.’97 paper James A. Rowson (Cadence) and Alberto
Sangiovanni-Vincentelli (Berkeley).
Communication in Interface-based Design
sender receiver
master slave
substitution
repartition
Incremental Communication Refinement
Checkpoint:IP-based Design
Separation between behavior and communication
It works in incremental communication refinement.
It includes SW IP’s as well as HW ones. To be explained later in function-architecture
co-design.
Bottom-up approach SoC design by assembling IP cores.
Evolution to Platform-based Design
A Problem of Bottom-Up IP Integration How can the designer find the optimal
system architecture? Can we re-use our design experience
at a higher level than IP level? Reuse of previous designs in a similar
application domain. Platform-based design.
Platform-based Design
Platform Common hardware/software
denominator that could be shared across multiple applications in a given application domain.
E.g. Derivative design of Qualcomm CDMA mobile station modems (MSM’s)
MSM3000 MSM3100 MSM3100 MSM5100
Derivative Design of Qualcomm CDMA Chips
MSM3000, 3100, 5100, … Added functionality and interfaces Base functionality and interfaces
Platform of MSM3000 series
Note Platform consists of software parts as
well as hardware ones.
MSM3000
MSM3100
Derivative Design Example
Derivative Design Example 1 MSM3000 -> 3100
Derivative MSM Design
Functional viewpoint MSM3000 3100
+ PLL, USB, PM (ADC, Vtg reg.) RF i/f, Vocoder (QCELP EVRC), Codec
(chip in)
Derivative Design Example 2 MSM3100 -> 5100
Summary of Case Study: Derivative MSM Design
Functional viewpoint MSM3000 3100
+ PLL, USB, PM (ADC, Vtg reg.) RF i/f, Vocoder (QCELP EVRC), Codec (chip in)
MSM3100 5100 + gpsOne processor, Bluetooth baseband processor,
MMC, R-UIM controllers, MP3, MIDI Vocoder DSP (QDSP2000)
Platform-based design in functional viewpoint Common functionality + added/modified
functionality Architectural viewpoint?
Levels of Platform-based Design
Architectural viewpoint Fixed platform at layout or RTL Parameterized platform A family of parameterized platforms
Function/architecture codesign
Fixed Platform[p. 113, Surviving the SoC Revolution]
Fixed Platform at Layout
HW Kernel [p. 148, 156 Surviving the SoC …]
Parameterized Platform
UCI, digital camera platform
D$, I$ parameterssize, line, assoc
Bus parameterswidth, BI coding
DCT parametersprecisions
Function-Architecture Co-design
SoC platforms A family of parameterized platforms
High abstraction level design SW-centric SoC design Top-down flow in platform-based design
A key design step Mapping functions to SoC platform W/ different HW/SW, communication mapping
Thus, it is named Function-Architecture Co-design
Function-Architecture Co-design Flow
Function Architecture
Mapping Local optimizationCom. network design
Evaluation
HW/SW implementation
Cosimulation/emulation
AlgorithmArch.-indep opt.
Arch. dev.
Three Commercial Approaches
Cadence VCC Coware N2C Synopsys CoCentric
Function-Architecture Co-design: Cadence Approach
Function
Architecture
Function to Architecture Mapping
Mapping to HW and SW
Mapping Function to SW
Mapping Function to HW
Function-Architecture Co-design: Cadence Approach
Architecture Exploration w/ Different Architectures
Function-Architecture Co-design: Cadence Approach
Incremental Refinement of Function/Architecture
Three Ways to Performance Estimation
Limit the estimation to the runtime Applicable to other design metrics, e.g. power
Depending on where the function is mapped Processor and micro-controller
Addition of assembly instruction delays Usage of virtual instruction set
DSP Estimation of kernel function delays
ASIC With measurement or simulation models
Using Virtual Instruction Set in SW Performance Estimation
Usage of Virtual Instruction Set (VIS)
Constructing the delay table From the datasheet Run benchmark programs on actual processor
and measure Run benchmark programs on cycle-accurate
instruction set simulators Solve a set of linear equations
Compile the function code (e.g. in C) to the VIS.
Delay calculation by adding the delays of virtual instructions.
White Box Model
Characterization of SW Execution Time for DSPs
Two reasons of not using the VIS Many legacy/high-performance codes are assembly code. Performance is highly dependent on memory locations of
instructions and data. VIS ignores accesses to different memory locations.
DSP SW codes are usually dataflow functions with small amounts of control.
Dataflow functions dominate the total cycle count. Measure or estimate a set of standard DSP kernels or
atomic functions on a processor. Derive a parameterized delay equation for each kernel
function on each processor. Model an application as a scheduled sequence of kernel
functions.
Abstraction Levels of DSP Kernel Functions
Basic arithmetic Integer barrel shift, integer add Integer & long multiply
Generic signal processing and complex mathematical functions Auto-correlation, FIR filter Convolution, discrete time FFT
Application specific: e.g. for CDMA: Viterbi decoder Convolutional Encoder Block Interleaver 64-ary Orthogonal Modulator
Non-programmable HW Components and Custom HW
Examples MPEG decoder, CDMA modem, IP peripheral IO
block, etc. Modeling method
Generate delay equations for the VC block Output pin delay equations Considering the internal resource contention
To derive the delay equations, measure real HW or use the simulation model
Then, derive high level delay equations In terms of frame, packet, token processing
Function-Architecture Co-design: Cadence Approach
Refine HW/SW Architecture
Function-Architecture Co-design: Cadence Approach
Architectural Service Concept
Each architectural component provides a set of services. E.g. processor gives
Instruction fetching, interrupt handling, bus adapters
E.g. OS gives Scheduling, standard C library, software
timers, etc. E.g. Bus gives
Arbitration, single read/write, burst read/write.
Communication based on Architectural Services
Post()
RegisterMappedSWSender
Std. C lib. API
Std. C lib. Service
CPU mem. API
CPU Memory ServiceMaster Adapter API
FCFS Bus Adapter
Value(), Enabled()
RegisterMappedReceiverASIC mem. API
Bus Slave AdatperBus Slave API
SW
HW
Reuse of Architectural Services
Post()
RegisterMappedSWSender
Std. C lib. API
Std. C lib. Service
CPU mem. API
CPU Memory ServiceMaster Adapter API
FCFS Bus Adapter
Value(), Enabled()
RegisterMappedReceiverASIC mem. API
Bus Slave AdatperBus Slave API
SW
HW
If the designer uses a different bus,the remaining architectural servicesare reused.
Communication Patterns
Refine HW/SW Architecture
Coware N2C
Also supports function-architecture co-design Gradual refinement of communication/behavior
Untimed functional timed func RTL C HDL RPC BCASH real i/f
HW/SW Interface Synthesis Comparable to communication patterns in VCC
InterState Synthesis HW synthesis from C description Abstract port physical port implementation
E.g. scheduling port accesses On-chip bus modeling
Synopsys Approach: CoCentric
Compared to N2C and VCC, CoCentric first focuses on control/dataflow mixed functional specification Control flow FSM Data flow data flow models used in COSSAP
As a design entry, SystemC can be used. Control/dataflow spec. is compiled to
synthesizable/compilable HW and SW codes. Especially, HW synthesis from SystemC by SystemC
Compiler Existing Synopsys tool chains are used.
COSSAP stream driven simulation (SDS)
CoCentric: Control/Dataflow Specification
CoCentric: Dataflow Model (static I/O pattern)
CoCentric: Dataflow Model (dynamic I/O pattern)
CoCentric: Control Flow Model
CoCentric: Case Study
OR model embedded in a hierarchical DFG Bit-serial signals of mixed text and
image are multiplexed into a TDMA signal, modulated and transmitted.
Receiver Equalization based on LMS and
demodulation
CoCentric: Case Study
CoCentric: Case Study
Training Sequence Model
CoCentric: Case Study
OR model of Mux-TrainingSequence
CoCentric
Code scheduling and generation Control model
FSM sequential code Like Esterel C code
Dataflow model Use COSSAP
• Fixed I/O pattern sequential code• Dynamic I/O pattern scheduler
CoCentric
Import of HDL models Cosimulation with external HDL
simulators Import of SystemC models
Cosimulation with SystemC simulator
IP/Platform-based Design
IP/Platform Characterization IP/Platform Authoring SoC design validation in IP/Platform-
based design Testbench reuse Mixed-abstraction-level simulation
SoC Testing IEEE P1500
IP and Platform Characterization
Problem: exhaustive characterization Due to the large number of parameters, full
characterization does not seem to be possible.
E.g. IP or platform w/ 100 binary parameters --> 2100 configurations
For each configuration• HW area, runtime, power estimation• By simulation/estimation
In an exhaustive way, it is impossible to characterize the IP/platform!
How to do practical characterization?
IP and Platform Characterization
What the designer wants is pareto-optimal points in design space.
Solution Search space pruning by decomposing the
parameter space into orthogonal sub-spaces.
Exe. time
Power
x
x xxx
x
x
x xx
x
xxx
IP and Platform Characterization
Search space pruning Dependency between parameters
E.g. I$ line size and I$ associativity are dependent with each other.
E.g. I$ line size and D$ associativity are independent.
Key idea If parameter sub-spacess P1 and P2 are
independent Parato-optimal points (P1xP2) = POP(P1) x POP(P2)
IP and Platform Characterization
UCI, digital camera platform
Parameter Dependency
Parameter Clustering
Exhaustivesearch in a cluster
Cluster Merging
Design space =DS(AHI)xD(LMPQ)
Pareto-Optimal Points of (0.25, 0.08m)
Average pruning ratio = 99.999997%
IP and Platform Authoring
Problem: unexpected IP/Platform usage Negative test is required. Usage-scenario-based testing
for expected IP usage
Solution: precisely define illegal inputs for each IP configuration activate corner cases
e.g. by random test vector generators check illegal inputs by assertion check
IP/Platform Trade-off
Quality, verification, characterization vs. parameterization
Quality,Verifiability,Testability,
Characterizability
No. of parameters,Generality
1 single instance
SoC Validation in IP/Platform-based Design
SoC validation takes up to 2/3 of design cycle. Why?
E.g. if the integration of 1 core has 0.1% prob. of bug, then that of 200 core SoC has 20% prob. of bug!
How to reduce the validation efforts? Testbench reuse Mixed-abstraction-level simulation
SoC Validation in IP/Platform-based Design
Testbench reuse Most errors come from the integration
step.
SoC Validation in IP/Platform-based Design
Mixed-abstraction-level simulation Incremental integration of new
functionality
M4
High-levelSpecification M2 M3 M4
M1
OS
HW wr.
M3
PIP
AMBA
IntermediateImplementation
M1M1
SoC Validation in IP/Platform-based Design Mixed-abstraction-level simulation
A conventional method: bus functional model (BFM)
Functional memory access• E.g. write to variable x located 0x100
Cycle-accurate (C/A) memory access• addr_bus = 0x100; nrw=0; • 1 cycle delay; • return data_bus;
SystemC and Coware BCASH: bus cycle accurate shell
• RPC (remote procedural call) access to C/A accesses
TIMA Wrapper concept
SoC Testing Why SoC testing?
To detect manufacturing defects. Deep sub-micron technology various types of fault
Why core-based testing? Reuse of core test SoC test is constructed based on core test.
Core-based SoC testing Core provider
DFT (design for testability) hardware and test patterns
SoC Integrator SoC-level DFT, .e.g TAM (test access mechanism) SoC test patterns using core test patterns
SoC Testing
SoC Test Challenges Cores have different test methods.
E.g. BIST, scan, mixed, memory test, etc. Test access to cores
Cores can be deeply embedded. Test access should be routed to the cores.
Test optimization Many cores very long test time TTM loss Area overhead of SoC test Power overhead of SoC test
• E.g. BIST consumes more power than usual operations.
Core Test Requirements
Test access mechanism (TAM) and test wrapper
Core Test Requirements
Wrapper and TAM
Core-based Test Standard
IEEE P1500 SECT (Standard for Embedded Core Test) Core Test Language (CTL)
Core test knowledge transfer Test patterns are written to be reused.
Core Test Wrapper Architecture Test access to embedded cores
IEEE P1500
Core test wrapperElements-Wrapper Inst. Reg-Wrapper Bypass Reg.-Wrapper Boundary Reg.
Modes-Transparent mode-Serial InTest mode-Serial ExTest mode-Parallel InTest mode-Parallel ExTest mode
Automation of SoC Test Design Test wrapper generation and insertion Compliancy checking Interconnect test generation Test access planning and synthesis
E.g. # of TAM’s, mapping cores to TAM’s, TAM width Test expansion
Core test patterns SoC test patterns Test scheduling
Schedule core tests to minimize test resources (time, area, power, etc.).
Power Test application time and storage capacity
Case Study of Core-based Test Design
Fujitsu/LogicVision, ‘98
Case Study of Core-based Test Design
Core 1 (VD) DfT
Memory BIST insertion Scan chain insertion Logic BIST insertion
Case Study of Core-based Test Design
MPEG-2 Chip Core
Case Study of Core-based Test Design: Testability Results
Summary
Design productivity gain By reuse
IP reuse IP-based design Platform reuse platform-based design Test bench reuse Test reuse
By high-level SoC design Concurrent programming of SoC
Reuse and high-level SoC design Function-architecture co-design
Practical issues