l1: application specific integrated circuits: introduction jun-dong cho sungkyunkwan univ. dept. of...

39
L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab. http://vada.skku.ac.kr

Upload: bryan-byrd

Post on 28-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

L1: Application Specific Integrated Circuits:

Introduction

Jun-Dong ChoSungKyunKwan Univ.

Dept. of ECE, Vada Lab. http://vada.skku.ac.kr

Page 2: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

2

Contents

Why ASIC? Introduction to System On Chip Design Hardware and Software Co-design Low Power ASIC Designs

Page 3: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

3

Why ASIC - Design productivity grows!Complexity increase 40 % per year Design productivity increase 15 % per year

Integration of PCB on single die

Page 4: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

4

Silicon in 2010Die Area: 2.5x2.5 cmVoltage: 0.6 VTechnology: 0.07 m

Density Access Time(Gbits/cm2) (ns)

DRAM 8.5 10DRAM (Logic) 2.5 10SRAM (Cache) 0.3 1.5

Density Max. Ave. Power Clock Rate(Mgates/cm2) (W/cm2) (GHz)

Custom 25 54 3Std. Cell 10 27 1.5

Gate Array 5 18 1Single-Mask GA 2.5 12.5 0.7

FPGA 0.4 4.5 0.25

Page 5: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

5

ASIC Principles Value-added ASIC for huge volume opportunities;

standard parts for quick time to market applications Economics of Design

Fast Prototyping, Low Volume Custom Design, Labor Intensive, High Volume

CAD Tools Needed to Achieve the Design Strategies System-level design: Concept to VHDL/C Physical design VHDL/C to silicon, Timing closure (Monterey,

Magma, Synopsys, Cadence, Avant!) Design Strategies: Hierarchy; Regularity; Modularity;

Locality

Page 6: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

6

ASIC Design Strategies Design is a continuous tradeoff to achieve

performance specs with adequate results in all the other parameters.

Performance Specs - function, timing, speed, power

Size of Die - manufacturing cost Time to Design - engineering cost and schedule Ease of Test Generation & Testability -

engineering cost, manufacturing cost, schedule

Page 7: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

7

ASIC Flow

Page 8: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

8

Structured ASIC Designs Hierarchy: Subdivide the design into many levels

of sub-modules Regularity: Subdivide to max number of similar

sub-modules at each level Modularity: Define sub-modules unambiguously &

well defined interfaces Locality: Max local connections, keeping critical

paths within module boundaries

Page 9: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

9

ASIC Design Options Programmable Logic Programmable Interconnect Reprogrammable Gate Arrays Sea of Gates & Gate Array Design Standard Cell Design Full Custom Mask Design

Symbolic Layout Process Migration - Retargeting Designs

Page 10: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

10

ASIC Design Methodologies

D ensity

P erfo rm ance

F lex ibi l i ty

D esign tim e

M anu factu ring tim e

C o st - lo w vo lum e

C o st - h igh vo lum e

C ustomC ustom

V ery H igh

V ery H igh

V ery H igh

V ery L ong

V ery H igh

L o w

M edium

C ell-based

H igh

H igh

H igh

H igh

L o w

Sho rt

M edium

P red iffused

L o w

H igh

H igh

H igh

Sho rt

Sho rt

M edium

P rew ired

L o w

H igh

L o w

V ery Sho rt

V ery Sho rt

M edium - L o w

M edium - L o w

Page 11: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

11

Why SOC?

• SOC specs are coming from ICT system engineers rather

than RTL descriptions

•SOC will bridge the gap b/w s/w and their implementation

in novel, energy-efficient silicon architecture.

•In SOC design, chips are assembled at IP block level (design reusable) and IP interfaces rather than gate level

Page 12: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

12

Common Fabric for IP Blocks Soft IP blocks are portable, but not as predictable as

hard IP. Hard IP blocks are very predictable since a specific

physical implementation can be characterized, but are hard to port since are often tied to a specific process.

Common fabric is required for both portability and predictability.

Wide availability: Cell Based Array, metal programmable architecture that provides the performance of a standard cell and is optimized for synthesis.

Page 13: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

13

Four main applications

Set-top box: Mobile multimedia system, base station for the home local-area network.

Digital PCTV: concurrent use of TV,3D graphics, and Internet services

Set-top box LAN service: Wireless home-networks, multi-user wireless LAN

Navigation system: steer and control traffic and/or goods-transportation

Page 14: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

14

PC-Multimedia Applications

Page 15: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

15Types of System-on-a-Chip Designs

Page 16: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

16

Physical gap

Timing closure problem: layout-driven logic and RT-level synthesis

Energy efficiency requires locality of computation and storage: match for stream-based data processing of speech,images, and multimedia-system packets.

Next generation SOC designers must bridge the architectural gap b/w system specification and energy-efficient IP-based architectures, while CAE vendors and IP providers will bridge the physical gap.

Page 17: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

17

Circular Y-Chart

Page 18: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

18

SOC Co-Design Challenges Current systems are complex and heterogenous

Contain many different types of components Half of the chip can be filled with 200 low-power,

RISC-like processors (ASIP) interconnected by field-programmable buses, embedded in 20Mbytes of distributed DRAM and flash memory, Another Half: ASIC

Computational power will not result from multi-GHz clocking but from parallelism, with below 200 MHz. This will greatly simplify the design for correct timing, testability, and signal integrity.

Page 19: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

19

Bridging the architectural gap One-M gate reconfigurable, one-M gate hardwired

logic. 50GIPS for programmable components or 500

GIPS for dedicated hardwares Product reliability: design at a level far above the

RT level, with reuse factors in excess of 100 Trade-off: 100MOPs/watt (microprocessor)

100GOPs/watt (hardwired) Reconf. Computing with a large number of computing nodes and a very restricted instruction set (Pleiades)

Page 20: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

20

Why Lower Power

Portable systems long battery life light weight small form factor

IC priority list power dissipation cost performance

Technology direction Reduced voltage/power

designs based on mature high performance IC technology, high integration to minimize size, cost, power, and speed

Page 21: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

21

year

Power(W)

1980 1985 1990 1995 2000

10

20

30

40

50

5

15

25

35

45

i286i386 DX 16 i486 DX25

i486 DX 50

i486 DX2 66 P-PC601 50

P6 166

P5 66

Alpha21064 200

Alpha 21164

i486 DX4 100

P II 300

P-PC604 133

P-PC750 400

P III 500

Alpha 21264

Microprocessor Power Dissipation

Page 22: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

22

Levels for Low Power DesignSystem

Algorithm

Architecture

Circuit/Logic

Technology

Hardware-software partitioning,

Complexity, Concurrency, Locality,

Parallelism, Pipelining, Signal correlations

Sizing, Logic Style, Logic Design

Threshold Reduction, Scaling, Advanced packaging

Possible Power Savings at Different Design LevelsLevel of

Abstraction Expected Saving

Algorithm

Architecture

Logic Level

Layout Level

Device Level

10 - 100 times

10 - 90%

20 - 40%

10 - 30%

10 - 30%

Regularity, Data representation

Instruction set selection, Data rep.

SOI

Power down

Page 23: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

23

Power-hungry Applications

Signal Compression: HDTV Standard, ADPCM, Vector Quantization, H.263, 2-D motion estimation, MPEG-2 storage management

Digital Communications: Shaping Filters, Equalizers, Viterbi decoders, Reed-Solomon decoders

Page 24: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

24

New Computing Platforms

SOC power efficiency more than 10GOPs/w Higher On Chip System Integration: COTS: 100W,

SOAC:10W (inter-chip capacitive loads, I/O buffers) Speed & Performance: shorter interconnection,fewer

drivers,faster devices,more efficient processing artchitectures

Mixed signal systems Reuse of IP blocks Multiprocessor, configurable computing Domain-specific, combined memory-logic

2P kCFV

Page 25: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

25

Three Factors affecting Energy– Reducing waste by Hardware Simplification:

redundant h/w extraction, Locality of reference,Demand-driven / Data-driven computation,Application-specific processing,Preservation of data correlations, Distributed processing

– All in one Approach(SOC): I/O pin and buffer reduction– Voltage Reducible Hardwares

2-D pipelining (systolic arrays) SIMD:Parallel Processing:useful for data w/ parallel

structure VLIW: Approach- flexible

Page 26: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

26

IBM’s PowerPC Lower Power Architecture Optimum Supply Voltage through Hardware Parallel, Pipelining ,Parallel instruction

execution 603e executes five instruction in parallel (IU, FPU, BPU, LSU, SRU) FPU is pipelined so a multiply-add instruction can be issued every clock cycle Low power 3.3-volt design

Use small complex instruction with smaller instruction length IBM’s PowerPC 603e is RISC

Superscalar: CPI < 1 603e issues as many as three instructions per cycle

Low Power Management 603e provides four software controllable power-saving modes.

Copper Processor with SOI IBM’s Blue Logic ASIC :New design reduces of power by a factor of 10 times

Page 27: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

27

Power-Down Techniques

◆ Lowering the voltage along with the clock actually alters the energy-per-operation of the microprocessor, reducing the energy required to perform a fixed amount of work

Page 28: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

28

Implementing Digital Systems

Page 29: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

29

H/W and S/W Co-design

Page 30: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

30

Three Co-Design Approaches IFIP International Conference FORTE/PSTV’98, Nov.’98 N.S. Voros et.al, “Hardware -software co-design of embedded

systems using multiple formalisms for application development” ASIP co-design: starts with an application, builds a specific

programmable processor and translates the application into software code. H/w and s/w partitioning includes the instruction set design.

H/w s/w synchronous system co-design: s/w processor as a master controller, and a set of h/w accelerators as co-processors. Vulcan,Codes,Tosca,Cosyma

H/w s/w for distributed systems: mapping of a set of communication processors onto a set of interconnected processors. Behavioral decomposition, process allocation and communication

transformation. Coware(powerful),Siera (reuse),Ptolemy (DSP)

Page 31: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

31

Mixing H/W and S/W Argument: Mixed hardware/ software systems

represent the best of both worlds.High performance, flexibility, design reuse, etc.

Counterpoint: From a design standpoint, it is the worst of both worlds

Simulation: Problems of verification, and test become harder

Interface: Too many tools, too many interactions, too much heterogeneity

Hardware/ software partitioning is “AI- complete”!

Page 32: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

32

Low power partitioning approach

Different HW resources are invoked according to the instruction executed at a specific point in time

During the execution of the add op., ALU and register are used, but Multiplier is in idle state.

Non-active resources will still consume energy since the according circuit continue to switch

Calculate wasting energy Adding application specific core and partial

running Whenever one core performing, all the other

cores are shut down

Page 33: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

33

ASIP Design Given a set of applications, determine micro

architecture of ASIP (i. e., configuration of functional units in datapaths, instruction set)

To accurately evaluate performance of processor on a given application need to compile the application program onto the processor datapath and simulate object code.

The micro architecture of the processor is a design parameter!

Page 34: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

34

ASIP Design Flow

Page 35: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

35

Cross-Disciplinary nature

Software for low power:loop transformation leads to much higher temporal and spatial locality of data.

Code size becomes an important objective Software will eventually become a part of the chip

Behavior-platform-compiler codesign: codesigned with C++ or JAVA, describing their h/w and s/w implementation.

Multidisciplinary system thinking is required for future designs (e.g., Eindhoven Embedded Systems Institute http://www.eesi.tue.nl/english)

Page 36: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

36

VLSI Signal Processing Design Methodology

pipelining, parallel processing, retiming, folding, unfolding, look-ahead, relaxed look-ahead, and approximate filtering

bit-serial, bit-parallel and digit-serial architectures, carry save architecture

redundant and residue systems Viterbi decoder, motion compensation, 2D-

filtering, and data transmission systems

Page 37: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

37

Low Power DSP DO-LOOP Dominant

VSELP Vocoder : 83.4 %2D 8x8 DCT : 98.3 %LPC computation : 98.0 %

DO-LOOP Power Minimization ==> DSP Power Minimization

VSELP : Vector Sum Excited Linear PredictionLPC : Linear Prediction Coding

Page 38: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

38

Deep-Submicron Design Flows Rapid evaluation of complex designs for area and

performance Timing convergence via estimated routing

parasitics In-place timing repair without resynthesis Shorter design intervals, minimum iterations Block-level design and place and route Localized changes without disturbance Integration of complex projects and design reuse

Page 39: L1: Application Specific Integrated Circuits: Introduction Jun-Dong Cho SungKyunKwan Univ. Dept. of ECE, Vada Lab

VLSI Algorithmic Design Automation Lab.

39

SOC CAD Companies Avant! www.avanticorp.com Cadence www.cadence.com Duet Tech www.duettech.com Escalade www.escalade.com Logic visions

www.logicvision.com Mentor Graphics

www.mentor.com Palmchip www.palmchip.com Sonic www.sonicsinc.com Summit Design www.summit-

design.com

Synopsys www.synopsys.com

Topdown design solutions www.topdown.com

Xynetix Design Systems www.xynetix.com

Zuken-Redac www.redac.co.uk