pi: prof. jason cong (ucla) students: lei he, david pan, xin yuan

29
Interconnect Planning, Interconnect Planning, Synthesis, and Layout for Synthesis, and Layout for Performance, Signal Performance, Signal Reliability and Cost Reliability and Cost Optimization Optimization SRC Task ID: 605.001 PI: Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan Mentors: Dr. Prakash Arunachalam (Intel) Dr. Norman Chang (HP) Dr. Wilm Donath (IBM) Dr. Stefan Rusu (Intel)

Upload: mahlah

Post on 05-Jan-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Interconnect Planning, Synthesis, and Layout for Performance, Signal Reliability and Cost Optimization SRC Task ID: 605.001. PI: Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan Mentors: Dr. Prakash Arunachalam (Intel) Dr. Norman Chang (HP) Dr. Wilm Donath (IBM) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Interconnect Planning, Synthesis, and Interconnect Planning, Synthesis, and Layout for Performance, Signal Layout for Performance, Signal

Reliability and Cost OptimizationReliability and Cost Optimization

SRC Task ID: 605.001

PI: Prof. Jason Cong (UCLA)

Students: Lei He, David Pan, Xin Yuan

Mentors: Dr. Prakash Arunachalam (Intel)

Dr. Norman Chang (HP)

Dr. Wilm Donath (IBM)

Dr. Stefan Rusu (Intel)

Page 2: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Project Overview

Objective: investigate an

interconnect-centric design flow and

methodology, consisting of:

Interconnect Planning

Interconnect Synthesis

Interconnect Layout

Page 3: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Key Issues in Interconnect Planning

Three levels of planning: Interconnect architecture planning (pre-design)

Interconnect planning with RTL-floorplan

Interconnect planning with physical-level

floorplan

Enabling tools: interconnect estimation

models for interconnect synthesis/layout

Page 4: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Review: Accomplishments in Year 1

Efficient (constant time) and accurate (90%) interconnect delay estimation models for 2-pin nets under different interconnect optimization algorithms [Cong-Pan, IWLS’98,

SRC/TECHCON’98, ASPDAC’99]

Optimal wire sizing (OWS) Simultaneous driver and wire sizing (SDWS) Simultaneous buffer insertion/sizing and wire sizing (BISWS)

Interconnect architecture planning [Cong-Pan,DAC’99]

Propose a unified wire-width planning framework Obtain a surprising result that our pre-determined two-width can

achieve close to optimal solution for a large wire length range ! Can handle different objective functions

Page 5: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Accomplishments in Year 2

Efficient and accurate interconnect estimation models for multiple-pin nets [Cong-Pan, TAU’99]

Buffer block planning for interconnect-driven floorplanning [Cong-Kong-Pan, ICCAD’99]

Further study on interconnect architecture planning

Page 6: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Interconnect Estimation for Multiple-Pin Nets

Objective: estimate delay/area under different interconnect optimizations (e.g, OWS, BISWS) quickly (100K - 1M nets per second)

Different targets:1. Minimize the delay to a single critical sink (SCS)2. Minimize the maximum delay (defined as the tree delay) for multiple critical sinks

(MCS)

G

Input

G0

Csn

Cs2

Cs1

Sn

S1

S2

S3

Cs3

Page 7: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Very difficult: No closed-form wire shaping or buffer insertion All available optimization algorithms are

iterative based Multiple critical sinks may exist at the same

time !

Our approach Reduce the multiple-pin net estimation problem

into one or several 2-pin net estimation problems, then use our previous (Year 1) results

Estimation for Multiple-Pin Nets

Page 8: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Reduction for OWS of SCS

G

Input

G0

Csk

Sk

Single-Line-Multiple-Load (SLML)

Cs1S1 S3

Cs2

S2

G

Input

G0

CskC2C1

Sk

Page 9: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

OWS for SCS

Transform SLML to SLSL (i.e., 2-pin net)

CkC2

Sk

R d

C1 Ck-1

l1 l2lk

W

Page 10: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

OWS for SCS

Sk

R dl1 l2

lk

W

C0 C2C1 Ck-1 CL

Cl

lC C C CL

ii

j

j

k

j j

j

k

L

1

1

0

1

Transform SLML to SLSL (i.e., 2-pin net)

Page 11: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Delay/Area Estimation for OWS/SCS

Closed-form delay estimation for the critical sink

llcrcRcRlW

l

lW

lCRClRT fadfddLdows

)(

2

)(),,(

2

1

22

10

where

arc4

11

Ld

a

CRrc

2

12 ,

W(x) is Lambert’s W function defined as we xw Closed-form area estimation for the critical path

lcR

ClcrClRAad

LfLdows

2)2(),,(

Page 12: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Summary for Interconnect Estimation

Develop delay and area estimation models for multiple-pin nets with consideration of various interconnect optimizations

Consider different optimization objectives Single critical sink (SCS) Multiple critical sinks (MCS)

Apply various optimization alternatives: Optimal wire sizing (OWS) Buffer insertion/sizing and wire sizing (BISWS)

Page 13: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Delay/Area Comparison with TRIO

Rd = 180ohm, C1 = 100 fF, C2 = 10 fF

One internal load, l1 = 0.1 to 0.9 x l (l = 5, 10 or 20 mm) Max. allowable wire width is 20x min. width; wire is segmented in every 10um.

0

0.5

1

1.5

2

0 5000 10000 15000 20000

L_1 (um)

Del

ay (n

s)

Model 5mm TRIO 5mm Model 10mm

TRIO 10mm Model 20mm TRIO 20mm

0

0.5

1

1.5

2

2.5

0 5000 10000 15000 20000L_1 (um)

W_a

vg (u

m))

Model 5mm TRIO 5mm Model 10mmTRIO 10mm Model 20mm TRIO 20mm

Page 14: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Accomplishments in Year 2

Efficient and accurate interconnect estimation models for multiple-pin nets (Cong-Pan, TAU’99)

Buffer block planning for interconnect-driven floorplanning (Cong-Kong-Pan, ICCAD’99)

Further study on interconnect architecture planning

Page 15: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Motivation for BBP

For high-performance DSM designs, many buffers may be inserted to optimize/meet interconnect delay (e.g., up to 800,000 for 50nm tech., [Cong’97, SRC Work Paper])

The introduction of so many buffers will significantly change a floorplan; thus shall be planned to ensure timing/design convergence.

Need proper buffer block planning (BBP) to address buffer location constraints (e.g., hard IP blocks) “dead area” utilization regularity for ease of layout and power/ground network

sharing

Page 16: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Buffer Block Planning Problem

Given: initial floorplan, buffer capacity for each soft block, and performance constraint for each net;

Output: “optimal” location/dimension of buffer blocks such that the overall chip area and the number of buffer blocks are minimized.

grey (soft) block(limited buffers)

buffer blocks

black (hard) block(no buffer allowed)

white space

Page 17: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Feasible Regions for BI Feasible region is the maximal region that a buffer

can be placed to meet given delay constraint.

1 bufferdriver

CL

driver

CL

k buffers

Page 18: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Feasible Regions for BI We obtain the closed-form formula for FR’s Important observation: even under tight delay constraint,

FR for BI can still be pretty large! => FR provides a lot flexibility to plan buffer location

0

2000

4000

6000

8000

10000

0 0.2 0.4 0.6delta

um

6000um 7000um 8000um 9000um• FR distance under different delay budgets

• Delay budget is (1+delta) Topt (the best delay by optimal buffer insertion)

Page 19: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Feasible Regions for BI

FR extended to 2-dimension with obstacles

source

sink

2-D FR

Restricted (RES) line (delay minimal BI positions)

Page 20: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Overview of BBP Algorithm

1. Build polar graphs for given floorplan;

2. Build tile data structure;

3. For each tile, compute its area slacks;

4. Compute FR(s) for each net;

4. While (there exists some buffer to be inserted) {Pick_A_Tile that can insert most buffers w/o area penalty; if no

such tile exists, pick the one with most BI demand ;

Insert_Buffers into : insert all those buffers whose FR’s intersect with to create BB w/o area penalty; or insert one buffer into to expand its channel;

Update chip dimension, FR, and area slacks, etc.

}

Page 21: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Experimental Setting

Two Scenarios (for buffer insertion flexibility): RES: restricted buffer insertion position(s) as to

minimize delay FR: feasible buffer region as to meet delay

constraint

Two Algorithms (for buffer clustering): RDM: a buffer is randomly assigned to any feasible

location BBP: buffers are assigned with appropriate

clustering 6 MCNC + 5 randomly generated circuits (0.18um tech)

Page 22: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

#nets that meet delay constraints

FR provides a lot more flexibility than RES (e.g., to avoid obstacles) during BI, thus can better meet delay constraints

0

200

400

600

800

1000

1200

RDM/RES RDM/FR BBP/RES BBP/FR

Page 23: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Area Increase (%) due to BI

BBP/FR can effectively cluster individual buffers together with marginal area increase (less than 2% in all above test cases), by high utilization of “dead areas”.

0.00

1.00

2.00

3.00

4.00

5.00

RDM/RES RDM/FR BBP/RES BBP/FR

Page 24: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Comparison of #BB

BBP reduces #BB from RDM by a factor of up to 3x;BBP/FR further reduces #BB from BBP/RES by up to 34%

0

100

200

300

400

500

600

RDM/RES RDM/FR BBP/RES BBP/FR

Page 25: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Accomplishments in Year 2

Efficient and accurate interconnect estimation models for multiple-pin nets (Cong-Pan, TAU’99)

Buffer block planning for interconnect-driven floorplanning (Cong-Kong-Pan, ICCAD’99)

Further study on interconnect architecture planning Our two width-planning is still valid for certain

range (2x) of driver size variation Currently investigating wider range of variations

Page 26: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Deliverables Development of efficient and accurate interconnect performance

estimation models for interconnect-driven synthesis and planning (Completed - 30-Jun-1999)

Development of interconnect architecture planning framework (Completed - 30-Jun-1999)

Development of efficient algorithms for integrated interconnect planning & floorplanning capabilities at the physical level (Completed - 30-Sep-1999)

Development & validation of accurate noise models to guide the interconnect synthesis algorithm for signal reliability (Planned - 31-Dec-1999)

Development of optimal or near-optimal interconnect synthesis algorithm for multiple spatially or temporally related signal nets for performance & signal reliability optimization (Planned - 31-Dec-1999)

Development of efficient algorithms for integrated interconnect planning & floorplanning capabilities at the RTL-level; Software (Planned - 31-Dec-2000)

Page 27: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Technology Transfer

TRIO (TRIO-Repeater-Interconnect-Optimization) package Integrated into Intel design technology Available on the web:

http://cadlab.cs.ucla.edu/~trio

IDEM (Interconnect Delay Estimation Model) package Prototype provided to Intel Package will be available this week to all SRC member companies:

http://cadlab.cs.ucla.edu/~trio

BBP (Buffer Block Planning) for physical level floorplanning Interest from Intel and HP

Page 28: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Summary and Future Work

Efficient and accurate interconnect estimation models

Interconnect architecture planning Buffer Block Planning Future Work:

Noise estimation and planning RTL interconnect planning

Page 29: PI:  Prof. Jason Cong (UCLA) Students: Lei He, David Pan, Xin Yuan

Milestones Development of a computational model for interconnect architecture planning based

on a given design characterization (specified in terms of target clock rate, interconnect distribution, depths of logic,network, etc.) (31-Dec-1998)

Development of estimation models for interconnect layout optimizations suitable for pre-layout synthesis and planning (31-Dec-1998)

Development of efficient algorithms for integrated interconnect planning and floorplanning capabilities at the RTL-level (31-Dec-1999)

Completion of the ongoing effort on the development on a multi-layer general-area gridless routing system (31-Dec-1999)

Development of optimal or near-optimal interconnect synthesis algorithm for multiple spatially or temporally related signal nets for performance and signal reliability optimization (31-Dec-1999)

Development and validation of very efficient but accurate noise models to relate the noise with the physical parameters to guide the interconnect synthesis algorithm for signal reliability optimization (31-Dec-1999)

Development of efficient algorithms for integrated interconnect planning and floorplanning capabilities at the physical level (31-Dec-1999)

Development of efficient algorithms for integrated interconnect planning and floorplanning capabilities at the RT-level (31-Dec-2000)