timing¬driven variation¬aware nonuniformclock mesh synthesis

16
Israel, May 4, 2010 1 Timing–Driven Variation– Aware Nonuniform Clock Mesh Synthesis Ameer Abdelhadi, Ran Ginosar, Avinoam Kolodny, and Eby G. Friedman* Electronics Electronics Computers Computers Communicatio Communicatio ns ns This research was supported in part by the Technion ACRC and by the ALPHA research consortium * Also with Dept. of Electrical and Computer Engineering, University of Rochester Technion – Israel Institute of Technology Department of Electrical Engineering

Upload: alonagradman

Post on 11-May-2015

396 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010 1

Timing–Driven Variation–Aware Nonuniform Clock Mesh Synthesis

Ameer Abdelhadi, Ran Ginosar, Avinoam Kolodny, and Eby G. Friedman*

ElectronicsElectronicsComputersComputersCommunicatCommunicat

ionsions

This research was supported in part by the Technion ACRC and by the ALPHA research consortium

* Also with Dept. of Electrical and Computer Engineering, University of Rochester

Technion – Israel Institute of TechnologyDepartment of Electrical Engineering

Page 2: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Non-tree Clock Topologies (1)• Pros:

– Immune to process, voltage, and temperature variations.

– Tolerate non-uniform switching– Tolerate unbalanced distribution load– Low skew, variations, and jitter– Overcome late design changes

• Cons:– Difficult to analyze, optimize, and automate– Require significant resources– Dissipate higher power, due to:

• Large capacitance• Short-circuit and current loops

– Clock gating is impractical in most structures

2

Tree driving 10.6pF load Non-Uniform load at 2GHz

Trees driving a Grid with 68pF Non-Uniform load at 2GHz

Animations by Phillip J. Restle: P. J. Restle, "Technical visualizations in VLSI design: visualization," Proc. DAC, pp. 494-499, 2001.

Page 3: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Non-tree Clock Topologies (2)

3

Sinks

Spine

Driving tree

Local T

rees

Sinks

Clock tree

Crosslink

Sinks

Mesh

Pre-drivers

Sinks

Local M

eshesG

lobal tree

Pen

tiu

m 4

sp

ine

[1]

Tre

e w

ith

cro

sslin

ks [

2]

Lea

f le

vel g

lob

al m

esh

Sinks

Local Trees

Global mesh

Driving tree

Glo

bal

mes

h w

ith

lo

cal

tree

s (M

LT

) [3

]

Glo

bal

tre

e w

ith

lo

cal

mes

hes

(T

LM

) [3

][1] N. A. Kurd et al., “A Multigigahertz Clocking Scheme for the Pentium 4 Microprocessor,” JSSC 36(11):1647–1653, 2001.[2] A. Rajaram, J. Hu, and R. Mahapatra, “Reducing Clock Skew Variability Via Crosslinks,” Proc. DAC, pp. 18–23, 2004.[3] C. Yeh et al., “Clock Distribution Architectures: a Comparative Study,” Proc. ISQED, pp. 27-29, 2006.

Page 4: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Motivation

4

mesh

Page 5: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Method

5

Page 6: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Previous Work

Clock mesh design automation:• Segment wire width sizing [1]• Start from a clock tree and add crosslinks

[2],[3]• Start with a fully uniform mesh, remove

redundant segments [4],[5]Circuit timing is not exploited

6

[1] M. P. Desai, R. Cvijetic, and J. Jensen, “Sizing of Clock Distribution Networks for High Performance CPU Chips,” Proc. DAC, pp. 389-394, ‘96.[2] A. Rajaram, J. Hu, and R. Mahapatra, “Reducing Clock Skew Variability Via Crosslinks,” IEEE TCAD, 25(6): 1176-1182, ‘06.[3] I. Vaisband, R. Ginosar, A. Kolodny, and E. G. Friedman, “Power Efficient Tree-Based Crosslinks for Skew Reduction,” Proc. GLSVLSI, pp. 285-290, ‘09.[4] A. Rajaram et al., “MeshWorks: an Efficient Framework for Planning, Synthesis and Optimization of Clock Mesh Networks,” Proc. ASP-DAC, pp. 250-257, ‘08.[5] G. Venkataraman, Z. Feng, J. Hu, and P. Li “Combinational Algorithms for Fast Clock Mesh Optimization,” Proc. ICCAD, pp. 563-567, ‘06.

Page 7: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Timing Constraint Graph• Presents the circuit’s connectivity• Vertices represent clock sinks:

GCV=S={s1, s2, …, sn}

• Edges represent data paths:GC

E={ei,j=vi~vj|Pdelayij<∞,vi,vj∈GC

V}• Edges’ weights are maximum skew

constraint (permissible):(∀e∈GC

E) we= skewi,j

• Vertices also have attributes:– Sink capacitance:

(∀vi∈GCV) C(vi)=Capacitance(Si)

– Location:(∀vi∈GC

V) bbox(vi)=[[x0,y0],[x1,y1]]

7

0 1 2

S12

1

0

S2 S3

S4 S5 S6

S7 S8 S9

V1 V2 V3

V4 V5 V6

V7 V8

3 3

11

3

31

1

2

2

3 3

2

1

00 1 2

V9

Floorplan and connectivity

Corresponding graph

Page 8: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Timing–Driven Variation–Aware Problem Formulation

• Given:– Circuit connectivity– Static timing analysis

• Relative tolerance parameter ξ– User defined– Upper bound of the maximum skew variation ratio overall

maximum skew constraints:

(∀ei,j∈GCE) ξ≥δi,j

max/skewi,j

While:• δi,j

max is the maximum skew variation between register Si and Sj• skewi,j is the maximum permissible skew between register Si and Sj

Construct a clock mesh that satisfies the δi,jmax requirement

8

Page 9: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Solution Stages1. Derive Timing Constraint Graph for

Static Timing Analysis (STA)2. Generate skew map

– Rectangular shapes– [skew,cap,bbox] triples

3. Remove overlapping– Polygon shapes– [skew,cap,bbox] triples

4. Mesh Generation– Mach a mesh to each polygon– Mesh density satisfies skew variations

9

1 2 3 4

6

5

1 7 3

0

5

Skew1

Skew2

Skew3

Skew1

Skew2

Skew3

Skew1

Skew2

Skew3

1

2

3

4 F

l

o

o

r

p

l

a

nT

C G

Page 10: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Phase II: Generate Skew Map (1)• Graph theoretic algorithm• Find regions with deferent skew requirements• Inputs:

– Gc: Constraint Graph• Extracted at phase I• Contains circuit information

– T: Thresholds vector• Contains skew thresholds• Determine the granularity of the skew regions

• Output:– Skew map with rectangular shapes– [skew,cap,bbox] triples– Arranged in stack– Ascending order by skew

10

Page 11: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Phase II: Generate Skew Map (2)1. foreach tT1.1 GC

undirected=getUndirected(GC)

1.2 foreach eGCE

1.2.1 if we>tGC

undirected=GCundirected/e

1.3 UCC=getConnectedComponents(GC

undirected)

1.4 foreach uccUCC1.4.1 vmerge=mergeVertices(GC,ucc)

1.4.2 skewucc=min(we|e=vi~vj, vi,vjucc)

1.4.3 bboxucc=bbox(vmerge)

1.4.4 capucc=cap(vmerge)

1.4.5 push(skewBbox,skewucc,capucc,bboxucc])

11

1 2 3

4 5 6

7 8 9

3 3

11

3

31

1 2 3

4 5 6

7 8 9

1 2 3

4 5 6

7 8 9

11 1

T=[1,2,3] ; t=1

skew

=1

[1,C1,[[0,0],[1,2]]]

3

6

9

t=2

skew=2

[2,C2,[[1,1,[2,2]][1,C1,[[0,0],[1,2]]

3

6

9

t=3

[3,C3,[[0,0],[3,3]][2,C2,[[1,1],[2,2]][1,C1,[[0,0],[1,2]]

12

2

3 3

1

2 3

5 6

9

3 3

3

3

2

2

3 3

1'

2 3

5 6

9

3 3

3

3

2

2

3 3

1'

2 3

5 6

9

2

21'

2

9

3 3

3

33 3

1'

2'

2

9

3 3

3

33 3

1'

2'

2

9

3 3

3

33 3

1'

2'

3'

[ske

w,c

ap,b

box]

tr

iple

s pu

shed

in

to S

kew

BB

ox

Rem

ove

edge

s w

ith w

e<

t fro

m

GC

un

dire

cte

d (s

tep

1.2)

Mer

ge c

onne

cted

co

mpo

nent

into

Gc

(ste

p 1.

4)In

itial

Gra

ph G

c

Get

[s

kew

,cap

,bbo

x]

trip

les

skew

=1

1 2

4 5

7 8

skew

=1

1 2

4 5

7 8

skew=2skew=3

0 1 2

21

0

21

0

21

0

0 1 2 0 1 2

Ex

ec

uti

on

ex

am

ple

. R

ow

s a

re i

tera

tio

ns

an

d c

olu

mn

s a

re s

tep

s

Page 12: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Phase III: Remove Overlapping (1)

• Geometric algorithm

• Generate polygons from rectangles

• Input:– [skew,cap,bbox] triples stack

• Generated at phase II

• Output:– [skew,cap,polygon] triples stack

• Non-overlapping polygons

12

Page 13: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Phase III: Remove Overlapping (2)

1. covered=Ø2. while skewBbox≠Ø2.1 [skew,cap.,bbox]=

pop( reversed(

skewBbox))2.2 polygon=

coveredc∩bbox2.3 covered=

coveredbbox2.4 push(skewPolygon,

[skew,capac,polygon])

13

bbox

0 1 2

2

1

0 1 2

2

1

0 1 2

2

1

polygon

0 1 2

2

1

0 1 2

2

1

0 1 2

2

1

covered

0 1 2

2

1

0 1 2

2

1

0 1 2

2

1

[1,C1,[[0,0],[1,2]]]

[2,C2,[[1,1],[2,2]]]

[3,C3,[[0,0],[2,2]]]

reversed( skewBbox)

[2,C2,[[1,1],[2,2]]]

[3,C3,[[0,0],[2,2]]]

[3,C3,[[0,0],[2,2]]]

skew

[[0,0],[1,2]]

[[1,1],[2,2]]

[[0,0],[2,2]]

[ [0,0],[0,2], [1,2],[1,0] ]

[ [1,1],[1,2], [2,2],[2,1] ]

[ [1,0],[1,1], [2,1],[2,0] ]

[ [0,0],[0,2], [1,2],[1,0] ]

[[0,0],[0,2],[2,2][1,1],[1,0]]

[ [0,0],[0,2], [2,2],[2,0] ]

1

2

3

[skew,cap, polygon]

[3,C3,[[1,0],[1,1],[2,1],[2,0]]]

[2,C2,[[1,1],[1,2],[2,2],[2,1]]]

[1,C1,[[0,0],[0,2],[1,2],[1,0]]]

[2,C2,[[1,1],[1,2],[2,2],[2,1]]]

[1,C1,[[0,0],[0,2],[1,2],[1,0]]]

[1,C1,[[0,0],[0,2],[1,2],[1,0]]]

Execution example. Rows are iterations and columns are steps

Page 14: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Phase IV: Mesh Generation

• Mesh to each polygon

• Mesh density is tuned to satisfy skew variations

• Optimized drivers by solving set-covers problem

• Global and local trees are design by the internal EDA clock router

14

Mes

hS

inks

Pre

-driv

ers

Example of the proposed mesh

Page 15: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Implementation

• Algorithms:– Graph-theoretic

• for timing constraints– Geometric

• for layout generation

• Quasi-linear (nlogn) runtime

• Design Environment:– RTL to layout design flow– Standard EDA tools– Standard 65nm library– ISCAS89 benchmarks

15

Page 16: Timing¬Driven Variation¬Aware NonuniformClock Mesh Synthesis

Israel, May 4, 2010

Results

16