Israel, May 4, 2010 1
Timing–Driven Variation–Aware Nonuniform Clock Mesh Synthesis
Ameer Abdelhadi, Ran Ginosar, Avinoam Kolodny, and Eby G. Friedman*
ElectronicsElectronicsComputersComputersCommunicatCommunicat
ionsions
This research was supported in part by the Technion ACRC and by the ALPHA research consortium
* Also with Dept. of Electrical and Computer Engineering, University of Rochester
Technion – Israel Institute of TechnologyDepartment of Electrical Engineering
Israel, May 4, 2010
Non-tree Clock Topologies (1)• Pros:
– Immune to process, voltage, and temperature variations.
– Tolerate non-uniform switching– Tolerate unbalanced distribution load– Low skew, variations, and jitter– Overcome late design changes
• Cons:– Difficult to analyze, optimize, and automate– Require significant resources– Dissipate higher power, due to:
• Large capacitance• Short-circuit and current loops
– Clock gating is impractical in most structures
2
Tree driving 10.6pF load Non-Uniform load at 2GHz
Trees driving a Grid with 68pF Non-Uniform load at 2GHz
Animations by Phillip J. Restle: P. J. Restle, "Technical visualizations in VLSI design: visualization," Proc. DAC, pp. 494-499, 2001.
Israel, May 4, 2010
Non-tree Clock Topologies (2)
3
Sinks
Spine
Driving tree
Local T
rees
Sinks
Clock tree
Crosslink
Sinks
Mesh
Pre-drivers
Sinks
Local M
eshesG
lobal tree
Pen
tiu
m 4
sp
ine
[1]
Tre
e w
ith
cro
sslin
ks [
2]
Lea
f le
vel g
lob
al m
esh
Sinks
Local Trees
Global mesh
Driving tree
Glo
bal
mes
h w
ith
lo
cal
tree
s (M
LT
) [3
]
Glo
bal
tre
e w
ith
lo
cal
mes
hes
(T
LM
) [3
][1] N. A. Kurd et al., “A Multigigahertz Clocking Scheme for the Pentium 4 Microprocessor,” JSSC 36(11):1647–1653, 2001.[2] A. Rajaram, J. Hu, and R. Mahapatra, “Reducing Clock Skew Variability Via Crosslinks,” Proc. DAC, pp. 18–23, 2004.[3] C. Yeh et al., “Clock Distribution Architectures: a Comparative Study,” Proc. ISQED, pp. 27-29, 2006.
Israel, May 4, 2010
Motivation
4
mesh
Israel, May 4, 2010
Method
5
Israel, May 4, 2010
Previous Work
Clock mesh design automation:• Segment wire width sizing [1]• Start from a clock tree and add crosslinks
[2],[3]• Start with a fully uniform mesh, remove
redundant segments [4],[5]Circuit timing is not exploited
6
[1] M. P. Desai, R. Cvijetic, and J. Jensen, “Sizing of Clock Distribution Networks for High Performance CPU Chips,” Proc. DAC, pp. 389-394, ‘96.[2] A. Rajaram, J. Hu, and R. Mahapatra, “Reducing Clock Skew Variability Via Crosslinks,” IEEE TCAD, 25(6): 1176-1182, ‘06.[3] I. Vaisband, R. Ginosar, A. Kolodny, and E. G. Friedman, “Power Efficient Tree-Based Crosslinks for Skew Reduction,” Proc. GLSVLSI, pp. 285-290, ‘09.[4] A. Rajaram et al., “MeshWorks: an Efficient Framework for Planning, Synthesis and Optimization of Clock Mesh Networks,” Proc. ASP-DAC, pp. 250-257, ‘08.[5] G. Venkataraman, Z. Feng, J. Hu, and P. Li “Combinational Algorithms for Fast Clock Mesh Optimization,” Proc. ICCAD, pp. 563-567, ‘06.
Israel, May 4, 2010
Timing Constraint Graph• Presents the circuit’s connectivity• Vertices represent clock sinks:
GCV=S={s1, s2, …, sn}
• Edges represent data paths:GC
E={ei,j=vi~vj|Pdelayij<∞,vi,vj∈GC
V}• Edges’ weights are maximum skew
constraint (permissible):(∀e∈GC
E) we= skewi,j
• Vertices also have attributes:– Sink capacitance:
(∀vi∈GCV) C(vi)=Capacitance(Si)
– Location:(∀vi∈GC
V) bbox(vi)=[[x0,y0],[x1,y1]]
7
0 1 2
S12
1
0
S2 S3
S4 S5 S6
S7 S8 S9
V1 V2 V3
V4 V5 V6
V7 V8
3 3
11
3
31
1
2
2
3 3
2
1
00 1 2
V9
Floorplan and connectivity
Corresponding graph
Israel, May 4, 2010
Timing–Driven Variation–Aware Problem Formulation
• Given:– Circuit connectivity– Static timing analysis
• Relative tolerance parameter ξ– User defined– Upper bound of the maximum skew variation ratio overall
maximum skew constraints:
(∀ei,j∈GCE) ξ≥δi,j
max/skewi,j
While:• δi,j
max is the maximum skew variation between register Si and Sj• skewi,j is the maximum permissible skew between register Si and Sj
Construct a clock mesh that satisfies the δi,jmax requirement
8
Israel, May 4, 2010
Solution Stages1. Derive Timing Constraint Graph for
Static Timing Analysis (STA)2. Generate skew map
– Rectangular shapes– [skew,cap,bbox] triples
3. Remove overlapping– Polygon shapes– [skew,cap,bbox] triples
4. Mesh Generation– Mach a mesh to each polygon– Mesh density satisfies skew variations
9
1 2 3 4
6
5
1 7 3
0
5
Skew1
Skew2
Skew3
Skew1
Skew2
Skew3
Skew1
Skew2
Skew3
1
2
3
4 F
l
o
o
r
p
l
a
nT
C G
Israel, May 4, 2010
Phase II: Generate Skew Map (1)• Graph theoretic algorithm• Find regions with deferent skew requirements• Inputs:
– Gc: Constraint Graph• Extracted at phase I• Contains circuit information
– T: Thresholds vector• Contains skew thresholds• Determine the granularity of the skew regions
• Output:– Skew map with rectangular shapes– [skew,cap,bbox] triples– Arranged in stack– Ascending order by skew
10
Israel, May 4, 2010
Phase II: Generate Skew Map (2)1. foreach tT1.1 GC
undirected=getUndirected(GC)
1.2 foreach eGCE
1.2.1 if we>tGC
undirected=GCundirected/e
1.3 UCC=getConnectedComponents(GC
undirected)
1.4 foreach uccUCC1.4.1 vmerge=mergeVertices(GC,ucc)
1.4.2 skewucc=min(we|e=vi~vj, vi,vjucc)
1.4.3 bboxucc=bbox(vmerge)
1.4.4 capucc=cap(vmerge)
1.4.5 push(skewBbox,skewucc,capucc,bboxucc])
11
1 2 3
4 5 6
7 8 9
3 3
11
3
31
1 2 3
4 5 6
7 8 9
1 2 3
4 5 6
7 8 9
11 1
T=[1,2,3] ; t=1
skew
=1
[1,C1,[[0,0],[1,2]]]
3
6
9
t=2
skew=2
[2,C2,[[1,1,[2,2]][1,C1,[[0,0],[1,2]]
3
6
9
t=3
[3,C3,[[0,0],[3,3]][2,C2,[[1,1],[2,2]][1,C1,[[0,0],[1,2]]
12
2
3 3
1
2 3
5 6
9
3 3
3
3
2
2
3 3
1'
2 3
5 6
9
3 3
3
3
2
2
3 3
1'
2 3
5 6
9
2
21'
2
9
3 3
3
33 3
1'
2'
2
9
3 3
3
33 3
1'
2'
2
9
3 3
3
33 3
1'
2'
3'
[ske
w,c
ap,b
box]
tr
iple
s pu
shed
in
to S
kew
BB
ox
Rem
ove
edge
s w
ith w
e<
t fro
m
GC
un
dire
cte
d (s
tep
1.2)
Mer
ge c
onne
cted
co
mpo
nent
into
Gc
(ste
p 1.
4)In
itial
Gra
ph G
c
Get
[s
kew
,cap
,bbo
x]
trip
les
skew
=1
1 2
4 5
7 8
skew
=1
1 2
4 5
7 8
skew=2skew=3
0 1 2
21
0
21
0
21
0
0 1 2 0 1 2
Ex
ec
uti
on
ex
am
ple
. R
ow
s a
re i
tera
tio
ns
an
d c
olu
mn
s a
re s
tep
s
Israel, May 4, 2010
Phase III: Remove Overlapping (1)
• Geometric algorithm
• Generate polygons from rectangles
• Input:– [skew,cap,bbox] triples stack
• Generated at phase II
• Output:– [skew,cap,polygon] triples stack
• Non-overlapping polygons
12
Israel, May 4, 2010
Phase III: Remove Overlapping (2)
1. covered=Ø2. while skewBbox≠Ø2.1 [skew,cap.,bbox]=
pop( reversed(
skewBbox))2.2 polygon=
coveredc∩bbox2.3 covered=
coveredbbox2.4 push(skewPolygon,
[skew,capac,polygon])
13
bbox
0 1 2
2
1
0 1 2
2
1
0 1 2
2
1
polygon
0 1 2
2
1
0 1 2
2
1
0 1 2
2
1
covered
0 1 2
2
1
0 1 2
2
1
0 1 2
2
1
[1,C1,[[0,0],[1,2]]]
[2,C2,[[1,1],[2,2]]]
[3,C3,[[0,0],[2,2]]]
reversed( skewBbox)
[2,C2,[[1,1],[2,2]]]
[3,C3,[[0,0],[2,2]]]
[3,C3,[[0,0],[2,2]]]
skew
[[0,0],[1,2]]
[[1,1],[2,2]]
[[0,0],[2,2]]
[ [0,0],[0,2], [1,2],[1,0] ]
[ [1,1],[1,2], [2,2],[2,1] ]
[ [1,0],[1,1], [2,1],[2,0] ]
[ [0,0],[0,2], [1,2],[1,0] ]
[[0,0],[0,2],[2,2][1,1],[1,0]]
[ [0,0],[0,2], [2,2],[2,0] ]
1
2
3
[skew,cap, polygon]
[3,C3,[[1,0],[1,1],[2,1],[2,0]]]
[2,C2,[[1,1],[1,2],[2,2],[2,1]]]
[1,C1,[[0,0],[0,2],[1,2],[1,0]]]
[2,C2,[[1,1],[1,2],[2,2],[2,1]]]
[1,C1,[[0,0],[0,2],[1,2],[1,0]]]
[1,C1,[[0,0],[0,2],[1,2],[1,0]]]
Execution example. Rows are iterations and columns are steps
Israel, May 4, 2010
Phase IV: Mesh Generation
• Mesh to each polygon
• Mesh density is tuned to satisfy skew variations
• Optimized drivers by solving set-covers problem
• Global and local trees are design by the internal EDA clock router
14
Mes
hS
inks
Pre
-driv
ers
Example of the proposed mesh
Israel, May 4, 2010
Implementation
• Algorithms:– Graph-theoretic
• for timing constraints– Geometric
• for layout generation
• Quasi-linear (nlogn) runtime
• Design Environment:– RTL to layout design flow– Standard EDA tools– Standard 65nm library– ISCAS89 benchmarks
15
Israel, May 4, 2010
Results
16