wire sizing and spacing - purdue universitychengkok/ee695k/lec9a.pdf · 1999. 4. 12. · prepared...
TRANSCRIPT
EE695K VLSI Interconnect
Prepared by CK 1
Wire Sizing and Spacing
Wiresizing Algorithms
• Wiresizing based on local refinement– Optimal wiresizing under a distributed RC model [Cong-
Leung-Zhou, DAC’93]
– Optimal wiresizing under distributed Elmore delay model forcritical sinks [Cong -Leung, ICCAD’93]
– Optimal wiresizing for multiple source interconnects[Cong-He, ICCAD’95]
• Wiresizing using convex programming [Sapatnekar,DAC’94],
• Sensitivity-based [Menezes’95, Xue-Kuh-Yu’96],• Dynamic programming [Lillis -Cheng -Lin’96],• Continuous wire shaping [Fishburn-Schevon’96,
Chen-Chen-Wong’96, Chen-Wong’97]
EE695K VLSI Interconnect
Prepared by CK 2
Wiresizing Optimization
• Given: A set of possible wire widths {W1, W2, …, Wr}
• Question: Find an optimal wire width assignment
Discrete Wiresizing Optimization
• Given: A set of possible wire widths { W1, W2, …, Wr }
1 Minimize the sum of weighted delays, or
2 Minimize the maximum delay or maximize the min.slack at sinks
• Find: An optimal wire width assignment to
WiresizingOptimization
EE695K VLSI Interconnect
Prepared by CK 3
Elmore Delay Model for Interconnects
• Modeling of interconnect as an RC tree– Each wire maybe segmented into several
edges
– Each edge E modeled as a π-type or L-typecircuit
– rE = unit length res. × length(E) = r0 × width(E) × length(E), where r0 is the sheet resistance.
– cE = unit length cap. × length(E) = c0 × width(E) × length(E),where c0 is the unit area capacitance
(For simplicity, we do not consider fringingcapacitance here. But its extension isstraightforward)
• Use Elmore delay to guide optimization
Wiresizing Formulation for Single Critical Sink
)()( )(
otherwise 0
)(' and ),( if 1)',(
where
1
)(
)',(
)(
'0
',',
''00
0
EDesTsinkEg
EDesENNPEEEf
WEglr
W
WEEfllcr
WlcRWT
i
Ii
TE EiE
EETEE E
EiEE
TE
EEdi
∩==
∈∈=
⋅⋅⋅
+⋅⋅⋅⋅⋅
+⋅⋅⋅=
+
∈
≠∈
∈
∑
∑
∑
EE695K VLSI Interconnect
Prepared by CK 4
Properties of Coefficient Functions
)'( if )'( )(
)'( if ),'(),(
)'( if )',(),(
1111
112121
222121
EAnsEEgEg
EAnsEEEfEEf
EDesEEEfEEf
ii
ii
ii
∈≥•∈≥•∈≥•
EAns(E) Des(E)
Wiresizing Formulation for MultipleCritical Sinks
)(
)',(
1 where
1
)(
)',(
)()(
)(
)(
)(
'0
',',
''00
0
)(
∑
∑∑
∑
∑
∑
∑
∈
∈
∈
∈
≠∈
∈
∈
⋅=
⋅=
=
⋅⋅⋅
+⋅⋅⋅⋅⋅
+⋅⋅⋅=
⋅=
TsinkNi
ii
TsinkNi
ii
TsinkNi
i
TE EE
EETEE E
EEE
TE
EEd
TsinkNi
ii
EgG(E)
EEfF(E,E')
WEGlr
W
WEEFllcr
WlcR
WTWT
λ
λ
λ
λ
EE695K VLSI Interconnect
Prepared by CK 5
Impact of Resistance Ratio
• Definition: i.e. driver resistance versus unit wireresistance
• Determined by the Technology:reduce device dimension
• Impact on Wiresizing Optimization:
0r
Rd
↑↓ 0 and rR d
↓0 r
R d
↑⋅⋅⋅
↑⋅⋅⋅⋅⋅
↓⋅⋅⋅
∑
∑
∑
∈
≠∈
∈
-------- 1
)(
-------- )',(
--------
'0
',',
' '00
0
TE EE
EETEE E
EEE
TE
EEd
WEGlr
W
WEEFllcr
WlcR
Properties of Optimal Wiresizing Solutions
• Monotone Property
• Separability
• Dominance Property
EE695K VLSI Interconnect
Prepared by CK 6
Monotone Property
• Wire width from source to any sink is non- increasing
source
Separability
• Given wire width assignment of Ans(Si ) and SiEach subtree rooted at Si can be optimally sizedindependently
Si
Ans(Si )
N0
Des(Si )
Tss(Si , 3)
Tss(Si , 1)
Tss(Si , 2)
EE695K VLSI Interconnect
Prepared by CK 7
Opt. Wiresizing Algorithm (OWSA)
• Basic Approach:– Dynamic Programming
– Recursively applying the algorithm to each subtreeindependently
• Complexity:
– OWSA: O(|S| ) Brute-force: O(r ) r = # possible widths, S = set of segments
Si
Ans(Si )
N0
Des(Si )
Tss(Si , 3)
Tss(Si , 1)Tss(Si , 2)
r|S|
Local Refinement
� Local Refinement of W on segment EGiven: wire width assignment W
Compute: optimal wire width of E assuming otherwire width fixed in W
�
rE
EE
EE
WW WtsW
CWB
WCWBAWT
≤≤
⋅=⋅
⋅+⋅+=
1 ..
1 when minimized is
1)(
� L.R. can be applied repeatedly for every edge
EE695K VLSI Interconnect
Prepared by CK 8
Dominance Relation
• Dominance Relation
For all Ej, W(Ej)≥W'(Ej)
⇓W dominates W'
Dominance Property and Greedy Algorithm
• Theorem: If solution W dominates optimal solutionW*, and W’ = local refinement of W,
=> W’ dominates W*• Theorem: If solution W is dominated by optimal
solution W*, and W’ = local refinement of W, => W’ is dominated by W*
• Greedy Wiresizing Algorithm:– repeated application of local refinement until no
further improvement
EE695K VLSI Interconnect
Prepared by CK 9
Wiresizing Based on Dominance Property
• Lower bound computationWo = Min wire solution (dominated by opt. sol.)
W1 = Local-Refinement(W0)
W2 = Local-Refinement(W1)
Dominates
Dominates
• Wi dominated by opt. sol. ⇒ a tight lower bound
• Upper bound can be computed similarly
• Extension to multi-source nets [Cong-He’95]
• Also speedup by bundled refinement
2-Step Dominance-PropertyBased Algorithm
• Step 1: Bound Computation– TIGHT upper and lower bounds of the optimal solution
computed based on the Dominance Property [Cong-Leung,ICCAD’93]
– 100x faster algorithm for non-uniform wiresizing proposedbased on Bundled Refinement Property [Cong-He,ICCAD’95]
• Step 2: Optimal Sizing Within Bounds– top-down dynamic programming [Cong-Leung, ICCAD’93]
– bottom-up dynamic programming [Lillis-Cheng-Lin,ICCAD’95]
• Maximum delay can be minimized as well– iterative weighted-delay minimization based on Lagrangian
Relaxation [Chen-Chang-Wong,DAC’96]
EE695K VLSI Interconnect
Prepared by CK 10
Technology: Integrated Circuits(ICs)
Multi-Chip Modules(MCM)
Driver Resistance ( Rd ): 156 ohm 25 ohm
Unit Wire Resistance ( r0 ): 0.112 ohm/micron 0.008 ohm/micron
Loading Capacitance ( cs ): 1 fF 1000 fF
Unit Wire Capacitance ( c0 ): 0.039 fF/micron 0.060 fF/micron
Total Area: 5 mm X 5 mm 100 mm X 100 mm
Experimental Result: Technology Parameters[Cong-Leung’93]
Experimental Result: Delay Reduction byOptimal Wiresizing
IC #sinks
MIN MAX EX-OWSA FOWSA EX-OWSA FOWSA
4 0.238 0.497 (+109.01%) 0.224 (-5.88%) 0.220 (-7.42%) 1.2745 1.2422
8 0.327 0.706 (+116.00%) 0.300 (-8.05%) 0.288 (-12.01%) 1.3599 1.2719
Delay (ns) Normalized Wiring Area
C M # sink
M IN M A X E X -O W S A F O W S A E X -O W S A F O W S A
4 7 .9 0 6 7 .2 5 9 ( -8 .1 8 % ) 4 .7 7 7 ( -3 9 .5 7 % ) 4 .3 9 1 ( -4 4 .5 0 % ) 2 .3 6 7 7 1 .8 5 6 5
8 1 3 .8 9 9 1 1 .8 6 0 ( -1 4 .6 7 % ) 7 .6 7 1 ( -4 4 .8 2 % ) 6 .7 5 0 ( -5 1 .4 4 % ) 2 .3 7 6 2 1 .7 2 1 4
D e lay (n s) N o r m a l ize d W ir in g A r e a
� MIN/MAX: Wire width assignment using MIN/MAX width for each segment
� EX-OWSA: Optimal wiresizing algorithm using an upper bound delaymodel
� FOWSA: Fast optimal wiresizing algorithm under the distributed Elmoredelay model
EE695K VLSI Interconnect
Prepared by CK 11
Experimental Result: Wiresizing for multipleCritical Sinks
Effect of Resistance Ratio on InterconnectTopology and Wiresizing optimization
EE695K VLSI Interconnect
Prepared by CK 12
Extensions
• Still optimal when fringing capacitance is considered
• Wiresizing optimization with non-uniform r0 and c0
– restricted monotone property holds
– separability and dominance property still hold=>polynomial-time optimal algorithm
• Combined Area & Delay Minimization: Still optimalunder the objective function:
a*area+b*delay
Multi-Source Wire Sizing (MSWS)[Cong-He, ICCAD’95][Cong-He,TODAES’96]
• Given: A multi-source interconnect tree (MSIT) withmultiple sources, each driving MSIT at a differenttime.
• Find: Discrete wire width assignment to minimizelinear combination of delays between multiple criticalsource-sink pairs.
A source
A sink
Both source and sink
EE695K VLSI Interconnect
Prepared by CK 13
Difficulty of MSWS
• Single-source wire sizing (SSWS) has a fixed signaldirection– [Cong-Leung, ICCAD’93] proved:
• Separability,• monotone property• dominance property
• Signal direction in an MSIT is not fixed– Theory for single-source wire sizing no longer holds
0.345ns 0.267ns 0.347ns
Decomposition of MSIT
• An MSIT is decomposed into– a source-subtree (SST) of changeable signal direction
– a number of loading-subtrees (LSTs) of fixed signal direction
• LST is like SSWS case– LST separability– LST monotone property
– SSWS algorithm OWSA ([Cong-Leung, ICCAD’93]) can beused
SST
LSTs
EE695K VLSI Interconnect
Prepared by CK 14
Local Monotone Property for SST
• Theorem: wire sizing solution is monotone withineach segment, and the monotone direction isdetermined before sizing procedure
decrease rightward
uniform
increase rightward
Dominance Relation and Local Refinement(previously defined for SSWS in [Cong-Leung,
ICCAD’93])
• Dominance Relation
For all Ej, w(Ej)≥w'(Ej)
� Local Refinement of E
⇓W dominates W'
Given wire width assignmentW, we minimize our objectivefunction by only changingwire width of E and assumingother wire width fixed in W
Wire sizing solution W
Wire sizing solution W’
EE695K VLSI Interconnect
Prepared by CK 15
Bundled Refinement
• Given the monotone direction– both El and Er have MinLength
• Bundled refinement does the following:– for , to treat local refinement for El as an upper bound
for El, …, Er
– for , to treat local refinement for Er as a lower boundfor El, …, Er
*WW ≥
*WW ≤
• Theorem (Bundled Refinement Property):– Assignment W dominates optimal assignment W*
W’ = bundled refinement of WThen, W’ dominates W*
– If W is dominated by W*Then, W’ is dominated by W*
El Er
Bundled Wire Sizing Algorithm(BWSA)
Iterative bundled refinementfor both lower and upper bounds
Uni-segment := segment
Output opt. solution
Binary refinement ofdivergent uni-segment
Output TIGHT bounds
MinLength for all divergentuni-segments?
No divergentuni-segments?
yes
no
yes
no
EE695K VLSI Interconnect
Prepared by CK 16
Delay Reduction by Optimal Wiresizing[Cong-He, ICCAD’95,TODAES’96]
– Technology: 0.5 um CMOS
– nets: multi-source nets from an Intel microprocessor– opt: optimal wiresizing solution
– min: minimum wire width solution
– total runtime: 5.3 seconds for wiresizing712.4 seconds for SPICE verification
Analysis and Experimental Results
• BWSA achieves bounds as TIGHT as GWSA underthe finest segment division
• BWSA has the same worst-case complexity asGWSA, but runs 100x faster in the practice
• It takes 5.3 seconds to optimize these nets, but 712.4seconds to run HSPICE on these nets
Runtime(s) net1 net2 net3 net4 net5 net6
GWSA-based 0.07 8.18 172.37 15.67 38.10 227.92
BWSA-based 0.07 0.15 0.37 0.37 0.97 3.37
Speedup factor 1 54.5 465.8 42.3 39.3 67.6
GWSA-based = GWSA + bounded enumeration for SST + OWSA for LSTsBWSA-based = BWSA + bounded enumeration for SST + OWSA for LSTs
EE695K VLSI Interconnect
Prepared by CK 17
Wiresizing for maximum Delay Minimization[Sapatnekar’94]
• A sensitivity based method– try small perturbation on every edge– pick the one with the largest gain– repeat this strategy until no improvement, or reach timing
specifications
By separate enumeration, delay at node 1 isminimized when w1=10, w2=1, w3=7; Delayat node 2 is minimized when w1=10, w2=6,w3=1.
However, the maximum of the two delays isminimized when w1=10, w2=4, w3=5
e1e2e3
1 2
R=1
C2=38C1=52
• Separability does not hold anymore
Wiresizing Using Convex Programming[Sapatnekar’94]
• Minimize the total area subject to sink (Elmore)
delay constraints and width bound constraints
• Convex programming formulation
• Solved using TILOS-like approach
EE695K VLSI Interconnect
Prepared by CK 18
Sensitivity-Based Wiresizing[Menezes et al’95]
• To achieve target delays and slopes for RC tree
• Target moments are computed for target delays andslopes
• The sensitivity of real moment is computed w.r.t.wire widths
• Levenburgh-Marquardt method is used
– to minimize mean square error between values oftarget and real moments
Sensitivity-Based Wiresizing[Xue-Kuh-Yu’96]
• Minimize maximum delay for transmission line tree
• Compute delay sensitivity δtu/δwl in two steps:
– Compute delay sensitivity w.r.t. moment δtu/δmi by solving 2qlinear equations
– Compute moment sensitivity w.r.t. wire width δmi/δwl
l
iq
i i
u
l
u
wm
mt
wt
∂∂×
∂∂=
∂∂ ∑
−
=
12
0
• Compute moments and sensitivities by iterative treetraversal analytically
• Perform wiresizing in an iterative manner:– Compute the maximum delay tu and sensitivity vector w.r.t.
existing wiresizing solution
– Choose a sizable wire with maximum sensitivity for sizing
EE695K VLSI Interconnect
Prepared by CK 19
Wiresizing by Dynamic Programming [Lillis-Cheng-Lin’95]
• Similar to buffer insertion by dynamic programming[van Ginneken’90]– Bottom-up computation of options
– Top-down selection of optimal option
• Consider buffer insertion together with wiresizing to:– Minimize maximum sink delay, or– Minimize power (or layout area) while meeting target delay
for each sink
Continuous and Non-Uniform WiresizingOptimization
• Without fringing cap.[Fishburn-Schevon, TCAS’95][Chen-Chen-Wong,DAC’96]
• With fringing cap. [Chen-Wong, DAC’97]
where
• Given: A wiresizing, driverresistance and loading cap.Find: f(x), optimal wire width atposition x to minimize dealy
1))
aeC
W(
1(
2CC
f(x)
bxf0
f +−−=
−
n
1n
1n
xn!n)(
w(x) ∑−=
∞
=
−
bxf( )
−
EE695K VLSI Interconnect
Prepared by CK 20
Global Interconnect Sizing and Spacing w/Coupling Capacitance
• Given:
– Initial layout of multiple nets– Set of critical sinks and their criticalities– Capacitance models and design rules
• Output:
– Sizing and spacing of each net for performanceoptimization
[Cong-He-Koh-Pan’97]
Symmetric and Asymmetric Wire Sizing
Wire segments with center-lines
E1 E2
neighbor
E1E2
Symmetric wire sizing
E1Asymmetric wire sizing
E2
w↑w↓
EE695K VLSI Interconnect
Prepared by CK 21
2-D Capacitance Model
cef
Cacf Ca
Cx
For net i
• Table-based model [Cong et al.,DAC’97]
– Consider: Ca (area), Cf (fringing) and Cx (coupling)– Pre-compute a set of using a 3-D solver
– Use table-look-up method
Effective-Fringing Property
Larger Cef leads to larger optimal wire-sizing, i.e.,
→OWS(Ca, Cef’)
→OWS(Ca, Cef)≥ Cef’ Cef≥
→ →
Cef
→
C1
C2
Rd
EE695K VLSI Interconnect
Prepared by CK 22
Effective-Fringing Property
Larger Cef leads to larger optimal wire-sizing, i.e.,
→OWS(Ca, Cef’)
→OWS(Ca, Cef)≥ Cef’ Cef≥
→ →
Cef
→
C1
C2
Rd
Cef’ >→
�→�OWS(Ca, Cef)
� Key to GISS: Reduce GISS to a sequence ofsingle-net optimal wire-sizing problems:
Bound Computation for Optimal GISS Solution
• Start with a min width solution
• Get effective fringing capacitance of eachsegment
• Compute optimal wiresizing solution for eachnet
• Re-compute effective fringing capacitance
Lower bound of opt. width of all nets!
(Due to the effective fringing property)
EE695K VLSI Interconnect
Prepared by CK 23
C e n te r sp a c in g A v e r a g e D e la y s (n s)
M IN O W S G IS S /F A F G IS S /V A F
2 x p it c h 1 .5 1 1 .2 6 ( -1 7 % ) 0 .8 1 ( -4 6 % ) 0 .8 0 ( -4 7 % )
3 x p it c h 1 .3 3 0 .7 3 ( -4 5 % ) 0 .5 7 ( -5 7 % ) 0 .5 2 ( -6 1 % )
4 x p it c h 1 .2 8 0 .4 6 ( -6 4 % ) 0 .4 6 ( -6 4 % ) 0 .4 2 ( -6 7 % )
5 x p it c h 1 .2 5 0 .3 8 ( -7 0 % ) 0 .3 9 ( -6 9 % ) 0 .3 7 ( -7 1 % )
6 x p it c h 1 .2 3 0 .3 5 ( -7 1 % ) 0 .3 6 ( -7 1 % ) 0 .3 4 ( -7 2 % )
Experimental Results (Multiple Nets)
� GISS/FAF: GISS with fixed ca and cf
� GISS/VAF: GISS with variable ca and cf’s
• 16-bit 10mm bus structure equally spaced, with 5 different centerspacings from 2x to 6x min. pitch
• min. pitch = min. width + min.spacing
�Sizing results (3 x pitch bus)
•OWS
•GISS
� Sizing results fora 4-bit bus.
� Horizontaldirection is scaleddown by 1000x
�GISS is about30% better thanOWS for delayreduction.