1 clock routing based on x-architecture pattern matching chia-chun tsai professor dept. of computer...

59
1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University Dept. of Computer Science and Engineering Yuan Ze University Oct. 03, 2008

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

1

Clock Routing Based on X-Architecture Pattern Matching

Chia-Chun TsaiProfessor

Dept. of Computer Science and Information Engineering

Nanhua University

Dept. of Computer Science and EngineeringYuan Ze University

Oct. 03, 2008

Page 2: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

2

Outline

IntroductionProblem FormulationProposed AlgorithmExperimental ResultsConclusion

Page 3: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

3

Introduction An interesting geometric problem (Clock routing problem).

› How to connect a particular point (clock source) to a number of points (clock sinks) such that each path from a particular point to the points is equal to each other.

Source

Sink

Page 4: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

4

Cut 1

Cut 2Cut 3

The MMM (Method of Means and Medians ) algorithm presented with recursively partitioning.

MMM Approach [Jackson 90]

Page 5: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

5

H-flip

The GMA (Geometric Matching Algorithm) based on bottom-up matching approach.

GMA Approach [Kahng 91]

Page 6: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

6

The WCA (Weighted Center based Algorithm) searched next tapping point with new weighted center

WCA Approach [Bo 91]

Page 7: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

7

The DME (Deferred Merge Embedding): The bottom-up phase constructs a tree of merging segments and the top-down embedding phase determines the exact location.

DME Approach [BK92, CHH92, Eda91]

The bottom-up phase in DME The top-down phase in DME

Page 8: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

8

The GDME (Grey relation analysis for DME) for an illustration of 29 clock sinks.› Partition S by alternating x- and y-median based on MMM approach until the number of clock sinks in each partition zone, Z, is less or equal to four.

GDME Approach [Wu 07]

Page 9: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

9

› Use the Grey relational analysis and associate with the DME approach. Then, recursively split and construct a minimum-cost clock tree.

GDME Approach (Cont’d)

Page 10: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

10

Clock Routing for 512 Sinks

Page 11: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

11

Clock Tree Construction for Benchmark r5

Page 12: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

12

Clock Network in a Chip

Two factors for a clock network, clock delay and clock skew › Max clock delay dominates the operation frequency.› Clock skew (max clock delay – min clock delay) may fail chip functions.› Wanted: minimize the max clock delay and get exact-zero skew

Clock network

A typical architecture of SoC exists a physical clock network.

Page 13: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

13

Wire Delay and Sink Loading Two typical delay models for a wire. r is a sheet resistance, ca is

a unit area capacitance, cf is a unit fringing capacitance, and CL is the load capacitance of a clock sink.

Elmore delay model (Elmore 48) The FED (Fitted Elmore delay model) (Abou-Seido 04)

Page 14: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

14

Interconnection Delay

Interconnects dominate signal delay

Data from ITRS Roadmap

Page 15: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

15

clock source

Source Steiner point Sink

4 4 4 4 2 2 3 3

7 8 7 5

20 25

31 31 32 32 34 34 33 33 Delay = 34

Skew = 34-31 = 3

Delay = max. delay

Skew = max. delay - min. delay

level 3

level 2

level 1

Clock Tree Topology

Page 16: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

16

Manhattan routing (horizontal and vertical)› Lead to

- Long wire length on average - Worse performance dominated by interconnect delay

X-architecture routing › Reduce wire length› Proviso: manufacturing technology supports

diagonal routing direction.› TSMC and UMC are ready for 65-nm X-Architecture designs

EE Times, May 25, 2006.http://www.eetimes.com/news/design/showArticle.jhtml?articleID=188500129

Partial routing result: Primary 1 @ 0.13m

Manhattan vs. X-architecture Clock Routings

Page 17: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

17

Layer definition in Manhattan and X Architectures

Page 18: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

18

Compared Manhattan and X- Architectures

Manhattan vs X-architecture

Same area,higher performance

Same performance,

less area

Page 19: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

19

X-architecture (horizontal, vertical and diagonal)

› L= [(x1-x2)2+ (y1-y2)2]1/2

› LM=L(sinα +cosα)› LX=L(0.41sinα+cosα)

› Benefits [Teig IWSLIP2002]:» 20% reduction in wire length» 20% saving in power» 10% improvement in chip performance» 30% reduction in die cost

Partial routing result: Primary 1 @ 0.13m

Arbitrary angle Manhattan arch. X-arch.

45°

(x1, y1)

(x2, y2)

(x1, y1)

(x2, y2)

LMα

(x2, y2)

LXα

(x1, y1)

Metal 2

Metal 3

Metal 4

Metal 1s1

s2

s1

s2

s1

s2

PB

Manhattan vs. X Architectures

Page 20: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

20Routing result: r1 @ 0.13m

Our Contribution

Construct ZST (Zero Skew Tree) based on X-architecture and predefined 16 matching patterns

Simplify DME merging procedures X-flip shortens wire length Wire sizing reduces routing

resources

Page 21: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

21

Outline

IntroductionProblem FormulationProposed AlgorithmExperimental ResultsConclusion

Page 22: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

22

Problem Formulation

A general CRP (clock routing problem):

Given:

a set of n clock sinks, S = {s1, s2, … sn}

Objective: construct a ZST (Zero Skew clock Tree) based

on X-architecture with better performance.

Page 23: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

23

DME-4 [Shen ISCAS06]

Associated with DME (Deferred Merge Embedding) [Chao TCAD92] Construct TOR (Tiled Octangular Region) in bottom-up phase of DME. Resolve the exact coordinates in top-down phase of DME. Use balanced bipartition to reduce wire length. Delay model: FED (Fitted Elmore Delay) [Abou-Seido TVLSI04]

radius1

s1

radius2

s2

TORs1

merging segment

radius1

The construction procedure should be more easy!

Page 24: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

24

Metal 2

Metal 3 Metal 4

Metal 1

Node viaEdge via

NVM [Wang VLSI-DAT07]

Also use DME to construct ZST (Zero Skew Tree). Focus on NVM (Node Via Minimization). Reducing #via is crucial. Delay model: Elmore model

They use various layer definitions.

Not practical enough.

Page 25: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

25

Definition of Our Clock Problem

Given: a set of clock sinks, S = {s1, s2, … sn} and a X-pattern library.

Objective: construct a ZST based on X-architecture with better performance.

Preliminary› Layer definition› One bend X-pattern› 16 X- patterns as a library

s2

PTN_2

s1

PTN_1

Zone location ofstart point

Zone location of end point

SLT SRT SLB SRB

LT PTN_R PTN_1 PTN_2 PTN_R

RT PTN_1 PTN_R PTN_R PTN_2

LB PTN_2 PTN_R PTN_R PTN_1

RB PTN_R PTN_2 PTN_1 PTN_R

Page 26: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

26

Complete routing result:r1 @ 0.13m

X-Pattern

Main idea:› Clock source locates

near the center of routingarea.

Centralize all the routing

wires.

Page 27: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

27

X-Pattern (cont’d)

Assumed that s1 and s2 are paired.› Step1. Tile the routing area. s1 locates in LT› Step2. Tile the routing area of s1. s2 locates in SRT› Step3. Define the X-pattern for 4 sub-zones.

s2

s2

s5

s3

s1s6

s8

s7

s4

LT

LB

RT

RB

s2

s1

SRTSLT

SRBSLB

s2

PTN_1

s1

PTN_2

PTN_1

s2

SRTSLT

s2

PTN_1PTN_2PTN_2

SRBSLB

Page 28: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

28

X-Pattern (cont’d)

Zone location of start point

Zone location of end point

SLT SRT SLB SRB

LT

RT

LB

RB

s2

s2

s5

s3

s1s6

s8

s7

s4

LT

LB

RT

RB

s2

s1

SRTSLT

SRBSLB

s2

PTN_1

s1

PTN_2

PTN_1

s2

SRTSLT

s2

PTN_1PTN_2PTN_2

SRBSLB

PTN_R PTN_1 PTN_2 PTN_R

PTN_1

PTN_2

PTN_R

PTN_R

PTN_R

PTN_2

PTN_R

PTN_R

PTN_1

PTN_2

PTN_1

PTN_R

Page 29: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

29

Outline

IntroductionProblem FormulationProposed AlgorithmExperimental ResultsConclusion

Page 30: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

30

Proposed Algorithm

Algorithm PMXF Input: A set of S sinks and 16-kind of X-patterns for a pair of pointsOutput: A ZST based on X-architecture with zero skew and minimal

delaybegin1. While(|S| > 1)2. { (s1, s2) = DPPG(S); //Determine a pair of points using GMA

3. Pattern = CPXP(s1, s2)∩CPXP(s2, s1); //Choose proper X-pattern

4. Pt = DCTP(s1, s2, x); //Find tapping point Pt of s1 and s2

5. If (x<0) WireSizing(s1, s2); //Adjust w2

6. If (x>1) WireSizing(s2, s1); //Adjust w1

7. DME-X(s1, s2, Pt, Pattern); //Construct the clock tree

8. X-Flip(s1, s2); //Reduce wire length

9. Insert(S, Pt); //Insert Pt to S

10. }End

PMXF (Pattern-Matching based on X-clock routing with X-Flip) algorithm

Page 31: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

31

DPPG Procedure

Determine Pair of Points in GMA› GMA is a bottom-up algorithm

[Kahng DAC91]› Focus on path-length balancing

X4

X6

X2

X1

X7

X8

X3

X5

X9

X10

X15X12

X11

X13

X14

DPPG

DPPG

DPP

G

DPPG

DPPG

DPPG

DPPG

Time complexity O(logn)

Page 32: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

32

CPXP Procedure Choose Proper X-Pattern

› Ex. CPXP(X1, X2)› Step1. Tile the routing area x1 locates in LT› Step2. Tile the routing area

of start point, x1 x2 locates in SRT

› Step3. Map the given X-pattern table

CPXP(X1, X2)=PTN_1 CPXP(X2, X1)=PTN_R

CPXP(X1, X2)∩CPXP(X1,X2)=PTN_1

X4

X6

X2

X1

X7

X8

X3

X5

X9

X10

X15X12

X11

X13

X14

CPXP

LT

LB

RT

RB

SLT

SLB

SRT

SRB

CPXP

CPXP

CPXP

CPXP

CPXP

CPXP

Time complexity O(logn)

Page 33: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

33

DCTP Procedure

Determine Coordinate of Tapping Point› Tapping point, Pt is determined to achieve zero skew. [Tsay ICCAD91]› Zero skew condition ratio, x.› If 0≤x≤1, tapping point locates on wire.› If x< 0 or x>1, need snaking wire.› Use binary search to determine the coordinate. [Wu IEICE07]

Time complexity

O(n)

Page 34: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

34

Wire Sizing

Snaking wire is one of public methods for constructing ZST. Benefits of adopting wire sizing [El-Moursy GLSVLSI03]

› Release routing resources› But need extra power due to wider wires

Snaking wire Sized wire

Page 35: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

35

Wire Sizing (cont’d)

Consider the zero skew condition, x < 0.

]2

)([),,(FED),( 1

11

1

11111 L

faLt FC

lEcwDC

w

lrwlCPsdx

),( 2 tPsdx

)(2

)2

(

)(

2111

1

1

2222

2

llDC

FClEC

w

l

FClw

aL

f

L

lEC f

Time complexity O(n)

Page 36: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

36

DME-X Procedure

Traditional DME based on X-arch. Bottom-up

phase› Create

TOR.

› Merge.

X4

X6

X2

X1

X7

X3

X5

X8

Page 37: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

37

DME-X (cont’d)

Traditional DME based on X-arch. Bottom-up

phase› Create

TOR.

› Merge.

Top-down phase› Determine

points’ locations.› Connect all the nodes.

X4

X6

X2

X1

X7

X3

X5

X15X12

X11

X13

X14

X8

X9

X10

Page 38: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

38

DME-X (cont’d)

Our DME-X method› Integrate bottom-up and top-down phases› Construct the

parallelogram

› DCTP(X4, X6)

› CPXP(X4, X6) ∩CPXP(X6, X4)

› Tip! Run CPXP firstthen DCTP for savingrunning time.

X4

X6

X2

X1

X7

X3

X5

X15X12

X11

X14

X13

X8

X9

X10

X9’DPPG

Time complexity O(n)

Page 39: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

39

X-Flip Procedure

Exchange X-pattern based on

predefined patterns

s1

PTN_2

PTN_1

s2 s2

PTN_2

s1

PTN_1

Delay = 4454.614 ps

Cost = 38219.374 m

Power = 0.000531 w

Complete routing result:08-5 @ 0.13m

Page 40: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

40

X-Flip (cont’d)

Check the length of the i-1th level when constructing the ith level.

Delay = 4139.209 ps, saving 7%

Cost = 36334.753 m, saving 4.9%

Power = 0.000515 w, saving 3%

Complete routing result:08-5 @ 0.13m with X-Flip

Time complexity O(n)

Page 41: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

41

Time Complexity Analysis

Algorithm PMXFInput: A set of S sinks and 16-kind of X-patterns for a pair of pointsOutput: A ZST based on X-architecture with zero skew and minimal

delaybegin1. While(|S| > 1)2. { (s1, s2) = DPPG(S); //Determine a pair of points using GMA

3. Pattern = CPXP(s1, s2)∩CPXP(s2, s1); //Choose proper X-pattern

4. Pt = DCTP(s1, s2, x); //Find tapping point Pt of s1 and s2

5. If (x<0) WireSizing(s1, s2); //Adjust w2

6. If (x>1) WireSizing(s2, s1); //Adjust w1

7. DME-X(s1, s2, Pt, Pattern); //Construct the clock tree

8. X-Flip(s1, s2); //Reduce wire length

9. Insert(S, Pt); //Insert Pt to S

10. }End

Time complexity O(logn)

Time complexity O(n logn)

Time complexity O(n)

Page 42: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

42

Outline

IntroductionProblem FormulationProposed AlgorithmExperimental ResultsConclusion

Page 43: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

43

Experimental Results

Platform: WinXP-SP2 on P4-M 1.7G with 1G Memory Compiler: Borland C++ Builder 6.0 IBM benchmarks, r1-r5, for testing our algorithm PMXF Our PMXF is compared with

› DME-4 [Shen ISCAS06] based on fitted Elmore delay model› NVM [Wang VLSI-DAT07] based on Elmore delay model

0.13m fabrication parameters are used.DME-4 (fitted Elmore) NVM (Elmore)

r 0.623Ω/m D 1.12673ln2 Fclk 100MHz r 0.623Ω/m

ca 0.00598fF/m E 1.10463ln2 Vdd 1.2V c 0.118fF/μm

cf 0.043fF/m F 1.04836ln2

Page 44: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

44

Our GUI

Page 45: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

45

Our GUI (cont’d)

Page 46: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

46

Our Results based on FED Model

Benchmark

#Sinkdelay (s) wirelength (m) power (W) total via runtime (s)

PMX PMXF PMXF/PMX PMX PMXF PMXF/PMX PMX PMXF PMXF/PMX PMX PMXF PMXF/PMX PMX PMXF PMXF/PMX

r1 26 0.47415 0.310858 0.656 1406401 1383347 0.983 0.074957 0.076774 1.024 1248 1215 0.973 6.72 6.859 1.02

r2 598 1.130498 0.841717 0.744 3000575 2863408 0.954 0.194351 0.19197 0.987 2744 2816 1.026 28.12 31.535 1.121

r3 862 1.632144 1.790971 1.097 3750372 3651790 0.973 0.263825 0.25889 0.981 3993 4019 1.006 65.163 70.261 1.078

r4 1903 4.639215 3.989911 0.860 7593864 7221328 0.950 0.617959 0.599185 0.969 9170 9160 0.998 754.024 897.16 1.189

r5 3101 8.987384 7.881827 0.877 11322668 10855445 0.958 1.023591 0.998684 0.975 14624 14528 0.993 1840.375 2309.672 1.255

Average - - 0.847 - - 0.964 - - 0.987 - - 0.999 - - 1.126Improve 15.3% in delayImprove 3.6% in wire length and 1.3% in powerImprove 0.1% in total via, but need more 12.6% in runtime

Compare our PMXF algorithm without/ with X-Flip in terms of delay, wire length, power consumption,total via, and runtime for FED model

Page 47: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

47

Our Results Based on ED Model

Benchmark

#Sinkdelay (s) wirelength (m) power (W) total via runtime (s)

PMX PMXF PMXF/PMX PMX PMXF PMXF/PMX PMX PMXF PMXF/PMX PMX PMXF PMXF/PMX PMX PMXF PMXF/PMX

r1 26 0.165138 0.137993 0.836 1517071 1364700 0.900 0.165665 0.163641 0.988 1230 1229 0.999 6.930 7.491 1.081

r2 598 0.455854 0.320785 0.704 2917878 2788433 0.956 0.377402 0.374030 0.991 2846 2867 1.007 25.577 30.144 1.179

r3 862 0.526273 0.498202 0.947 3757000 3696636 0.984 0.525380 0.514669 0.980 4016 3993 0.994 60.213 66.957 1.112

r4 1903 1.822653 1.614070 0.886 7513128 7363705 0.980 1.208581 1.185089 0.981 9220 8912 0.967 732.046 790.046 1.079

r5 3101 2.577663 2.095517 0.813 11246479 10854213 0.965 1.978120 1.952519 0.987 14718 14546 0.988 1597.062 1689.281 1.058

Average - - 0.837 - - 0.957 - - 0.985 - - 0.991 - - 1.102Improve 16.3% in delayImprove 4.3% in wire length and 1.5% in powerImprove 0.9% in total via, but need more 10.2% in runtime

Compare our PMXF algorithm without/ with X-Flip in terms of delay, wire length, power consumption,total via, and runtime for ED model

Page 48: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

48

Clock Tree Construction of r5 Based on PMXF

#sinks: 3101

Delay: 7.881827 s

Skew: 0

#vias: 14528

Power: 0.998684 W

Runtime: 2309.672s

Page 49: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

49

Our Results Compared with DME-4

[8] W. Shen, Y. Cai, J. Hu, X. Hong, and B. Lu, “High Performance Clock Routing in X-architecture,” IEEE International Symposium On Circuits and Systems, 2006, pp. 2081-2084.

Compare our PMXF algorithm with DME-4[8] in terms of delay, wire length, and power consumption for FED model.

Page 50: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

50

Our Results Compared with DME-4

Benchmarks #sinksDelay (s)

DME-4[8] PMXF PMXF/DME-4

r1 267 0.471340 0.310858 0.659

r2 598 1.145970 0.841717 0.734

r3 862 1.664930 1.790971 1.075

r4 1903 4.631840 3.989911 0.861

r5 3101 9.053950 7.881827 0.871

Average - - 0.840

Improve 16% in delay

The comparison of our algorithm and DME-4[8] in delay

Page 51: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

51

Our Results Compared with DME-4

Benchmarks #sinksWire length (m)

DME-4[8] PMXF PMXF/DME-4

r1 267 1414960 1383347 0.977

r2 598 2863420 2863408 0.999

r3 862 3656580 3651790 0.998

r4 1903 7245500 7221328 0.996

r5 3101 10971100 10855445 0.989

Average - - 0.992

Improve 0.8% in wire length

The comparison of our algorithm and DME-4[8] in wire length

Page 52: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

52

Our Results Compared with DME-4

Benchmarks #sinksPower (w)

DME-4[8] PMXF PMXF/DME-4

r1 267 0.074594 0.076785 1.029

r2 598 0.180590 0.174153 0.964

r3 862 0.254845 0.258602 1.015

r4 1903 0.589042 0.583533 0.991

r5 3101 0.981078 0.909697 0.927

Average - - 0.985

Improve 1.5% in power

The comparison of our algorithm and DME-4[8] in power

Page 53: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

53

Our Results Compared with NVM

[9] C. H. Wang and W. K. Mak, “λ-Geometry clock tree construction with wire length and via minimization,” IEEE International Symposium on VLSI-DAT, 2007, pp. 124-127.

Compare our PMXF algorithm with NVM[9] in terms of via and wire length for ED model.

Page 54: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

54

Our Results Compared with NVM

Benchmarks

#sinksNode via Total via

NVM[9] PMXFNVM[9

]PMXF PMXF/NVM

r1 267 832 720 1486 1229 0.827

r2 598 1859 1658 3401 2867 0.843

r3 862 2689 2335 4921 3993 0.811

r4 1903 6046 5200 10769 8912 0.828

r5 3101 9818 8504 17681 14546 0.823

Average - - - - 0.826Improve 17.4% in total via

The comparison of our algorithm and NVM[9] in node/ total via

Page 55: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

55

Benchmarks #sinksWire length (m)

NVM[8] PMXF PMXF/NVM

r1 267 1200300 1364700 1.137

r2 598 2354000 2788433 1.185

r3 862 3074900 3696636 1.202

r4 1903 6145000 7363705 1.198

r5 3101 9152300 10854213 1.186

Average - - 1.181

Worsen 18.1% in wire length

Our Results Compared with NVM

The comparison of our algorithm and NVM[9] in wire length

Page 56: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

56

Explanation for Wire Length Why we performed worse than NVM in wire length?

› The wire length is determined in different topologies.

Merging rules used in NVM› N=|S|/c. where c is a constant.› Step1. sort the corresponding edges, eim, of sink i, si.

Where si∈S, eim E∈› Step2. Get the first N number in E.› Step3. Merge N number elements in S.› Step4. Remove the merged elements from S and add new merging

one.

Page 57: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

57

Conclusion

X-architecture has been proven more effective than Manhattan architecture in constructing ZST.

We defined 16 X-routing patterns to simply the merging procedures in constructing X-based ZST.

X-flip can shorten wire length and minimize clock delay. Wire sizing removes snaking wires and save routing

resources. Our algorithm performs well in clock delay, wire length,

power and via cost.

Page 58: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

58

Future Works

Insert buffers to get higher performance Consider the inductance effects in delay model Consider DFM problems

› Antenna effect› Optical correction› Redundant via insertion› CMP variation

Planar X-routing with less metal layers

Page 59: 1 Clock Routing Based on X-Architecture Pattern Matching Chia-Chun Tsai Professor Dept. of Computer Science and Information Engineering Nanhua University

59

Thank you for attendance!