impact of interconnect architecture on vpsas (via-programmed structured asics)

Post on 30-Dec-2015

36 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Impact of Interconnect Architecture on VPSAs (Via-Programmed Structured ASICs). Usman Ahmed Guy Lemieux Steve Wilton System-on-Chip Lab University of British Columbia. What is a Structured ASIC?. An FPGA without re programmable interconnect Interconnect is mask-programmed. - PowerPoint PPT Presentation

TRANSCRIPT

Impact of Interconnect Architecture

on VPSAs (Via-Programmed Structured ASICs)

Usman Ahmed Guy Lemieux Steve Wilton

System-on-Chip LabUniversity of British Columbia

2

What is a Structured ASIC?

• An FPGA without reprogrammable interconnect– Interconnect is mask-programmed

InterconnectLayers

Transistors

3

What is a Structured ASIC?

• An FPGA without reprogrammable interconnect– Interconnect is mask-programmed

• Two types

InterconnectLayers

Transistors

Via ProgrammableVPSA

This Talk

4

What is a Structured ASIC?

• An FPGA without reprogrammable interconnect– Interconnect is mask-programmed

• Two types

InterconnectLayers

Transistors

Metal ProgrammableMPSA

Via ProgrammableVPSA

This Talk FPT 2009

Key Messages

1. Structured ASICswill be the key technologyof the future.

5

Key Messages

1. Structured ASICswill be the key technologyof the future.

Because the key issues thatmake structured ASICs attractivehave not been solved.

They are growing more prominent.

6

Key Messages

2. Interconnect matters.

1. Structured ASICswill be the key technologyof the future.

Because the key issues thatmake structured ASICs attractivehave not been solved.

They are growing more prominent.

7MPSAs have better performance,

VPSAs are cheaper.

8

Motivation for Structured ASICs

• Enormous NRE + Design costlimit access to advanced process

• Many ICs are still manufactured with old process technologies– New processes (90nm and below)

49% of TSMC revenue in 2009 Q1

• FPGAs not suitable for low power, consumer-oriented,hand-held devices

• SAs reduce mask cost + risk

• SAs must reduce design effort

• SAs make advanced processes profitable

• SAs lower power, lower costthan FPGAs

9

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, cost trends

• Conclusions

10

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, cost trends

• Conclusions

11

VPSA Die-Cost

• Cost is more important than die area• Primary cost components

– Die Area– Number of configurable layers (New for structured ASICs)

• Secondary cost components– Die Yield– Wafer and processing cost– Volume requirements

12

VPSA Cost Model

Costdie = Cbase +

Ccustom +

Cproto

13

VPSA Cost Model

Costdie = Cbase +

Ccustom +

Cproto

Cost of the masks for the base (common portion)

Cost of fabricating the base portion ++

14

VPSA Cost Model

Costdie = Cbase +

Ccustom +

Cproto

Cost of the masks for the base (common portion)

Cost of fabricating the base portion ++

Cost of the remaining masks Cost of fabricating the remaining portion ++

15

VPSA Cost Model

Costdie = Cbase +

Ccustom +

Cproto

Cost of the masks for the base (common portion)

Cost of fabricating the base portion ++

Cost of the remaining masks Cost of fabricating the remaining portion ++

Similar to Ccustom, but depends on the number of spins

16

VPSA Cost Model

Costdie = Cbase +

Ccustom +

Cproto

• Die Area and Yield: Ngdpw

• Configurable layers: Nvl

• Fixed layers: Nfm , Nfm , Nfm , Nfm

• Mask/wafer cost: Csml,Csmu,Cwpm, Nmprl

• Volume requirements: Vtot, Vc

Ccustom

Cbase

Cproto

Primary Cost Variables

Constantsl m v u

gdpw

swfmfmwpm

tot

fmfmfmsmfmsm

N

CNNC

V

NNNCNC vmuvmull)(

gdpw

fmfmfmfmvlwpm

c

fmvlsm

N

NNNNNC

V

NNCuvmvvu)(

swfmfmfmfmvlfmwpmfmvlsmc

s CNNNNNNCNNCV

Nuvmvlvu

)(

1

17

VPSA Cost Model

3210 KK

NK

NNK

Costgdpw

vlgdpw

die

ConfigurabilityDie Area

1 … 5 via layers4503 (10mm2 die) … 450 (100mm2 die)

18

VPSA Cost Model

3210 KK

NK

NNK

Costgdpw

vlgdpw

die

$4800 $440 $0.74

$1.08

• Key Assumptions– 45nm Maskset cost: $2.5M

– Total volume: 2M

– Per-customer volume: 100k

– No. of spins: 2

ConfigurabilityDie Area

1 … 5 via layers4503 (10mm2 die) … 450 (100mm2 die)

19

VPSA Cost Model

• At constant cost, area can be traded for number of customizable layers

VPSA

40

80

120

160

200

Cor

e A

rea

(mm

)2

2 3 4 5

Nvl

1

$ 8

$12

$18

$25

$35

20

VPSA Cost Model

• At constant cost, area can be traded for number of customizable layers

VPSA

40

80

120

160

200

Cor

e A

rea

(mm

)2

2 3 4 5

Nvl

1

$ 8

$12

$18

$25

$35

11 mm2/layer

21

VPSA Cost Model

• At constant cost, area can be traded for number of customizable layers

VPSA

40

80

120

160

200

Cor

e A

rea

(mm

)2

2 3 4 5

Nvl

1

$ 8

$12

$18

$25

$35

MPSA

2 3 4 5 6

40

80

120

160

200

NrlC

ore

Are

a (m

m )2

$ 8

$12

$18

$25

$35

11 mm2/layer 30 mm2/layer

22

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, cost trends

• Conclusions

23

Metrics

• Cost– Detailed cost model (just presented)

• Area– Placement grid size after whitespace insertion

• Determined by CAD flow

• Delay and Power– Please see paper

24

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, cost trends

• Conclusions

25

CAD FlowInput Circuit

Insert Whitespace

No

Detailed Routing

Global Routing

Placement

Tech. Mapping

Any Congestion?

Any Congestion?

Yes

No

Yes

Area/Delay/Power/Cost Estimate

T-VPACK

Std. Cell Placer: CAPO

Std. Cell Router: FGR

Custom Router -Pathfinder based

Interconnect Architecture

26

CAD FlowInput Circuit

Insert Whitespace

No

Detailed Routing

Global Routing

Placement

Tech. Mapping

Any Congestion?

Any Congestion?

Yes

No

Yes

Area/Delay/Power/Cost Estimate

T-VPACK

Std. Cell Placer: CAPO

Std. Cell Router: FGR

Custom Router -Pathfinder based

Interconnect Architecture

- Simulated annealing too slow- Uniform whitespace distribution- Logic cells snapped to grid

27

CAD FlowInput Circuit

Insert Whitespace

No

Detailed Routing

Global Routing

Placement

Tech. Mapping

Any Congestion?

Any Congestion?

Yes

No

Yes

Area/Delay/Power/Cost Estimate

T-VPACK

Std. Cell Placer: CAPO

Std. Cell Router: FGR

Custom Router -Pathfinder based

Interconnect Architecture

- Increase Placement Grid Size

28

CAD FlowInput Circuit

Insert Whitespace

No

Detailed Routing

Global Routing

Placement

Tech. Mapping

Any Congestion?

Any Congestion?

Yes

No

Yes

Area/Delay/Power/Cost Estimate

T-VPACK

Std. Cell Placer: CAPO

Std. Cell Router: FGR

Custom Router -Pathfinder based

Interconnect Architecture

- Routing graph for only single basic tile- Expand wavefront only along the global route

29

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, cost trends

• Conclusions

30

Routing Fabrics

Crossover Fabric

All wires same length!

31

Routing Fabrics

Crossover Fabric

All wires same length!

32

Routing Fabrics

Crossover Fabric

All wires same length!

33

Routing Fabrics

Jumper Fabric

Long wires OK!

34

Routing Fabrics

Jumper Fabric

Long wires OK!

35

Routing Fabrics

Jumper Fabric

Long wires OK!

36

Routing Fabric Comparison

Layer2i+2 (i=0, 1, …)Layer2i+1 (i=0, 1, …)

These points coincideLong Segment that spans two blocks

Jumper is not required at this point

Layer2i+2 (i=0, 1, …)Layer2i+1 (i=0, 1, …)

Crossover Fabric Jumper Fabric

• Single via to extend

• All wires same: length-1

• Two vias to extend

• Short segments: 1 blocks

• Long segments: 4 blocks, staggered

• Two variants– Jumper20: 20% Long segments– Jumper40: 40% Long segments

37

Logic Block Model

• Characteristics of logic block– Physical dimensions

(in wire pitches)

– Pin locations

• Do not need low-levellayout details

38

Parameterize Logic Block

• Cover wide search space for logic blocks

• Vary layout density– Dense: Determined by # pins (small layout area)

– Sparse: Determined by Standard Cell implementation

• Vary logic capacity– Sweep number of inputs and outputs

• 2-input, 1-output logic blocks (shown here)

• 16-input, 8-output logic blocks (also in paper)

– Use logic clustering (T-VPack) as tech-mapper

39

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, cost trends

• Conclusions

40

Area, Cost Trends

• Experimental results

– MCNC benchmarks• Geometric mean over 19 large circuits

– Logic block density• Dense, medium, and sparse

– Logic block capacity• From 2-input, 1-output to 16-input, 8-outputs

• Only 2-input, 1-output results shown here

41

Area and Die-Cost Trends

31 20

5

15

10

Cos

t ($)

Nvl MPSAs:(Nrl – 1)

Dense Logic BlockDense Logic Block

31 2

0.5

0

1

1.5

2

Cor

e A

rea Jumper40

Jumper20Crossover

MPSA

Nvl MPSAs:(Nrl – 1)

Area Cost

42

Area and Die-Cost Trends

31 20

5

15

10

Cos

t ($)

Nvl MPSAs:(Nrl – 1)

Dense Logic BlockDense Logic Block

31 2

0.5

0

1

1.5

2

Cor

e A

rea Jumper40

Jumper20Crossover

MPSA

Nvl MPSAs:(Nrl – 1)Area

Cost

MPSA < Crossover < Jumper

Nvl = 1 more area,needs whitespace to route

43

Area and Die-Cost Trends

31 20

5

15

10

Cos

t ($)

Nvl MPSAs:(Nrl – 1)

Dense Logic BlockDense Logic Block

31 2

0.5

0

1

1.5

2

Cor

e A

rea Jumper40

Jumper20Crossover

MPSA

Nvl MPSAs:(Nrl – 1)Area Cost

MPSAs: more layers higher cost

VPSAs: more layers lower cost

MPSA < Crossover < Jumper

Nvl = 1 more area,needs whitespace to route

44

Area and Die-Cost Trends

31 20

5

15

10

Cos

t ($)

Nvl MPSAs:(Nrl – 1)

Dense Logic BlockDense Logic Block

31 2

0.5

0

1

1.5

2

Cor

e A

rea Jumper40

Jumper20Crossover

MPSA

Nvl MPSAs:(Nrl – 1)Area Cost

MPSAs: more layers higher cost

VPSAs: more layers lower cost

MPSA < Crossover < Jumper

Nvl = 1 more area,needs whitespace to route

45

Area and Die-Cost Trends

• Sparse layout is better! ???– Less whitespace needed

• Need to study whitespace allocation

31 2

0.5

0

1

1.5

2

Cor

e A

rea Jumper40

Jumper20Crossover

MPSA

Nvl MPSAs:(Nrl – 1)

Sparse Logic Block

31 20

5

15

10

Cos

t ($)

Nvl MPSAs:(Nrl – 1)

Sparse Logic Block

Area Cost

Delay and Power Trends

Key results (in paper):

MPSA is significantly better than VPSA

47

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, cost trends

• Conclusions

48

Conclusions• Trends for VPSAs

– Die-cost more important than die-area

– MPSAs better in Area, Delay, and Power

– VPSAs better in Cost

– Interconnect Matters• Performance varies with different routing fabrics

• Even significant variation among VPSA structures

• Ongoing research– Interconnect architectures

– Whitespace insertion algorithm

49

Limitations

• CAD framework available onlinehttp://groups.google.com/group/sasic-pr

• This is early work … need improvements!– Whitespace insertion

– Buffer insertion

– Delay/Power of logic blocks

– Power/clock network area overhead

– SRAM-configurable logic blocks

Key Message

2. Interconnect matters.

1. Structured ASICswill be the key technologyof the future.

Because the key issues thatmake structured ASICs attractivehave not been solved.

They are growing more prominent.

MPSAs have better performance,

VPSAs are cheaper.50

51

52

CAD Framework Available

Power and Delay Trends

54

Metrics

• Area– Determined from placement grid size

• Delay– Average net delay (Elmore model)

• Register locations unknown; critical path delay calculation is difficult

• CAD flow is not timing driven

• Power– Total metal + via capacitance

55

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, delay, power, cost trends

• Cost model sensitivity

• Conclusions

56

Power TrendsIn

terc

onne

ct P

ower

1.5

2

1

1 2 3

Jumper40Jumper20Crossover

Nvl

Dense Logic Block

57

Power Trends

• Significant range for different routing fabrics• More custom via layers → Lower Power

– Especially for dense layouts

Inte

rcon

nect

Pow

er

1.5

2

1

1 2 3

Jumper40Jumper20Crossover

Nvl

Dense Logic Block

1 2 3

1.5

2

1

Nvl

Sparse Logic Block

Jumper40Jumper20Crossover

58

Power Trends

• Re-Normalized to MPSAs• VPSAs use more power

– 2x (sparse) to 6x (dense) more than MPSAs

12 3

2

3

5

6

4

1

VP

SA

Inte

rco

nnec

t P

ow

er

MP

SA

Int

erc

onne

ct P

ower

Jumper40Jumper20Crossover

Nvl MPSAs:(Nrl – 1)

Dense Logic Block

12 3

2

3

5

6

4

1

Jumper40Jumper20Crossover

Nvl MPSAs:(Nrl – 1)

Sparse Logic Block

59

Delay TrendsIn

terc

onne

ct D

elay

1.5

2

1

1 2 3Nvl

Dense Logic Block

Crossover

60

Delay TrendsIn

terc

onne

ct D

elay

1.5

2

1

1 2 3Nvl

Dense Logic Block

CrossoverJumper40

61

Delay TrendsIn

terc

onne

ct D

elay

1.5

2

1

1 2 3Nvl

Dense Logic Block

Crossover

Jumper20Jumper40

62

Delay TrendsIn

terc

onne

ct D

elay

1.5

2

1

1 2 3Nvl

Dense Logic Block

Crossover

Jumper20Jumper40

1.5

2

1

1 2 3Nvl

Sparse Logic Block

Jumper40Jumper20Crossover

• Significant range for different fabrics– Delay improves with more custom via layers

• Jumper Fabric: Long segments improve delay(but higher power)

63

• Re-Normalized to MPSAs• VPSA delay up to 20x worse

Delay Trends

Jumper40Jumper20Crossover

5

10

15

20

VP

SA

Inte

rcon

nect

Del

ay

MP

SA

Inte

rcon

nect

Del

ay

1 2 3Nvl

MPSAs:(Nrl – 1)

Dense Logic Block

5

10

15

20

1 2 3Nvl

MPSAs:(Nrl – 1)

Sparse Logic Block

Jumper40Jumper20Crossover

64

Delay Trends

Jumper40Jumper20Crossover

5

10

15

20

VP

SA

Inte

rcon

nect

Del

ay

MP

SA

Inte

rcon

nect

Del

ay

1 2 3Nvl

MPSAs:(Nrl – 1)

Dense Logic Block

5

10

15

20

1 2 3Nvl

MPSAs:(Nrl – 1)

Sparse Logic Block

Jumper40Jumper20Crossover

Why isVPSA delayworse than

MPSA delay?• Re-Normalized to MPSAs• VPSA delay up to 20x worse

65

Delay Trends

Jumper40Jumper20Crossover

5

10

15

20

VP

SA

Inte

rcon

nect

Del

ay

MP

SA

Inte

rcon

nect

Del

ay

1 2 3Nvl

MPSAs:(Nrl – 1)

Dense Logic Block

5

15

25

1 2 3

VP

SA

Tot

al N

o. o

f Via

s

MP

SA

Tot

al N

o. o

f Via

s

35

1 2 3

2

3

4

5

6

VP

SA

Tot

al W

irele

ngth

MP

SA

Tot

al W

irele

ngth

Nvl MPSAs:(Nrl – 1)

• Re-Normalized to MPSAs• VPSA delay up to 20x worse

more vias

morewirelength

66

Delay Trends

Jumper40Jumper20Crossover

5

10

15

20

VP

SA

Inte

rcon

nect

Del

ay

MP

SA

Inte

rcon

nect

Del

ay

1 2 3Nvl

MPSAs:(Nrl – 1)

Dense Logic Block

5

15

25

1 2 3

VP

SA

Tot

al N

o. o

f Via

s

MP

SA

Tot

al N

o. o

f Via

s

35

1 2 3

2

3

4

5

6

VP

SA

Tot

al W

irele

ngth

MP

SA

Tot

al W

irele

ngth

Nvl MPSAs:(Nrl – 1)

more vias

morewirelength• Re-Normalized to MPSAs

• VPSA delay up to 20x worse

67

Delay Trends

Jumper40Jumper20Crossover

5

10

15

20

VP

SA

Inte

rcon

nect

Del

ay

MP

SA

Inte

rcon

nect

Del

ay

1 2 3Nvl

MPSAs:(Nrl – 1)

Dense Logic Block

5

15

25

1 2 3

VP

SA

Tot

al N

o. o

f Via

s

MP

SA

Tot

al N

o. o

f Via

s

35

1 2 3

2

3

4

5

6

VP

SA

Tot

al W

irele

ngth

MP

SA

Tot

al W

irele

ngth

Nvl MPSAs:(Nrl – 1)

more vias

morewirelength

• Re-Normalized to MPSAs• VPSA delay up to 20x worse

68

Cost Model Sensitivity

70

Talk Outline

• Cost model

• Experimental methodology– Metrics

– CAD flow

– Architecture modeling

• Area, delay, power, cost trends

• Cost model sensitivity

• Conclusions

71

Cost Model Sensitivity

• How sensitive is the die-cost to various factors?• Primary factors

– Die area– Number of customizable layers

• Secondary factors– Maskset cost– Volume requirements– Number of fixed lower masks

72

Cost Model Sensitivity

• VPSAs less sensitive to maskset cost

– Sensitivity to Maskset Cost

Cos

t ($)

531

10

15

20

25

2

Maskset Cost = $2.5M

Jumper40Jumper20Crossover

MPSA

5

10

15

20

25

31 2

Maskset Cost = $5M

Jumper40Jumper20Crossover

MPSA

Cos

t ($)

Nvl MPSAs:(Nrl – 1)

Nvl MPSAs:(Nrl – 1)

73

Cost Model Sensitivity

• VPSA cost increases more rapidly than MPSAs– Large area of VPSAs

– Sensitivity to Number of Fixed Lower Masks ( )lfm

N

Cos

t ($)

531

10

15

20

25

2

Jumper40Jumper20Crossover

MPSA

Nfm (Fixed Lower Masks) = 18 Nfm (Fixed Lower Masks) = 36

31 2

Jumper40Jumper20Crossover

MPSA

Cos

t ($)

5

10

15

20

25

Nvl MPSAs:(Nrl – 1)

Nvl MPSAs:(Nrl – 1)

74

Cost Model Sensitivity

• VPSAs less sensitive to customer volume than MPSAs

– Sensitivity to Per Customer Volume (Vc)

Cos

t ($)

531

10

15

20

25

2

Jumper40Jumper20Crossover

MPSA

Vc (Per Customer Volume) = 100k Vc (Per Customer Volume) = 50k

31 2

Jumper40Jumper20Crossover

MPSA

5

10

15

20

25

Nvl MPSAs:(Nrl – 1)

Nvl MPSAs:(Nrl – 1)

75

top related