Transcript
Page 1: 3D IC Architectural-Physical Co-Design€¦ · material fracture) to avoid slow FEM simulations • Results show we can design more reliable circuit without scarifying performance

Electromigration TSV-TSV Coupling

3D IC Architectural-Physical Co-Design

Caleb Serafy, Tiantao Lu and Zhiyuan Yang

Advisor: Ankur Srivastava

3D-IC Trapped Heat Effect Thermo-Mechanical Stress

• As transistor shrinks in 2D-IC:

- Interconnect delay/power

becomes dominating factor

- Cost scaling flattens

• 3D-IC expands planar circuits into

3D space

- Shorter wirelengths are faster

and consume less power

- Transistor stacking increases

density without scaling

• Increased power density creates

thermal challenges

• TSV Challenges: Coupling, Stress,

Electromigration 180 130 90 65 40 28 20 160

10

20

30

40

Technology (nm)C

ost per

Mill

ion T

ransis

tor

(cents

)

[Sutherland, The Economist 11/18/13]

150 130 100 90 80 70 65 45 35 200

1

2

3

4

5

6

Norm

aliz

ed P

ow

er

Technology (nm)

Gate Power

Interconnect Power

[Magen, SLIP’04]

650 500 350 250 180 130 1000

10

20

30

40

Dela

y (

ps)

Technology (nm)

Interconnect (Al + SiO2)

Interconnect (Cu + low-k)

Gate Delay

[ITRS 1999]

2 3 4 50

20

40

60

80

100

freq (GHz)

ED

P-1

(u

s-1

nJ

-1)

cooled

uncooled

2 3 4 50

200

400

600

freq (GHz)

To

tal P

ow

er

(W)

cooled

uncooled

2 3 4 50

50

100

150

200

freq (GHz)

Pe

ak T

em

pe

ratu

er

(de

g C

)

cooled

uncooled

Air Uniform MF Optimized MF1

1.2

1.4

1.6

1.8

Norm

aliz

ed R

esults

Cooling Solution

Thermally Aware FP

Thermally Unaware FP

Performance

Efficiency

Objective

• Show potential for thermo-electrical co-

design at physical and architectural level

Contributions

• Unified multi-physics simulation framework

with integrated physical optimization

routines

Results and Conclusions

• MF cooling unlocks full potential of 3D CPU

• FP and Heatsink optimizations are necessary to maximize feasible

performance and efficiency in 3D CPU

• Significant interactions between heatsink, architecture and floorplan

• Unified co-design approach is necessary in early design space analysis

Architectural-Physical Co-design of 3D CPUs

Thermo-Electrical Floorplan Optimization

• Rearranging the power density in horizontal and vertical

direction can reduce peak temperatures

• Tradeoff between temperature and timing

- Spreading blocks reduces hotspots but increases wirelength

• Changing chip aspect ratio can affect channel length and count

Multi-physics Simulation Framework

• Simulate performance, power and temperature as a function of

3D CPU architecture and software workload

• Built in floorplan and micro-fluidic heatsink optimization loops

Micro-fluidic Heatsink Optimization

• Micro-fluidic heatsink drastically reduces temperature as compared to traditional air cooling

• There is a tradeoff between number of channels and fluid velocity (assuming a constant pressure drop)

• The number and placement of channels can be optimized to yield further thermal improvements

3D CPU Design Space

• Number of CPU cores: stack logic layers

• Number of memory controllers: increase

on-chip DRAM bandwidth

• Core Frequency: MF cooling improves

feasibility and efficiency of frequency

scaling

Optimal Performance/Efficiency

• The optimal performance or efficiency that

is thermally and timing feasible across 3D

CPU design space

• Both optimizations show significant

improvements

2D PDN

Voltage Droop 3D PDN

Voltage Droop

Co-design Graph Model

• Multiple design variables interact through a

cause and effect graph

• Modeling of all interactions (edges) can be

overly complex without adding significant

accuracy

• Assign a weight to each edge to represent

the strength of the interaction

• Trade off model complexity and accuracy by

changing parameter W

- Only model edges with weight > W

3D CPU Design Space Exploration

• Past work has exhaustively simulated

architectural design space

- This computationally limits the

scope of the design space

• Design space exploration (DSE)

algorithm required to find optimal

design without exhaustive simulation

• Requires model of metrics vs design

• Increase accuracy of model as

exploration occurs

Future Work

3D IC Power Delivery Network (PDN)

• Past work has shown 3D CPUs with MF

cooling can increase performance at the

expense of increased transistor power

• This can lead to increased frequency and

amplitude of VDD noise and IR drop

• Vertical power routing changes PDN

design requirements

• PDN constraints could limit potential of

3D CPUs

• PDN constraints and optimization must

be integrated into DSE framework

Background and Motivation

Physical Level Design of 3D-ICs Designing for Interconnect Performance

• We use a high-order accurate delay model

• Minimize the delay based on dynamic programming

Distributed RC π-model

Designing for Reliability

• Atom migration refers

to the movement of

metal atoms in

interconnect

• Result in open/short

circuit

• Decided by electrical

current, stress, and

thermal condition

• Multi-physics time-

dependent simulation

Study 1: Electrical Current

Study 2: Thermal Mechanical Stress

Study 3: Atom Migration

• When cooled, Cu contracts more

than Si, forms residual stress

• Stress leads to mechanical failures

such as crack and delamination

• Finite-element-method (FEM)

thermal-mechanical simulation

Adaptive resistive mesh algorithm

DC current modeling

Non-uniform mesh

Electrical-Thermal-Reliability Co-Design

• We propose a unified framework to optimize both 3D-IC’s

performance and reliability

• Novel reliability models (including electromigration and

material fracture) to avoid slow FEM simulations

• Results show we can design more reliable circuit without

scarifying performance

Designing for Low-Power

too many TSVs, illegal less TSVs, legal design

Clock source

Fixed clock sinks

• Clock tree power accounts for up to 70% of the circuit total dynamic power

• Shutdown gates periodically turns off idle sequential logic thus saves power

• These gates need special control logic (wiring overhead)

• Limited wiring (TSV) resources in 3D-IC: different from planar circuit design

Layer 0

Layer 1

M2

M1 M4

M3

Clock TSV

3D IC Reliability Study

• Past work has focused on TSV-induced

failure in 3D-IC, and has shown the co-design

scheme can prolong the interconnect lifetime

without performance overhead.

• Gate’s failure (HCI, TDDB, NBTI) is also

crucial for the system

• Call for co-design at different levels of

abstraction (device, circuit, architectural level)

We have developed algorithms to :

• Generate the entire 3D clock tree

• Decide the location of the shutdown gates

• Place clock TSVs and control TSVs within given whitespace

Wire sizing problem to minimize interconnect delay Motivations for Co-Design

• Conflict between TSVs and micro-

channel/micropin-fin structure

• Interaction between gates and micro-

channels

• Temperature of coolant fluid increases

along flow direction: potentially large

spatial temperature variation

• Extra power introduced by micro-fluidic

cooling

Co-Design Framework

• 2-stage algorithm: (1) Layer-

assignment of gates; (2)

Hierarchically partitioning

based placement

• Objective: provide sufficient

cooling for 3D IC with more

compact layout

Peak Temperature of Different Benchmarks

Temperature Limit Same Chip Size

Keep 3D IC Thermally Feasible

By Enlarging Chip Size

Temperature Profile of 3D IC with Micropin-Fin Cooling

Results an Conclusion

• Air cooing scheme (AC) and the scheme allocate micro-channels after gate

and TSV placement (PMA) fail to sufficiently cool 3D IC for compact layout

• Co-design can keep the chip temperature below the limit while maintaining

compact layout

• Enlarging chip size (EAC and EPMA) can provide sufficient cooling with

larger overhead in chip area and wire-length than co-design

3D FPGA with Micro-Fluidic Cooling

• Past work has shown the peak temperature

of 3D FPGA increases with the number of

layers

• This necessitates the use of more aggressive

cooling such as micro-fluidic cooling

• Heat sink structure impacts the arrangement

of 3D switch boxes thereby influencing the

placement and routing performance

• Different from ASIC, routing of FPGA

determines connection boxes which has great

impact on power distribution thereby affecting

the temperature

• Micro-fluidic cooling should be integrated in

the placement and routing algorithm of 3D

FPGA

Co-Design with Micro-Fluidic Cooling

[Jung, et al. DAC’11]

Funded by NSF, ISR and DARPA

Top Related