system and circuit level power modeling of energy-efficient 3d-stacked wide i/o drams karthik...

11
System and Circuit Level Power Modeling of Energy-Efficient 3D-Stacked Wide I/O DRAMs Karthik Chandrasekar TU Delft Christian Weis $ , Benny Akesson*, Norbert Wehn $ & Kees Goossens # $ * #

Upload: adelia-hood

Post on 17-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

System and Circuit Level Power Modeling of Energy-Efficient 3D-Stacked Wide I/O DRAMs

Karthik ChandrasekarTU Delft

Christian Weis$, Benny Akesson*, Norbert Wehn$ & Kees Goossens#

$ * #

Karthik Chandrasekar / TU Delft 2

• Motivation for 3D-stacking of DRAMs• Problem Statement - Power Modeling• Circuit-level DRAM architecture & power model• System-level DRAM power model (DRAMPower)• Comparison: Results and Analysis• Summary

Overview

19-Mar-13

Karthik Chandrasekar / TU Delft 3

[I/O power per bit: 0.7mW in TSV vs 2.3mW in PoP vs 4.6mW in Off-Chip – Samsung]

The Performance Vs. Power Factor

Motivation: Why 3D-Stacked DRAMs?

19-Mar-13

State of the art: Mobile LPDDR/2/3Off-Chip Interconnects (on PCB)PoP (Package-on-Package) (μ bumps)

SC LPDDR2 x32 (400) DC LPDDR2 x32 (533) DC LPDDR3 x32 (800) QC Wide IO x128 (200)0

102030405060708090

100

0

2

4

6

8

10

12

14Bandwidth Power

Pow

er (m

W/G

Bps)

Pea

k Ba

ndw

idth

(GBp

s)

We want 3D-Stacking!

Images & Data Courtesy: HMC, JEDEC 42.6, FineTech, Nvidia, Samsung

3D-Stacked Wide IO

TSV (Through Silicon Via) - Many Dies

Capacitance: ~2pF (-85% Power)

4 Channels (x128) (High Bandwidth)

PoP (Package-on-Package) (μ bumps)

Capacitance: 8 to 20pF (-50% Power)

1 or 2 Channels (x32) (Low Bandwidth)

Karthik Chandrasekar / TU Delft 4

An accurate 3D-DRAM Power Model to design DRAM-stacked SoCs

What’s missing? [Problem Statement]

19-Mar-13

Karthik Chandrasekar / TU Delft 5

• Circuit-level Power Model – Modeling the DRAM architecture at the circuit-level in SPICE– Pros: Accurate and detailed– Cons: Slow, requires circuit-level understanding of DRAM architecture &

technology specifications for DRAMs are publicly unavailable

• System-level Power Model (like Micron’s)– Based on vendor provided datasheet measures and JEDEC specifications– Pros: Fast, easy to integrate & employs simple models for memory operations– Cons: Accuracy is unclear. Not directly applicable for 3D-DRAMs and is not

verified against circuit-level models or hardware measurements.

Approaches to power modeling

19-Mar-13

Need: Fast, Simple & Accurate Model

Karthik Chandrasekar / TU Delft 6

DevelopA System-Level 3D-DRAM Power Model

i.e. as accurate as

What’s the solution?

19-Mar-13

A Circuit-Level 3D-DRAM Power Model

Karthik Chandrasekar / TU Delft 719-Mar-13

Circuit-Level DRAM Modeling

Baseline DRAM Model• (Weis) DATE‘11 and DAC‘13• NGSPICE - PTM/BSIM• 1T1C Cell to Banks

2D to 3D (New)• Based on DATE ‘11 &

JEDEC Wide IO – x512• 4 Banks/Channel• 4 Channels• TSV Routing

– Data, Cmd & Addr– Control, Clock & Power

• No ODT (On Die Termination)– Low Freq. & IO Capacitance

• No DLL (Delay Locked Loop)• TSV model from IMEC/GaTech

Karthik Chandrasekar / TU Delft 8

System-Level Power Model (DRAMPower)

19-Mar-13

• Problem with Micron’s model:• Not directly applicable for 3D-DRAMs (Multiple voltage domains and IO)• Accuracy is unclear (State transitions not addressed & Approx. workload used)• Not verified against circuit-level models or hardware power measurements.

Comparison to Micron model

• Adapting to 3D-DRAMs:• Considers multiple voltage domains: (a) Core (b) Derived (Wordline)• Includes IO power consumption (Incl. I/O Pads, Buffers, Bumps, Drivers & Pins)• RD operation Energy (Generic equation):

• Modeling for Accuracy:• Models memory state transitions – from active to power-down• Models self-refresh accurately (functional correctness & timing difference)• Most importantly: Is almost as accurate as the circuit-level model

Karthik Chandrasekar / TU Delft 9

Self-Refresh Operation - Accuracy

19-Mar-13

Micron SREF NOP NOP NOP NOP NOP NOP NOP SREX NOP NOP NOP NOP NOPTimings <--------- ---------- ---------- -------SR EF------- ---------- ---------- --------> <--------- ---------- ---------- -XSDLL- ---------- -------->Active

Current Bckgnd Current IDD6 IDD6 IDD6 IDD6 IDD6 IDD6 IDD6 IDD6 IDD2N IDD2N IDD2N IDD2N IDD2N IDD2N

Actual SREF NOP NOP NOP NOP NOP NOP NOP SREX NOP NOP NOPTimings <--------- RFC-RP --------> <-------R P-------> <--------- --SREF-- ---------> <--------- ---------X S-------- --------->Active

CurrentIDD5-IDD3N

IDD5-IDD3N

IDD5-IDD3N

IDD5-IDD2N

IDD5-IDD2N

Bckgnd Current IDD3P0 IDD3P0 IDD3P0 IDD2P0 IDD2P0 IDD6 IDD6 IDD6 IDD2N IDD2N IDD2N IDD2N

Actual• Internal

Refresh• No DLL

We furnish new equations in the system-level power model to address such accuracy issues

Karthik Chandrasekar / TU Delft 10

• Experiment I:– Different Operations– Different Granularity

• Results:– Less than 2% difference– Adapted Micron SR (200): 72% diff.

• Experiment II:– H.263 Encoder & EPIC Encoder– JPEG Encoder & MPEG2 Decoder– Different Loads and Power Modes

• Results:– Less than 2% difference– Adapted Micron: 12% diff. (SR 500MHz)

• The 2% difference is due to the use of JEDEC-specified averaged IDD currents.

Comparison: Results & Analysis

19-Mar-13

Shows the accuracy of the system-level power model

Karthik Chandrasekar / TU Delft 11

Key Highlights:• Presented an accurate datasheet-based system-level power model for Wide I/O

3D-stacked DRAMs.• Verified the system-level model for accuracy against as a detailed SPICE-based

circuit-level 3D-DRAM architecture and power model.• Observed < 2% difference in power and energy estimates for different memory

operations and for any variations in memory load.

Other Important Contributions:• Provided estimates for IDD current measures for different JEDEC 3D-DRAM

configurations, in place of the as yet unavailable datasheets (in the paper).• The system-level power model (DRAMPower) has been released online as an

open-source 3D-DRAM power estimation tool. Download link:www.drampower.info

Summary

19-Mar-13