memscale: active low-power modes for main memory

19
1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University *University of Michigan

Upload: falala

Post on 23-Feb-2016

63 views

Category:

Documents


0 download

DESCRIPTION

MemScale: Active Low-Power Modes for Main Memory. Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University *University of Michigan. Server memory power challenges. Power consumption of a Google server [Barroso & Hoelzle’07]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: MemScale: Active Low-Power Modes for Main Memory

1

MemScale: Active Low-Power Modes for Main Memory

Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini

Rutgers University *University of Michigan

Page 2: MemScale: Active Low-Power Modes for Main Memory

2

Server memory power challengesPower consumption of a Google server [Barroso &

Hoelzle’07]

• DRAM power varies little with load • Memory power represents 30-40% of total power for typical loads• Fraction is larger since memory controller power is not included

Compute Load (%)

Pow

er (%

of p

eak)

Page 3: MemScale: Active Low-Power Modes for Main Memory

3

Improving memory energy efficiency

• Observation: Memory bandwidth is rarely fully utilized [Meisner’11];

we can save energy during periods of light and moderate load

• Previous approaches• Leveraging DRAM idle low-power state [Lebeck’00][Delaluz’01][Li’04][Diniz’07]…

• Rank sub-setting and DRAM reorganization [Ahn’09][Udipi’10][Zheng’10]…

• Memory controller power is typically not considered

• Need active low-power modes to save energy when underutilized • Frequency has greater impact on bandwidth than latency

Page 4: MemScale: Active Low-Power Modes for Main Memory

4

MemScale: Active low-power modes for memory• Goal: Dynamically scale memory frequency to conserve energy

• Hardware mechanism:• Frequency scaling (DFS) of the channels, DIMMs, DRAM devices• Voltage & frequency scaling (DVFS) of the memory controller

• Key challenge:• Conserving significant energy while meeting performance constraints

• Approach:• Online profiling to estimate performance and bandwidth demand• Epoch-based modeling and control to meet performance constraints

• Main result: • System energy savings of 18% with average performance loss of 4%

Page 5: MemScale: Active Low-Power Modes for Main Memory

5

Outline

• Motivation and overview

• Background on memory systems

• MemScale: DVFS for the memory system

• Results

• Conclusions

Page 6: MemScale: Active Low-Power Modes for Main Memory

6

Impact of frequency scaling on memory latencyACT

CL

Burst

PRE

ACT CL PREBurstTime

ACT CL PREBurst

MC

MC

800 MHz

400 MHz

• For DDR3 DRAM, scaling frequency from 800MHz to 400MHz: bandwidth down by 50%, latency up by only 10%

Req

Reply

Page 7: MemScale: Active Low-Power Modes for Main Memory

7

Opportunity for MemScale

0%

20%

40%

60%

80%

100%

MEM INTENSIVE INTERMEDIATE COMPUTE INTENSIVE

Pow

er %

(nor

mal

ized)

Background Dynamic MC

Background: clock tree, I/O driver, register, PLL, DLL, refresh, others

• Effects of lower frequency on power:• Lowers background power linearly (~f)• Lowers MC power by cubic factor (~f^3)

Dynamic: read, write, terminationMC: memory controller

Page 8: MemScale: Active Low-Power Modes for Main Memory

8

Outline

• Motivation and overview

• Background on memory systems

• MemScale: DVFS for the memory system

• Results

• Conclusions

Page 9: MemScale: Active Low-Power Modes for Main Memory

99

MemScale design

• Goal: Minimize energy under user-specified slowdown bound

• Approach: OS-managed, epoch-based memory frequency tuning

• Each epoch (e.g., an OS quantum):1. Profile performance & bandwidth demand

• New performance counters track mem latency, queue occupancies

2. Estimate performance & energy at each frequency• Models estimate queuing delays & system energy

3. Re-lock to best frequency; continue tracking performance• Slack: delta between estimated & observed performance

4. Carry slack forward to performance target for next epoch

Page 10: MemScale: Active Low-Power Modes for Main Memory

1010

Frequency and slack management

Time

Epoch 1 Epoch 2 Epoch 3 Epoch 4

High Freq.

Low Freq.MC, Bus + DRAM

CPU Pos. Slack Neg. Slack Pos. Slack ProfilingTarget

Actual

Calculate slack vs. targetEstimate performance/energy via models

Page 11: MemScale: Active Low-Power Modes for Main Memory

11

Modeling of performance and energy• New performance counters enable estimate of

• Level of contention (bank and bus)• Energy consumption

• CPI of each application

• Avg memory latency

• Performance slack

• Estimate full system energy

Page 12: MemScale: Active Low-Power Modes for Main Memory

12

MemScale adjusts frequency dynamically

Timeline of workload mix MID3

Page 13: MemScale: Active Low-Power Modes for Main Memory

13

Outline

• Motivation and overview

• Background on memory systems

• MemScale: DVFS for the memory system

• Results

• Conclusions

Page 14: MemScale: Active Low-Power Modes for Main Memory

14

Methodology• Detailed simulation

• 16 cores, 16MB LLC, 4 DDR3 channels, 8 DIMMs

• Multi-programmed workloads from SPEC suites

• Power modes• 10 frequencies between 200 and 800 MHz

• Power consumption• Micron’s DRAM power model • Memory system power = 40% of total server power

Page 15: MemScale: Active Low-Power Modes for Main Memory

15

Results – energy savings and performance

0%

10%

20%

30%

40%

50%

60%

70%

80%

ILP MID MEM AVG

Ener

gy sa

ving

s (%

)

Full system energyMemory system energy

0%

2%

4%

6%

8%

10%

12%

ILP MID MEM AVGCP

I inc

reas

e (%

)

Multiprogram averageWorst program in mix

CPI degradation bound

Memory energy savings of 44%

System energy savings of 18% always within performance bound

Average energy savings Performance overhead

Page 16: MemScale: Active Low-Power Modes for Main Memory

16

Alternative approaches

• Fast power-down• Transition ranks into fast power-down mode when idle

• Decoupled-DIMM [Zheng’09]• Low frequency DRAM + high frequency DIMMs & channels

• Static• Pre-selected active low-power mode w/o dynamic scaling• Unrealistic: needs a priori knowledge of workload behavior

Page 17: MemScale: Active Low-Power Modes for Main Memory

17

Results – comparison to alternative approachesFull System Energy Saving

0%2%4%6%8%

10%12%14%16%18%20%

0%1%2%3%4%5%6%7%8%9%

10% Multiprogram averageWorst program in mix

Performance overhead (MID)Full system energy savings (MID)

Ener

gy S

avin

gs (%

)

CPI i

ncre

ase

(%)

Fast-PD

Decoupled-DIM

MSta

tic

MemScale

MemScale+Fast-

PD

Fast-PD

Decoupled-DIM

MSta

tic

MemScale

MemScale+Fast-

PD

Page 18: MemScale: Active Low-Power Modes for Main Memory

18

Conclusions

• MemScale contributions:• Active low-power modes for the memory subsystem• New perf. counters to capture energy and contention• OS policy to choose best power mode dynamically

• Avg 18% system energy savings, avg 4% performance loss

• In the paper• Performance and energy models• Sensitivity analyses (including lower performance bounds)• Energy break-down comparison

Page 19: MemScale: Active Low-Power Modes for Main Memory

19

THANKS!

SPONSORS: