mrinmoy ghosh 1 coolpression: a hybrid significance compression technique for reducing energy in...

21
1 Mrinmoy Ghosh CoolPression: A Hybrid Significance CoolPression: A Hybrid Significance Compression Technique for Reducing Compression Technique for Reducing Energy in Caches Energy in Caches Mrinmoy Ghosh Weidong Shi Hsien-Hsin (Sean) Lee School of Electrical and Compute School of Electrical and Compute Engineering Engineering Georgia Institute of Georgia Institute of Technology Technology

Upload: noel-harper

Post on 24-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

1Mrinmoy Ghosh

CoolPression: A Hybrid Significance CoolPression: A Hybrid Significance Compression Technique for Reducing Compression Technique for Reducing

Energy in CachesEnergy in Caches

Mrinmoy Ghosh Weidong ShiHsien-Hsin (Sean) Lee

School of Electrical and Computer EngineeringSchool of Electrical and Computer Engineering

Georgia Institute of TechnologyGeorgia Institute of Technology

September 15, 2004

2Mrinmoy Ghosh

Hot CachesHot Caches

Data Cache14%

Bus Interface Unit12%

Integer Units16%

Data Path32%

Mem. Control

ler19%

Instruction

Cache7%

I Cache25%

D MMU5%

I MMU4%

SysCtl3%

Other4%

Clocks4%

BIU8%

PATag RAM1%

CP152%

ARM 925%

D Cache19%

Alpha 21264

ARM 920T

3Mrinmoy Ghosh

MotivationMotivation

1

10

100

1000

10000

100000

1000000

10000000

100000000

1000000000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63

Max

Min

Avg

8 16 24 32 40 48

56 64

# o

f In

stan

ces

# of Leading Zeroes

Occurrences of Leading Zeroes for SPECint2000

Uniform distribution of occurrences of leading zeroes across the 64 bit space

4Mrinmoy Ghosh

Salient Features of CoolPressionSalient Features of CoolPression

Energy-saving based on “bits” granularity

Compress both leading 1’s and leading 0’s

Reuse most significant byte, minimizing overhead

CoolPression is a hybrid of two schemes

Dynamic Zero Compression

CoolCount Scheme

Choose the better scheme dynamically

5Mrinmoy Ghosh

CoolPression CacheCoolPression Cache

Sense Amps

32 bits

SRAM Cell Array

6Mrinmoy Ghosh

CoolPression Cache CoolPression Cache DZC DZC

Dynamic Zero Compression Technique [Villa et al 2000]

Sense Amps

36 bits

ZIBs

SRAM Cell Array1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0

7Mrinmoy Ghosh

CoolPression CacheCoolPression Cache

1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 11 1 1 1 1 10 0 0 0 0 0 0 0

Sense Amps

36 bits

ZIBsCoolCount Technique

SRAM Cell Array

8Mrinmoy Ghosh

CoolPression CacheCoolPression Cache

Step 1: Read In First 7 bits and the ZIBs

CoolCount Circuit

1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 11 1 1 1 1 10 0 0 0 0 0 0 0

Sense Amps37Data from Cache

Data Out

33

32Bitline Enable Lines

6 bits

CE Bit

CoolCount Circuit

36 bits

ZIBsCoolCount Technique

SRAM Cell Array

Step 2a: Read only 32 –count bits and append with leading zeroes or ones

32 - count

9Mrinmoy Ghosh

Counting Leading 0’s And 1’sCounting Leading 0’s And 1’s

0

Priority Encoder

0 10 1 0 10

“# of Leading Zeroes or Ones ”

0 0 0 1 1 0

0 0 0 0 1 1 0 1

10Mrinmoy Ghosh

Counting Leading 0’s And 1’sCounting Leading 0’s And 1’s

0

Priority Encoder

0 10 1 0 1

“# of Leading Zeroes or Ones”

0 0 0 1 1 00

0 0 0 1 0 1 1

0 0 0 0 1 1 0 1

11Mrinmoy Ghosh

Counting Leading 0’s And 1’sCounting Leading 0’s And 1’s

Priority Encoder

“# of Leading Zeroes or Ones”

0 0 0 1 0 1 1

0 1 1

0 0 0 0 1 1 0 1

12Mrinmoy Ghosh

Counting Leading 0’s And 1’sCounting Leading 0’s And 1’s

Priority Encoder

“# of Leading Zeroes or Ones”

0 1 1

1 0 0

0 0 0 0 1 1 0 1

13Mrinmoy Ghosh

Bitline Precharge Enabling CircuitBitline Precharge Enabling Circuit

VDD

C2

C1C0

Y7

Y6

Y5

Y4

Y3

Y2

Y1

Y0

SRAM Cell

SRAM Cell

Bitline PrechargePrecharge Control Transisto

rVDDVDD

b b

wl

wl

Precharge Enable from

Coolcount Decoder Circuit

14Mrinmoy Ghosh

Read Data From CoolPression CacheRead Data From CoolPression Cache

Read in Count Enable (CE) Bit and First 6 bits of data

CE ==1 Enable Least Significant 64-count bit lines

Read Data From Least Significant 64-count bit lines and append with

count leading zeroes or ones

Read Data for bytes where ZIB is not enabled and make the other bytes

zero

Yes

No

15Mrinmoy Ghosh

Write Data To CoolPression CacheWrite Data To CoolPression Cache

Count Number of Leading Zeroes or Ones

Check for Bytes which are zero

Count > Zero Bytes

Set CE bit to one and Enable Most Significant 6

bits lines and Least Significant 64-count bit

lines

Write Encoded Data to Cache

Set CE bit to 0 and Write Data to Cache setting ZIBs where necessary

Yes

No

16Mrinmoy Ghosh

Simulation MethodologySimulation Methodology

● Simulator: Simplescalar with Wattch ● Benchmarks: SPEC INT 2000● Power Numbers for Cache Structures:

CACTI● Power Numbers for Priority Encoder:

J.S Wang, C.H. Huang. “High Speed and low power CMOS priority encoders”. Journal of Scientific Computing, 35(10) 2000

For a 64 KB Cache Priority Encoder consumes around .1% of the Cache Power

17Mrinmoy Ghosh

ResultsResults

00.20.40.60.8

1

Bzip2 Crafty GCC GZIP MCF Parser Vortex Vpr Avg

Dcache Base Dcache CoolCount Dcache DZC Dcache CoolPression

16K Data Cache

16K Instruction Cache

0.75

0.8

0.85

0.9

0.95

1

Bzip2 Crafty GCC GZIP MCF Parser Vortex Vpr Avg

Icache Base Icache CoolCount Icache DZC Icache CoolPression

Norm

Tota

l P

ow

er

Norm

Tota

l P

ow

er

18Mrinmoy Ghosh

ResultsResults

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Bzip2 Crafty GCC GZIP MCF Parser Vortex Vpr Avg

Dcache Base Dcache CoolCount Dcache DZC Dcache CoolPression

32K Data Cache

Norm

Tota

l P

ow

er

00.10.20.30.40.50.60.70.80.9

1

Bzip2 Crafty GCC GZIP MCF Parser Vortex Vpr Avg

Icache Base Icache CoolCount Icache DZC Icache CoolPression

32K Instruction Cache

Norm

Tota

l P

ow

er

19Mrinmoy Ghosh

Potential Performance ImpactPotential Performance Impact

0

0.5

1

1.5

2

2.5

Crafty Gcc Gzip Mcf Parser Twolf Vortex VPR Avg

IPC

Normal Cache CoolPression Cache

20Mrinmoy Ghosh

ConclusionsConclusions

● System Transparent Hybrid Zero Compression Scheme

● Bit level and Byte level compressibility used to save power

● Energy Savings of over 35% over baseline cache

● Potential Use at other places where data transfer takes place

21Mrinmoy Ghosh

Thank YouThank You