smart cache cleaning : energy efficient vulnerability reduction in embedded processors

28
C M L Smart Cache Cleaning: Energy Efficient Vulnerability Reduction in Embedded Processors Reiley Jeyapaul, and Aviral Shrivastava Compiler Microarchitecture Lab, Arizona State University, Tempe, Arizona, USA

Upload: erek

Post on 25-Feb-2016

38 views

Category:

Documents


1 download

DESCRIPTION

Smart Cache Cleaning : Energy Efficient Vulnerability Reduction in Embedded Processors. Reiley Jeyapaul, and Aviral Shrivastava. Compiler Microarchitecture Lab , Arizona State University, Tempe, Arizona, USA. Scaling Drives Technology Advancement. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CML

Smart Cache Cleaning: Energy Efficient Vulnerability Reduction

in Embedded Processors

Reiley Jeyapaul, and Aviral Shrivastava

Compiler Microarchitecture Lab, Arizona State University, Tempe, Arizona, USA

Page 2: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu2 CML

Scaling Drives Technology Advancement

Smaller device dimensions improve performance

and reduce power consumption

Processor device size rapidly shrinks every generation45nm [2008]30nm [2010] 20nm [2011] 15nm [2013*] 10nm [2015*]

*Expected

Page 3: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu3 CML

Reliability a consequence:Transient Faults induce Soft Errors

Electrical disturbances can disrupt the operation

causing Transient Faults

Page 4: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu4 CML

Soft Errors - an Increasing Concern with Technology Scaling

Toyota Prius: SEUs blamed as the probable cause for unintended acceleration.

Performance is useless if not

correct !

Charge carrying particles induce Soft Errors Alpha particles Neutrons

High energy (100KeV -1GeV) Low energy (10meV – 1eV)

Soft Error Rate Is now 1 per year Exponentially increases with

technology scaling Projected1 per day in a decade

Page 5: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu5 CML

Agenda Why cache vulnerability?

Cache Cleaning to Improve Reliability

Smart Cache Cleaning Methodology

Experimental Evaluation and Results

Page 6: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu CML

Caches are most vulnerable

6

Caches occupy majority of chip-area

Much higher % of transistors More than 80% of the

transistors in Itanium 2 are in caches.

Low operating voltages Frequent accesses Small and tight SRAM cell layout Majority contributor to the total

soft errors in a systemCache (split I/D) = 32KBI-TLB = 48 entriesD-TLB = 64 entriesLSQ = 64 entriesRegister File = 32 entries

With cheap Error detection, cache still the most susceptible architecture block.

Page 7: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu7 CML

How to protect L1 Cache ?Features SECDED1 ParityError detection 1 bit and 2 bit 1 bitError Correction 1 bit No correctionCache Access Latency

+95% increase(can be hidden)

No Impact

Cache Area Increase

+22% + <1%

Cache Power Increase

+22% + <1%

Enabled Processors SPM of IBM Cell ARM, Intel Xscale, Intel

AtomTo Detect +

Correct: Consequences

render it impractical.

Practical Method: Needs supporting

method to correct errors.

[1] L. Hung, H. Irie, M. Goshima, and S. Sakai. Utilization of SECDED for soft error and variation-induced defect tolerance in caches. In DATE ’07,

Page 8: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu CML

Cache Vulnerability

Assume: Parity based error detection to detect 1-bit errors.

Non-dirty data is not vulnerable Can always re-read non-dirty data from lower level of memory Parity based error detection can correct soft errors on non-

dirty dataDirty data cannot be reloaded (recovered) from

errors.Data in the cache is vulnerable if

It will be read by the processor, or it will be committed to memory

AND it is dirty

8

R W R R RCE CE

TimeW

How to protect dirty

L1 cache data ?

Page 9: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu9 CML

Agenda Why cache vulnerability? Cache Cleaning to Improve Reliability

Write-through cache Early Write-back cache Proposed Smart Cache Cleaning

Smart Cache Cleaning Methodology

Experimental Evaluation and Results

Page 10: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu10 CML

Possible Solution 1: Write-Through

Cache

A copy of cache-data is written into the

memory

NO dirty data in cache NO vulnerability HIGH L1-M traffic

If error detected on subsequent access,

can reload from memory to recover.

Error Recovery:

Data reloaded from memory

RW

E

RW RW RW RW RW RW RW RWA[1]

ProgramTimeline

(cycles)Memory

Write-backor Cache Cleaning

for(i:1~3){ for(j:1~3){ A[i]+=B[j] }}

A[2] A[3]

End of Loop

A[1] A[1] A[2] A[2] A[3] A[3]

Data Accesse

d

Vulnerability = 0

# write-backs = 9

Page 11: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu11 CML

Possible Solution 2: Early Write-back

Cache

Hardware-only cleaning has no knowledge of the

program’s data access pattern.

RW

E

RW RW RW RW RW RW RW RWA[1]

ProgramTimeline

(cycles)Periodic

Write-back

for(i:1~3){ for(j:1~3){ A[i]+=B[j] }}

A[2] A[3]

End of Loop

A[1] A[1] A[2] A[2] A[3] A[3]

Data Accesse

d

Vulnerability A[1]A[2]

A[3]

A[1]A[2]

A[3]

Unnecessary cleaning while data is being

reused

4 Cycles

Data unused but

vulnerableVulnerability =

48# write-backs

= 0

Vulnerability = 13

# write-backs = 8

Vulnerability ≠ 0 What went

wrong?

L. Li, V. Degalahal, N. Vijaykrishnan, M. Kandemir, and M. Irwin. Soft error and energy consumption interactions: a data cache perspective. In ISLPED ’04.

Page 12: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu12 CML

Proposed Solution: Smart Cache

CleaningRW

E

RW RW RW RW RW RW RW RWA[1]

ProgramTimeline

(cycles)Smart

Cache Cleaning

for(i:1~3){ for(j:1~3){ A[i]+=B[j] }}

A[2] A[3]

End of Loop

A[1] A[1] A[2] A[2] A[3] A[3]

Data Accesse

d

A[1]A[2]

A[3]

Vulnerability

Vulnerability = 0 for unused data.

Data is vulnerable while being reused by

the programFor this program, Clean

data, ONLY when not in use

by the program.

Vulnerability = 18

# write-backs = 3

Smart program analysis can help perform Cache

Cleaning only when required.

Page 13: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu13 CML

Agenda Why cache vulnerability?

Cache Cleaning to Improve Reliability

Smart Cache Cleaning Methodology When to clean data ? SCC Hardware Architecture How to clean data ? Which data to clean ?

Experimental Evaluation and Results

Page 14: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu14 CML

How to do Smart Cache Cleaning ?

SCC Insn Addr

Which data

to clean ?

IF ID EX M WB

L1 Cache

R/W Cache Accesses

Memory

MemoryWrite-backs

LSQ

SCC Pattern

When to clean ?

Controller: Issue clean

signal when

required

Store Insn Addr

Targeted cache

cleaning architecture

clean

Cache Cleaning

How to clean ?

Program

SCC Analysis

MemoryProfile data

Page 15: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu15 CML

When to clean data ?RW

E

RW RW RW RW RW RW RW RWA[1]

ProgramTimeline

(cycles)

InstantaneousVulnerability(per access)

for(i:1~3){ for(j:1~3){ A[i]+=B[j] }}

A[2] A[3]

End of Loop

A[1] A[1] A[2] A[2] A[3] A[3]

Data Accesse

d

3

If Instantaneous Vulnerability of access > SCC_Threshold Execute: store + clean assign 1 to SCC_PatternElse Execute: store only assign 0 to SCC_Pattern

A[1] 3 19

Execute: store + clean

If end of loop execution is not end of program, then instantaneous

vulnerability of last access extends till subsequent cache eviction.

0SCC_Pattern 0 1 0 0 1 0 0 1

SCC_Threshold = 4

Page 16: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu16 CML

How to do Smart Cache Cleaning

SCC Insn Addr

Which data

to clean ?

IF ID EX M WB

L1 Cache

R/W Cache Accesses

Memory

MemoryWrite-backs

LSQ

SCC Pattern

When to clean ?

Controller: Issue clean

signal when

required

Store Insn Addr

Targeted cache

cleaning architecture

clean

Cache Cleaning

How to clean ?

Program

SCC Analysis

MemoryProfile data

Page 17: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu17 CML

How to clean data ?

RW

E

RW RW RW RW RW RW RW RWA[1]

ProgramTimeline

(cycles)

for(i:1~3){ for(j:1~3){ A[i]+=B[j] }}

A[2] A[3]

End of Loop

A[1] A[1] A[2] A[2] A[3] A[3]

SCC Pattern 0 0 1 0 0 1 0 0 1

Program Execution

Instruction Pipeline

L1 Cache

Memory

LSQ

Controller

Targeted cache

cleaning architecture

clean Cache Cleaning

0 0 0 1 0 0 1 0 0 1

SCC_Pattern

Cycle count : 369

1

12

0No

Cleaning

Page 18: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu18 CML

SCC Achieves Energy-efficient Vulnerability ReductionHardware-only cache cleaning trades-off energy for vulnerability

Smart Cache Cleaning can achieve ≈0 Vulnerability, at ≈0 Energy cost

Page 19: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu19 CML

SCC_Pattern Generation: Weighted k-bit

Compression1 1 0 1 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 0 0 1 1 1SCC Cleaning

sequence:

K = 8SCC Pattern: - - - - - - - - Sliding window of 8

bits

Bit count in position 0Num of 1s = 3Num of 0s = 1

Cost for placing 0 in pos [0] of SCC Pattern: cost_of_0 = Num of 1s X 1 = 3 X 1 = 3

Cost of not cleaning clean

when required.

- - - - - - - 1

To determine matching bit value

for position 0

Cost of cleaning when not required.

Choose bit value = 1,

iff # of 1s > 2X # of 0s

if ( cost_of_1 ≤ cost_of_0 ) Bit value [0] = 1

Cost for placing 1 in pos 0 of SCC Pattern: cost_of_1 = Num of 0s X 2 = 1 X 2 = 2

Page 20: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu20 CML

SCC_Pattern Generation: Weighted k-bit

Compression1 1 0 1 1 0 0 1 1 0 0 0 0 1 0 1 0 1 0 1 0 0 0 1 1 1SCC Cleaning

sequence:

K = 8SCC Pattern:

Remaining 6 bits are 0-padded

- - - - - - - 1

Position [1] : cost_of_1[1] = 2 cost_of_0[1] = 3

if ( cost_of_1[i] ≤ cost_of_0[i] ) Bit value [i] = 1else Bit value [i] = 0 - - - - - - 1 1

Position [2] : cost_of_1[2] = 2 cost_of_0[2] = 3

- - - - - 1 1 1

Position [4] : cost_of_1[4] = 6 cost_of_0[4] = 1

- - - - 0 1 1 1 - - - 0 0 1 1 1 - - 0 0 0 1 1 1

Greater # of 1s

Greater # of 1s

Greater # of 0s

Position [6] : cost_of_1[6] = 4 cost_of_0[6] = 2

Equal # of 0s and 1s

- 0 0 0 0 1 1 10 0 0 0 0 1 1 1

0 0 0 0 0 0

All 0s Bit value = 0

0 0 0 0 0 1 1 1

Page 21: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu21 CML

Accuracy of the Weighted Pattern-Matching Algorithm

Weights used in the algorithm define

the accuracy. Size of k affects

accuracy

Page 22: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu22 CML

How to do Smart Cache Cleaning

SCC Insn Addr

Which data

to clean ?

IF ID EX M WB

L1 Cache

R/W Cache Accesses

Memory

MemoryWrite-backs

LSQ

SCC Pattern

When to clean ?

Controller: Issue clean

signal when

required

Store Insn Addr

Targeted cache

cleaning architecture

clean

Cache Cleaning

How to clean ?

Program

SCC Analysis

MemoryProfile data

Page 23: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu23 CML

Which data to clean ?

Overlapping accesses:

Choosing B, precludes the choice

of A

Average Vulnerability per access

Instantaneous Vulnerability(IV)

by each access of reference A

A110

A220

Parameters

Ref A Ref B

VulnerabilityAccess #

B120

How to choose one over another ?

Profit (V/A)

302

201

15 20

One SCC InsnAddr Register

Page 24: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu24 CML

Energy Efficient Vulnerability Reduction with SCC

Page 25: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu25 CML

SCC: Better results with more hardware registers

With more SCC registers, vulnerability is reduced

further, at the cost of hardware

overhead

Page 26: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu26 CML

Summary We develop a Hybrid Compiler & Micro-architecture

technique for Reliability – SCC

Soft Errors are a major concern, and Caches are most vulnerable to transient errors by radiation particles

Cache Cleaning can reduce vulnerability, at the possible cost of power overhead ECC gains 0 vulnerability, but 70X power overhead EWB gains 47% vulnerability reduction, with 6X power overhead

Our Smart Cache Cleaning technique: performs Cleaning on the right cache blocks at the right time achieves energy-efficient reliability in embedded systems

Page 27: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

CMLWeb page: aviral.lab.asu.edu27 CML

Future Work SCC-hardware overhead can be eliminated through

compiler-based instrumentation and loop unrolling.

Compile-time SCC analysis, and instrumentation can be performed using Cache Vulnerability Equations [LCTES’10]. Pure software-only SCC solution. NO hardware overhead

By introducing methods to accurately calibrate the weights used in the algorithm, accuracy of k-bit pattern matching algorithm can be improved.

Page 28: Smart Cache Cleaning : Energy Efficient Vulnerability Reduction  in Embedded Processors

28 CMLWeb page: aviral.lab.asu.edu

e-mail : [email protected]

Home Page : www.public.asu.edu/~rjeyapau/

CML Lab : http://aviral.lab.asu.edu