effects of wrong path mem. ref. in cc mp systems gökay burak akkuŞ cmpe 511 – computer...

25
Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe511 – Computer Architecture

Upload: kevin-robinson

Post on 04-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Effects of wrong path mem. ref. in CC MP Systems

Gökay Burak AKKUŞ

Cmpe511 – Computer Architecture

Page 2: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

About the papers R. Sendag, A. Yilmazer, J.J. Yi, and Augustus K. Uht,

Quantifying and Reducing the Effects of Wrong-Path Memory References in Cache-Coherent Multiprocessor Systems, IPDPS2006, 2006

O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Cache filtering techniques to reduce the negative impact of useless speculative memory references on processor performance. Symposium on Computer Architecture and High Performance Computing, 2004.

O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Understanding the effects of wrong-path memory references on processor performance. Workshop on Memory Performance Issues, 2004.

Page 3: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

What is it all about?

how wrong-path memory accesses affect the cache coherence traffic state transitions, the resource utilization.

proposes a filtering mechanism and a replacement policy

Page 4: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Subjects

SMPs: Shared-memory MultiProcessor systems

Cache Coherence Branch Prediction and prefetching Wrong paths

Page 5: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Cache Coherence Solutions Snooping Solution (Snoopy Bus):

Send all requests for data to all processors Processors snoop to see if they have a copy and respond

accordingly Requires broadcast, since caching information is at

processors Works well with bus (natural broadcast medium) Dominates for small scale machines (most of the market)

Directory-Based Schemes Keep track of what is being shared in 1 centralized place

(logically) Distributed memory => distributed directory for

scalability (avoids bottlenecks)

Send point-to-point requests to processors via network Scales better than Snooping Actually existed BEFORE Snooping-based schemes

Page 6: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Cache Coherence Protocols

MSI (Modified, Shared, Invalid) MESI (Modified, Shared, Exclusive, Invalid) MOESI (Modified, Owned, Shared,

Exclusive, Invalid)

Page 7: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Wrong-path effects

Replacements Writebacks Invalidations Cache Block State Transitions Data/Bus Traffic and Coherence

Transactions Power Consumption Resource Contention

Page 8: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Replacements

Cause: speculatively-executed load instruction

mispredicted path a cache block brought into data cache One of the cache blocks replaced by

the new one

Page 9: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Writebacks When a replacement occurs by a wrong

path reference The evicted cache block may have the state M

(exclusive, dirty) or O (share, dirty) Before removing this block from cache a

writeback occurs For MSI and MESI

if a requested cache block has the state M, before it is sent to the requestor it is written back to memory

Then its state is set to S in the original owner’s cache.

Page 10: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Invalidations

Assume MOESI protocol A wrong-path load instruction accesses a

cache block that is modified by nother processor

The owner sets the state to O The requestor gets the block and the

state is S if the owner needs to write to that block

Changes state from O to M Then invalidates all other copies

Page 11: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Cache Block State Transitions

2 extra cache transitions in the owner’s cache When a modified block is requested

Cache state changes from M to O When that block is modified

Again the cache state becomes M

Page 12: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture
Page 13: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Data/Bus Traffic and Coherence Transactions

Due to L1 and L2 cache accesses Caused by extra replacements,

writebacks, invalidations and state transitions

Traffic also increases

Snoop or Directory requests also increase traffic

Page 14: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture
Page 15: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Power Consumption

As there are unnecessary snoops, Traffic overhead State transition overhead

Power consumption increases Ex:

Filtering unnecessary snoops may reduce L2 cache power by 30% (see Moshovos et al.)

Page 16: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Resource Contention

wrong-path memory accesses compete with correct-path memory accesses for the multiprocessor’s resources

additional cache coherence transactions may increase the frequency of full service buffers

Result: increasing chance of deadlocks

Page 17: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Simulation

SPLASH-2 benchmark suite em3d simulation benchmark MOSI and MOESI protocols used 16-processor SPARC v9

Page 18: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture
Page 19: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture
Page 20: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Statement based on experiments

mispredicted branches are resolved before 94% of wrong-path L2 misses complete.

Therefore, whether “an L2 cache miss is speculative” is usually known before the block is placed into the L2 cache. [REF2]

Page 21: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Reducing Cache Pollution Filtering

Filtering applied to L2 cache Observation:

if a speculatively-fetched cache block is not used while it resides in the L1 cache, then it is likely that that block will not be used at all or will not be used before being evicted from the L2 cache

In this mechanism all memory references made by wrong-path instructions

or the prefetcher are fetched only into the first-level cache

the processor monitors whether they are referenced by non-speculative (correctpath) instructions

Based on the predefined observation, the processor may choose to not write the block into the L2 cache or may adopt a policy that gives lower priority to the unused speculatively-fetched block.

Page 22: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Wrong Path Aware Replacement Policy

when a block is brought into the cache, it is marked as being either on the correct-path or on the wrong-path

when a block needs to be evicted wrong-path blocks are evicted first, on a

LRU basis if there are multiple wrong-path blocks.

Page 23: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Performance Evaluation

Page 24: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

Conclusions & Critics IPC (instruction per cycle) can be used as the metric In some cases wrong-path executions positively effect overall

performance mcf, parser, and perlbmk

In some cases significantly negative effect vpr and gcc

To model or not to model especially for future systems with longer memory interconnect

latencies and processors with larger instruction windows.

The real effect: Cache pollution In SMP case especially

For a workload with many cache-to-cache transfers, wrong-path memory references can significantly affect the coherence actions.

Proposed solutions yet not studied deeply

Page 25: Effects of wrong path mem. ref. in CC MP Systems Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture

References R. Sendag, A. Yilmazer, J.J. Yi, and Augustus K. Uht,

Quantifying and Reducing the Effects of Wrong-Path Memory References in Cache-Coherent Multiprocessor Systems, IPDPS2006, 2006

O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Cache filtering techniques to reduce the negative impact of useless speculative memory references on processor performance. Symposium on Computer Architecture and High Performance Computing, 2004.

O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Understanding the effects of wrong-path memory references on processor performance. Workshop on Memory Performance Issues, 2004.