snoopy coherence protocols small-scale multiprocessors

21
Snoopy Coherence Protocols Small-scale multiprocessors

Post on 22-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Snoopy Coherence Protocols

Small-scale multiprocessors

Assumptions

• broadcast-style interconnect– e.g. shared bus, free-space optical, …– allows passive listeners

• assume write-back caches– invalidation after a write rather than update

• write-through (update protocol) is also possible

Invalidate vs. update

• Write-invalidate protocol:– write to shared data: an invalidate is sent to

all caches which snoop and invalidate copies.– read miss: snoop caches to find most recent

copy

• Write-update protocol:– write to shared data: broadcast on bus,

processors snoop and update any copies.– read miss: memory is always up to date.

Three-state MSI protocol

• Each block of memory is in one state:– Clean in all caches and up-to-date in memory (shared)– Dirty in exactly one cache (modified)– Not in any cache

• Each cache line is in one state:– Modified: cache has only copy, it is writable and dirty– Shared: line can be read– Invalid: line contains no valid data

• Read misses cause the cache to snoop the bus• Write to a shared block is treated as a miss - needs a

(snoopy) bus transaction

I S

M

Read (miss)

Write (hit)

Write (miss)

Read (hit)

Read or write (hit)

a) Processor actions

I S

M

Bus write

Bus read – send data to requestor

Bus write

Bus read

b) Bus snooping

-send datato requestor

Bus reador write

Example

• assume cache line is initially invalid

• consider two addresses, A1 and A2

• assume A1 and A2 map to the same cache line, but A1 != A2– that is, A1 and A2 refer to completely different

places in memory, not adjacent (or nearby) addresses that fit within the same block

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

Step 1a: Write miss, invalid line

- is A1 cached anywhere?

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

Step 1b: No other cache responds

- assert ownership

Wait a minute...

• if we only have one type of read transaction (“Bus read”) how do we tell the difference between memory or another cache responding?

• the bus cycle allows for an “intervention”– more properly, a cache-to-cache intervention– a cache pre-empts the bus and answers

instead of memory

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

Step 2: Read hit

- no bus action needed

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

Step 3a: Read miss

- does anyone have A1 cached?

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

Step 3b: Cached elsewhere

- P1 replies

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

P2: write 20 to A1 I     M A1 20 Bus write P2 A1      

Step 4: Write hit, shared line

- now P2 owns it

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

P2: write 20 to A1 I     M A1 20 Bus write P2 A1      

P2: write 40 to A2             Bus write P2 A1 20 A1 20

Step 5a: Write miss, A2 maps to the same line as A1

- first, write back the victim

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

P2: write 20 to A1 I     M A1 20 Bus write P2 A1      

P2: write 40 to A2             Bus write P2 A1 20 A1 20

              Bus read P2 A2      

Step 5b: Service the miss

- does anyone have A2 cached?

Step P1 P2 Bus Memory

  State Addr Value State Addr Value Action Processor Addr Value Addr Value

                         

P1: write 10 to A1 I           Bus read P1 A1      

  M A1 10       Bus write P1 A1      

P1: read A1 M A1 10                  

P2: read A1       I     Bus read P2 A1      

  S A1 10 S A1 10 Bus write P1 A1 10 A1 10

P2: write 20 to A1 I     M A1 20 Bus write P2 A1      

P2: write 40 to A2             Bus write P2 A1 20 A1 20

              Bus read P2 A2      

        M A2 40 Bus write P2 A2      

Step 5c: Not cached elsewhere

- like the second half of step 1

Four state protocol

• add “exclusive” state

• indicates this is the only cached copy

• no need to broadcast an invalidation on a write hit to an E line

• goal is to reduce bus traffic

• works well for local variables

I E

M

Write (hit)Write (miss)

Read (hit)

Read or write (hit)

a) Processor actions

S

Write (hit)

Read (miss) 2

Read (miss) 1

Read (hit)1: data comes from memory2: data from another cache

I E

M

Bus readBus write

Bus reador write

b) Bus snooping

S

Bus read

Bus write

Bus write

Bus read

Coherence misses

• a new type of miss has been added

• we still have the usual cold, capacity and conflict misses

• now we also have coherence misses

• these occur when a read miss is serviced from another cache