![Page 2: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/2.jpg)
Extension Using NoCs
2
![Page 3: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/3.jpg)
P-Mesh NoC Connected I/O and Accelerators
3
DRAM DRAM
Deserializer
Serializer
DMA
Buffer Accel
Tile
10
Gig
abit
Eth
ern
et
![Page 4: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/4.jpg)
P-Mesh NoC: packet format
CHIPID: Highest bits indicate whether the destination is on-chip or off-chip, the rest of the bits indicates the chip ID
XPOS: The position of the destination tile in the X dimensionYPOS: The position of the destination tile in the Y dimensionFBITS: The router output port to the destinationPAYLOAD LENGTH: The number of payload packetsRESERVED: Reserved Bits used by higher-level protocols.
4
RESERVED
![Page 5: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/5.jpg)
P-Mesh NoC: .h files
piton/design/include/network_define.hDefines the header flits b63-22(all except messageid, tag, and options 1)
piton/design/include/define.vhdefines the rest
5
![Page 6: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/6.jpg)
Cache Coherence Protocol
Directory-based MESI coherence Protocol
- Four-hop message communication (no direct communication between private L1.5 caches)
- Uses 3 physical NoCs with point-to-point ordering to avoid deadlock
- The directory and L2 are co-located but state information are maintained separately
- Silent eviction in E and S states
- No need for acknowledgement upon write-back of dirty lines from L1.5 to L2, but writeback guard needed in some cases.
6
![Page 7: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/7.jpg)
Memory Hierarchy Datapath
7
Private L1.5
Distributed shared
L2
Off-chipChipset
NoC1
NoC2
NoC3
NoC1
NoC2
NoC3
![Page 8: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/8.jpg)
NoC Messages
8
L1.5 L2L1.5/
Memory
NoC1 NoC2
NoC2 NoC3
In order to avoid deadlock, NoC3 messages will never be blocked
LoadStore
Ifill…
DowngradeInv
Mem Req…
DG ackInv ack
Mem Reply…
Load AckStore Ack
…
![Page 9: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/9.jpg)
Backup Slides
9
![Page 10: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/10.jpg)
Coherence Transaction Example
10
Core 1
I → E
Core 2
I
I → E
Memory
L1.5 L1.5
L2❶Load
❷Mem Req ❸MemReply
❹ Data Ack
Core 1 Core2
Ld
![Page 11: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/11.jpg)
Coherence Transaction Example (2)
11
Core 1
E → I
Core 2
I → M
E → M
Memory
L1.5 L1.5
L2 ❶ Store
❷Downgrade
❸ DG Ack
❹ Data Ack
Core 1 Core2
LdSt
![Page 12: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/12.jpg)
Coherence Transaction Example (3)
12
Core 1
I
Core 2
M → I
M → I
Memory
L1.5 L1.5
L2 ❶WbGuard
❷Writeback
Core 1 Core2
LdSt
Wb
![Page 13: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/13.jpg)
Adding to OpenPiton
• AXI-Lite
• Wishbone
• Interfacing with the Network on Chip
13
![Page 14: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/14.jpg)
Hooking up an AXI-Lite device
14
![Page 15: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/15.jpg)
Interfacing with the Networks-on-Chip
1.Packet format– Highlighting key packet fields
2.Definition files– .h files
3.Instantiations in Verilog design
15
![Page 16: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/16.jpg)
NoC: packet format
64-bit flits1 packet header (64b) + X packet payload flits
(64b * X)Ex: Cache request from L1.5 to L2
Header flit + req. address flit + metadata flitEx: Cache response from L2 to L1.5
Header flit + 2x data flits (16B cache line)Ex: Instruction cache response
Header flit + 4x data flits (32B cache line)
16
![Page 17: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/17.jpg)
NoC: instantiations
piton/design/chip/rtl/chip.v.pyvChip-wide connections between tilesAuto generated using PYHP
17
![Page 18: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/18.jpg)
NoC: instantiationspiton/design/chip/tile/rtl/tile.v.pyv
Instantiation of NoC1/2/3piton/design/chip/tile/rtl/tile.v.pyv
Selectable between router and crossbar design
18
![Page 19: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/19.jpg)
Cache Coherence Protocol
Directory-based MESI coherence Protocol
- Four-hop message communication (no direct communication between private L1.5 caches)
- Uses 3 physical NoCs with point-to-point ordering to avoid deadlock
19
ReqI->S
DirM->S
ReqRd
AckDt
OwnerM->S
FwdRdAck
FwdRd
L1.5 L2 L1.5
![Page 20: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/20.jpg)
Cache Coherence Protocol (2)
Directory-based MESI coherence Protocol
- The directory and L2 are co-located but state information are maintained separately
20
L2 State Dir State Tag Data Sharer List
…
![Page 21: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/21.jpg)
Cache Coherence Protocol (3)
Directory-based MESI coherence Protocol
- Silent eviction in E and S states
- No need for acknowledgement upon write-back of dirty lines from L1.5 to L2
21
ReqS->I
ReqE->I
ReqM->I
DirM->IE->I
WbGuard
Wb
![Page 22: Extension Using NoCsparallel.princeton.edu/openpiton/tutorial_slides/asplos18/openpiton... · Cache Coherence Protocol Directory-based MESI coherence Protocol - Four-hop message communication](https://reader034.vdocuments.site/reader034/viewer/2022042917/5f5adf3cd54bbc484256e92c/html5/thumbnails/22.jpg)
Example: Add an on-chip accelerator
1. Implement the NoC interface for the accelerator
2. Design and implement the control flow for the accelerator
– Use interrupt packets to init and stop the accelerator
– Use special load and stores to config the accelerator
– Follow the coherence protocol if a coherence cache is maintained
3. Connect the accelerator to NoCs and assign it a new tile ID
4. Modify the OS code to init the accelerator if needed
5. Write tests to test the accelerator
22