lecture 13: system interface - georgia institute of...
TRANSCRIPT
![Page 1: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/1.jpg)
Lecture 13:System InterfaceTushar Krishna
Assistant ProfessorSchool of Electrical and Computer EngineeringGeorgia Institute of Technology
ECE 8823 A / CS 8803 - ICNInterconnection NetworksSpring 2017http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/
![Page 2: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/2.jpg)
Network Architecture¡ Topology¡ How to connect the nodes¡ ~Road Network
¡Routing¡ Which path should a message take¡ ~Series of road segments from source to destination
¡ Flow Control¡ When does the message have to stop/proceed¡ ~Traffic signals at end of each road segment
¡Router Microarchitecture¡ How to build the routers¡ ~Design of traffic intersection (number of lanes, algorithm for
turning red/green)
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
2
How does the NoCinterface with the
rest of the system?
![Page 3: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/3.jpg)
Network Interface
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
3
L3$/Directory
L2$
L1D$
Core
L1I$ Network
Interface
Router
So far we have focused inside the network: routers, their connections, and routing + flow-control protocols for communication between them
Let’s go up the stack
![Page 4: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/4.jpg)
Ingress
NIC Microarchitecture
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
4
Inte
rfac
eto
Cac
he C
ontro
ller
X
VC Select
Router
N
S
E
W
LPacketizer
Network InterfaceEgress
VC SelectRouting Unit*
*Source Routing or Lookahead Routing
De-Packetizer
Switch Allocator
Backpressure
![Page 5: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/5.jpg)
Interface to Core/Cache Controllers¡Industry Standard Interfaces (in MPSoCs)¡AMBA AXI (ARM)¡ AMBA 4 ACE (AXI Coherence Extensions)¡ AMBA 5 CHI (Coherent Hub Interface)
¡OCP (Sonics)¡STBus (ST Microelectronics )¡Wishbone (OpenCores)
¡Custom Interfaces (in CMPs)¡ Intel¡AMD¡ IBM
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
5
![Page 6: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/6.jpg)
Communication Protocols¡Message Passing¡Explicit movement of data between nodes and
address spaces¡Programmers manage communication
¡Shared Memory ¡Communication occurs implicitly through
loads/stores and accessing instructions¡Cache misses are serviced by the NoC¡We will focus on NoCs for shared memory systems
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
6
![Page 7: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/7.jpg)
Cache Controller à NIC Interface: Miss Status Handling Register (MSHR)
7
Core
Cache
Protocol Finite State Machine
Status Addr Data
Message Format and Send Message Receive
Dest RdReq Addr
Dest Writeback Addr Data
RdReply Addr Data
To network From network
MSHRs
Type Addr DataCache Request Type Addr DataReply
Request Addr
Network Interface
On a cache miss, allocate entry into MSHR, and send a request into the NoC.
Response is drained by MSHR
February 27, 2017
![Page 8: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/8.jpg)
Shared Memory Systems
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
8
Core L1 I/D Cache
L2 Cache Router
Tags Data
Controller Logic
Slide Courtesy: N. Jerger, Univ of Toronto
![Page 9: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/9.jpg)
Shared Memory Network for CMPs¡Logically…¡all processors access same shared memory
¡Practically…¡ cache hierarchies reduce access latency to improve
performance
¡Requires cache coherence protocol ¡ to maintain coherent view in presence of multiple
shared copiesFebruary 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
9
![Page 10: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/10.jpg)
Hardware Cache Coherence
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
10
¡Snoopy Protocol¡ Broadcast Rd/Wr request
over a shared bus¡ Every cache snoops
request¡ If some other cache is
writing, invalidate self copy
$ $
P1 P2
$ $
P3 P4
Mem
Bus
Mem
ory
Cont
rolle
r
1Read Cache miss
2
Request broadcast
3Send Data
![Page 11: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/11.jpg)
Hardware Cache Coherence
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
11
$ $
P1 P2
$ $
P3 P4
MemInterconnection NetworkDirectory
¡Directory Protocol¡ Send a Rd/Wr request to
a directory¡Directory tracks dirty-
copy and sharers and manages data response and invalidates
1Read Cache miss
2
Directory receives request
3Send Data
Dire
ctor
y
![Page 12: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/12.jpg)
Cache Organization: Private L2
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
12
P Private L1
PrivateL2 slice
Router
Memory Controller
Directory
![Page 13: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/13.jpg)
Cache Organization: Shared distributed L2
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
13
P Private L1
Directoryslice
Shared L2 slice
Router
Memory Controller
Non Uniform Cache Access (NUCA)
![Page 14: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/14.jpg)
Cache Organization and Coherence Protocol Impacts Network Performance¡ Cache Organization shapes injection into the network¡ Private L2 caches+ L1 Miss à L2 Miss à Traffic in the NoC [low miss penalty]- Data replication between L2s è Overall lower cache capacity
¡ Shared L2 caches+Data can only exist in one L2 bank è Higher cache capacity- L1 Miss à Traffic in the NoC to go to L2 bank [increased miss penalty]
¡ Coherence protocol shapes NoC bandwidth requirement¡ Snoopy Protocol à More Messages¡ Directory Protocol à Fewer Messages¡ Messages Types¡ Data requests¡ Data responses¡ Coherence permissions
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
14
![Page 15: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/15.jpg)
Cache Coherence(Private L1 + Private/Shared L2)
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
15
Requestor
HomeNode
Sharer
Owner
1
2
2
4
Memory Controller
Requestor
HomeNode
Memory Controller
3
3
14
3
2 3
On-Chip Hit On-Chip Miss
Private Cache (L1 or L2)
Directory/Ordering PointCould be the Memory Controller itself
Broadcast (if snoopy protocol) or unicast/multicast (if directory)
Invalidate ACK if this was a Write Request. Not required if Read Request 1 Req
2 Fwd
3 Resp
4 Unblock
Req/Ctrl (1-flit)
Resp/Data (1 or 5-flit)
![Page 16: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/16.jpg)
Implications of Shared Memory Traffic on NoC Design¡Virtual Networks¡3-4 Protocol Message Classes¡ request, forward, response, unblock
¡=> 3-4 Virtual Networks in NoC¡ response and unblock guaranteed to drain (can share
vnet)¡There might be additional Message classes (=>
Virtual Networks) for “non-cacheable requests”¡DMA, synchronization, setup, …
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
16
![Page 17: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/17.jpg)
Implications of Shared Memory Traffic on NoC Design¡Flit Size and VC depth¡Control Packets: request, forward, response_ACK,
and unblock¡ size links such that control packets fit in 1-flit
¡Data Packets: response_DATA¡ Suppose 64B cache line, 16B flits : data packets are 5-
flit¡ 1-flit for control information (header etc)¡ 4-flits for cache line (64B cache line, 16B flits)
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
17
![Page 18: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/18.jpg)
Example: MOESI_hammer (AMD Opteron) protocol in gem5¡Message Classes¡ src/mem/protocol/MOESI_hammer-cache.sm¡ src/mem/protocol/MOESI_hammer-dir.sm
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
18
// Cache ControllerMessageBuffer * requestFromCache, network="To",
virtual_network="2”, vnet_type="request";
MessageBuffer * responseFromCache, network="To", virtual_network="4”, vnet_type="response";
MessageBuffer * unblockFromCache, network="To", virtual_network="5”, vnet_type="unblock";
MessageBuffer * forwardToCache, network="From", virtual_network="3”, vnet_type="forward";
MessageBuffer * responseToCache, network="From", virtual_network="4”, vnet_type="response";
// Directory Controller MessageBuffer * forwardFromDir, network="To",
virtual_network="3”, vnet_type="forward";
MessageBuffer * responseFromDir, network="To", virtual_network="4”, vnet_type="response";
MessageBuffer * dmaResponseFromDir, network="To", virtual_network="1”, vnet_type="response";
MessageBuffer * unblockToDir, network="From", virtual_network="5”, vnet_type="unblock";
MessageBuffer * responseToDir, network="From", virtual_network="4", vnet_type="response";
MessageBuffer * requestToDir, network="From", virtual_network="2”, vnet_type="request";
MessageBuffer * dmaRequestToDir, network="From", virtual_network="0”, vnet_type="request";
![Page 19: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/19.jpg)
Example: MOESI_hammer (AMD Opteron) protocol in gem5¡Message Types¡ src/mem/protocol/MOESI_hammer-msg.sm
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
19
// CoherenceRequestTypeenumeration(CoherenceRequestType, desc="...") {GETX, desc="Get eXclusive";GETS, desc="Get Shared";MERGED_GETS, desc="Get Shared";PUT, desc="Put Ownership";WB_ACK, desc="Writeback ack";WB_NACK, desc="Writeback neg. ack";PUTF, desc="PUT on a Flush";GETF, desc="Issue exclusive for Flushing";BLOCK_ACK, desc="Dir Block ack";INV, desc="Invalidate";
}
// CoherenceResponseTypeenumeration(CoherenceResponseType, desc="...") {ACK, desc="ACKnowledgment, responder does not have a
copy";ACK_SHARED, desc="ACKnowledgment, responder has a shared
copy";DATA, desc="Data, responder does not have a copy";DATA_SHARED, desc="Data, responder has a shared copy";DATA_EXCLUSIVE, desc="Data, responder was exclusive, gave us a
copy, and they went to invalid";WB_CLEAN, desc="Clean writeback";WB_DIRTY, desc="Dirty writeback";WB_EXCLUSIVE_CLEAN, desc="Clean writeback of exclusive data";WB_EXCLUSIVE_DIRTY, desc="Dirty writeback of exclusive data";UNBLOCK, desc="Unblock for writeback";UNBLOCKS, desc="Unblock now in S";UNBLOCKM, desc="Unblock now in M/O/E";NULL, desc="Null value";
}
![Page 20: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/20.jpg)
Implications of Traffic on NoC Design¡Design-time¡ Placement of Cores/Caches/Memory Controllers¡Homogeneous / “General Purpose CMP”¡ Typical Assumptions: each tile has 1 (or 2 cores), Private L1
Data + Instruction Cache, Private/Shared L2 slice, Directory¡ What about Memory Controllers?¡ If one or two memory controllers, usually on one end of the
chip¡ What if there are more? (next)
¡Heterogeneous / “Application Specific SoC”¡ later in the course
¡Runtime (usually done by the OS)¡Mapping of threads/tasks to cores¡Mapping of data across caches
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
20
![Page 21: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/21.jpg)
Logistics¡Lab 4 [5 points]: Due coming Sunday (March 4)¡Full-System Simulations for PARSEC benchmarks¡ 2 coherence protocols¡ 2 NoC Configurations: 1-cycle and 5-cycle routers
¡Study Impact of Network Delay on Full-system Runtime
¡Proposal Presentation [7 points] (March 15)¡Milestone I: Motivation Graphs [5 points]¡Proposed Plan + Timeline [2 points]
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
21
![Page 22: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/22.jpg)
Paper Discussion¡“Achieving Predictable Performance through
Better Memory Controller Placement in Many-Core CMPs”¡Dennis Abts, Natalie Enright Jerger, John Kim, Dan
Gibson, Mikko Lipasti, ISCA 2009
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
22
![Page 23: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/23.jpg)
Discussion Points¡ Summary of paper¡Why do you think row0_7 and
col0_7 popular?¡ What’s the problem with this
placement?¡ Simulation Methodology?¡ Channel Load for Random Traffic¡ Why genetic algorithm?
¡ Routing Algorithms¡ XY, YX, XY+YX, CDR¡ Deadlock avoidance?
¡ Challenges with other placements?
¡ 2 strengths¡ 2 weaknesses
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
23
![Page 24: Lecture 13: System Interface - Georgia Institute of Technologytusharkrishna.ece.gatech.edu/wp-content/uploads/sites/175/2016/10/... · Lecture 13: System Interface Tushar Krishna](https://reader031.vdocuments.site/reader031/viewer/2022022118/5cc661bf88c99384138c7a09/html5/thumbnails/24.jpg)
Results
February 27, 2017ICN | Spring 2017 | L13: System Interface © Tushar Krishna, School of ECE, Georgia Tech
24
Traffic TypesReq-only, Rep-only, Req+Rep