interconnection mechanisms
TRANSCRIPT
Interconnection MechanismsPerformance Models
Connecting Processors and Memories
• Shared Buses• Interconnection Networks
• Static Networks• Dynamic Networks
slide 2
P P P PM M M
Interconnection Network
M
M M M
P P P PM M M
Interconnection Network
M
M M M
Global Interconnection Network
M M M
Shared Bus
slide 3
each processor sees this picture:processing
bus access
timentransactiobustimeprocessingtimentransactiobus
nutilizatiobus
prob of a processor using the bus = prob of a processor not using the bus = 1 – prob of none of the n processors using the bus = (1 – )n
prob of at least one processor using the bus = 1 – (1 – )n
achieved BW on a relative scale = 1 – (1 – )n
required BW = n available BW = 1
Effect of re-submitted requests
slide 4
A W
(1-PA )1- + PA 1-PA
PA
1 also11
111
1
raterequestactual
111
a
aA
na
AA
A
A
A
wA
AWA
A
AA
AA
aPanBW
PPP
PP
qqa
qqPP
PPP
q
prob = qA prob = qW
Shared Bus : BW per proc
-0.100
0.000
0.100
0.200
0.300
0.400
0.500
0.600
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
BW required (req probability)
BW a
chie
ved
n = 2
n = 3
n = 4
n = 2
n = 3
n = 4
Shared Bus : utilization
-0.200
0.000
0.200
0.400
0.600
0.800
1.000
1.200
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
req probability
utili
zatio
n
n = 2
n = 3
n = 4
n = 2
n = 3
n = 4
Waiting time
slide 7
busa
abus
A
A
A
AAbus
iA
iAbus
Ai
Ai
busw
Ai
A
th
bus
TTP
PP
PPTPiPT
PPTiT
PP
)(ii
T i
1)1(1
1 )1(
)1( time waitingof valueExpected
)1( thisofy probabilit
attempt 1on accepted and times rejected isrequest if
timewaiting
21
1
Switched NetworksBUS
• Shared media• Lower Cost• Lower throughput• Scalability poor
Switched Network• Switched paths• Higher cost• Higher throughput• Scalability better
slide 8
Interconnection Networks• Topology : who is connected to whom
• Direct / Indirect : where is switching done
• Static / Dynamic : when is switching done
• Circuit switching / packet switching : how are connections established
• Store & forward / worm hole routing : how is the path determined
• Centralized / distributed : how is switching controlled
• Synchronous/asynchronous : mode of operation
slide 9
Direct and Indirect Networks
slide 10
PMP
MS
PMS
SMP
SMP
PM
PM
PM
SWIT
CH
DIRECTINDIRECT
node node
node node
link
linklink link
node
node
node
node
link
link
link
link
Static and Dynamic Networks
• Static Networks• fixed point to point connections• usually direct• each node pair may not have a direct connection• routing through nodes
• Dynamic Networks• connections established as per need• usually indirect• path can be established between any pair of nodes• routing through switches
slide 11
Static Network Topologies
slide 12
Linear
Star
2D-Mesh
Tree
Non-uniform connectivity
Static Networks Topologies- contd.
slide 13
Ring
Fully ConnectedTorus
Uniform connectivity
Illiac IV Mesh Network
slide 14
0 1 2
3 4 5
6 7 8
01
2
3
45
6
7
8
neighbors of node r :(r 1) mod 9 and(r 3) mod 9 Chordal Ring
Fat Tree Network
slide 15
Dynamic Networks
slide 16
k kcross -bar
switch
building block for multi-stagedynamic networks
2 2switch
straight exchange upperbroadcast
lowerbroadcast
simplestcross-bar
Baseline Network
slide 17
000001010011100101110111
000001010011100101110111
blocking can occur
Benes Network
slide 18
non-blocking
Switching Mechanism• Circuit Switching (connection oriented
communication)• A circuit is established between the source and the
destination
• Packet Switching (connectionless communication)• Information is divided into packets and each packet is
sent independently from node to node
slide 19
Routing in Networks
slide 20
nodeincomingmessage
outgoingmessage
header payload/datastore & forward
routing
worm holerouting
time
BWH
BWl
BWl
BWHnlatency
BWl
BWHnlatency
Routing in presence of congestion• Worm hole routing
• When message header is blocked, many links get blocked with the message
• Solution: cut-through routing• When message header is blocked, tail is allowed to
move, compressing the message into a single node
slide 21
Routing Options• Deterministic routing: always same path followed• Adaptive routing: best path selected to minimize
congestion
• Source based routing: message specifies path to destination
• Destination based routing: message specifies only destination address
slide 22
Some Performance Parameters
slide 23
time
sender
receiver
time of flight
overhead
overhead
Tx time=bytes/BW
Tx time=bytes/BW
transport latency
total latency
Other Parameters• Throughput Bandwidth (no credit for header)
• Bisection bandwidth = BW across a bisection• Node degree• Network Diameter• Cost• Fault Tolerance
slide 24
Multidimensional Grid/Mesh
Size=k k …. k (n times)= k n
Diameter = (k-1) n without end around
connections= k n /2 with end around
connections
slide 25
k-ary n-cube
for (Binary) Hypercube : k = 2
Grid/Mesh Performance - 1
slide 26
cycle ain req message of prob is dimension one along
hops of no. av. is dimensions ofnumber is
rate arrival Message
r
kn
knr
d
d
kd
Grid/Mesh Performance - 2
np
TkrTn
sd
s
2link a along
request ofy Probabilit2
Occupancy Server
2 rate Service
slide 27
Grid/Mesh Performance - 3
slide 28
k-ary n-cube
sw
w
TppT
D
T
)1(2)1(2)(1
model queueopen 1//M use
node aat time waiting
B
Switch Performance
slide 29
k mcross -bar
switch
mm
mmmmm
E(i)irrCq(i)ki
T
r
i
i
ii
ikii
k
11)1(
portsoutput of num portoutput specific a including patterns address offraction
requests ofout accepted requests of no. expected)1( ports on requests ussimultaneo of prob
timeservice same requires packet)(or mesageeach that assumed isit Here
cycle service one duringport input an at request of probLet
Switch Performance – contd.
slide 30
kk
k
i
iki
ik
k
i
ikii
k
k
i
ikii
kik
i
ikii
k
k
i
ikii
ki
k
i
mrmmrr
mmmm
rrm
mCmrrCm
rrCmm
mrrCm
rrCmm
m
iqiE
1)1(1
)1(1)1(
)1(1)1(
)1(11
)()( scale) relative(on BW Expected
00
00
0
0
Switch Performance – contd.
slide 31
waiting.of because delays compute also and submission-re todue raterequest revised compute toneed We
conflicts. todue submission-rerequest ofeffect consider now We
requests of acceptance of prob
)1 that (assuming as wellas than less is this
1conflicts)port output of (becauseBW Expected
conflicts)port output no were there(ifBW Expected bandwidth Requested
krBWP
rr kmmrmm
mr k
A
k
Effect of re-submitted requests
slide 32
link ofBW
1 timewaiting
' '1
1'
) and states graph with Markov (using ' raterequestactual
lHtimecycleT
TP
Pkr
BWPmrmmBW
rPrrr
qqqqrr
A
A
A
k
A
wA
wA
Effect of bufferingThere are two possibilities• Buffering before switching (k buffers, one at each
input port)• Buffering after switching (m buffers, one at each
output port)
slide 33
Switch with input buffersRate of messages at input and output of each
queue is same in steady state - r per cycle
Service time includes delays due to conflicts, calculated as earlier. This has an exponential distribution – recall the analysis for a shared bus.
M/M/1 open queue model can be used to calculate queuing delay. Details are omitted.
TP
P
A
A1
slide 34
Switch with output buffersHere we assume that all the messages destined for same
output are queued in the same buffer, in some order. That is no rejections and no re-submissions.
For each queue,
Messages arriving per service cycle = =
Prob of a request coming from one of
the k sources = p =
Apply MB/D/1 model for finding queuing delay Tw
mkr
mr
slide 35
TpTw )1(2