interconnection mechanisms

Interconnection MechanismsPerformance Models

Connecting Processors and Memories

• Shared Buses• Interconnection Networks

• Static Networks• Dynamic Networks

slide 2

P P P PM M M

Interconnection Network

M

M M M

P P P PM M M

Interconnection Network

M

M M M

Global Interconnection Network

M M M

Shared Bus

slide 3

each processor sees this picture:processing

bus access

timentransactiobustimeprocessingtimentransactiobus

nutilizatiobus

prob of a processor using the bus = prob of a processor not using the bus = 1 – prob of none of the n processors using the bus = (1 – )n

prob of at least one processor using the bus = 1 – (1 – )n

achieved BW on a relative scale = 1 – (1 – )n

required BW = n available BW = 1

Effect of re-submitted requests

slide 4

A W

(1-PA )1- + PA 1-PA

PA

1 also11

111

1

raterequestactual

111

a

aA

na

AA

A

A

A

wA

AWA

A

AA

AA

aPanBW

PPP

PP

qqa

qqPP

PPP

q

prob = qA prob = qW

Shared Bus : BW per proc

-0.100

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

BW required (req probability)

BW a

chie

ved

n = 2

n = 3

n = 4

n = 2

n = 3

n = 4

Shared Bus : utilization

-0.200

0.000

0.200

0.400

0.600

0.800

1.000

1.200

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

req probability

utili

zatio

n

n = 2

n = 3

n = 4

n = 2

n = 3

n = 4

Waiting time

slide 7

busa

abus

A

A

A

AAbus

iA

iAbus

Ai

Ai

busw

Ai

A

th

bus

TTP

PP

PPTPiPT

PPTiT

PP

)(ii

T i

1)1(1

1 )1(

)1( time waitingof valueExpected

)1( thisofy probabilit

attempt 1on accepted and times rejected isrequest if

timewaiting

21

1

Switched NetworksBUS

• Shared media• Lower Cost• Lower throughput• Scalability poor

Switched Network• Switched paths• Higher cost• Higher throughput• Scalability better

slide 8

Interconnection Networks• Topology : who is connected to whom

• Direct / Indirect : where is switching done

• Static / Dynamic : when is switching done

• Circuit switching / packet switching : how are connections established

• Store & forward / worm hole routing : how is the path determined

• Centralized / distributed : how is switching controlled

• Synchronous/asynchronous : mode of operation

slide 9

Direct and Indirect Networks

slide 10

PMP

MS

PMS

SMP

SMP

PM

PM

PM

SWIT

CH

DIRECTINDIRECT

node node

node node

link

linklink link

node

node

node

node

link

link

link

link

Static and Dynamic Networks

• Static Networks• fixed point to point connections• usually direct• each node pair may not have a direct connection• routing through nodes

• Dynamic Networks• connections established as per need• usually indirect• path can be established between any pair of nodes• routing through switches

slide 11

Static Network Topologies

slide 12

Linear

Star

2D-Mesh

Tree

Non-uniform connectivity

Static Networks Topologies- contd.

slide 13

Ring

Fully ConnectedTorus

Uniform connectivity

Illiac IV Mesh Network

slide 14

0 1 2

3 4 5

6 7 8

01

2

3

45

6

7

8

neighbors of node r :(r 1) mod 9 and(r 3) mod 9 Chordal Ring

Fat Tree Network

slide 15

Dynamic Networks

slide 16

k kcross -bar

switch

building block for multi-stagedynamic networks

2 2switch

straight exchange upperbroadcast

lowerbroadcast

simplestcross-bar

Baseline Network

slide 17

000001010011100101110111

000001010011100101110111

blocking can occur

Benes Network

slide 18

non-blocking

Switching Mechanism• Circuit Switching (connection oriented

communication)• A circuit is established between the source and the

destination

• Packet Switching (connectionless communication)• Information is divided into packets and each packet is

sent independently from node to node

slide 19

Routing in Networks

slide 20

nodeincomingmessage

outgoingmessage

header payload/datastore & forward

routing

worm holerouting

time

BWH

BWl

BWl

BWHnlatency

BWl

BWHnlatency

Routing in presence of congestion• Worm hole routing

• When message header is blocked, many links get blocked with the message

• Solution: cut-through routing• When message header is blocked, tail is allowed to

move, compressing the message into a single node

slide 21

Routing Options• Deterministic routing: always same path followed• Adaptive routing: best path selected to minimize

congestion

• Source based routing: message specifies path to destination

• Destination based routing: message specifies only destination address

slide 22

Some Performance Parameters

slide 23

time

sender

receiver

time of flight

overhead

overhead

Tx time=bytes/BW

Tx time=bytes/BW

transport latency

total latency

Other Parameters• Throughput Bandwidth (no credit for header)

• Bisection bandwidth = BW across a bisection• Node degree• Network Diameter• Cost• Fault Tolerance

slide 24

Multidimensional Grid/Mesh

Size=k k …. k (n times)= k n

Diameter = (k-1) n without end around

connections= k n /2 with end around

connections

slide 25

k-ary n-cube

for (Binary) Hypercube : k = 2

Grid/Mesh Performance - 1

slide 26

cycle ain req message of prob is dimension one along

hops of no. av. is dimensions ofnumber is

rate arrival Message

r

kn

knr

d

d

kd


np

TkrTn

sd

s

2link a along

request ofy Probabilit2

Occupancy Server

2 rate Service

slide 27


slide 28

k-ary n-cube

sw

w

TppT

D

T

)1(2)1(2)(1

model queueopen 1//M use

node aat time waiting

B

Switch Performance

slide 29

k mcross -bar

switch

mm

mmmmm

E(i)irrCq(i)ki

T

r

i

i

ii

ikii

k

11)1(

portsoutput of num portoutput specific a including patterns address offraction

requests ofout accepted requests of no. expected)1( ports on requests ussimultaneo of prob

timeservice same requires packet)(or mesageeach that assumed isit Here

cycle service one duringport input an at request of probLet

Switch Performance – contd.

slide 30

kk

k

i

iki

ik

k

i

ikii

k

k

i

ikii

kik

i

ikii

k

k

i

ikii

ki

k

i

mrmmrr

mmmm

rrm

mCmrrCm

rrCmm

mrrCm

rrCmm

m

iqiE

1)1(1

)1(1)1(

)1(1)1(

)1(11

)()( scale) relative(on BW Expected

00

00

0

0

Switch Performance – contd.

slide 31

waiting.of because delays compute also and submission-re todue raterequest revised compute toneed We

conflicts. todue submission-rerequest ofeffect consider now We

requests of acceptance of prob

)1 that (assuming as wellas than less is this

1conflicts)port output of (becauseBW Expected

conflicts)port output no were there(ifBW Expected bandwidth Requested

krBWP

rr kmmrmm

mr k

A

k

Effect of re-submitted requests

slide 32

link ofBW

1 timewaiting

' '1

1'

) and states graph with Markov (using ' raterequestactual

lHtimecycleT

TP

Pkr

BWPmrmmBW

rPrrr

qqqqrr

A

A

A

k

A

wA

wA

Effect of bufferingThere are two possibilities• Buffering before switching (k buffers, one at each

input port)• Buffering after switching (m buffers, one at each

output port)

slide 33

Switch with input buffersRate of messages at input and output of each

queue is same in steady state - r per cycle

Service time includes delays due to conflicts, calculated as earlier. This has an exponential distribution – recall the analysis for a shared bus.

M/M/1 open queue model can be used to calculate queuing delay. Details are omitted.

TP

P

A

A1

slide 34

Switch with output buffersHere we assume that all the messages destined for same

output are queued in the same buffer, in some order. That is no rejections and no re-submissions.

For each queue,

Messages arriving per service cycle = =

Prob of a request coming from one of

the k sources = p =

Apply MB/D/1 model for finding queuing delay Tw

mkr

mr

slide 35

TpTw )1(2

interconnection mechanisms

Engineering