nick mckeown
DESCRIPTION
Spring 2012 Lecture 4 Parallelizing an OQ Switch. EE384x Packet Switch Architectures. Nick McKeown. many outputs. one output. 1. 1. 1. 1. 1. k. k. N. N. Scaling an OQ Switch. Work conserving if memory b/w >= R(N+1). Not so clear. B5. C5. B5. A5. B6. A5. B6. B6. C6. A6. - PowerPoint PPT PresentationTRANSCRIPT
Nick McKeown
Spring 2012
Lecture 4
Parallelizing an OQ Switch
EE384xPacket Switch Architectures
Scaling an OQ Switch
one output
1
k
many outputs
1
k
111
NN
Not so clear.Work conserving if memory b/w >= R(N+1)
At most two memory operations per time slot: 1 write and 1 read
Parallel OQ SwitchMay not be work-conserving
1
1
k=3
N=3
A
C
B2Time slot = 1
A5
A6
A7
A5
A6
A7
B5
B6
A8
B5
B6
A8
Time slot = 2
B6
B5
A8
C5
C6Time slot = 3
Constant size packets
ProblemHow can we design a parallel OQ work-conserving switch from slower parallel memories?
Work Conserving
Theorem (sufficiency)A parallel output-queued switch is work-conserving with 3N –1 memories, each able to perform at most one memory operation per time slot.
Re-stating the Problem
1. There are K cages which can contain an infinite number of pigeons.
2. Assume that time is slotted, and in any one time slota. At most N pigeons can arrive and at most N can
depart. b. At most 1 pigeon can enter or leave a cage via a
pigeon hole.c. The time slot at which arriving pigeons will depart
is known
3. For any switchWhat is the minimum K, such that all N pigeons can be immediately placed in a cage when they arrive, and can depart at the right time?
Only one packet can enter or leave a memory at time t
Intuition for Theorem
Only one packet can enter a memory at time t
Time = t
DT=t+X
DT=t+X
DT=t
Only one packet can enter or leave a memory at any time
Memory
Proof of Theorem
When a packet arrives in a time slot it must choose a memory not chosen by
1. The N – 1 other packets that arrive at that timeslot.
2. The N other packets that depart at that timeslot.
3. The N - 1 other packets that can depart at the same time as this packet departs (in future).
Proof
By the pigeon-hole principle, the switch can be work-conserving if there are 3N –1 memories, each able to perform at most one memory operation per time slot.
Memory
Memory
Memory
Memory
Memory
Memory
Memory
A Parallel Shared Memory Switch
C
A
Departing Packets
R
R
Arriving Packets
A5
A4
B1
C1
A1
C3
A5
A4
From theorem 1, k = 7 memories don’t suffice .. but 8 memories do
Memory
1
K=8
C3
At most one operation – a write or a read per time slot
B
B3
C1
A1
A3
B1
Distributed Shared Memory Switch
The central memories are distributed to the line cards and shared.Memory and line cards can be added incrementally.
From theorem 1, the switch is work-conserving if we have a total of 3N –1 memories, each able to perform one operation per time slot i.e. a total memory bandwidth of 3NR.
Switch Fabric
Line Card 1 Line Card 2 Line Card NR R R
Memories Memories Memories
Switch bandwidth
What switch bandwidth does the DSM switch need in order to be work-conserving?
Theorem (sufficiency)A switch bandwidth of 4NR is sufficient for a distributed shared memory switch to be work-conserving.
ProofThere are a maximum of 3 memory accesses and 1
external line access per time slot.
Switch AlgorithmWhat switching algorithm allows the DSM switch to be
work-conserving? 1. Shared bus: No algorithm needed.
2. Crossbar switch: Algorithm needed because only permutations are allowed.
Theorem
An edge coloring algorithm can switch packets for a work-conserving distributed shared memory switch
ProofKönig’s theorem: Any bipartite graph with maximum degree has an edge coloring with colors.
Summary - Switches with 100% throughput
None2NR2NR2NR/kNk
Maximal2NR6NR3R2N
MWMNR2NR2RNCrossbarIQ
None2NR2NR2NR1BusShared Mem.
Switch Algorithm
Switch BW
Total MemBW
Mem. BW
# Mem.Fabric
NoneNRN(N+1)R(N+1)RNBusOQ
PSM
C. Sets4NR2N(N+1)R2R(N+1)/kNkClosPPS - OQ
C. Sets4NR4NR4RN
C. Sets6NR3NR3RN
Edge Color4NR3NR3RNXbar
C. Sets3NR3NR3NR/kkBus
C. Sets4NR4NR4NR/kNkClos
Time Reserve*
3NR6NR3R2NCrossbar
PPS
DSMJuniper M-series
CIOQ Cisco GSR