forwarding decisions - univ-pau.frcpham.perso.univ-pau.fr/enseignement/iup/routerforwarding.pdf ·...

65
From a Nick M cKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from Keshav) Some slides modified by C. Pham 1 Forwarding Decisions ATM and MPLS switches Direct Lookup Bridges and Ethernet switches Associative Lookup – Hashing Trees and tries IP Routers – CIDR Patricia trees/tries Other methods – Caching Packet Classification

Upload: doankien

Post on 24-May-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 1

Forwarding Decisions• ATM and MPLS switches

– Direct Lookup• Bridges and Ethernet switches

– Associative Lookup– Hashing– Trees and tries

• IP Routers– CIDR– Patricia trees/tries– Other methods– Caching

• Packet Classification

Page 2: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 2

ATM and MPLS SwitchesDirect Lookup

VCI

Address

MemoryD

ata(Port, VCI)

Page 3: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 3

Forwarding Decisions• ATM and MPLS switches

– Direct Lookup• Bridges and Ethernet switches

– Associative Lookup– Hashing– Trees and tries

• IP Routers– CIDR– Patricia trees/tries– Other methods– Caching

• Packet Classification

Page 4: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 4

Bridges and Ethernet SwitchesAssociative Lookups

NetworkAddress

AssociatedData

AssociativeMemory or CAM

Search Data

48

log2N

AssociatedData

Hit?

Address{

Advantages:• Simple

Disadvantages• Slow• High Power• Small• Expensive

Page 5: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 5

Bridges and Ethernet SwitchesHashing

HashingFunction

Memory

Add

ress

Dat

a

Search Data

48

log2N

AssociatedData

Hit?

Address{16

Page 6: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 6

Lookups Using HashingAn example

Hashing Function

CRC-1616

#1 #2 #3 #4

#1 #2

#1 #2 #3Linked lists

Memory

Search Data

48

log2N

AssociatedData

Hit?

Address{M entries

N lists

Page 7: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 7

Lookups Using HashingPerformance of simple example

Most addresses in their own list

Most addresses in one list

Page 8: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 8

Lookups Using Hashing

Advantages:• Simple

• Expected lookup time can be small

Disadvantages• Non-deterministic lookup time

• Inefficient use of memory

Page 9: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 9

Trees and Tries

Binary Search Tree

< >

< > < >

log2 N

N entries

Binary Search Trie

0 1

0 1 0 1

111010

Page 10: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 10

Tries

• An entry is:– a pointer to another array,– a special symbol indicating

no better match– a null pointer indicating that

the longst match is the parentnode

• Two ways to improveperformance– cache recently used addresses

in a CAM– move common entries up to a

higher level (match longerstrings)

128.32.1.2 ?

Page 11: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 11

TriesMultiway tries

16-ary Search Trie

0000, ptr 1111, ptr

0000, 0 1111, ptr

000011110000

0000, 0 1111, ptr

111111111111

Page 12: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 12

Trees and TriesMultiway tries

Table produced from 215 randomly generated 48-bit addresses

Page 13: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 13

Forwarding Decisions• ATM and MPLS switches

– Direct Lookup• Bridges and Ethernet switches

– Associative Lookup– Hashing– Trees and tries

• IP Routers– CIDR– Patricia trees/tries– Other methods– Caching

• Packet Classification

Page 14: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 14

IP RoutersClass-based addresses

Class A Class B Class C D

212.17.9.4Class AClass BClass C 212.17.9.0 Port 4

Exact Match

Routing Table:

IP Address Space

Page 15: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 15

IP RoutersCIDR

A B C D0 232-1

0 232-1

128.9/16

128.9.0.0

216

142.12/19

65/24

Classless:

Class-based:

128.9.16.14

Page 16: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 16

IP RoutersCIDR

0 232-1

128.9/16

128.9.16.14

128.9.16/20 128.9.176/20

128.9.19/24

128.9.25/24

Most specific route = “longest matching prefix”

Page 17: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 17

IP RoutersMetrics for Lookups

128.9/16128.9.16/20

128.9.176/20

128.9.19/24128.9.25/24

142.12/19

65/24

Prefix Port35271013

128.9.16.14• Lookup time• Storage space• Update time• Preprocessing time

Page 18: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 18

IP RouterLookup

IPv4 unicast destination address based lookup

Dstn Addr Next Hop

--------

---- ----

--------

Destination Next HopForwarding Table

Next Hop Computation

Forwarding Engine

Incoming Packet

HEADER

Page 19: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 20

Lookup Performance Required

Gigabit Ethernet (84B packets): 1.49 Mpps

Page 20: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 21

Size of the Routing Table

Source: http://www.telstra.net/ops/bgptable.html

Exponentialgrowth before

CIDR

About10k newprefixes per year

Page 21: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 22

Size of the Forwarding TableSource: http://www.telstra.net/ops/bgptable.html

95 96 97 98 99 00Year

Num

ber

of P

refi

xes

10,000/year

Renewed Exponential Growth

Renewed growth due to multi-homing of enterprise networks!

Page 22: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 31

Routing Lookups in Hardware

Prefix length

Num

ber

Most prefixes are 24-bits or shorter

Page 23: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 32

Routing Lookups in Hardware14

2.19

.6.1

4

Prefixes up to 24-bits14

2.19

.614

1 Next Hop

24

Next Hop

142.19.6

224 = 16M entries

Page 24: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 33

Routing Lookups in Hardware12

8.3.

72.4

4

Prefixes up to 24-bits12

8.3.

7244

1 Next Hop

128.3.72

24 0 Pointer

8

Prefixes above 24-bits

Next Hop

Next Hop

Next Hopof

fset

base

Page 25: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 36

Caching Addresses

CPU BufferMemory

LineCard

DMA

MAC

LocalBuffer

Memory

LineCard

DMA

MAC

LocalBuffer

Memory

LineCard

DMA

MAC

LocalBuffer

Memory

Fast Path

Slow Path

Advantages

Increased averagelookupperformance

Disadvantages

Decreased locality inbackbone traffic

Cache size

Cache managementoverhead

Hardwareimplementation difficult

Page 26: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 37

Caching Addresses

LAN:Average flow < 40 packets

WAN: Huge Number of flows

Cache = 10% of Full Table

Cache Hit Rate

Page 27: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 38

IP Router LookupsReferences

• A. Brodnik, S. Carlsson, M. Degermark, S. Pink. “Small ForwardingTables for Fast Routing Lookups”, Sigcomm 1997, pp 3-14.

• B. Lampson, V. Srinivasan, G. Varghese. “ IP lookups using multiwayand multicolumn search”, Infocom 1998, pp 1248-56, vol. 3.

• M. Waldvogel, G. Varghese, J. Turner, B. Plattner. “Scalable highspeed IP routing lookups”, Sigcomm 1997, pp 25-36.

• P. Gupta, S. Lin, N.McKeown. “Routing lookups in hardware atmemory access speeds”, Infocom 1998, pp 1241-1248, vol. 3.

• S. Nilsson, G. Karlsson. “Fast address lookup for Internet routers”,IFIP Intl Conf on Broadband Communications, Stuttgart, Germany,April 1-3, 1998.

• V. Srinivasan, G.Varghese. “Fast IP lookups using controlled prefixexpansion”, Sigmetrics, June 1998.

Page 28: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 45

Course Outline

• Packet Lookup and Classification:Where does a packet go next?

• Switching Fabrics:How does the packet get there?

Page 29: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 46

Switching Fabrics• Overview• Output and Input Queueing• Output Queueing• Input Queueing

– Scheduling algorithms– Combining input and output queues– Multicast traffic– Other non-blocking fabrics

• Multistage Switches

Page 30: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 47

Basic Architectural ComponentsDatapath: per-packet processing

ForwardingDecision

ForwardingDecision

ForwardingDecision

ForwardingTable

ForwardingTable

ForwardingTable

Interconnect

OutputScheduling

1.2.

3.

Transfers data from aninput to an output

many ports (density),high speeds

Page 31: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 48

Background: Circuit switch

• A switch that can handle N calls has N logical inputs andN logical outputs– N up to 200,000

• Moves 8-bit samples from an input to an output port– Recall that samples have no headers– Destination of sample depends on time at which it arrives at the

switch• In practice, input trunks are multiplexed

– Multiplexed trunks carry frames = set of samples• Goal: extract samples from frame, and depending on

position in frame, switch to output– each incoming sample has to get to the right output line and the

right slot in the output frame

Page 32: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 49

Call blocking

• Can’t find a path from input to output• Internal blocking

– slot in output frame exists, but no path• Output blocking

– no slot in output frame is available• Output blocking is reduced in transit

switches– need to put a sample in one of several slots

going to the desired next hop

Page 33: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 50

Multiplexors and demultiplexors

• Most trunks time division multiplex voicesamples

• At a central office, trunk is demultiplexedand distributed to active circuits

• Synchronous multiplexor– N input lines– Output runs N times as fast as input

123

N

MUX…

123

N

De-MUX1 2 3 … N

Page 34: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 51

Time division switching

• Key idea: when de-multiplexing, position in framedetermines output trunk

• Time division switching interchanges sample positionwithin a frame: time slot interchange (TSI)

Page 35: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 52

Time Division SwitchingLimitations

• To build a 120,000 circuit switch– read and write samples 120,000 every 125us, a

R&W operation in 0.5 ns!– Today DRAM has access time from 80 to 40 ns– If we use 40 ns DRAM, it's 80 times more than

what we need– Maximum #circuit= 120,000/80=1500!– Too small!!

Page 36: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 53

Space division switching

• Each sample takes a different path throughthe switch, depending on its destination

Page 37: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 54

Crossbar

• Simplest possible space-division switch

• Crosspoints can be turnedon or off, long enough totransfer a packet from aninput to an output

• Expensive• Internally nonblocking

– but need N2 crosspoints– time to set each crosspoint

grows quadratically

configuration

Dat

a In

Data Out

Page 38: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 55

Multistage crossbar (1)

• In a crossbar during eachswitching time only onecross-point per row orcolumn is active

• Can save crosspoints if across-point can attach tomore than one input line

• This is done in a multistagecrossbar

N/narraysn x k

karraysN/n x N/n

N/narraywk x n

Page 39: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 56

Multistage crossbar (2)

• Can suffer internal blocking– unless sufficient number of second-level

stages, k ≥ n• Number of crosspoints < N2

• Finding a path from input to outputrequires a depth-first-search

• Scales better than crossbar, but still not toowell– 120,000 call switch needs ~250 million

crosspoints

Page 40: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 57

The true cost of telephoneswitching

• In a central switching system, the high costis the line card.

• Now the true cost is the copper wire to thecustomer premises!!

• In long-distance, the high cost is in layinglines, acquiring rights of way and switch-control software!

• So, saving a few thousand crosspoints isnot going to make phone call cheaper!

Page 41: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 58

Packet switches

• In a circuit switch, path of a sample is determinedat time of connection establishment

• No need for a sample header--position in frameused

• In a packet switch, packets carry a destinationfield or label– Need to look up destination port on-the-fly

• Datagram switches– lookup based on entire destination address (longest-

prefix match)• Cell or Label-switches

– lookup based on VCI or Labels

Page 42: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 59

Blocking in packet switches

• Can have both internal and output blocking• Internal

– no path to output• Output

– trunk unavailable• Unlike a circuit switch, cannot predict if

packets will block (why?)• If packet is blocked, must either buffer or

drop

Page 43: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 60

Dealing with blocking in packetswitches

• Over-provisioning– internal links much faster than inputs

• Buffers– at input or output

• Backpressure– if switch fabric doesn’t have buffers, prevent

packet from entering until path is available• Parallel switch fabrics

– increases effective switching capacity

Page 44: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 61

Switch Fabrics: Bufferedcrossbar

• What happens ifpackets at two inputsboth want to go tosame output?

• Can defer one at aninput buffer

• Or, buffer cross-points: complex arbiter

Page 45: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 62

Switch fabric element

• Goal: towards building “self-routing” fabrics• Can build complicated fabrics from a simple

element

• Routing rule: if 0, send packet to upper output,else to lower output– If both packets to same output, buffer or drop

0

1

data 10

data 00

Page 46: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 63

Banyan• Simplest self-routing recursive fabric, 2n output

need n stages with 2n-1 components in each stage

• What if two packets both want to go to the sameoutput→output blocking

000001

010011

100101

110111

000001

010011

100101

110111

Page 47: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 64

Blocking in Banyan S/wsSorting

• Can avoid blocking by choosing order in which packetsappear at input ports

• If we can– present packets at inputs sorted by output– remove duplicates– remove gaps– precede banyan with a perfect shuffle stage– then no internal blocking

• For example: [X, 011, 010, X, 011, X, X, X]:• Sort => [010, 011, 011, X, X, X, X, X]• Remove dups => [010, 011, X, X, X, X, X, X]• Shuffle => [010, X, 011, X, X, X, X, X]• Need sort, trap and shuffle networks.

Page 48: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 65

Sorting using Merging

• Build sorters from merge networks• Assume we can merge two sorted lists• Sort pairwise, merge, recurse

Sort {5,7,2,3,6,2,4,5}

1/ sort 2 by 22/ merge adjacent lists

to get two 4-el lists3/ merge de two lists

with a merge network

2357

2

54

6

Page 49: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 66

Putting it together- BatcherBanyan

• What about trapped duplicates?– recirculate to beginning– or run output of trap to multiple banyans (dilation)

Page 50: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 67

Non-Blocking Batcher-Banyan

3

7

5

2

6

0

1

4

7

2

3

5

6

1

0

4

7

5

2

3

1

0

6

4

7

0

5

1

3

4

2

6

7

4

5

6

0

3

1

2

7

6

4

5

3

2

0

2

7

6

5

4

3

2

1

0

000001010011100101110111

Batcher Sorter Self-Routing Network

• Fabric can be used as scheduler. •Batcher-Banyan network is blocking for multicast.

a dans le sens de la flèche si a > b,a dans le sens opposé si a est tout seul

Page 51: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 68

InterconnectsTwo basic queueing techniques

Input Queueing Output Queueing

Usually a non-blockingswitch fabric (e.g. crossbar)

Usually a fast bus

Page 52: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 69

InterconnectsOutput Queueing

Individual Output Queues Centralized Shared Memory

Memory b/w = (N+1).R

1

2

N

Memory b/w = 2N.R

1

2

N

Page 53: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 70

Output QueueingThe “ideal”

1

1

1

1

1

1

1

1

1

11

1

2

2

2

2

2

2

Page 54: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 71

Output QueueingHow fast can we make centralized shared memory?

SharedMemory

200 byte bus

5ns SRAM

1

2

N

• 5ns per memory operation• Two memory operations per packet• Therefore, up to 160Gb/s• In practice, closer to 80Gb/s

Page 55: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 72

Switching Fabrics

• Output and Input Queueing• Output Queueing• Input Queueing

– Scheduling algorithms– Combining input and output queues– Multicast traffic– Other non-blocking fabrics

• Multistage Switches

Page 56: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 73

InterconnectsInput Queueing with Crossbar

configuration

Dat

a In

Data Out

Scheduler

Memory b/w = 2R

Page 57: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 74

Input QueueingHead of Line Blocking

Del

ay

Load58.6% 100%

Page 58: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 75

Head of Line Blocking

Page 59: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 76

Page 60: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 77

Page 61: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 78

Input QueueingVirtual output queues

Page 62: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 79

Input QueuesVirtual Output Queues

Del

ay

Load100%

Page 63: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 80

Input Queueing Virtual Output Queues

Scheduler

Memory b/w = 2R

Can be quitecomplex!

Page 64: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 85

Input QueueingWhy is serving long/old queues better than

serving maximum number of queues?

• When traffic is uniformly distributed, servicing themaximum number of queues leads to 100% throughput.• When traffic is non-uniform, some queues become longer than others.• A good algorithm keeps the queue lengths matched, and services a large number of queues.

VOQ #

Avg

Occ

upan

cy Uniform traffic

VOQ #

Avg

Occ

upan

cy

Non-uniform traffic

Page 65: Forwarding Decisions - univ-pau.frcpham.perso.univ-pau.fr/ENSEIGNEMENT/IUP/RouterForwarding.pdf · From a Nick McKeown's tutorial, 1999 and slides from Kalyanaraman (with figure from

From a Nick McKeown's tutorial, 1999 andslides from Kalyanaraman (with figure fromKeshav) Some slides modified by C. Pham 103

Summary: progression

Shared Memory

InputQueued

Combined Input and

Output QueuedParallelPacket

Switches37526014

72356104

75231064

70513426

74560312

76453202

76543210

000001010011100101110111

Batcher Sorter Self-Routing Network

Multistage