adonet spring school1 scheduling algorithms for input-queued ip routers emilio leonardi in...

146
Adonet Spring School 1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco, M.Mellia, F.Neri Dipartimento di Elettronica Telecommunication Network Group http://www.tlc-networks.polito.it Politecnico di Torino (Italy) Budapest, March 2006

Upload: matthew-hoyt

Post on 16-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

1

Scheduling algorithms for input-queued IP routers

Emilio Leonardiin collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco, M.Mellia, F.Neri

Dipartimento di ElettronicaTelecommunication Network Group

http://www.tlc-networks.polito.itPolitecnico di Torino (Italy)

Budapest, March 2006

Page 2: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

2

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 3: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

3

Note

The slides marked RWP are reproduced with permission of Prof.Nick McKeown from the Electrical Engineering and Computer Science Dept. of Stanford University (CA,USA)

Page 4: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

4

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 5: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

5

“The Internet is a mesh of routers”

core router

access router

enterprise router

Page 6: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

6

Access router: high number of ports at low speed (kbps/Mbps) several access protocols (modem, ADSL, cable)

Enterprise router: medium number of ports at high speed (Mbps) several services (IP classification, filtering)

Core router: moderate number of ports at very high speed (Mbps/Gbps) very high throughput

“The Internet is a mesh of routers”

Page 7: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

7

Basic functions

Routing computation of the output port of

an incoming packet uses the routing tables computed by

the routing protocols can be a complex procedure:

• very large routing tables• dynamic variation of routes in the Internet

Page 8: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

8

Basic functions

Switching transfer of packets from input ports

to output ports solution of the contentions for output ports

• queueing– where to store

• scheduling – what to transfer

Page 9: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

9

Faster and faster

Need for high performance routers to accommodate the bandwidth demands

for new users and new services to support QoS to reduce costs

Page 10: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

10

Packet processing and link speed

0,1

1

10

100

1000

10000

1985 1990 1995 2000

Spec

95In

t CPU

resu

lts

0,1

1

10

100

1000

10000

1985 1990 1995 2000

Fib

er

Ca

pa

cit

y (

Gb

it/s

)

TDM DWDM

Packet processing Power Link Speed

Moore’s law2x / 18 months

2x / 7 months

Source: SPEC95Int & David Miller, Stanford.

RWP

Increase of electronic packet processing power cannot accommodate the increase in link speed

?

Page 11: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

110,001

0,01

0,1

1

10

100

1000

1980 1983 1986 1989 1992 1995 1998 2001

Acc

ess

Tim

e (n

s)

Moore’s Law2x / 18 months

1.1x / 18 months

RWP

Memory access time

Page 12: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

12

It’s hard to keep up with Moore’s law: the bottleneck is memory speed

Moore’s law is too slow: routers need to improve faster

than Moore’s law

RWP

Moore’s law

Page 13: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

13

Router capacity exceeds Moore’s law

Growth in capacity of commercial routers: 1992 ~ 2 Gb/s 1995 ~ 10 Gb/s 1998 ~ 40 Gb/s 2001 ~ 160 Gb/s 2003 ~ 640 Gb/s

Average growth rate: 2.2x / 18 months

RWP

Page 14: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

14

Single packet processing

The time to process one packet is becoming shorter and shorter worst case: 40-Byte packets (ACKs)

travelling over the Internet• 3.2 s at 100 Mbps• 320 ns at 1 Gps• 32 ns at 10 Gps• 3.2 ns at 100 Gbps• 320 ps at 1Tbps

Page 15: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

15

S F

LC

LC

LC

LC

CP

S F

IP

IP

IP

IP

CP

OP

OP

OP

OP

Hardware architecture

physical structure logical structure

Page 16: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

16

Hardware architecture

Main elements line cards

support input/output transmissions store packets adapt packets to the internal format of the switching fabric support data link protocols classify packets schedule packets support security

switching fabric transfers packets from input ports to output ports

Page 17: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

17

Main elements

control processor/network processor runs routing protocols computes routing tables manages the overall system

forwarding engines compute the packet destination (lookup) inspect packet headers rewrite packet headers

Hardware architecture

Page 18: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

18

switching fabric

line card line card

control

processor

forwarding

engine

forwarding

engine

1 N

Interconnections among main elements - I

Page 19: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

19

Interconnections among main elements - II

switching fabric

line card &forwarding engine

control

processor

1

line card &forwarding engine

N

Page 20: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

20

Cell switch (fabric) ORM1

ORMN

1

ISM

N

ISM

packets cells cells packets

Cell-based routers

ISM: Input-Segmentation Module

ORM: Output-Reassembly Module

packet: variable-size data unit

cell: fixed-size data unit

Page 21: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

21

Switching fabric

Our assumptions: bufferless

• to reduce internal hardware complexity non-blocking

• it is always possible to transfer in parallel from input to output ports any non-conflicting set of cells

Page 22: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

22

Switching fabric

Examples: crossbar rearrangeable Clos network Benes network Batcher-Banyan network (self-routing)

Switching constraints at most one cell for each input and for each

output can be transferred

1234

1 2 3 4outputs

inpu

ts

Page 23: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

23

Switching fabric

We do not discuss switching fabrics with internal buffers e.g.: crossbars with buffer at each crosspoint

Page 24: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

24

Generic switching architecture

Output 1

switching fabric

Input 1

Input N Output N

Sin

Sin

Sout

Sout

input queues output queues

Page 25: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

25

Speedup

The speedup determinates the switch performance: Sin = reading speed from input queues

Sout = writing speed to output queues

maximum speedup factor:

S = max(Sin,Sout)

Page 26: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

26

Performance comparison

The performance of different switching systems can be studied with analytical models

• introducing simplifying assumptions, but obtaining general results

with simulation models• obtaining more detailed results

Page 27: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

27

Traffic description

Aij(n) = 1 if a packet arrives at time n at input i, with destination reachable through output j

ij = E[Aij(n)] An arrival process is admissible if:

i ij 1

j ij 1

• that is, no input and no output are overloaded on average

• note that OQ switches exhibit finite delays only for admissible traffic

traffic matrix: = [ij ]

Page 28: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

28

Traffic scenarios

Uniform traffic Bernoulli i.i.d. arrivals usual testbed in the literature

• “easy to schedule”

Diagonal traffic Bernoulli i.i.d arrivals critical to schedule, since only two matchings are good

2001

1200

0120

0012

3

1111

1111

1111

1111

N

Page 29: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

29

Traffic scenarios

LogDiagonal traffic Bernoulli i.i.d. arrivals more critical than uniform,

less than diagonal traffic

8124

4812

2481

1248

12N

Page 30: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

30

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 31: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

31

Output Queued (OQ) switches

Sin = 1 Sout = N used for low bandwidth routers

no coordination among ports work-conserving

• best average delays complete control of delays

• support of QoS scheduling

Page 32: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

32

Output Queued (OQ) switch

speedup N

Output N

Output 1

switching fabric

Input 1

Input N

Page 33: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

33

OQ performance

OQ

Note: OQ is optimal from the point of view of average delay and

throughput

Uniform traffic

Page 34: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

34

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 35: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

35

Simple Input Queued (IQ) switches

Sin = 1 Sout = 1 1 FIFO queue for each input port throughput limitations

due to head of the line (HOL) blocking scheduling

to solve contentions

for the same output

Output N

Input 1 Output 1

switching fabric

Input 1

Page 36: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

36

Head of the Line (HOL) Blocking

RWP

Page 37: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

37

Simple IQ switch performance

OQSimple IQ

Uniform traffic

%5822

Page 38: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

38

Improving simple IQ switches

Window/bypass schedulers the first w cells of each queue contend

for outputs HOL blocking is reduced, not eliminated w = 1 means FIFO at each input higher complexity

• the scheduler deals with wN cells • non-FIFO queues

Page 39: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

39

Improving IQ switches

Virtual output queueing (VOQ) one queue for each input/output pair

• N queues at each input• N2 queues in the whole switch

eliminates HOL blocking used in high-bandwidth routers

• scheduling implemented in hardware at very high speed

Page 40: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

40

IQ switches with VOQ

Output N

Input 11

N

Output 1

Input N1

Nschedule

r

switching fabric

Note: from now on, we always assume VOQ at the switch inputs

input constraints

output constraints

Page 41: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

41

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 42: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

42

Scheduling in IQ switches

Scheduling can be modeled as a matching problem in a bipartite graph the edge from node i to node j refers to packets

at input i and directed to output j the weight of the edge can be

• binary (not empty/empty queue)• queue length• HOL cell waiting time, or cell age• some other metric indicating the priority

of the HOL cell to be served

Page 43: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

43

Scheduling in IQ switches

Request Graph

Matching (or Permutation)

inputs outputs

scheduler

Page 44: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

44

Scheduling in IQ switches

Request Matrix

3 5 0 0

2 0 0 4

4 5 0 0

0 0 8 2

0 1 0 0

0 0 0 1

1 0 0 0

0 0 1 0

scheduler

Permutation

Page 45: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

45

Implementing schedulers

Scheduling is a complex task a scheduling algorithm can be implemented

in hardware if:• it shows good performance for a wide range

of traffic patterns• it can be efficiently parallelized• it can be efficiently pipelined• it requires few iterations (or clock cycles)• it requires limited control information

Page 46: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

46

Scheduling uniform traffic

A number of algorithms give 100% throughput when traffic is uniform For example:

• TDM and a few variants• iSLIP (see later)

RWP

Example of TDM for a 4x4 switch

Page 47: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

47

Birkhoff - von Neumann theorem

Any doubly stochastic matrix can be

expressed as convex combination of permutation matrices n:

= n an n

with

an≥0

n an =1

Page 48: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

48

Scheduling non-uniform traffic

thanks to the Birkhoff - von Neumann theorem

If the traffic is known and admissible, 100% throughput can be achieved by a TDM using: for a fraction of time a1 matching M1 (

for a fraction of time a2 matching M2 (

for a fraction of time ak matching Mk (

Page 49: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

49

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 50: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

50

Maximum Size Matching

Maximum Size Matching (MSM) among all the possible matchings, selects the one

with the highest number of edges • MSM is generally not unique

the best MSM algorithm requires O(N2.5) iterations, and cannot be implemented efficiently, since it is based on a flow augmentation path algorithm

Page 51: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

51

Instability of MSM Assume:

P(arrival at Q12) =

P(arrival at Q11) = P(arrival at Q22) = 1--

Q12 = B » 0 Q11 = Q22 = 0

in case of parity serve Q11 and/or Q22 instead of Q12

Observe: Q12 is served only when A11 = 0 and A22 = 0, i.e. with probability:

P(serve Q12) = P(no arrivals at both Q11 and Q22 ) = [1-(1--)]2 = (+)2

P(serve Q12) < P(arrival at Q12) if is small enough

Example: = 0.5; = 0.1; P(serve Q12) = 0.36 In1

In2

Out1

Out2

1--

1--

Note: this proof is due to I.Keslassy, Stanford Univ.

Page 52: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

52

Maximum Size Matching

MSM maximizes the instantaneous throughput

MSM may not yield 100% throughput short term decisions can be inefficient

in the long term non-binary edge weights allow MWM

to maximize the long-term throughput

Page 53: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

53

Maximum Weight Matching

Maximum Weight Matching (MWM) among all the possible N! matchings, selects the one

with the highest weight (sum of the edge metrics)• MWM is generally not unique

MWM is too complex to be implemented in hardware at high speed

• the best MWM algorithm requires O(N3) iterations, and cannot be implemented efficiently, since it is based on a flow augmentation path algorithm

• cannot be parallelized and pipelined efficiently MWM has never been implemented in a commercial

chipset

Page 54: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

54

Maximum Weight Matching

In case of unknown traffic, MWM is the optimal solution of the scheduling problem when the weight is either the queue length or the cell age achieves 100% throughput under any traffic

• also under non-Bernoulli arrival processes, satisfying the law of large numbers

achieves low average delays, very close to those of OQ switches

possible starvation for lightly loaded packet flows

Page 55: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

55

Maximum Weight Matching

MWM is the optimal solution of the scheduling problem when the traffic is unknown, when the weight is either the queue length or the cell age achieves 100% throughput under any traffic

• also under non-Bernoulli arrival processes, satisfying the law of large numbers

achieves low average delays, very close to those of OQ switches

possible starvation for lightly loaded packet flows

Page 56: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

56

MWM with pipeline and latency

Let T and P be fixed Dt denotes the matching used at time t The following variations of MWM also achieve

100% throughput: Dt = MWM(t-P) MWM with pipeline degree P

Dt = MWM(ceil(t/T)•T) MWM with latency T

combinations of both thus, it seems easy to achieve 100% throughput!

Page 57: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

57

MWM with pipeline and latency

Bit: What about throughput?

• 100% throughput– but needs the computation of a MWM …

What about delays?• delays can be really bad!

Page 58: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

58

General consideration

When scheduling in IQ switches, it is very difficult to achieve simultaneously high throughput low delay limited implementation complexity

Page 59: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

59

Uniform traffic MWM and MSM behave almost identically

1

10

100

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Me

an

de

lay

Normalized Load

Uniform Traffic

MWM MSM

Page 60: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

60

LogDiagonal traffic MSM is somewhat inferior to MWM

1

10

100

1000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Me

an

de

lay

Normalized Load

LogDiagonal Traffic

MWM MSM

Page 61: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

61

Diagonal traffic MSM yields much longer delays than MWM at medium/high loads

1

10

100

1000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Me

an

de

lay

Normalized Load

Diagonal Traffic

MWM MSM

Page 62: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

62

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 63: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

63

Approximations of MSM and MWM

Motivation strong interest in scheduling algorithms with

• very low complexity• high performance

Usually implementable schedulers (low complexity)

low throughput, long delays theoretical schedulers (high complexity)

high throughput, short delays

Page 64: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

64

Some implementable algorithms

Approximate MSM WFA, iSLIP, 2DRR, RC, FIRM and many others

Approximate MWM with wij = Xij (queue length)

iLQF, RPA, learning algorithms Approximate MWM with wij = cell age

iOCF Approximate MWM with wij = i Xij+ j Xij

iLPF, MUCS

Page 65: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

65

APPROXIMATIONS OF MAXIMUM SIZE

MATCHING

Page 66: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

66

Wave Front Arbiter

Requests Match

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

RWP

Page 67: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

67

Wave Front Arbiter

Requests Match

RWP

2N-1 steps

Page 68: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

68

Wrapped Wave Front Arbiter

Requests Match

N steps instead of2N-1

RWP

Page 69: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

69

iSLIP

iSLIP means “iterative SLIP” iterates among the following 3 phases

Request Grant Accept

Page 70: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

70

iSLIP 3 phases:

Request (from inputs to outputs)• each unmatched input sends a request

to every output for which it has a cell Grant (from outputs to inputs)

• if an unmatched output receives requests, it sends a grant to one of the inputs

– contentions solved by a round-robin mechanism Accept (from inputs to outputs)

• if an unmatched input receives grants, it selects a single output and it becomes matched to it

– contentions solved by a round-robin mechanism

Page 71: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

71

iSLIP

The round robin mechanism in iSLIP is designed so that, under uniform traffic, iSLIP emulates a dynamic TDM scheduler synchronized on the arrival pattern

Page 72: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

72

iSLIP

iSLIP is maximal• often, with log N iterations• always, with N iterations

iSLIP was implemented on one chip in the Cisco 12000 router http://www.cisco.com/warp/public/cc/pd/rt/12000/tech/fasts_wp.pdf

Page 73: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

73

iSLIP

iSLIP demo

from: http://tiny-tera.stanford.edu/tiny-tera/demos/index.html

Page 74: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

74

APPROXIMATIONS OF MAXIMUM WEIGHT

MATCHING

Page 75: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

75

iLQF

iLQF means “iterative Longest Queue First” iterates among the following 3 phases

Request Grant Accept

Page 76: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

76

iLQF 3 phases:

Request (from inputs to outputs)• each unmatched input sends all its queue lengths

as requests to corresponding outputs Grant (from outputs to inputs)

• if an unmatched output receives requests, it sends a grant to the input corresponding to the longest queue

– contentions solved by random choice

Accept (from inputs to outputs)• if an unmatched input receives grants, it selects

the output with the longest queue– contentions solved by random choice

Page 77: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

77

iLQF

iLQF is maximal• often, with log N iterations• always, with N iterations

iLQF is robust to non-uniform traffic

Page 78: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

78

iLQF

iLQF demo

from: http://tiny-tera.stanford.edu/tiny-tera/demos/index.html

Page 79: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

79

RPA

RPA means “Reservation with Preemption and Acknowledgment”

Two phases Reservation (possibly preemptive) Acknowledgement

Sequential accesses to a reservation vector Urgj (if set) is the urgency of the transfer from

input Inj to output j

Urg1,In1 Urg2,In2 Urg3,In3 UrgN,InN

Out 1 Out 2 Out 3 Out N

Vector Res

Page 80: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

80

RPA

Vector Res is sequentially accessed by all inputs

Res

Input 1 Input 2

Input 4 Input 3

Page 81: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

81

RPA

Initially, at each round: Urgj = 0 for all j

Reservation phase when input i accesses Res

it computes Wj = Xij – Urgj for all j finds j* such that Wj* = max{ Wj } if Wj* > 0,

reserve output j* and set Urgj*=Xij*, possibly overwriting the previous reservation

otherwise, leave the current reservation

Page 82: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

82

RPA

Acknowledgement phase if input i still finds its reservation at output j,

books output j otherwise,

chooses an unreserved output j and books output j

Page 83: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

83

Uniform traffic

comparison between MWM, iSLIP, iLQF, and RPA

1

10

100

1000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Mea

n de

lay

Normalized Load

Uniform Traffic

MWMiSLIP iLQF RPA

Page 84: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

84

LogDiagonal traffic

iSLIP saturates close to 84% throughput

1

10

100

1000

10000

100000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Mea

n de

lay

Normalized Load

LogDiagonal Traffic

MWMiSLIP iLQFRPA

Page 85: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

85

Diagonal traffic RPA achieves 98% throughput, iLQF 87%, iSLIP 83%

1

10

100

1000

10000

100000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Mea

n de

lay

Normalized Load

Diagonal Traffic

MWMiSLIP iLQF RPA

Page 86: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

86

LEARNING ALGORITMS

Page 87: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

87

Learning algorithms

Goal:find a good compromise among

throughput, delay and complexity

Page 88: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

88

Learning algorithms

Key observation the matchings generated by MWM show limited

changes from one time to another• remembering the matching from the past simplifies

the computation of the new matching the search implemented by MWM can be enhanced

• with a randomized approach• by observing arrivals• by searching in parallel

based on an extension of randomized scheduling algorithms

Page 89: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

89

Simple Randomized Schemes Choose a matching at random and use it

as the schedule doesn’t yield 100% throughput

Choose 2 matchings at random and use the heavier one as the schedule

… Choose N matchings at random and use

the heaviest one as the schedule

None of these can give 100% throughput !

Page 90: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

90

0.001

0.01

0.1

1

10

100

1000

10000

0.0 0.2 0.4 0.6 0.8 1.0

Mea

n IQ

Len

Normalized Load

Diagonal Traffic

MWM R32R1

Simple randomized algorithms32x32

Page 91: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

91

Bounds on Maximum Throughput

Page 92: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

92

Tassiulas’ scheme

Consider the following policy Rt = matching picked at random (uniformly) among

all the possible N! matchings Dt = arg max { W(Dt-1), W(Rt) }

Complexity is very low O(1) iterations easy to pipeline

Yields 100% throughput ! note the boost in throughput is due to memory

of the past matching Dt-1

However, delays are very large

Page 93: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

93

0.01

0.1

1

10

100

1000

10000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Mea

n IQ

Len

Normalized Load

Diagonal Traffic

MWMTassiulas

Tassiulas' scheme32x32

Page 94: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

94

Learning approach

Properties of COMP1 W(Dt) W(Dt-1)

W(Dt) W(Mt)

Examples: COMP1 is the MAX among

Dt-1 and Mt

COMP1 is the MERGE among Dt-1 and Mt

Dt-1

Dt

COMP1

Mt

Page 95: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

95

Merging

3

2

3

3

1

3-1+2-2=2

2-1+2-4=-1

3

2

3

2

2

X

W(X)=12

1

2

3

4

1

R

W(R)=10

M

W(M)=13

Emulating MWM is O(N)

MERGE procedure

Page 96: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

96

The learning approach

Dt-1

Dt

COMP1

Mt

Properties of Mt

informally, Mt should be a “good” sample in the space of all possible matchings

Examples: Mt is a matching picked uniformly at random Mt is a matching picked non-uniformly

at random, with a high probability of being heavy

Mt is derived from the arrival vector At

Mt is a good “neighbor” of Dt-1

Page 97: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

97

Theoretical properties

Dt-1

Dt

COMP1

Mt

Stability 100% throughput under any

admissible Bernoulli traffic pattern

Delay the better is the weight of Mt ,

the smaller are the queue lengths, and hence the smaller are the delays

Page 98: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

98

Dt-1

Mt

Dt

MAX

MAX

N1 NK

At

K-th neighbor of

Dt-1

Example of practical implementation Exploiting parallel search:

This scheme is called APSARA

Page 99: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

99

What is a “neighbor” of a matching?

• Each neighbor– differs from Dt-1 in ONLY TWO edges– can be generated very easily in hardware

3 neighbors

• Example: 3 x 3 switch Dt-1

N1 N2 N3

Page 100: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

100

Max-APSARA

APSARA, as described before, is not maximal

Max-APSARA is a modified version of APSARA where a maximal size matching algorithm runs on the remaining unmatched inputs/outputs e.g., if k inputs/outputs are unmatched,

• run iSLIP with k iterations• select k random edges among the

unmatched inputs/outputs

Page 101: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

101

APSARA performance

0.01

0.1

1

10

100

1000

10000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Me

an

IQ

Le

ng

th

Normalized Load

Diagonal Traffic

MWMMaxAPSARA APSARA iSLIPiLQF

Page 102: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

102

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 103: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

103

Routers and switches

IP routers deal with variable-size packets Hardware switching fabrics often deal

with fixed-size cells

Question: how to integrate an hardware switching fabric

within an IP router?

Page 104: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

104

Router based on an IQ cell switch: cell-mode

switching fabric

IQ cell switch1 ISM

N ISM

ORM1

ORMN

Page 105: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

105

Cell-mode scheduling

Scheduling algorithms work at cell level pros:

• 100% throughput achievable cons:

• interleaving of packets at the outputs of the switching fabric

Page 106: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

106

Router based on an IQ cell switch: packet-mode

switching fabric

IQ cell switch1 ISM

N ISM

ORM1

ORMN

NO packet interleaving

if packet-mode

Page 107: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

107

Router based on an IQ cell switch: packet-mode

switching fabric

IQ cell switch1 ISM

N ISM

ORM1

ORMN

NO packet interleaving

if packet-mode

ORMs can be removed

Page 108: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

108

Packet-mode scheduling

Rule: packets transferred as trains of cells when an input starts transferring the first cell

of a packet comprising k cells, it continues to transfer in the following k-1 time slots

Pros: no interleaving of packets at the outputs easy extension of traditional schedulers

Cons: starvation due to long packets

• inherent in packet systems without preemption• negligible for high speed rates

Page 109: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

109

Packet-mode scheduling

Questions can packet mode provide high

throughput?

what about delays?

YES!

It depends…

Page 110: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

110

Packet-mode properties

Main theoretical results MWM in packet-mode yields 100% throughput Packet mode can provide shorter delays

than cell mode, depending on the packet length distribution

Page 111: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

111

Simulation scenario

Router with ISMs and ORMs Uniform packet traffic

uniform packet load uniform (1,192) packet size

distribution Spotted packet traffic

non uniform packet load bimodal (3,100) packet size

distribution

1 1 1 0 1 0 1 0 0 1 0 1 1 1 0 1 1 0 1 0 1 1 1 0 1 1 0 1 0 1 0 1 1 0 1 0 1 0 1 1 0 1 0 1 0 1 1 1 1 0 1 1 1 0 1 00 1 1 1 0 1 0 1

P=

Page 112: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

112

Uniform packet traffic Packet mode and cell mode reach the same throughput

Cell-mode Packet-mode

100

1000

10000

100000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Me

an

pac

ket

del

ay

Normalized Load

Uniform packet traffic for packet mode

100

1000

10000

100000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Me

an

pac

ket

del

ay

Normalized Load

Uniform packet traffic for cell mode

MWM MSM iSLIP iLQF

Page 113: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

113

Spotted packet traffic Packet mode reaches higher throughput than cell mode

100

1000

10000

100000

0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1.0 1.0

Me

an

pac

ket

del

ay

Normalized Load

Spotted packet traffic for packet mode

100

1000

10000

100000

0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1.0 1.0

Me

an

pac

ket

del

ay

Normalized Load

Spotted packet traffic for cell mode

MWMMSMiSLIPiLQF

Cell-mode Packet-mode

Page 114: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

114

At high load PM becomes better

Effect of packet size distribution

iSLIP delayCM/delayPM for different packet size distributions

better PM

better CM

0

0.5

1

1.5

2

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Pac

ket m

ode

gain

for

iSLI

P

Normalized load

UniformExponentialTrimodalBimodal

Page 115: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

115

Packet mode features

Packet mode scheduling is a feasible modification of schedulers improves throughput

• but it can generate some unfairness between long and short packets

– inherent to all variable-packet networks without preemption

may give better packet delays than cell mode• depends on the packet size distribution

Page 116: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

116

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 117: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

117

Network of IQ routers

Question: given a network of IQ switches and an

admissible input traffic, is the network always stable?

NO!

this is quite counterintuitive…but true

Page 118: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

118

Networks of IQ routers

Consider the acyclic network of IQ routers in the following slide derived from well established results

from adversarial queueing theory a very specific scenario, but comprises

only few switches…• this situation may not be common,

but cannot be excluded in real networks

Page 119: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

119

Pathological network of IQ switches

Network with 8 switches and 4 flows

Page 120: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

120

Instability of MWM

If MWM is adopted at each IQ router, and the traffic is admissible, the system can be unstable under Bernoulli i.i.d. arrivals

Page 121: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

121

Instability of MWM

MWM is too greedy, in the sense that it can create traffic bursts that are amplified by each scheduler

A server can be idling when large bursts (directed to it) are blocked because of the contentions upstream the problem arises when a packet flow is

subject to priority changes along its path through the network• it is “dangerous” to increase priority

along the path

Page 122: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

122

Stability in networks of routers

Global policies “Oldest in the network” and many others

• problem: requires global information about the network, and perfectly synchronized clocks at the ingress of the network

Local policies until now, nothing really satisfying known …

(work in progress)

Page 123: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

123

Stability in networks of routers

Semi-local policies MWM with local information about the router

neighbors can achieves 100% throughput under i.i.d. Bernoulli arrivals

Virtual network queue• the weights used by MWM are:

– wij = max{0,Xij-H(Xij)}

where H(Xij) is the size of the queue upstream which is sending packets to Xij

Page 124: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

124

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 125: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

125

CIOQ routers

Output 1

switching fabric

Output N

S

S

o1

oN

Input 1 S

Input N S

VOQ

Page 126: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

126

CIOQ routers

Question: if a low speedup S is allowed (and queues

are available at both inputs and outputs), is it possible to design simple scheduling algorithms, capable of achieving high throughput and low delay?

YES!

Page 127: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

127

CIOQ routers with S=2

If S = 2 it is easy to obtain 100% throughput

• all maximal matchings work – based on stable marriage algorithms

it is less easy to obtain work conservation• output never idling whenever a packet is present

destined to it • same average delays as OQ• very good delay performance• e.g.: LOOFA

it is difficult to perfectly emulate OQ…

Page 128: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

128

LOOFA

The occupancy Cj

is the number of cells currently residing at the j-th output queue

at each time slot, it is decremented by one because of departures

Basic idea of LOOFA give priority to output channels with low

occupancy, thereby attempting to maintain work-conservation for all outputs

Page 129: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

129

LOOFA

If S = 2, during each of the two phases each unmatched input selects a non-empty

VOQ directed to the unmatched output with the lowest occupancy, and sends a request to that output

each unmatched output selects one of the requests, and sends a request to that input

repeat until the matching is maximal the selection at the outputs can be

round robin, random, ...

Page 130: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

130

CIOQ routers with S=2

If S = 2 it is difficult (but possible) to perfectly emulate

an OQ router in terms of packet departures• it is impossible to distinguish, by observing

arrivals and departures, if the switching architecture is CIOQ or OQ

• delays are perfectly controlled– easy to implement scheduling algorithms

born for OQ (eg: WFQ)

Page 131: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

131

CIOQ routers

CIOQ are very promising architectures many degrees of freedom in design

• how to balance input/output buffers• how the buffers interact

– e.g., by backpressure mechanisms

Several currently designed architectures are supposed to be CIOQ

The speedup S is becoming closer and closer to 1 in practical implementations of new switching architectures (CIOQ IQ)

Page 132: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

132

Outline

IP routers OQ routers IQ routers

Scheduling Optimal algorithms Heuristic algorithms Packet-mode algorithms Networks of routers

CIOQ routers Multicast traffic Conclusions

Page 133: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

133

Multicast traffic

Misleading (but common) idea: observe

1. OQ can achieve 100% throughput under any admissible unicast and multicast traffic

2. OQ can be perfectly emulated by CIOQ with S = 2

then, with S = 2 it is possible to achieve 100% throughput for multicast traffic

WRONG! because observation 2 holds only for unicast traffic

Page 134: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

134

Multicast traffic

Question: what is the minimum speedup required

to achieve 100% throughput?

unknown!

Page 135: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

135

Multicast traffic

Possible implementations copy network before the switching fabric

• a multicast cell with f destinations is treated as f cells• possible bandwidth inefficiency

dedicated queue• multicast packets are treated in some specific way

1

UC

MC

N

N N

UC+MC

N N

Page 136: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

136

Multicast traffic: optimal queueing

MC-VOQ queueing best throughput performance

• avoids HOL blocking 2N-1 queues for each input, one for each fanout set

• re-enqueuing process out-of-sequence problem• no re-enqueuing some throughput degradation

MC+UC

1

2N-1N N

Page 137: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

137

Multicast traffic: optimal scheduling

The optimal scheduling for multicast traffic can be defined similarly to unicast traffic it is a sort of max flow algorithm on all N(2N-1)

queues Many heuristics can be envisaged

to approximate it

Page 138: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

138

Summary

3 main ingredients for IQ scheduling algorithms:

Weight computation Matching computation Contention resolution

Page 139: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

139

Summary

Weight computation obtains the priority of each input queue the metric can be related to queue length,

waiting time of the cell at the HOL, … Contention resolution

whenever the selection is among situations with equal weights

can be round robin, or random

Page 140: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

140

Summary

Matching computation computes the matching, trying to maximize

its total weight can be based on

an iterative search, like in iSLIP, iOCF, iLQF

a matrix greedy approach, like in MUCS, WFA

a reservation vector, like in RPA a learning approach, like in APSARA

Page 141: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

141

Summary

Good IQ scheduling algorithms exist: 100% throughput short delay limited complexity

Performance differences are significant only close to saturation

Page 142: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

142

Summary

Open questions concerning IQ schedulers:

QoS guarantees stability of networks of switches multicast traffic

Page 143: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

143

ReferencesRouter functions and architectures Keshav S., Sharma R., ``Issues and trends in router design'', IEEE Communications Magazine, vol.36, n.5, May 1998,

p.144-151 Bux W., Denzel W.E., Engbersen T., Herkersdorf A., Luijten R.P.,``Technologies and building blocks for fast packet

forwarding'', IEEE Communications Magazine, Jan.2001, pp.70-77 Newman P., Minshall G., Lyon T., Huston L.,``IP switching and gigabit routers'', IEEE Communications Magazine,

Jan.1997, pp.64-69 Wolf T., Turner J.S., ``Design issues for high-performance active routers'', IEEE Journal on Selected Areas in

Communications, vol.19, n.3, Mar.2001, pp.404-409

Scheduling in IQ switches Karol M., Hluchyj M., Morgan S., ``Input versus output queueing on a space division switch'', IEEE Transactions on

Communications, vol.35, n.12, Dec.1987 McKeown N., Anantharam V., Walrand J.,``Achieving 100\% throughput in an input-queued switch'',IEEE

INFOCOM'96, vol.1, San Francisco, CA, Mar.1996, pp.296-302 McKeown N.,``iSLIP: a scheduling algorithm for input-queued switches'', IEEE Transactions on Networking, vol.7, n.2,

Apr.1999, pp.188-201 McKeown N., Mekkittikul A.,``A practical scheduling algorithm to achieve 100\% throughput in input-queued switches'',

IEEE INFOCOM'98, vol.2, 1998, pp.792-9, New York, NY Tamir Y., Chi H.-C., ``Symmetric crossbar arbiters for VLSI communication switches'', IEEE Transaction on Parallel

and Distributed Systems, vol.4, no.1, Jan.1993, pp.13 –27 Chen H., Lambert J., Pitsilledes A.,``RC-BB switch. A high performance switching network for B-ISDN'',

GLOBECOM 95

Page 144: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

144

ReferencesScheduling in IQ switches Anderson T., Owicki S., Saxe J., Thacker C.,``High speed switch scheduling for local area networks'', ACM Transactions

on Computer Systems, vol.11, n.4, Nov.1993 LaMaire R.O., Serpanos D.N., ``Two dimensional round-robin schedulers for packet switches with multiple input queues'',

IEEE/ACM Transaction on Networking, vol.2, n.5, Oct.1994, p.471-482 Chen H., Lambert J., Pitsilledes A., ``RC-BB switch. A high performance switching network for B-ISDN'', IEEE

GLOBECOM 95, 1995 Duan H., Lockwood J.W., Kang S.M., Will J.D., ``A high performance OC12/OC48 queue design prototype for input

buffered ATM switches'', IEEE INFOCOM'97, vol.1, 1997, pp.20-8, Los Alamitos, CA Partridge C., et al., ``A 50-Gb/s IP router'', IEEE Transactions on Networking, vol.6, n.3, June 1998, pp.237-248 Ajmone Marsan M., Bianco A., Leonardi E., Milia L., ``RPA: a flexible scheduling algorithm for input buffered switches'',

IEEE Transactions on Communications, vol.47, n.12, Dec.1999, pp.1921-1933 Ajmone Marsan M., Bianco A., Filippi E., Giaccone P.,Leonardi E., Neri F.,``On the behavior of input queueing switch

architectures'', European Transactions on Telecommunications, vol.10, n.2, Mar.1999, pp.111-124 Christensen K.J.,``Design and evaluation of a parallel-polled virtual output queued switch'', IEEE ICC 2001, vol.1, pp.112-

116, 2001 Serpanos D.N., Antoniadis P.I., ``FIRM: a class of distributed scheduling algorithms for high-speed ATM switches with

multiple input queues'', IEEE INFOCOM 2000, vol.2, pp.548-555, 2000 Ying Jiang, Hamdi, M., “A 2-stage matching scheduler for a VOQ packet switch architecture”, IEEE ICC 2002, vol.4,

pp.2105-2110, 2002 Tassiulas L., ``Linear complexity algorithms for maximum throughput in radio networks and input queued switches'', IEEE

INFOCOM'98, vol.2, New York, NY, 1998, pp.533-539 Giaccone P., Prabhakar B., Shah D., ``Towards simple, high-performance schedulers for high-aggregate bandwidth

switches '', IEEE INFOCOM'02, New York, Jun.2002

Page 145: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

145

ReferencesPacket scheduling in IQ switches Ajmone Marsan M., Bianco A., Giaccone P., Leonardi E., Neri F., ``Packet scheduling in input-queued cell-

based switches'', IEEE INFOCOM'01, Anchorage, Alaska, Apr.2001(extended version to appear in IEEE Trans. on Networking, about Oct.2002)

Moon S.H., Sung D.K., ``High-performance variable-length packet scheduling algorithm for IP traffic'', IEEE GLOBECOM'01, Dec.2001

Scheduling multicast traffic in IQ switches Hayes J.F., Breault R., Mehmet-Ali M.K., ``Performance analysis of a multicast switch'', IEEE Transactions

on Communications, vol.39, n.4, Apr.1991, pp.581-587 Kim C.K., Lee T.T., ``Call scheduling algorithm in multicast switching systems'', IEEE Transactions on

Communications, vol.40, n.3, Mar.1992, pp.625-635 McKeown N., Prabhakar B., ``Scheduling multicast cells in an input-queued switch'', INFOCOM'96, vol.1,

San Francisco, CA, Mar.1996, pp.261-278 Prabhakar B., McKeown N., Ahuja R., ``Multicast scheduling for input-queued switches'', IEEE Journal on

Selected Areas in Communications, vol.15, n.5, Jun.1997, pp.855-866 Chen W., Chang Y., Hwang W., ``A high performance cell scheduling algorithm in broadband multicast

switching systems'', IEEE GLOBECOM'97, vol.1, New York, NY, 1997, pp.170-174 Guo M., Chang R., ``Multicast ATM switches: survey and performance evaluation'', Computer

Communication Review, vol.28, n.2, Apr.1998, pp.98-131 Andrews M., Khanna S., Kumaran K., ``Integrated scheduling of unicast and multicast traffic in an input-

queued switch'', IEEE INFOCOM'99, vol.3, New York, NY, 1999, pp.1144-1151 Liu Z., Righter R., ``Scheduling multicast input-queued switches'', Journal of Scheduling, John Wiley & Sons,

May 1999

Page 146: Adonet Spring School1 Scheduling algorithms for input-queued IP routers Emilio Leonardi in collaboration with: P. Giaccone, M. Ajmone Marsan, A Bianco,

Adonet Spring School

146

ReferencesScheduling multicast traffic in IQ switches Nong G., Hamdi M., ``On the provision of integrated QoS guarantees of unicast and multicast traffic in input-

queued switches'', IEEE GLOBECOM'99, vol.3, 1999 Ajmone Marsan M., Bianco A., Giaccone P., Leonardi E., Neri F., ``On the throughput of input-queued cell-

based switches with multicast traffic'', IEEE INFOCOM'01, Anchorage Alaska, Apr.2001 Ge Nong, Hamdi M., “Providing QoS guarantees for unicast/multicast traffic with fixed/variable-length

packets in multiple input-queued switches”, IEEE Symposium on Computers and Communications, pp.166 –171, 2001

Smiljanic A., “Flexible bandwidth allocation in high-capacity packet switches”, IEEE/ACM Transactions on Networking, vol.10, n.2, pp.287-293, Apr.2002

QoS support in IQ switches Tabatabaee V., Georgiadis L., Tassiulas L., ``QoS provisioning and tracking fluid policies in input queueing

switches'', IEEE INFOCOM'00, New York, Mar.2000 Chang C.S., Lee D.S., Jou Y.S., ``Load balanced Birkhoff-von Neumann switches'', 2001 IEEE Workshop on

High Performance Switching and Routing, 2001, pp.276-280. Hung A., Kesidis G., McKeown N.,``ATM input-buffered switches with guaranteed-rate property'', IEEE

ISCC'98, July 1998, pp.331-335, Athens, Greece

Advanced architectures derived from pure IQ Iyer S., McKeown N., ``Making parallel packet switches practical'', IEEE INFOCOM'01, Alaska, Mar.2001 Chang C.S., Lee D.S., Jou Y.S., ``Load balanced Birkhoff-von Neumann switches'', 2001 IEEE Workshop on

High Performance Switching and Routing, 2001, pp.276-280 Sivaram R., Stunkel C.B., Panda D.K., “HIPIQS: a high-performance switch architecture using input

queuing”, IEEE Transactions on Parallel and Distributed Systems, vol.13, n.3, pp.275-289, Mar.2002