a load-balanced switch with an arbitrary number of linecards

A Load-Balanced Switch with an Arbitrary Number of Linecards

Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown

Stanford University

Stanford 100Tb/s Router

“Optics in Routers” project http://yuba.stanford.edu/or/

Some challenging numbers: 100Tb/s R=160Gb/s linecard rate N=640 linecards

Performance guarantees

Router Wish ListScale to High Linecard Speeds

No Centralized Scheduler Optical Switch Fabric Low Packet-Processing Complexity

Scale to High Number of Linecards High Number of Linecards Arbitrary Arrangement of Linecards

Provide Performance Guarantees 100% Throughput Guarantee Delay Guarantee No Packet Reordering

112233

Load-Balanced Switch

Load-balancing mesh

Forwarding mesh

Load-Balanced Switch

Load-balancing mesh

Forwarding mesh

Combining the Two Meshes

One linecard

A Single Combined Mesh

References on Early Work

Initial Work C.-S. Chang, D.-S. Lee and Y.-S. Jou, "Load

Balanced Birkhoff-von Neumann Switches, part I: One-Stage Buffering," Computer Communications, Vol. 25, pp. 611-622, 2002.

Sigcomm’03 I. Keslassy, S.-T. Chuang, K. Yu, D. Miller, M.

Horowitz, O. Solgaard and N. McKeown, "Scaling Internet Routers Using Optics," ACM SIGCOMM '03, Karlsruhe, Germany, August 2003.

Summary of Early Work

Initial Work (C.-S. Chang et al.)

Sigcomm‘03

Scheduler No centralized scheduler

No centralized scheduler

Architecture Crossbar-based architecture

Mesh-based architecture => no reconfiguration Single Mesh

Performance guarantees

100% throughput guarantee for weakly-mixing traffic

100% throughput guarantee for any adversarial traffic Average delay within constant from output-queued router No packet reordering

ExampleN=8

When N is Too LargeDecompose into groups (or racks)

4R/42R 2R1

When N is Too LargeDecompose into groups (or racks)

Group/Rack 1

Group/Rack G

Group/Rack 1

Group/Rack G

2RL 2RL

2RL2RL/G

When Linecards are MissingFailures, Incremental Additions, and Removals…

Group/Rack 1

Group/Rack G

Group/Rack 1

Group/Rack G

2RL 2RL

2RL2RL/G

Solution: replace mesh with sum of permutations

2RL/G 2RL/G 2RL/G 2RL/G

2RL 2RL/G

Hybrid Electro-Optical ArchitectureUsing MEMS Switches

Group/Rack 1

Group/Rack G

Group/Rack 1

Group/Rack G

MEMSSwitch

Electronics Electronics

Optics

Group/Rack 1

Group/Rack G

Group/Rack 1

Group/Rack G

MEMSSwitch

When Linecards are Missing

Questions

Number of MEMS Switches?

TDM Schedule?

All Link Capacities Are Equal

Group/Rack 1

Group/Rack G

Group/Rack 1

Group/Rack G

MEMSSwitch

Link Capacity ≈ 64 λ’s * 5 Gb/s/λ = 320 Gb/s = 2R

Laser/Modulator

MUX≤ 2R

≤ 2R

Group/Rack 1

Group/Rack 2

Example2 Groups of 2 Linecards

Group/Rack 1

Group/Rack 2

Intuition on Worst-Case

Group/Rack 1

MEMSSwitch

2RL 2RL≤ 2R

≤ 2R

Group/Rack G

Group/Rack 2

2R1 2R

Group/Rack 2

Group/Rack G

Theorem: M ≤ L+G-1

Number of MEMS Switches

Examples:

5540,16,640

Questions

Number of MEMS Switches?

TDM Schedule?

Group A

Group B

TDM Schedule

Group A

Group B

TDM Schedule

T+1 T+2 T+3 T+4

Tx LC A1 ? ? ? ?

Tx LC A2 ? ? ? ?

Tx LC B1 ? ? ? ?

Tx LC B2 ? ? ? ?

Tx Group A

Tx Group B

TDM Schedule

T+1 T+2 T+3 T+4

Tx LC A1 A1 A2 B1 B2

Tx LC A2 B2 A1 A2 B1

Tx LC B1 B1 B2 A1 A2

Tx LC B2 A2 B1 B2 A1

Tx Group A

Tx Group B

Bad TDM Schedule

T+1 T+2 T+3 T+4

Tx LC A1 A1 A2 B1 B2

Tx LC A2 B2 A1 A2 B1

Tx LC B1 B1 B2 A1 A2

Tx LC B2 A2 B1 B2 A1

Tx Group A

Tx Group B

TDM Schedule Algorithm

Intuition1. Create TDM schedule between groups:

“Group A sends to group B”

2. Assign group connections to specific linecards: “Linecard A1 sends to linecard B3”

Theorem: There exists a polynomial-time algorithm to find a correct TDM schedule.

Algorithm Running Time

0-49 100-149

200-249

300-349

400-449

500-549

600-639

milliseconds

number of linecards

Worst CaseAverage CaseBest Case

[Verilog simulation, linecard placement generated uniformly-at-random among 40 groups, 4ns clock cycle, 1000 runs per case. Source: Srikanth Arekapudi]

Open Questions

Greedy TDM algorithm with more capacity?

A better switch fabric architecture?

Thank you.

a load-balanced switch with an arbitrary number of linecards

r 2rexample2 groups

gbs linecard raten

balanced birkhoff

routers projecthttp

82r8when n

racks4r42r2r2r2r4r4rwhen

neumann switches

scaling internet routers

Documents

well-balanced positivity preserving central-upwind scheme...

arbitrary lagragian eulerian method

arbitrary waveform generators - awg7000 series · 2010. 10....

analytical study on arbitrary waveform generation by...

efﬁcient similarity search : arbitrary similarity ...

paint, body & equipment products - napa...

consti project arbitrary arrest

arbitrary keyword spotting in handwritten documents ·...

individual losers and collective winners micro –...

a load balanced switch with an arbitrary number of linecards

arbitrary waveform generators

arbitrary speed curve

optimal polarization synthesis of arbitrary arrays with...

arbitrary waveform generator

numerical studies of the abjm theory for arbitrary n at...

well-balanced positivity preserving central-upwind scheme...

afg2021 arbitrary-function generator datasheet · 2019. 8....

heat conduction in two dimensions - aalborg · pdf fileheat...

awg2021 arbitrary waveform generator user manualuser manual...

pc oscilloscopes with arbitrary waveform generator · 2014....