proteus: a topology malleable data center network ankit singla (university of illinois...
TRANSCRIPT
Proteus: A Topology Malleable Data Center Network
Ankit Singla (University of Illinois Urbana-Champaign)Atul Singh, Kishore Ramachandran, Lei Xu, Yueping Zhang
(NEC Labs, Princeton)
Data centers: Foundation of Internet services, enterprise operation
– Need good bandwidth connectivity between servers
Data Centers
2
CLUE
“Good” Bandwidth Connectivity Connect all servers at full bandwidth? Fat-trees [SIGCOMM 2008], VL2 [SIGCOMM 2009]
3
CABLING COMPLEXIT
Y
UPGRADE TO 40/100-GIGE?
POWER CONSUMPTIO
N?
Oversubscribed Networks Is all-to-all full bandwidth connectivity always necessary?
– Small number of ‘hot’ ToR-ToR connections • Flyways [HotNets 2009]
– >90% bytes flow in ‘elephant flows’ • VL2 [SIGCOMM 2009]
– ~60% ToRs see <20% change in traffic for between 1.6-2.2 sec• The Case for Fine-grained TE in Data Centers [WREN 2010]
Flyways [HotNets 2009], c-Through and Helios [SIGCOMM 2010] Supplement electrical network with wireless/optics
– Wireless/Optical connections are set up between hot ToRs– Some flexibility to adjust to changes in traffic matrix 4
Proteus
Proteus is a novel interconnect above the ToR layer– Topology adjusts to traffic demands– Low cabling complexity– Easier migration to 40/100-GigE– Low power consumption
5
A NEW DESIGN POINT:ALL-OPTICS
Optical Interconnect
ToR
. . .
. . .ToR
. . .
Serv
ers
Proteus is an oversubscribed network with topology malleabilitytopology malleability
Malleability
A B
CD
E F
GH
GC
FA
D
E B
HCHANGE
TOPOLOGY
GC
FA
D
E B
H
CHANGE CAPACITY
TRAFFIC CHANGE
PICK ROUTESA G 10
B H 10
C E 10
D F 10
B D 10
A G 10
B H 10
C E 10
G F 20
B D 10
6
1 Gigabit X 64,000
64 Terabits* X 1* Achieved by NEC Labs and AT&T
Low complexity, reconfigurability, low power consumption
MEMS
DCBA
A CB
D
A BC
D
A BC
D
A
C DWSSMEMS
CIRCUIT SETUP TIME
LIMITED WAVELENGTHS
TOPOLOGY MANAGEMENT
7MEMS = Micro-Electro Mechanical Switch WSS = Wavelength Selective Switch
Optics: Perfect Fit
Problem Setting: Container-sized DCN
Proteus-2560: Connect 80 ToRs, each with 32 servers Typical container-size in containerized data center architectures
Image adapted from: www.sun.com/blackbox
8
ToR Perspective
9
…
NON-BLOCKING TOR
…
OPTICAL INTERCONNECT
SERVERS
32 PORTS TOWARDSINTERCONNECT
32 PORTS FORSERVERS
ToR Perspective
10
…
NON-BLOCKING TOR
…
O
E
O
INTRA-RACKTRAFFIC
TRANSIT TRAFFIC(HOP-BY-HOP)CROSS-RACK
TRAFFIC
TRANSCEIVERS WITH UNIQUEWAVELENGTHS
(O-E-O conversions add sub-nanosecond latency at each hop)
LIMITED BY TOR PORT CAPACITY
11
…
TOR1
…
OPTICAL COMPONENTS
ToR13 ToR21 ToR45 ToR73
INCOMING
OUTGOING
HIGH CAPACITYLINK
LOW CAPACITYLINK
ToR67 ToR11 ToR29 ToR55
CHANGE TOPOLOGY CHANGE
CAPACITY OPTICAL COMPONENTS
TOPOLOGY (MEMS)
BI-DIRECTIONALITY (CIRCULATORS)CAPACITY(WSS)
12
MEMS (320 ports)
C C C C
WSS
MUX
…
…ToR26
…
…… …
C C C C
…
…ToR59
…
COUPLER
DEMUX
To To
R 2
To ToR31
32
4S
SR
R
Proteus-2560 Properties
Build any 4-regular ToR topology Each link’s capacity varies in each direction
– Capacity Є {10, 20, 30, …, 320 } Gbps– Provided sum of capacities of 4 links <= 320 Gbps– (Also avoid wavelength contention)
Use hop-by-hop connections to other ToRs– Transit traffic doesn’t interfere with intra-ToR traffic
13
Topology Management
We formulate the problem as a mixed-integer linear program Describe a heuristic approach backed by graph-theoretic insights
– Likely to take under a couple of hundred milliseconds
COMPLEX PROBLEM: ALL CONFIGURATIONS ARE INTERDEPENDENT
DCBA
?A
C D
? A BC
D?
MEMS WSS Hop-by-hop routing
14
Heuristic Approach – Key Ideas
Topology: Weighted 4-matching over hot ToR-ToR connections– Check and correct for connectivity
Routing: Can use shortest paths– Ideally, need low-congestion routing schemes
Capacities: Graph edge-coloring over wavelengths– Ensure each link carries at least one wavelength
15
Preliminary Analysis
Cabling: #Fibers ≈ 1/5th #cables in a fat-tree Ease of upgrade: When ToRs move to 40/100-GigE, nothing else
changes! Cost: similar to a fat-tree
– Optics is yet to benefit from commoditization– To some extent, dispels the optics is expensive myth
Power: 50% of fat-tree power consumption Fat-tree is also fault tolerant though
16
Conclusion, Ongoing Work A novel data center architecture
– Unprecedented topology flexibility– Reduced cabling complexity– Easier migration to 40/100-GigE– Reduced power consumption– Explores a new design point – all-optics
Experimental evaluation Incremental update heuristics Mega-data-center scale Fault tolerance
17
TRANSIENT BEHAVIOR?
ROUTING?
SYNCHRONIZATION?
Thank You!
Questions?
Extras / Backup
19
Hop-by-hop Through ToRs
MEMS – limited end-to-end circuits Need hop-by-hop routes over these circuits Feasibility assessment: works fine!
20
Helios [SIGCOMM ’10]
Pods are still fat-trees Requires design-time
decision on stable vs. unstable traffic
Does not exploit multi-hop optical routes
Does not leverage WSS technology for variable capacity
Image from “Helios: A Hybrid Electrical/Optical Switch Architecture for Modular Data Centers” – Farrington et al
21