proteus: a topology malleable data center network ankit singla (university of illinois...

21
Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping Zhang (NEC Labs, Princeton)

Upload: brianne-harvey

Post on 20-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Proteus: A Topology Malleable Data Center Network

Ankit Singla (University of Illinois Urbana-Champaign)Atul Singh, Kishore Ramachandran, Lei Xu, Yueping Zhang

(NEC Labs, Princeton)

Page 2: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Data centers: Foundation of Internet services, enterprise operation

– Need good bandwidth connectivity between servers

Data Centers

2

Page 3: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

CLUE

“Good” Bandwidth Connectivity Connect all servers at full bandwidth? Fat-trees [SIGCOMM 2008], VL2 [SIGCOMM 2009]

3

CABLING COMPLEXIT

Y

UPGRADE TO 40/100-GIGE?

POWER CONSUMPTIO

N?

Page 4: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Oversubscribed Networks Is all-to-all full bandwidth connectivity always necessary?

– Small number of ‘hot’ ToR-ToR connections • Flyways [HotNets 2009]

– >90% bytes flow in ‘elephant flows’ • VL2 [SIGCOMM 2009]

– ~60% ToRs see <20% change in traffic for between 1.6-2.2 sec• The Case for Fine-grained TE in Data Centers [WREN 2010]

Flyways [HotNets 2009], c-Through and Helios [SIGCOMM 2010] Supplement electrical network with wireless/optics

– Wireless/Optical connections are set up between hot ToRs– Some flexibility to adjust to changes in traffic matrix 4

Page 5: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Proteus

Proteus is a novel interconnect above the ToR layer– Topology adjusts to traffic demands– Low cabling complexity– Easier migration to 40/100-GigE– Low power consumption

5

A NEW DESIGN POINT:ALL-OPTICS

Optical Interconnect

ToR

. . .

. . .ToR

. . .

Serv

ers

Proteus is an oversubscribed network with topology malleabilitytopology malleability

Page 6: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Malleability

A B

CD

E F

GH

GC

FA

D

E B

HCHANGE

TOPOLOGY

GC

FA

D

E B

H

CHANGE CAPACITY

TRAFFIC CHANGE

PICK ROUTESA G 10

B H 10

C E 10

D F 10

B D 10

A G 10

B H 10

C E 10

G F 20

B D 10

6

Page 7: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

1 Gigabit X 64,000

64 Terabits* X 1* Achieved by NEC Labs and AT&T

Low complexity, reconfigurability, low power consumption

MEMS

DCBA

A CB

D

A BC

D

A BC

D

A

C DWSSMEMS

CIRCUIT SETUP TIME

LIMITED WAVELENGTHS

TOPOLOGY MANAGEMENT

7MEMS = Micro-Electro Mechanical Switch WSS = Wavelength Selective Switch

Optics: Perfect Fit

Page 8: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Problem Setting: Container-sized DCN

Proteus-2560: Connect 80 ToRs, each with 32 servers Typical container-size in containerized data center architectures

Image adapted from: www.sun.com/blackbox

8

Page 9: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

ToR Perspective

9

NON-BLOCKING TOR

OPTICAL INTERCONNECT

SERVERS

32 PORTS TOWARDSINTERCONNECT

32 PORTS FORSERVERS

Page 10: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

ToR Perspective

10

NON-BLOCKING TOR

O

E

O

INTRA-RACKTRAFFIC

TRANSIT TRAFFIC(HOP-BY-HOP)CROSS-RACK

TRAFFIC

TRANSCEIVERS WITH UNIQUEWAVELENGTHS

(O-E-O conversions add sub-nanosecond latency at each hop)

LIMITED BY TOR PORT CAPACITY

Page 11: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

11

TOR1

OPTICAL COMPONENTS

ToR13 ToR21 ToR45 ToR73

INCOMING

OUTGOING

HIGH CAPACITYLINK

LOW CAPACITYLINK

ToR67 ToR11 ToR29 ToR55

CHANGE TOPOLOGY CHANGE

CAPACITY OPTICAL COMPONENTS

Page 12: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

TOPOLOGY (MEMS)

BI-DIRECTIONALITY (CIRCULATORS)CAPACITY(WSS)

12

MEMS (320 ports)

C C C C

WSS

MUX

…ToR26

…… …

C C C C

…ToR59

COUPLER

DEMUX

To To

R 2

To ToR31

32

4S

SR

R

Page 13: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Proteus-2560 Properties

Build any 4-regular ToR topology Each link’s capacity varies in each direction

– Capacity Є {10, 20, 30, …, 320 } Gbps– Provided sum of capacities of 4 links <= 320 Gbps– (Also avoid wavelength contention)

Use hop-by-hop connections to other ToRs– Transit traffic doesn’t interfere with intra-ToR traffic

13

Page 14: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Topology Management

We formulate the problem as a mixed-integer linear program Describe a heuristic approach backed by graph-theoretic insights

– Likely to take under a couple of hundred milliseconds

COMPLEX PROBLEM: ALL CONFIGURATIONS ARE INTERDEPENDENT

DCBA

?A

C D

? A BC

D?

MEMS WSS Hop-by-hop routing

14

Page 15: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Heuristic Approach – Key Ideas

Topology: Weighted 4-matching over hot ToR-ToR connections– Check and correct for connectivity

Routing: Can use shortest paths– Ideally, need low-congestion routing schemes

Capacities: Graph edge-coloring over wavelengths– Ensure each link carries at least one wavelength

15

Page 16: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Preliminary Analysis

Cabling: #Fibers ≈ 1/5th #cables in a fat-tree Ease of upgrade: When ToRs move to 40/100-GigE, nothing else

changes! Cost: similar to a fat-tree

– Optics is yet to benefit from commoditization– To some extent, dispels the optics is expensive myth

Power: 50% of fat-tree power consumption Fat-tree is also fault tolerant though

16

Page 17: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Conclusion, Ongoing Work A novel data center architecture

– Unprecedented topology flexibility– Reduced cabling complexity– Easier migration to 40/100-GigE– Reduced power consumption– Explores a new design point – all-optics

Experimental evaluation Incremental update heuristics Mega-data-center scale Fault tolerance

17

TRANSIENT BEHAVIOR?

ROUTING?

SYNCHRONIZATION?

Page 18: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Thank You!

Questions?

Page 19: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Extras / Backup

19

Page 20: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Hop-by-hop Through ToRs

MEMS – limited end-to-end circuits Need hop-by-hop routes over these circuits Feasibility assessment: works fine!

20

Page 21: Proteus: A Topology Malleable Data Center Network Ankit Singla (University of Illinois Urbana-Champaign) Atul Singh, Kishore Ramachandran, Lei Xu, Yueping

Helios [SIGCOMM ’10]

Pods are still fat-trees Requires design-time

decision on stable vs. unstable traffic

Does not exploit multi-hop optical routes

Does not leverage WSS technology for variable capacity

Image from “Helios: A Hybrid Electrical/Optical Switch Architecture for Modular Data Centers” – Farrington et al

21