extending sdn to the data planeweb.mit.edu/anirudh/www/hotnets-slides.pdf · extending sdn to the...

34
Extending SDN to the Data Plane Anirudh Sivaraman, Keith Winstein, Suvinay Subramanian, Hari Balakrishnan M.I.T. http://web.mit.edu/anirudh/www/sdn-data-plane.html

Upload: buidien

Post on 12-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Extending SDN to the Data Plane

Anirudh Sivaraman, Keith Winstein, Suvinay Subramanian,Hari Balakrishnan

M.I.T.

http://web.mit.edu/anirudh/www/sdn-data-plane.html

Switch Data Planes today

Two key decisions on a per-packet basis:

I Scheduling: Which packet to transmit next?

I Queue Management: How long can queuesgrow? Which packet to drop?

The long lineage of in-network algorithms

1980s

GPS

WFQ

VirtualClock

The long lineage of in-network algorithms

RED

BLUE

1980s 1990s

GPS

WFQ

DRRVirtualClock IntServ

CSFQ

DiffServ

The long lineage of in-network algorithms

RED

BLUE ECN

CHOKe

AVQ

XCP

VCP

RCP

1980s 1990s 2000s

GPS

WFQ

DRR

SRR

VirtualClock IntServ

CSFQ

DiffServ

RIO PI

The long lineage of in-network algorithms

RED

BLUE ECN

CHOKe

AVQ

XCP

VCP

RCP DCTCP

CoDel

PDQ

pFabricD²TCP

1980s 1990s 2000s 2010s

DeTail

GPS

WFQ

DRR

SRR

VirtualClock IntServ

CSFQ

DiffServ FCP

RIO PIEPI

The long lineage of in-network algorithms

RED

BLUE ECN

CHOKe

AVQ

XCP

VCP

RCP DCTCP

CoDel

PDQ

pFabricD²TCP

1980s 1990s 2000s 2010s

DeTail

GPS

WFQ

DRR

SRR

VirtualClock IntServ

CSFQ

DiffServ FCP

RIO PIE RC3PI MCP

The Data Plane is continuously evolving

I Each scheme wins in its own evaluation.

I Quest for a “silver bullet” in-network method.

We disagree: There is no silver bullet!

I Different applications care about differentobjectives.

I Applications use different transport protocols.

I Networks are heterogeneous.

Our work:

I Quantify non-universality of in-network methods.

I Extend SDN to the Data Plane to handlein-network diversity.

Quantifying “No Silver Bullet”: Network Configurations

Configuration DescriptionCoDel+FCFS One shared FCFS queue with

CoDel

CoDel+FQ Per-flow fair queueing with CoDelon each queue (Nichols 2013)

Bufferbloat+FQ Per-flow fair queueing with deepbuffers on each queue

Quantifying“No Silver Bullet”: Workloads and Objectives

Workload Description ObjectiveBulk Long-running

bulk transfer flowMax. throughput

Web Switched flowwith ON/OFFperiods

Min. 99.9 %ile flowcompletion time

Interactive Long-runninginteractive flow

Max. throughputdelay

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Experiment configuration:Workload: 1 Bulk flow + 1 Web FlowNetwork: LTE link with 150 ms min. RTT

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Experiment configuration:Workload: 1 Bulk flow + 1 Web FlowNetwork: LTE link with 150 ms min. RTT

Web Tail FCT: 43 s

Bulk Tpt: 3.9 Mbps

Web Tail FCT: 43 s

Bulk Tpt: 3.9 Mbps

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Experiment configuration:Workload: 1 Bulk flow + 1 Web FlowNetwork: LTE link with 150 ms min. RTT

Web Tail FCT: 43 s

Bulk Tpt: 3.9 Mbps

Web Tail FCT: 43 s

Bulk Tpt: 3.9 Mbps

Web Tail FCT: 21 s

Bulk Tpt: 11.2 MbpsBulk Tpt: 11.2 Mbps

Web Tail FCT: 21 s

Bulk Tpt: 11.2 MbpsBulk Tpt: 11.2 Mbps

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Experiment configuration:Workload: 1 Bulk flow + 1 Web FlowNetwork: LTE link with 150 ms min. RTT

Web Tail FCT: 43 s

Bulk Tpt: 3.9 Mbps

Web Tail FCT: 43 s

Bulk Tpt: 3.9 Mbps

Web Tail FCT: 21 s

Bulk Tpt: 11.2 MbpsBulk Tpt: 11.2 Mbps

Web Tail FCT: 21 s

Bulk Tpt: 11.2 MbpsBulk Tpt: 11.2 Mbps

Web Tail FCT: 43 s

Bulk Tpt: 3.9 Mbps

Web Tail FCT: 43 s

Bulk Tpt: 3.9 Mbps

Web Tail FCT: 21 s

Bulk Tpt: 11.2 MbpsBulk Tpt: 11.2 Mbps

Web Tail FCT: 21 s

Bulk Tpt: 11.2 MbpsBulk Tpt: 11.2 Mbps

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Bulk + Web on LTE. Bufferbloat+FQ givesWeb flow: 52% faster tail flow completion,Bulk flow: 186% more throughput

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Bulk + Web on LTE. Bufferbloat+FQ givesWeb flow: 52% faster tail flow completion,Bulk flow: 186% more throughput

Two Interactive on 15 Mbps link.

Codel+FQ gives 700x more tpt/delay

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Bulk + Web on LTE. Bufferbloat+FQ givesWeb flow: 52% faster tail flow completion,Bulk flow: 186% more throughput

Two Interactive on 15 Mbps link.

Codel+FQ gives 700x more tpt/delay

Bulk + Web, 15 Mbps link.

Codel+FQ gives Web flow16% faster tail flow completionwith same Bulk throughput

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Bulk + Web on LTE. Bufferbloat+FQ givesWeb flow: 52% faster tail flow completion,Bulk flow: 186% more throughput

Two Interactive on 15 Mbps link.

Codel+FQ gives 700x more tpt/delay

Bulk + Web, 15 Mbps link.

Codel+FQ gives Web flow16% faster tail flow completionwith same Bulk throughput

Two Bulk on LTE.Codel+FCFS gives5% more throughput

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Bulk + Web on LTE. Bufferbloat+FQ givesWeb flow: 52% faster tail flow completion,Bulk flow: 186% more throughput

Two Interactive on 15 Mbps link.

Codel+FQ gives 700x more tpt/delay

Bulk + Web, 15 Mbps link.

Codel+FQ gives Web flow16% faster tail flow completionwith same Bulk throughput

Two Bulk on LTE.Codel+FCFS gives5% more throughput

One Interactive on LTE.Codel+FCFS gives200x more tpt/delay

Quantifying “No Silver Bullet”

CoDel+FCFS

CoDel+FQ Bufferbloat+FQ

Bulk + Web on LTE. Bufferbloat+FQ givesWeb flow: 52% faster tail flow completion,Bulk flow: 186% more throughput

Two Interactive on 15 Mbps link.

Codel+FQ gives 700x more tpt/delay

Bulk + Web, 15 Mbps link.

Codel+FQ gives Web flow16% faster tail flow completionwith same Bulk throughput

Two Bulk on LTE.Codel+FCFS gives5% more throughput

One Interactive on LTE.Codel+FCFS gives200x more tpt/delay

One Bulk on LTE.Bufferbloat+FQ gives174% more throughput

Why is no single data plane configuration the best?

I Bufferbloat gives the best throughput onvariable-rate links.

I FCFS is preferable to Fair Queuing withhomogeneous objectives.

I Fair Queuing is preferable with heterogeneousobjectives.

So what should the network designer do?

I Don’t strive for the best in-network behaviour.

I Instead, architect for evolvability.

I Conceptually, extend SDN to include the dataplane as well.

Flexibility without sacrificing performance

I Provide interfaces only to the head and tail ofqueues

I Operators specify onlyqueue-management/scheduling logic

I No access to packet payloads.

Building such a data plane in four parts

I Hardware gadgetsI Random number generators (RED, BLUE)I Binary tree of comparators (pFabric, SRPT)I Look-up tables for function approximation (CoDel, RED)

I I/O interfacesI Drop/mark head/tail of queueI Interrupts for enqueue/dequeueI Rewrite packet fields

I State maintenanceI Per-flow (WFQ, DRR)I Per-dst address (PF)

I A domain-specific instruction setI Expresses control flowI Implements new functions unavailable in hardware

Building such a data plane in four parts

I Hardware gadgetsI Random number generators (RED, BLUE)I Binary tree of comparators (pFabric, SRPT)I Look-up tables for function approximation (CoDel, RED)

I I/O interfacesI Drop/mark head/tail of queueI Interrupts for enqueue/dequeueI Rewrite packet fields

I State maintenanceI Per-flow (WFQ, DRR)I Per-dst address (PF)

I A domain-specific instruction setI Expresses control flowI Implements new functions unavailable in hardware

Building such a data plane in four parts

I Hardware gadgetsI Random number generators (RED, BLUE)I Binary tree of comparators (pFabric, SRPT)I Look-up tables for function approximation (CoDel, RED)

I I/O interfacesI Drop/mark head/tail of queueI Interrupts for enqueue/dequeueI Rewrite packet fields

I State maintenanceI Per-flow (WFQ, DRR)I Per-dst address (PF)

I A domain-specific instruction setI Expresses control flowI Implements new functions unavailable in hardware

Building such a data plane in four parts

I Hardware gadgetsI Random number generators (RED, BLUE)I Binary tree of comparators (pFabric, SRPT)I Look-up tables for function approximation (CoDel, RED)

I I/O interfacesI Drop/mark head/tail of queueI Interrupts for enqueue/dequeueI Rewrite packet fields

I State maintenanceI Per-flow (WFQ, DRR)I Per-dst address (PF)

I A domain-specific instruction setI Expresses control flowI Implements new functions unavailable in hardware

Feasibility study: CoDel

Packet with timestamp

(Router stamps all incoming packets with timestamp)

Packet with timestamp

DequeuePacket Queue

Size

Timestamp

QueueEmptyDrop

First above time

TimeCounter

NowCount

Next TimeTo Drop

LinkReady

DropFront

State Variable

Input/Output interfaces CODEL

Packet

codel_q_t::dequeue

Check

codel_q_t::dodequeue

countdropping

Control-Flow

codel_q_t::control_law

drop_next

LUT

SQRT

Synthesis numbers on the Xilinx Kintex-7

Resource Usage FractionSlice logic 1,256 1%Slice logic dist. 1,975 2%IO/GTX ports 27 2%DSP slices 0 0%Maximum speed 12.9 million

pkts/s ~10 Gbps

I Small fraction of the FPGA’s resources.

I Can be improved by pipelining or parallelizing.

Conclusion

I No silver bullet to in-network resource allocation.

I Algorithms will evolve: Data Plane should help

I Reproduce our results:http://web.mit.edu/anirudh/www/sdn-data-plane.html