towards a redundancy -aware network stack for data centersmusa/uploads/hotnets_2016_talk.pdf ·...

60
Towards a Redundancy-Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts) Ihsan Ayyub Qazi (LUMS) Fahad R. Dogar (Tufts)

Upload: others

Post on 21-May-2020

8 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TowardsaRedundancy-AwareNetworkStackforDataCenters

AliMusaIftikhar(Tufts)

Ihsan AyyubQazi(LUMS)

FahadR.Dogar(Tufts)

Page 2: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TheProblemofTailLatency inDataCenters!

2

Page 3: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TheProblemofTailLatency inDataCenters!

Highfan-out

3

Page 4: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TheProblemofTailLatency inDataCenters!

• Loadimbalance• Backgroundtasks• Failures,etc.

Highfan-out

Straggler

4

Page 5: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TheProblemofTailLatency inDataCenters!

• Loadimbalance• Backgroundtasks• Failures,etc.

LongtaillatencyHighfan-out Stragglers+

Highfan-out

Straggler

5

Page 6: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Howtoavoidstragglers?

Reactively Proactively

6

Page 7: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Howtoavoidstragglers?

Reactively Proactively

PRO:lowoverhead

Hopper(SIGCOMM’15)C3(NSDI’15)

Sinbad(SIGCOMM’13)

CON:requiresstragglerdetection(slowandinaccurate)

7

Page 8: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Howtoavoidstragglers?

Reactively Proactively

PRO:lowoverhead

Dolly(NSDI’13)Lowlatencyvia

redundancy(CoNext’13)

Hopper(SIGCOMM’15)C3(NSDI’15)

Sinbad(SIGCOMM’13)

PRO:fastandaccurate

CON:requiresdeterminingthresholdload(non-trivial)

CON:requiresstragglerdetection(slowandinaccurate)

8

Page 9: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Howtoavoidstragglers?

Reactively Proactively

PRO:lowoverhead

Can we achieve the benefits of both without their limitations?

Dolly(NSDI’13)Lowlatencyvia

redundancy(CoNext’13)

Hopper(SIGCOMM’15)C3(NSDI’15)

Sinbad(SIGCOMM’13)

PRO:fastandaccurate

CON:requiresdeterminingthresholdload(non-trivial)

CON:requiresstragglerdetection(slowandinaccurate)

9

Page 10: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Overview• Duplicate-Aware Scheduling Framework

• Redundancy-Aware Network Stack

• Preliminary Results

Genericframework

NewnetworkstackforDC

10

Page 11: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2 11

Page 12: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

low

1. PriorityQueues

12

Page 13: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

low

request

1. PriorityQueues

13

Page 14: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

low

request

1. PriorityQueues

14

Page 15: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

low

P

request

B

1. PriorityQueues

15

Page 16: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

P

high

lowB

request

1. PriorityQueues

16

Page 17: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

lowB

request

1. PriorityQueues

17

Page 18: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

lowB

request

purge

1. PriorityQueues2. Purging

18

Page 19: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

NeedforPriorityQueuing

high

lowbackup

primary

19

ØDuplicationhasanoverhead!

L

Page 20: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

NeedforPriorityQueuing

high

lowbackup

primary

üStrictprioritiesüWorkconservationüPreemption

20

ØDuplicationhasanoverhead!

LPropertiesrequired:

Page 21: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

NeedforPriorityQueuing

high

lowbackup

primary

üStrictprioritiesüWorkconservationüPreemption

21

ØDuplicationhasanoverhead!

LPropertiesrequired:

PQ makes the overhead of duplication low. sJ

Page 22: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

NeedforPriorityQueuing

high

lowbackup

primary

üStrictprioritiesüWorkconservationüPreemption

22

ØDuplicationhasanoverhead!

LPropertiesrequired:

PQ makes the overhead of duplication low. sJessential

Page 23: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

ImportanceofPurging

ØStalerequestsblocknewrequests.

Lhigh

lowreq1req2

stale

23

Page 24: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

ImportanceofPurging

high

lowreq1req2

stale

ØStalerequestsblocknewrequests.

L

Purging makes the system more efficient! A J24

Page 25: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

ImportanceofPurging

high

lowreq1req2

stale

ØStalerequestsblocknewrequests.

L

Purging makes the system more efficient! A Joptimization 25

Page 26: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

26

Page 27: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

Network

27

Page 28: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

Compute

Network

28

Page 29: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

Memory

Compute

Network

29

Page 30: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

GFSHDFS BigTable

Memory

Compute

Filesystem/Database

Network

30

Page 31: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

GFSHDFS BigTable

Storage

Memory

Compute

Filesystem/Database

Network

31

Page 32: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

ateverypotentialbottleneck resourceinaDC

GFSHDFS BigTable

Memory

Compute

Storage

Network

Filesystem/Database

In-networkpurging

Prioritization

Purging+preemption

challenges

32

RealizingDuplicate-AwareScheduling

Page 33: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

+purging

33

Page 34: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

challenge

+purging

34

Page 35: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

Expressiveinterfaceallowsrichcommunicationb/wAppand

Transport.E.g.DAG

challenge

opportunity

+purging

35

Page 36: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

Expressiveinterfaceallowsrichcommunicationb/wAppand

Transport.E.g.DAG

challenge

opportunity

Hardtoimplementperpacketpurging.

challenge+purging

36

Page 37: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

Expressiveinterfaceallowsrichcommunicationb/wAppand

Transport.E.g.DAG

challenge

opportunity

Hardtoimplementperpacketpurging.

challenge

AddssupportforexistingPQsinDCswitches.

opportunity

+purging

37

Page 38: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

Expressiveinterfaceallowsrichcommunicationb/wAppand

Transport.E.g.DAG

challenge

opportunity

Hardtoimplementperpacketpurging.

challenge

AddssupportforexistingPQsinDCswitches.

opportunity

+purging

38

Page 39: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

e.g.Improvedfaulttolerance

üMultipath

üMulti-destination

RANSTransport:PointtoMulti-point

ØEnables:Richtransport

Sender1(replica1)

Receiver(client)

Sender2(replica2)

39

Page 40: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RANSTransport:ByteAggregation

Sender1(replica1)

Receiver(client)

Sender2(replica2)

ØOpportunity:Receiverdriventransport

Response

e.g.Moreefficientcongestioncontrol(2xormore)

üTwoormoreresponsestreams

üAggregatebytesatreceiverside

40

Page 41: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RANSTransport:PriorityAssignment

Sender1(replica1)

Receiver(client)

Sender2(replica2)

ØDynamicreplicaassignment

Response+Feedback

e.g.Improvedreplicaassignment

üFinegrainedmonitoringofcongestionwindow

üDynamicallyreprioritizeflows

üFeedbacktoApplication

41

Page 42: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Overview• Duplicate-Aware Scheduling Framework

• Redundancy-Aware Network Stack

• Preliminary Results

42

Page 43: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

PreliminaryEvaluation:ns-2setupdetailsØ Storagescenario

Client

10servers

Replica1

Replica2

43

Page 44: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

PreliminaryEvaluation:ns-2setupdetailsØ Storagescenario

Client

10servers

Replica1

Replica2

bottlenecks

44

Page 45: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

PreliminaryEvaluation:ns-2setupdetails

TrafficDetails

Totalrequests 20K

Arrivalprocess Poisson

Server&replicaselection Uniformlyrandom

Ø Storagescenario

Client

10servers

Replica1

Replica2

bottlenecks

45

Page 46: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

PreliminaryEvaluation:ns-2setupdetails

TrafficDetails

Totalrequests 20K

Arrivalprocess Poisson

Server&replicaselection Uniformlyrandom

Ø Storagescenario

Client

10servers

Replica1

Replica2

bottlenecks

The only source of stragglers is load imbalance.

46

Page 47: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

47

Page 48: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

48

Page 49: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

49

Page 50: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

50

Page 51: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

51

Page 52: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

~2X

52

Page 53: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

~2X

Expecting more gains even at lower loads with additional straggler sources.

53

Page 54: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

50-80% improvement over the baseline across all loads.

Expecting more gains even at lower loads with additional straggler sources.

~2X

54

Page 55: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Summary&Futurework

• TheIssueofStragglers

• Duplicate-AwareSchedulingFramework

• RANS

• ImplementinginHDFSandCassandra

Simpleyetchallengingsolution

Afirststeptowardsaduplicate-awarenetwork

55

Page 56: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RANS:FeedbackandDiscussion

• AliMusaIftikhar([email protected])

• FahadR.Dogar ([email protected])

• Ihsan A.Qazi ([email protected])Transport

Link

Network

PointtomultipointByteaggregationPriorityassignment

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Application Duplicate-Awareness

Layer NewRole

+purging

+purging

56

Page 57: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Possiblequestions– backupslide• Preemptionoverhead

• Notreallyanissueinthenetworkbecausepacketsaresmall.

• Packetpurging• PFC(backpressure,buildqueuesattheendhostsandpurgethem)

• Droptheentireduplicatequeue(easierthanper-packetdrops)

• Recenttrendtowardsprogrammableswitches

• GainswithPQ• Moregainswithfailuresasstragglers(primaryundergoesafailure)

• Alsomorebenefitswithdifferentresources

• Duplicationoverheadatclient• Clientisusuallynotthebottleneck

• Non-Idempotentrequests• Wearetargetingtheclassofappswhichhaveflexibleendpointsandrequireatleastoncesemantics

• Replicatingonlysmallpacketsandprioritizingthem• Onlybeneficialwithbursty smallflows• HDFShaveatypicalchunksizeb/w64MB-128MB

• Quorumsystems• RANScomplementssuchsystems,theycanusethistechniqueandsendKoutofNrequestsathighprio whileN-Kasbackups

• Can’tjustimplementattheappandgetthesamebenefits?• Networkcouldbeabottleneck• Finegrainedcontrol,muchmorecontrol

• Rootcausesofperformanceimprovement• PQavoidsoverheads• Nowwecaneasilygetthebenefitsofduplicationslikeaggregationetc.

• Purgingwillalsoattimespurgeprimarymakingthesystemmoreefficient.

57

Page 58: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Foodforthought

DCPrimary DCFailover

InterDCDuplicate-AwareScheduling

e.g.Google’sGeo-DistributedDatabase“Spanner”(OSDI’12)

58

Page 59: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Foodforthought

DCPrimary DCFailover

InterDCDuplicate-AwareScheduling

e.g.Google’sGeo-DistributedDatabase“Spanner”(OSDI’12)

Prefetch

SearchsuggestionsSpellcheck

• Searchenginesdropspellcheck,suggestions,etc.athighloads.

• Canbenefitfromduplicate-awarescheduling.

59

Page 60: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

WhenRANSworksbest?

• Applicationfanout ishighandstragglersarefrequent.• End-pointsareflexibleand“atleastonce”semanticsaresufficient.• Clientisnotthebottleneck.• Requestsizesaresmall(orpreemptionoverheadisminimal).

60