fast and efficient flow and loss measurement for data...

54
Fast and Efficient Flow and Loss Measurement for Data Center Networks Yuliang Li Rui Miao Changhoon Kim Minlan Yu 1

Upload: others

Post on 19-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

FastandEfficientFlowandLossMeasurementforDataCenterNetworks

YuliangLiRui Miao Changhoon Kim Minlan Yu

1

Page 2: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

FlowRadar:Captureallflowsonafinetimescale

2

Page 3: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Flowcoverage

• Weneedtoinspecteachindividualflow– Definedby5tuples:source,dest IPs,ports,protocol

3

Transientloops/blackholes Fine-grainedtrafficanalysis

0%

10%

20%

30%

40%

50%

60%

1 10 100 1000 10000

Distrib

ution

Size (Byte)

Page 4: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Temporalcoverage

• Weneedflowinformationonafinetimescale

4

0

1

2

3

4

5

6

0 50 100

#Loss

Time (ms)

ShorttimescaleLossrates Timelyattackdetection

DoS

Page 5: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Key insight: division of labor

• Goal:Monitoralltheflowsonafinetimescale

5

Overhead atthecollector

Overhead at theswitches

NetFlow

Mirroring

Limitedper-packetprocessingtimeLimitedmemory(10sofMB)

Collectorhaslimitedbandwidthandstorage

Needssampling

Page 6: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Key insight: division of labor

• Goal:Monitoralltheflowsonafinetimescale

6

FlowRadar

Usefixedoperationsperpacketatswitches

Overhead atthecollector

Overhead at theswitches

NetFlow

Mirroring

Smalldatastructures

Page 7: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

FlowRadar architecture

7

Extractper-flowcountersacrossswitchesCollector

Switches

AnalysisApplications

Flows&counters

Compactflowcounterswithfixedper-packetoperations

Periodicreport

Page 8: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Challenge:Howtohandlecollisions?

• Handling hashcollisionsishard• Large hash tablesà highmemoryusage• Linked list/Cuckoohashingàmultiple,non-constant#memoryaccesses

8

Flow a b cPacketCount … … …

Flowd

d1

collision

Page 9: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Solution:Embracescollisions

• Handling hashcollisionsishard• Large hash tablesà highmemoryusage• Linked list/Cuckoohashingàmultiple,non-constant#memoryaccesses

• Embracingthecollisions• Lessmemoryandconstant#memoryaccesses

9

Page 10: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

FlowXorFlowCountPacketCount

Solution:Embracescollisions

• Embracingthecollisions:xor upalltheflows• Lessmemoryandconstant#memoryaccesses

10

Countingtable

flowa:S(a) Flowb:S(b) Flowc:S(c)a a b b c c

[InvertibleBloomLookupTable(arXiv 2011)]

S(x):#packetsinx

H1 H2 H3

⊕ba a1S(a)

1S(a)

a1S(a)

b1

S(b)+S(b) +S(c)

⊕c b⊕c1

S(b)+S(c)

c1S(c)

2 2 2

Page 11: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

FlowFiltertoidentifynewflows

• 1.Checkandupdatetheflowfilter• 2.Updatecountingtable– Thefirstpacketfromanewflow,updateallfields– SubsequentpacketsupdateonlyPacketCount

11

bloomfilter:identifynewflow

FlowXor a a⊕b b⊕c b⊕c a cFlowCount 1 2 2 2 1 1PacketCount S(a) S(a)+S(b) S(b)+S(c) S(b)+S(c) S(a) S(c)

+1+1 +1

d

⊕d⊕d ⊕d+1+1 +1

+1+1 +1

d

Flowfilter

[InvertibleBloomLookupTable(arXiv 2011)]

EncodedFlowset

Countingtable

H1 H2 H3

Page 12: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Easytoimplementinmerchantsilicon

• Switchdataplane– Fixedoperations per packet– Smallmemory: 2.36MBfor100Kflows

• Switchcontrolplane– Collectsthesmallencodedflowset every 10ms

• WeimplementeditusingP4Language.– Portabletomanyp4-compatibleswitchchips

12

Page 13: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Flows&counters

Extractper-flowcountersacrossswitches

AnalysisApplications

FlowRadar architecture

13

Compactflowcounterswithfixedper-packetoperations

Collector

Switches

Periodicreport

Encodedflowset

Encodedflowset

Encodedflowset

Stage1.SingleDecode Stage2.Network-wideDecode

Page 14: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Stage1.SingleDecode

14

FlowXor …FlowCount …PacketCount …

Bloomfilter

Countingtable

FlowfilterFlow #packeta S(a)… …

Input:anencodedflowset fromasingleswitch

Output:per-flowcounters

Page 15: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Flowfilter

Flow #packeta 5

FlowXor a a⊕b b⊕c⊕d b⊕c⊕d a c⊕dFlowCount 1 2 3 3 1 2PacketCount 5 12 13 13 5 6

Stage1.SingleDecode

• Findapurecell:acellwithoneflow• Removetheflowfromallcells

15

Purecell

Decoded:

-1 -1 -1-5 -5 -5

H1 H2 H3

Page 16: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

FlowXor 0 b b⊕c⊕d b⊕c⊕d 0 c⊕dFlowCount 0 1 3 3 0 2PacketCount 0 7 13 13 0 6

Flowfilter

Stage1.SingleDecode

• Findapurecell:acellwithoneflow• Removetheflowfromallcells– Createmorepurecells

• Iterateuntilnopurecells

16

Flow #packeta 5Decoded:

H1 H2 H3

Page 17: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Stage1.SingleDecode

17

FlowXor 0 0 c⊕d c⊕d 0 c⊕dFlowCount 0 0 2 2 0 2PacketCount 0 0 6 6 0 6

Flowfilter

Flow #packeta 5b 7

Decoded:

H1 H2 H3

Page 18: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Flows&counters

FlowRadar architecture

18

Collector

Switches

Stage1.SingleDecode Stage2.Network-wideDecode

Periodicreport

Encodedflowset

Encodedflowset

Encodedflowset

AnalysisApplications

Page 19: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Keyinsight:overlappingsetsofflows

• Thesetsofflowsoverlapacrosshops–Wecanusetheredundancytodecodemoreflows

• Usedifferenthashfunctionsacrosshops

19

abcd

abcd

a a⊕c⊕d

b⊕c⊕d

a⊕b⊕c b⊕d

1 3 3 3 2

a⊕d a⊕c b⊕c⊕d

a⊕b⊕c b⊕d

2 2 3 3 2

FlowXor

FlowCountPktCount

FlowXor

FlowCountPktCount

Use5cellstodecode4flows10

Collectorcanleverageflowsets fromallswitchestodecodemoreflows

Page 20: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Keyinsight:overlappingsetsofflows

• Thesetsofflowsoverlapacrosshops–Wecanusetheredundancytodecodemoreflows

• Usedifferenthashfunctionsacrosshops• Provisionswitchmemorybasedonavg(#flows),notmax(#flows)– SingleDecode fornormalcases– Network-widedecodeforburstsofflows

20

Page 21: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Challenge1:setsofflowsnotfullyoverlapped

• Flowsfromoneswitchmaygotodifferentnexthops• Oneswitchreceivesflows frommultiplehops

21

abcd

a

e

cd

Page 22: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Solution:ZigzagdecodewithFlowFilters

• Generalizetotheentirenetwork– Noneedforroutinginformation– Supportincrementaldeployment

22

Flowfilter Flowfilter

a

a b c d

Decoded:

a ec d

ab cc d d eSingleDecode SingleDecodeRemovefromtheother

FlowDecode

Page 23: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Challenge2:differentcountervalues

• Thecounters ofthesameflowmaybedifferentacrosshops– Somepacketsmaygetlost– Somepacketsmaybeonthefly

23drop

Page 24: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Solution:calculatecountersforeachswitch

• Solvealinearequationsystemforeachswitch– Alreadyknowthefullsetofflowsateachswitch– Solvablewithnflowsand>=ncombinationsofcounters

24

a b c d

a⊕b⊕d …

3 …14 …

FlowXor

FlowCountPktCount

Flow #pkta 5b 7c 4d 2

!𝑆(𝑎)+𝑆(𝑏)+𝑆(𝑑)=14…

Page 25: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Solving the linearequationsystem

• Challenge:solvingthelinearequationsystemis notfast• Thematrixisverysparse,eachcolumnhask“1”s.• We use iterative solvers• Speeduptheiteration:– Starttheiterationfromcloseapproximationsofcounters,whichisgotfromtheFlowDecode

– Stoptheiterationwhentheresultisfloatingwithin±0.5aninteger.

25

Page 26: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Flows&counters

FlowRadar architecture

26

Collector

Switches

Stage1.SingleDecode Stage2.Network-wideDecode

Periodicreport

Encodedflowset

Encodedflowset

Encodedflowset

Stage2.1FlowDecode

Stage2.2CounterDecode

FlowXor …FlowCount …PacketCount …

Flow filter

Flow filter

Flow #pktabc

CounterDecode

Flow #pkta 5b 7c 4

FlowDecode

CounterDecode

FlowXor …FlowCount …PacketCount …

Flow #pktxyz

Flow #pktx 3y 5z 6

AnalysisApplications

Page 27: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Evaluations

27

AnalysisApplicationsFlows&countersCollector

Switches

Stage1.SingleDecode

Stage2.1FlowDecode

Stage2.2CounterDecode

Periodicreport

Encodedflowset

Encodedflowset

Encodedflowset

• Memoryusageisefficient• Bandwidthusageislow• NetDecode vsSingleDecode

Page 28: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Evaluation

• Simulationofk=8FatTree (80switches,128hosts)inns3• Configurethememorybaseonavg(#flow),– whenburstofflowshappens,usenetwork-widedecode

• Theworstcaseisallswitchesarepushedtomax(#flow)– Traffic:eachswitchhassamenumberofflows,andthussamememory

• Eachswitchreportstheflowsets every10ms.

28

Page 29: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

��

��

��

��

��

� �� ��� ����

�����������������������

� ����� ��� ������ ���

��������� �� ���������

Memoryefficiency

29

��

��

��

��

��

� �� ��� ����

�����������������������

� ����� ��� ������ ���

��������� �� ���������������� ���� �� ��

#cell=#flow(Impractical) FlowRadar:2.36MB

FlowRadar:24.8MB

Page 30: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Bandwidthusage

• EvaluatedbasedonFacebookdatacenters(SIGCOMM’ 15)– EachToR haslessthan100Kflowsper10msà lessthan2.3Gbps– EachToR talksto44*10G-hosts– FlowRadar consumes0.52%oftotalToR bandwidth

30

Page 31: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

NetDecode vs. SingleDecode

31

���

��

���

����

�����

�� �� �� �� ��� ��� ���

����������������������������

� ����� ��� ������ ���

SingleDecode NetDecode

SingleDecode:100Kflow,10ms

NetDecodeneedsmoretime

NetDecode:126.8Kflow,3sec

Page 32: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

NetDecode

• TheCounterDecode takesthemajorityofthedecodingtime• TheCounterDecode limitsthe#counterscouldbedecoded– If#flows>126.8K,wecanstilldecodeallflows(upto152K)withoutcountersusingFlowDecode

32

Page 33: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

AnalysisApplicationsFlows&counters

FlowRadar analyzer

33

Collector

Switches

Stage1.SingleDecode Stage2.Network-wideDecode

Periodicreport

Encodedflowset

Encodedflowset

Encodedflowset

Page 34: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Analysisapplications

• Flowcoverage– Transientloops/blackholes– Errorinmatch-actiontable– Fine-grainedtrafficanalysis

• Temporalcoverage– Shorttimescaleper-flowlossrate– ECMPloadimbalance– Timelyattackdetection

34

Page 35: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Per-flowlossmap:bettertemporalcoverage

• Detectlossesaftereveryflowlet

35

012345

012345

1 2 3 4 5 6 7 8 9 10 11

Switch1

Switch2

15packets

14packets

35packets

34packets

NetFlowdetectsloss

FlowRadardetectsloss

Page 36: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

FlowRadar conclusion

• Reportallper-flowcountersonafinetimescale• Fullyleveragebothswitchesandthecollector– Switches:Encodedflowsets withfixedper-packetprocessingtime,andlowmemoryusage

– Collector:Network-widedecoding

36

FlowRadar

Overhead atthecollector

Overhead at switches

NetFlow

Mirroring

Page 37: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

LossRadar:Detectindividuallossesonafinetimescale

37

Page 38: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Losseshavesignificantimpact

• Significantlatencyincreaseandthroughputdrop– ViolatingSLAanddroprevenue

• Takesoperatorsuptotensofhourstofindandfixtheproblem

38

Takeaway:WewanttoknowlossASAP

Page 39: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Challenge:manytypesoflosses

39

OutputPort 1OutputPortn

InputPort1InputPortn

Input buffer

… parser Ingressmatch-actions(L2->L3->ACL)

Sharedbuffer

Egressmatch-actions

SwitchCPU

Configure

corruptions

misconfigurations updates

ResourceshortageResourceshortage corruptions

Greyfailure

Takeaway:Weneedagenerictool

Page 40: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Challenge: losses could happen everywhere

40

TCP:I have losses

But I don’t knowwhere

ECMP:multiplepossible paths

Takeaway:Weneedthelocationsofthelosses

Page 41: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Challenge:lackofdetailsoflosses

• Difficulttoinfertherootcauses

41

Flow1

Flow2 Pass

Drop

0

1

2

3

4

5

6

0 50 100

#Loss

Time (ms)

Differentflowsexperiencedifferentlosspatterns

Losspatternschangeovertime

Takeaway:Needtheinformationofindividuallosses

Page 42: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

LossRadar overview

• Fastdetectionà Periodicallysendtrafficdigesttocollector• Knowinglocationà Install meters to monitor traffic• Needinfoofindividuallossesà Includedetailsofeachloss• Beinggenericà Comparethesetsofpackets

42

DigestCollector

TrafficDigest

TrafficDigest

TrafficDigest

Page 43: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Howtocoverthewholenetwork

• Need to coverallpipelines

43

UM

DMUM

DM

DMUMIngresspipeline

sharedbuffer

Egresspipeline

Cover ingress pipeline Cover egress pipeline

UM:UpstreammeterDM:Downstreammeter

S2

S1S3

Page 44: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

LossRadar overview

• Fastdetectionà Periodicallysendtrafficdigesttocollector• Knowinglocationà Install meters to monitor traffic• Needinfoofindividuallossesà Includedetailsofeachloss• Beinggenericà Comparethesetsofpackets

44

DigestCollector

TrafficDigest

TrafficDigest

TrafficDigest

Page 45: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Howtocomparethesetsofpackets

• UsingInvertibleBloomFilter(IBF)[Sigcomm’11]• OnlyO(#loss)memory– Eachendkeepsasmallpieceofmemory– Same packetsat both ends will cancel out– Only the differences remain

45

upstream

downstream

Collector

Page 46: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

How to achieve low memory usage

46

xorSumcount

00

00

00

00

00

xorSumcount

00

00

00

00

00

upstream

downstream

Collector

Page 47: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

How to achieve low memory usage

47

a a axorSumcount

a1

00

a1

a1

00

xorSumcount

00

00

00

00

00

upstream

downstream

Collector

a

drop

a=5-tuple+IPID

Page 48: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

How to achieve low memory usage

48

bb

bxorSumcount

a1

b1

a⊕b2

a1

b1

xorSumcount

00

b1

b1

00

b1

a a a upstream

downstream

Collector

b

bb b b

Page 49: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

How to achieve low memory usage

49

c c cupstream

downstream

Collector

xorSumcount

a⊕c2

b⊕c2

a⊕b2

a⊕c2

b1

xorSumcount

00

b1

b1

00

b1

bb

ba a a

b b b

c

drop

Page 50: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

How to achieve low memory usage

50

d dd

upstream

downstream

Collector

xorSumcount

a⊕c⊕d3

b⊕c2

a⊕b⊕d3

a⊕c2

b⊕d2

xorSumcount

d1

b1

b⊕d2

00

b⊕d2

c c cb

bba a a

b b bdd d

d

d

xorSumcount

a⊕c2

c1

a1

a⊕c2

00

ca a ac

c

Collectordoessubtraction

Page 51: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Benefits

• Memory-efficient– Proportionalto#losses– 10KBpermeter(trafficpatternreportedinDCTCP)

• Extend to collect more packet information– TTL:helpidentifyloops– Timestamp:taggedinthepacketheadersattheupstream– Anyotherfieldsthatprogrammableswitchescanconfigure

51

Page 52: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Challenges

• HowtoensureUMandDMcapturethesamesetofpackets– UsethepacketheadertocarrybatchID

• Incrementaldeployment– Put UMandDMaroundtheblackbox– Comparesum(UMi)andsum(DMj)

52

Page 53: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

LossRadar conclusion

• Designafastandefficientlossdetectionsystem– Providelocationsanddetailsofindividuallosses– Generictoanytypesoflosses– Memoryonlyproportionalto#losses

• Futurework:designananalyzertodiagnoseproblems– Temporalanalysis,correlationacrossflowsandswitches– Combinewithinfoprovidedbyhosts

53

Page 54: Fast and Efficient Flow and Loss Measurement for Data ...netseminar.stanford.edu/seminars/05_23_16.pdf · – Small memory:2.36MB for 100K flows • Switch control plane – Collects

Conclusion

• FullvisibilityofDataCenters– Reportalltheflowsandallthelostpacketsinafewmillisecondsacrossalltheswitches

• Efficientdatastructuresonprogrammableswitches– BothFlowRadar andLossRadar areimplementedonP4– TechnologytransfertoBarefootswitches(Seemydemotomorrow)

54