flow control an engineering approach to computer networking

59
Flow Control An Engineering Approach to Computer An Engineering Approach to Computer Networking Networking

Upload: grady-stukes

Post on 28-Mar-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Flow Control An Engineering Approach to Computer Networking

Flow Control

An Engineering Approach to Computer NetworkingAn Engineering Approach to Computer Networking

Page 2: Flow Control An Engineering Approach to Computer Networking

Flow control problem

Consider file transferConsider file transfer Sender sends a stream of packets representing fragments of a Sender sends a stream of packets representing fragments of a

filefile Sender should try to match rate at which receiver and network Sender should try to match rate at which receiver and network

can process datacan process data Can’t send too slow or too fastCan’t send too slow or too fast Too slowToo slow

wastes time wastes time Too fastToo fast

can lead to buffer overflowcan lead to buffer overflow How to find the correct rate?How to find the correct rate?

Page 3: Flow Control An Engineering Approach to Computer Networking

Other considerations

SimplicitySimplicity OverheadOverhead ScalingScaling FairnessFairness StabilityStability

Many interesting tradeoffsMany interesting tradeoffs overhead for stabilityoverhead for stability simplicity for unfairnesssimplicity for unfairness

Page 4: Flow Control An Engineering Approach to Computer Networking

Where?

Usually at transport layerUsually at transport layer Also, in some cases, in datalink layerAlso, in some cases, in datalink layer

Page 5: Flow Control An Engineering Approach to Computer Networking

Model

Source, sink, server, service rate, bottleneck, round trip time Source, sink, server, service rate, bottleneck, round trip time

Page 6: Flow Control An Engineering Approach to Computer Networking

Classification

Open loopOpen loop Source describes its desired flow rateSource describes its desired flow rate Network Network admits admits callcall Source sends at this rateSource sends at this rate

Closed loopClosed loop Source monitors available service rateSource monitors available service rate

Explicit or implicitExplicit or implicit Sends at this rateSends at this rate Due to speed of light delay, errors are bound to occurDue to speed of light delay, errors are bound to occur

HybridHybrid Source asks for some minimum rateSource asks for some minimum rate But can send more, if availableBut can send more, if available

Page 7: Flow Control An Engineering Approach to Computer Networking

Open loop flow control

Two phases to flowTwo phases to flow Call setupCall setup Data transmissionData transmission

Call setupCall setup Network prescribes parametersNetwork prescribes parameters User chooses parameter valuesUser chooses parameter values Network admits or denies callNetwork admits or denies call

Data transmissionData transmission User sends within parameter rangeUser sends within parameter range Network Network policespolices users users Scheduling policies give user QoSScheduling policies give user QoS

Page 8: Flow Control An Engineering Approach to Computer Networking

Hard problems

Choosing a descriptor at a source Choosing a scheduling discipline at intermediate network

elements Admitting calls so that their performance objectives are met (call

admission control).

Page 9: Flow Control An Engineering Approach to Computer Networking

Traffic descriptors

Usually an Usually an envelopeenvelope Constrains worst case behaviorConstrains worst case behavior

Three usesThree uses Basis for traffic contractBasis for traffic contract Input to Input to regulatorregulator Input to Input to policerpolicer

Page 10: Flow Control An Engineering Approach to Computer Networking

Descriptor requirements

RepresentativityRepresentativity adequately describes flow, so that network does not reserve adequately describes flow, so that network does not reserve

too little or too much resourcetoo little or too much resource VerifiabilityVerifiability

verify that descriptor holdsverify that descriptor holds PreservabilityPreservability

Doesn’t change inside the networkDoesn’t change inside the network UsabilityUsability

Easy to describe and use for admission controlEasy to describe and use for admission control

Page 11: Flow Control An Engineering Approach to Computer Networking

Examples

Representative, verifiable, but not useble Time series of interarrival times Time series of interarrival times

Verifiable, preservable, and useable, but not representativeVerifiable, preservable, and useable, but not representative peak ratepeak rate

Page 12: Flow Control An Engineering Approach to Computer Networking

Some common descriptors

Peak ratePeak rate Average rateAverage rate Linear bounded arrival processLinear bounded arrival process

Page 13: Flow Control An Engineering Approach to Computer Networking

Peak rate

Highest ‘rate’ at which a source can send dataHighest ‘rate’ at which a source can send data Two ways to compute itTwo ways to compute it For networks with fixed-size packetsFor networks with fixed-size packets

min inter-packet spacingmin inter-packet spacing For networks with variable-size packetsFor networks with variable-size packets

highest rate over highest rate over allall intervals of a particular duration intervals of a particular duration Regulator for fixed-size packetsRegulator for fixed-size packets

timer set on packet transmissiontimer set on packet transmission if timer expires, send packet, if anyif timer expires, send packet, if any

ProblemProblem sensitive to extremessensitive to extremes

Page 14: Flow Control An Engineering Approach to Computer Networking

Average rate

Rate over some time period (Rate over some time period (windowwindow)) Less susceptible to outliersLess susceptible to outliers Parameters: Parameters: tt and and aa Two types: jumping window and moving windowTwo types: jumping window and moving window Jumping windowJumping window

over consecutive intervals of length over consecutive intervals of length tt, only , only a a bits sentbits sent regulator reinitializes every intervalregulator reinitializes every interval

Moving windowMoving window over all intervals of length over all intervals of length t, t, only only aa bits sent bits sent regulator forgets packet sent more thanregulator forgets packet sent more than t t seconds ago seconds ago

Page 15: Flow Control An Engineering Approach to Computer Networking

Linear Bounded Arrival Process

Source bounds # bits sent in any time interval by a linear Source bounds # bits sent in any time interval by a linear function of timefunction of time

the number of bits transmitted in any active interval of length t is less than rt + s

r is the long term rate s is the burst limit insensitive to outliers Can be viewed as a “leaky bucket”

Page 16: Flow Control An Engineering Approach to Computer Networking

The Leaky Bucket Algorithm

(a)(a) A leaky bucket with water. A leaky bucket with water. (b)(b) a leaky bucket with packets. a leaky bucket with packets.

Page 17: Flow Control An Engineering Approach to Computer Networking

Leaky bucket

Token bucket fills up at rate Token bucket fills up at rate rr Largest # tokens < Largest # tokens < ss

Page 18: Flow Control An Engineering Approach to Computer Networking

More Leaky Bucket

Token and data bucketsToken and data buckets Sum is what matters [Berger and Whitt 92] Sum is what matters [Berger and Whitt 92]

Peak rate regulator and moving window average rate regulator Peak rate regulator and moving window average rate regulator can be constructed by token capacity to 1 and make 1 token can be constructed by token capacity to 1 and make 1 token arrive every 1/rate seconds.arrive every 1/rate seconds.

Burst length S, bucket capacity C, token arrival rate r, maximum Burst length S, bucket capacity C, token arrival rate r, maximum burst rate Mburst rate M MS = C+rSMS = C+rS

Combining leaky bucket with token bucket to limit burst sizeCombining leaky bucket with token bucket to limit burst size

Page 19: Flow Control An Engineering Approach to Computer Networking

The Leaky Bucket Algorithm

(a) Input to a leaky bucket. (b) Output from a leaky bucket. Output from a token bucket with capacities of (c) 250 KB, (d) 500 KB, (e) 750 KB, (f) Output from a 500KB token bucket feeding a 10-MB/sec leaky bucket.

Page 20: Flow Control An Engineering Approach to Computer Networking

Choosing LBAP parameters

Tradeoff between Tradeoff between r r and and ss Minimal descriptorMinimal descriptor

doesn’t simultaneously have smaller doesn’t simultaneously have smaller rr and and ss presumably costs lesspresumably costs less

How to choose minimal descriptor?How to choose minimal descriptor? Three way tradeoffThree way tradeoff

choice of choice of s s (data bucket size)(data bucket size) loss rateloss rate choice of choice of rr

Page 21: Flow Control An Engineering Approach to Computer Networking

Choosing minimal parameters

Keeping loss rate the sameKeeping loss rate the same if if s s is more, is more, r r is less (smoothing) is less (smoothing) for each for each rr we have least we have least ss

Choose knee of curve (P = peak rate, A = Average rate)Choose knee of curve (P = peak rate, A = Average rate)

Page 22: Flow Control An Engineering Approach to Computer Networking

LBAP

Popular in practice and in academiaPopular in practice and in academia sort of representativesort of representative verifiableverifiable sort of preservablesort of preservable sort of usablesort of usable

Problems with multiple time scale trafficProblems with multiple time scale traffic large burst messes up thingslarge burst messes up things

Page 23: Flow Control An Engineering Approach to Computer Networking

Open loop vs. closed loop

Open loopOpen loop describe trafficdescribe traffic network admits/reserves resourcesnetwork admits/reserves resources regulation/policingregulation/policing

Closed loopClosed loop can’t describe traffic or network doesn’t support reservationcan’t describe traffic or network doesn’t support reservation monitor available bandwidthmonitor available bandwidth

perhaps allocated using GPS-emulationperhaps allocated using GPS-emulation adapt to itadapt to it if not done properly eitherif not done properly either

too much losstoo much loss unnecessary delayunnecessary delay

Page 24: Flow Control An Engineering Approach to Computer Networking

Taxonomy

First generationFirst generation ignores network stateignores network state only match receiveronly match receiver

Second generationSecond generation responsive to stateresponsive to state three choicesthree choices

State measurementState measurement

• explicit or implicitexplicit or implicit ControlControl

• flow control window size or rateflow control window size or rate Point of controlPoint of control

• endpoint or within networkendpoint or within network

Page 25: Flow Control An Engineering Approach to Computer Networking

Explicit vs. Implicit

ExplicitExplicit Network tells source its current rateNetwork tells source its current rate Better controlBetter control More overheadMore overhead

ImplicitImplicit Endpoint figures out rate by looking at networkEndpoint figures out rate by looking at network Less overheadLess overhead

Ideally, want overhead of implicit with effectiveness of explicitIdeally, want overhead of implicit with effectiveness of explicit

Page 26: Flow Control An Engineering Approach to Computer Networking

Flow control window

Recall error control windowRecall error control window Largest number of packet outstanding (sent but not acked)Largest number of packet outstanding (sent but not acked) If endpoint has sent all packets in window, it must wait => slows If endpoint has sent all packets in window, it must wait => slows

down its ratedown its rate Thus, window provides Thus, window provides bothboth error control and flow control error control and flow control This is called This is called transmission transmission windowwindow Coupling can be a problemCoupling can be a problem

Few buffers are receiver => slow rate!Few buffers are receiver => slow rate!

Page 27: Flow Control An Engineering Approach to Computer Networking

Window vs. rate

In adaptive rate, we directly control rateIn adaptive rate, we directly control rate Needs a timer per connectionNeeds a timer per connection Plusses for windowPlusses for window

no need for fine-grained timerno need for fine-grained timer self-limitingself-limiting

Plusses for ratePlusses for rate better control (finer grain)better control (finer grain) no coupling of flow control and error controlno coupling of flow control and error control

Rate control must be careful to avoid overhead and sending too Rate control must be careful to avoid overhead and sending too muchmuch

Page 28: Flow Control An Engineering Approach to Computer Networking

Hop-by-hop vs. end-to-end

Hop-by-hopHop-by-hop first generation flow control at each linkfirst generation flow control at each link

next server = sinknext server = sink easy to implementeasy to implement

End-to-endEnd-to-end sender matches all the servers on its pathsender matches all the servers on its path

Plusses for hop-by-hop Plusses for hop-by-hop simplersimpler distributes overflowdistributes overflow better controlbetter control

Plusses for end-to-endPlusses for end-to-end cheapercheaper

Page 29: Flow Control An Engineering Approach to Computer Networking

On-off (481 stuff)

Receiver gives ON and OFF signals If ON, send at full speed If OFF, stop OK when RTT is small What if OFF is lost? Bursty Used in serial lines or LANs

Page 30: Flow Control An Engineering Approach to Computer Networking

Stop and Wait (481 stuff)

Send a packetSend a packet Wait for ack before sending next packetWait for ack before sending next packet

Page 31: Flow Control An Engineering Approach to Computer Networking

Static window (481 stuff)

Stop and wait can send at most one pkt per RTTStop and wait can send at most one pkt per RTT Here, we allow multiple packets per RTT (= transmission Here, we allow multiple packets per RTT (= transmission

window)window)

Page 32: Flow Control An Engineering Approach to Computer Networking

What should window size be? (481 stuff)

Let bottleneck service rate along path = b pkts/sec Let round trip time = R sec Let flow control window = w packet Sending rate is w packets in R seconds = w/R To use bottleneck w/R >= b w >= bR This is the bandwidth delay product or optimal window size

Page 33: Flow Control An Engineering Approach to Computer Networking

Static window

Works well if b and R are fixedWorks well if b and R are fixed But, bottleneck rate changes with time!But, bottleneck rate changes with time! Static choice of w can lead to problemsStatic choice of w can lead to problems

too smalltoo small too largetoo large

So, need to adapt windowSo, need to adapt window Always try to get to the Always try to get to the current current optimal valueoptimal value

Page 34: Flow Control An Engineering Approach to Computer Networking

DECbit flow control

IntuitionIntuition every packet has a bit in headerevery packet has a bit in header intermediate routers set bit if queue has built up => source intermediate routers set bit if queue has built up => source

window is too largewindow is too large sink copies bit to acksink copies bit to ack if bits set, source reduces window sizeif bits set, source reduces window size in steady state, oscillate around optimal sizein steady state, oscillate around optimal size

Page 35: Flow Control An Engineering Approach to Computer Networking

DECbit

When do bits get set?When do bits get set? How does a source interpret them?How does a source interpret them?

Page 36: Flow Control An Engineering Approach to Computer Networking

DECbit details: router actions

Measure Measure demanddemand and mean queue length of each source Computed over queue regeneration cycles Balance between sensitivity and stability

Page 37: Flow Control An Engineering Approach to Computer Networking

Router actions

If mean queue length > 1.0If mean queue length > 1.0 set bits on sources whose demand exceeds fair shareset bits on sources whose demand exceeds fair share

If it exceeds 2.0If it exceeds 2.0 set bits on everyoneset bits on everyone panic!panic!

Page 38: Flow Control An Engineering Approach to Computer Networking

Source actions

Keep track of bitsKeep track of bits Can’t take control actions too fast!Can’t take control actions too fast! Wait for past change to take effectWait for past change to take effect Measure bits over past + present window size (2RTT)Measure bits over past + present window size (2RTT) If more than 50% set, then decrease window, else increaseIf more than 50% set, then decrease window, else increase Additive increase, multiplicative decreaseAdditive increase, multiplicative decrease

Page 39: Flow Control An Engineering Approach to Computer Networking

Evaluation

Works with FIFOWorks with FIFO but requires per-connection state (demand)but requires per-connection state (demand)

SoftwareSoftware ButBut

assumes cooperation!assumes cooperation! conservative window increase policyconservative window increase policy

Page 40: Flow Control An Engineering Approach to Computer Networking

Sample trace

Page 41: Flow Control An Engineering Approach to Computer Networking

TCP Flow Control

ImplicitImplicit Dynamic windowDynamic window End-to-endEnd-to-end

Very similar to DECbit, butVery similar to DECbit, but no support from routersno support from routers increase if no loss (usually detected using timeout)increase if no loss (usually detected using timeout) window decrease on a timeoutwindow decrease on a timeout additive increase multiplicative decreaseadditive increase multiplicative decrease

Page 42: Flow Control An Engineering Approach to Computer Networking

TCP details

Window starts at 1Window starts at 1 Increases exponentially for a while, then linearlyIncreases exponentially for a while, then linearly Exponentially => doubles every RTTExponentially => doubles every RTT Linearly => increases by 1 every RTTLinearly => increases by 1 every RTT During exponential phase, every ack results in window increase During exponential phase, every ack results in window increase

by 1by 1 During linear phase, window increases by 1 when # acks = During linear phase, window increases by 1 when # acks =

window sizewindow size Exponential phase is calledExponential phase is called slow start slow start Linear phase is calledLinear phase is called congestion avoidance congestion avoidance

Page 43: Flow Control An Engineering Approach to Computer Networking

More TCP details

On a loss, current window size is stored in a variable called On a loss, current window size is stored in a variable called slow start thresholdslow start threshold or or ssthreshssthresh

Switch from exponential to linear (slow start to congestion Switch from exponential to linear (slow start to congestion avoidance) when window size reaches thresholdavoidance) when window size reaches threshold

Loss detected either with timeout or Loss detected either with timeout or fast retransmitfast retransmit (duplicate (duplicate cumulative acks)cumulative acks)

Two versions of TCPTwo versions of TCP Tahoe: in both cases, drop window to 1Tahoe: in both cases, drop window to 1 Reno: on timeout, drop window to 1, and on fast retransmit Reno: on timeout, drop window to 1, and on fast retransmit

drop window to half previous size (also, increase window on drop window to half previous size (also, increase window on subsequent acks)subsequent acks)

Page 44: Flow Control An Engineering Approach to Computer Networking

TCP vs. DECbit

Both use dynamic window flow control and additive-increase Both use dynamic window flow control and additive-increase multiplicative decreasemultiplicative decrease

TCP uses implicit measurement of congestionTCP uses implicit measurement of congestion probe a black boxprobe a black box

Operates at the Operates at the cliffcliff Source does not filter informationSource does not filter information

Page 45: Flow Control An Engineering Approach to Computer Networking

Evaluation

Effective over a wide range of bandwidthsEffective over a wide range of bandwidths A lot of operational experienceA lot of operational experience WeaknessesWeaknesses

loss => overload? (wireless)loss => overload? (wireless) overload => self-blame, problem with FCFSoverload => self-blame, problem with FCFS ovelroad detected only on a lossovelroad detected only on a loss

in steady state, source in steady state, source inducesinduces loss loss needs at least bR/3 buffers per connectionneeds at least bR/3 buffers per connection

Page 46: Flow Control An Engineering Approach to Computer Networking

Sample trace

Page 47: Flow Control An Engineering Approach to Computer Networking

TCP Vegas

Expected throughput = Expected throughput = transmission_window_size/propagation_delaytransmission_window_size/propagation_delay

Numerator: knownNumerator: known Denominator: measure Denominator: measure smallestsmallest RTT Also know actual throughput Difference = how much to reduce/increase rate Algorithm

send a special packetsend a special packet on ack, compute expected and actual throughputon ack, compute expected and actual throughput (expected - actual)* RTT packets in bottleneck buffer(expected - actual)* RTT packets in bottleneck buffer adjust sending rate if this is too largeadjust sending rate if this is too large

Works better than TCP RenoWorks better than TCP Reno

Page 48: Flow Control An Engineering Approach to Computer Networking

NETBLT

First rate-based flow control scheme Separates error control (window) and flow control (no coupling) So, losses and retransmissions do not affect the flow rate Application data sent as a series of buffers, each at a particular

rate Rate = (burst size + burst rate) so granularity of control = burst Initially, no adjustment of rates Later, if received rate < sending rate, multiplicatively decrease

rate Change rate only once per buffer => slow

Page 49: Flow Control An Engineering Approach to Computer Networking

Packet pair

Improves basic ideas in NETBLTImproves basic ideas in NETBLT better measurement of bottleneckbetter measurement of bottleneck control based on predictioncontrol based on prediction finer granularityfiner granularity

Assume all bottlenecks serve packets in round robin orderAssume all bottlenecks serve packets in round robin order Then, spacing between packets at receiver (= ack spacing) = Then, spacing between packets at receiver (= ack spacing) =

1/(rate of slowest server)1/(rate of slowest server) If If allall data sent as paired packets, no distinction between data data sent as paired packets, no distinction between data

and probesand probes Implicitly determine service rates if servers are round-robin-likeImplicitly determine service rates if servers are round-robin-like

Page 50: Flow Control An Engineering Approach to Computer Networking

Packet pair

Page 51: Flow Control An Engineering Approach to Computer Networking

Packet-pair details

Acks give time series of service rates in the pastAcks give time series of service rates in the past We can use this to predict the next rateWe can use this to predict the next rate Exponential averager, with fuzzy rules to change the averaging Exponential averager, with fuzzy rules to change the averaging

factorfactor Predicted rate feeds into flow control equationPredicted rate feeds into flow control equation

Page 52: Flow Control An Engineering Approach to Computer Networking

Packet-pair flow control

Let X = # packets in bottleneck buffer S = # outstanding packets R = RTT b = bottleneck rate Then, X = S - Rb (assuming no losses) Let l = source rate l(k+1) = b(k+1) + (setpoint -X)/R

Page 53: Flow Control An Engineering Approach to Computer Networking

Sample trace

Page 54: Flow Control An Engineering Approach to Computer Networking

ATM Forum EERC

Similar to DECbit, but send a whole cell’s worth of info instead Similar to DECbit, but send a whole cell’s worth of info instead of one bitof one bit

Sources periodically send a Resource Management (RM) cell Sources periodically send a Resource Management (RM) cell with a with a rate requestrate request typically once every 32 cellstypically once every 32 cells

Each server fills in RM cell with current share, if lessEach server fills in RM cell with current share, if less Source sends at this rateSource sends at this rate

Page 55: Flow Control An Engineering Approach to Computer Networking

ATM Forum EERC details

Source sends Explicit Rate (ER) in RM cellSource sends Explicit Rate (ER) in RM cell Switches compute source share in an unspecified manner Switches compute source share in an unspecified manner

(allows competition)(allows competition) Current rate = allowed cell rate = ACRCurrent rate = allowed cell rate = ACR If ER > ACR then ACR = ACR + RIF * PCR else ACR = ERIf ER > ACR then ACR = ACR + RIF * PCR else ACR = ER If switch does not change ER, then use DECbit ideaIf switch does not change ER, then use DECbit idea

If CI bit set, ACR = ACR (1 - RDF)If CI bit set, ACR = ACR (1 - RDF) If ER < AR, AR = ERIf ER < AR, AR = ER Allows interoperability of a sortAllows interoperability of a sort If idle 500 ms, reset rate to Initial cell rateIf idle 500 ms, reset rate to Initial cell rate If no RM cells return for a while, ACR *= (1-RDF)If no RM cells return for a while, ACR *= (1-RDF)

Page 56: Flow Control An Engineering Approach to Computer Networking

Comparison with DECbit

Sources know exact rateSources know exact rate Non-zero Initial cell-rate => conservative increase can be Non-zero Initial cell-rate => conservative increase can be

avoidedavoided Interoperation between ER/CI switchesInteroperation between ER/CI switches

Page 57: Flow Control An Engineering Approach to Computer Networking

Problems

RM cells in data path a mess Updating sending rate based on RM cell can be hard Interoperability comes at the cost of reduced efficiency (as bad

as DECbit) Computing ER is hard

Page 58: Flow Control An Engineering Approach to Computer Networking

Comparison among closed-loop schemes

On-off, stop-and-wait, static window, DECbit, TCP, NETBLT, On-off, stop-and-wait, static window, DECbit, TCP, NETBLT, Packet-pair, ATM Forum EERCPacket-pair, ATM Forum EERC

Which is best? No simple answerWhich is best? No simple answer Some rules of thumbSome rules of thumb

flow control easier with RR schedulingflow control easier with RR scheduling otherwise, assume cooperation, or police ratesotherwise, assume cooperation, or police rates

explicit schemes are more robustexplicit schemes are more robust hop-by-hop schemes are more resposive, but more compleshop-by-hop schemes are more resposive, but more comples try to separate error control and flow controltry to separate error control and flow control rate based schemes are inherently unstable unless well-rate based schemes are inherently unstable unless well-

engineeredengineered

Page 59: Flow Control An Engineering Approach to Computer Networking

Hybrid flow control

Source gets a minimum rate, but can use moreSource gets a minimum rate, but can use more All problems of both open loop and closed loop flow controlAll problems of both open loop and closed loop flow control Resource partitioning problemResource partitioning problem

what fraction can be reserved?what fraction can be reserved? how?how?