the impact of false sharing on shared congestion management
DESCRIPTION
The Impact of False Sharing on Shared Congestion Management. Srinivasa Aditya Akella Joint work with Srini Seshan and Hari Balakrishnan 28 Feb, 2001. Introduction. Predominant model for congestion control Slow-start AIMD Not always optimal - PowerPoint PPT PresentationTRANSCRIPT
The Impact of False Sharing on Shared Congestion Management
Srinivasa Aditya AkellaJoint work with Srini Seshan and
Hari Balakrishnan28 Feb, 2001
Introduction
Predominant model for congestion control
Slow-start AIMD
Not always optimal
Multiple concurrent flows from Src to Dest may share a bottleneck
Compete for resources rather than co-operate Especially visible in the context of Web transfers
Sharing Congestion Information...
Solution - share congestion information Granularity of sharing
Common destination host (network interface) All destination hosts on the same IP subnet
Set of flows sharing congestion info - macroflow
What are the drawbacks of sharing at agranularity larger than a single flow?
False Sharing
Flows sharing congestion state might not share the same bottleneck
Sender has no knowledge False sharing in the Internet
Flows are treated differently- Service Differentiation
Flows take different paths - Path Diversity
False Sharing
Service Differentiation
Integrated Services Differentiated Services (DiffServ)
Path Diversity
Network Load Balancers Network Address translators (NATs)
The sender observes different bottleneckbandwidths, RTTs and loss rates for flowssharing congestion info
Questions...
Impact on performance and correctness
Compromise to end-to-end congestion control?
Degradation in performance of individual flows? Detection
Under what conditions can false-sharing be detected?
Response
How should congestion sharing systems be modified?
What effect do these modifications have?
What should be the default behavior?
Quantifying the Penalty XXX needs to be fixed
Analysis False sharing reduces observed flow
throughput _l share = _1 _2 / ( _1 + _2)l l l l
False sharing increases observed flow loss rate
r_noshare = sqrt( _1 _2r r ) r_share = ( _1 + _2)/2r r
Service Differentiation
Network treats different flows differently
Bandwidth allocation and buffer resources
IETF DiffServ architecture
Three PHBs : Assured Forwarding, Expedited Forwarding, Best Effort
Nortel's implementation of Diffserv
Experiments with two traffic classes : AF and BE
WRR for bandwidth sharing
RIO (for AF) and RED (for BE) for buffer management
Styles of buffer management
Shared and unshared
Topology for Diffserv
Results...
Predicted throughput = XXX need to fill
The faster connection is slowed down by the slower one
Slower connection is never persistently overloaded
Loss rate for the slower connection does not increase appreciably with sharing
Path Diversity
Two flows taking different routes may not share a bottleneck
Two scenarios where path diversity leads to false sharing
Dispersity Routing
NATs Three distinct categories
Unshared bottleneck No shared bottleneck link
Semi-shared bottleneck One of the unshared paths has a bottleneck
Fully shared bottleneck No bottlenecks in the unshared portions RTTs would be different
Topology for Unshared Bottleneck
Results for Unshared-Bottleneck
Bandwidth is close to the prediction
Loss rates followed similar pattern as with the DiffServ case
Delays and Losses...
Delays vary independently of each other
Losses are uncorrelated
Variations and delays in losses in one flow are more correlated than those across flows
Path Diversity, Other Cases
Fully Shared Bottleneck - How is it Different?
Variations in delay seem correlated
The two flows share a common point of congestion
The flows should not share congection information
Detection
Test description
Rubenstein's Delay and Loss Correlation tests Need modifications to be a part of the architecture
Flows might undergo false-sharing if even one of their bottlenecks is unshared
Two differentially served flows might observe statistically dependent delays
Scheduler at the sender might apportion bandwidths non-uniformly
Congestion control schemes depend on RTTs Aggregating flows with different RTTs would
lead to false sharing
Loss-correlation Test Idea -- Losses are likely to come in bursts
This should hold across flows from the same source when a bottleneck is shared
Rubenstein's tests compare the auto and cross correlation metrics for pairs of flows
Does not detect unshared bottlenecks
Need a test to detect all if all bottlenecks are shared New test - Symmetric Loss Correlation
Loss and cross correlation metrics defined in a manner independent of the flows solves the problem
However, packets across flows are assumed to be spaced closer than those within a flow -- Not always true
A fix -- Schedule transmmissions appropriately
Delay-correlation Test
Delay = f(propagation time, queueing delay)
Queueing delay (Q)can vary significantly with time Current Q is strongly related to recently values
Challanges with measuring delay
Clocks cannot be easily synchronized Use change in delay or the relative delay
Methodology of the tests Use timestamps to compute delays Compute correlations Correlation is independent of constant differences
Out-of-Order Test Flows might have fundamentally different delays
DelayCorr does not identify this
Loss and Delay tests might help detect false-sharing
MultiPath Routing where bottleneck is shared
Out-of-Order test handles this well
Look at packet reordering from a source
Reordering by more than 3 packets => No sharing Limitation: Packets must be delivered to the same
physical destination
Cannot be applied to situations like NAT
Rely on RTTs in such situations
Genuine Sharing is Harder to Detect
Evaluation of the Tests
Two metrics for each tests
Detection time Probability of correct decision
Which test is the best?
Out-of-order tests are mostly accurate Loss tests are neither timely nor accurate Delay tests are timely but not as accurate Symmetric Loss test ouputs correct result
much more often than the asymmetric test
Response to False Sharing Design Issues
Default behavior: share information and detect false-sharing
Scheduling False sharing detected more easily than genuine
sharing Default of no-sharing makes no sense with out-
of-order tests Upon detection, stop sharing
In CM, associate the different flows to different macroflows
Relatively small confidence intervals can be used No significant penalty due to an incorrect
decision
Performance How good can restoration possibly be?
False sharing may penalize flows significantly It might take time to restore performance However, the greater the penalty, the easier it is to
detect Approach to performance evaluation -- multiple, de-
randomized, offline runs
Performance restored in less then a factor of 3 of time taken to detect