for the datacenter timely: rtt-based congestion...
TRANSCRIPT
![Page 1: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/1.jpg)
TIMELY: RTT-based Congestion Control for the Datacenter
Radhika Mittal*(UC Berkeley), Vinh The Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi*(Microsoft),
Amin Vahdat, Yaogong Wang, David Wetherall, David Zats
* Work done while at Google
![Page 2: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/2.jpg)
The Story of RTT
![Page 3: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/3.jpg)
Once upon a time, there was a congestion signal,
called RTT.
![Page 4: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/4.jpg)
It had all the qualities to rule the congestion control in
Datacenters.
![Page 5: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/5.jpg)
Qualities of RTT
• Fine-grained and informative
• Quick response time
• No switch support needed
• End-to-end metric
![Page 6: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/6.jpg)
Applicability in Datacenters
● RTT-based schemes discarded for WANs○ Compete poorly with loss-based schemes
● This is not a concern for the Datacenters.
![Page 7: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/7.jpg)
Stringent Performance Requirements
• Tightly coupled computing tasks
• Require both high throughput and low latencies • Packet losses are too costly
![Page 8: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/8.jpg)
However, it was too hard to measure the RTTs accurately.
![Page 9: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/9.jpg)
While RTT was banished from WANs,
it was never even considered for Datacenters!!
![Page 10: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/10.jpg)
And ECN emerged victorious instead.
![Page 11: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/11.jpg)
ECNDCTCP (2010) D2TCP (2012)HULL (2012)
TCP-Bolt (2014)DCQCN (2015)
![Page 12: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/12.jpg)
This is the tale of how we helped RTT become a powerful congestion signal in
Modern Datacenters.
![Page 13: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/13.jpg)
Contributions
1. Show that accurate RTT measurements are possible.
2. Demonstrate the effectiveness of RTT as a congestion signal.
3. Develop TIMELY, an RTT-based congestion control for the datacenters.
![Page 14: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/14.jpg)
Accurate RTT Measurement
![Page 15: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/15.jpg)
Hardware Assisted RTT Measurement
● HW timestamps ○ mitigate noise in measurement
● HW Acknowledgements○ low processing overhead
![Page 16: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/16.jpg)
Hardware vs Software Timestamps
Kernel Timestamps introduce significant noise in RTT measurements compared to HW Timestamps.
![Page 17: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/17.jpg)
RTT as a congestion signal
![Page 18: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/18.jpg)
RTT is a multi-bit signal
012345
000111
RTT
ECN
![Page 19: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/19.jpg)
RTT correlates with Queuing Delay
![Page 20: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/20.jpg)
Limitations of RTT
![Page 21: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/21.jpg)
RTT Limitations
● RTT is a lumped signal○ Confuses reverse and forward path
congestion
● Changing network paths can cause disparate delays
![Page 22: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/22.jpg)
ACK Prioritization
![Page 23: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/23.jpg)
TIMELY Framework
![Page 24: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/24.jpg)
Overview
RTT Measurement
Engine
Timestamps
RTT Rate Computation
EnginePacing EngineRate
Data
PacedData
![Page 25: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/25.jpg)
Overview
RTT Measurement
Engine
Timestamps
RTT Rate Computation
EnginePacing EngineRate
Data
PacedData
![Page 26: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/26.jpg)
RTT Measurement Engine
RECEIVER
SENDER
![Page 27: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/27.jpg)
RTT Measurement Engine
RECEIVER
SENDERtsend tcompletion
![Page 28: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/28.jpg)
RTT Measurement Engine
Serialization Delay
RECEIVER
SENDERtcompletion
tsend
![Page 29: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/29.jpg)
RTT Measurement Engine
Serialization Delay
RECEIVER
SENDER
ACK Turnaround Time
tcompletiontsend
![Page 30: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/30.jpg)
RTT Measurement Engine
Serialization Delay
RECEIVER
SENDER
Propagation & Queuing Delay
tcompletiontsend
ACK Turnaround Time
![Page 31: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/31.jpg)
RTT Measurement Engine
Serialization Delay
RECEIVER
SENDER
Propagation & Queuing Delay
RTT = Propagation & Queuing Delay
tcompletiontsend
ACK Turnaround Time
![Page 32: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/32.jpg)
RTT Measurement Engine
Serialization Delay
HW AckRECEIVER
SENDER
Propagation & Queuing Delay
RTT = Propagation + Queuing Delay
tcompletiontsend
![Page 33: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/33.jpg)
RTT Measurement Engine
Serialization Delay
RECEIVER
SENDER
Propagation & Queuing Delay
RTT = tcompletion – tsend – Serialization Delay
HW ack
RTTtcompletion
tsend
![Page 34: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/34.jpg)
Overview
RTT Measurement
Engine
Timestamps
RTT Rate Computation
EnginePacing EngineRate
Data
PacedData
![Page 35: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/35.jpg)
Rate Computation Engine● On each segment completion event
○ Input: RTT sample○ Runs the rate update algorithm○ Output: Updated rate
Why do we compute the rate as opposed to a window?○ Segment sizes as high as 64KB○ (32us RTT x 10Gbps) = 40KB window size○ 40KB < 64KB : Window makes no sense
![Page 36: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/36.jpg)
Overview
RTT Measurement
Engine
Timestamps
RTT Rate Computation
EnginePacing EngineRate
Data
PacedData
![Page 37: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/37.jpg)
Pacing Engine
● Computes the send time of a segment using○ segment size○ computed flow rate○ time of last transmission
![Page 38: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/38.jpg)
TIMELY Algorithm
![Page 39: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/39.jpg)
Goals
● Flow Completion Time○ Large Flows: High Throughput○ Short Flows: Low tail latencies
● Ride the throughput-latency curve○ until tail latencies become unacceptable○ low latency prioritized over throughput
● Fairness and Stability
![Page 40: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/40.jpg)
Challenges
● Bursty traffic
● Coarse-grained feedback
● Existing delay-based schemes cannot be used
![Page 41: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/41.jpg)
Algorithm Overview
Gradient-based Increase / Decrease
![Page 42: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/42.jpg)
Algorithm Overview
Gradient-based Increase / Decrease
![Page 43: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/43.jpg)
Algorithm Overview
Gradient-based Increase / Decrease
![Page 44: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/44.jpg)
Algorithm Overview
Gradient-based Increase / Decrease
![Page 45: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/45.jpg)
Algorithm Overview
Gradient-based Increase / Decrease
To navigate the throughput-latency tradeoff and ensure
stability
![Page 46: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/46.jpg)
Algorithm Overview
Gradient-based Increase / Decrease
Additive Increase
Multiplicative Decrease
ThighTlow
To keep tail latency within
acceptable limits
To avoid over-reaction to
transient spikes
To navigate the throughput-latency tradeoff and ensure
stability
![Page 47: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/47.jpg)
Evaluation
![Page 48: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/48.jpg)
Implementation Set-up
● TIMELY is implemented in the context of RDMA.○ RDMA write and read primitives used to invoke NIC services.
● RDMA transport in the NIC is sensitive to packet drops.○ Priority Flow Control is enabled in the network fabric.○ PFC sends out pause frames to ensure lossless network.
![Page 49: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/49.jpg)
Experiments Set-up
● Small-scale experiments:○ In-cast traffic pattern with 10 clients and a server sharing the
same rack.
● Large scale experiments:○ A few hundreds of machine in a classic Clos-network
![Page 50: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/50.jpg)
● Impact of RTT noise●● Comparison with PFC
● Comparison with DCTCP
● More results in the paper...
![Page 51: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/51.jpg)
Impact of RTT Noise
Throughput degrades with increasing noise in RTT. Precise RTT measurement is crucial.
![Page 52: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/52.jpg)
Comparison with PFC - Small Scale
TIMELY PFC
Throughput (Gbps) 19.4 19.5
Avg RTT (us) 61 658
99%ile RTT (us) 116 1036
![Page 53: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/53.jpg)
Comparison with DCTCP
TIMELY DCTCP
Throughput (Gbps) 19.4 19.5
Avg RTT (us) 61 598
99%ile RTT (us) 116 1490
![Page 54: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/54.jpg)
Conclusion● Conventional wisdom considers delay to be
noisy signal for DC○ Experience with TIMELY shows the opposite.
● TIMELY detects and react to 10s of us of delay.
● Open Question: Effectiveness of RTT as DC speeds scale up by order of magnitude; buffer sizes shrink.
![Page 55: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/55.jpg)
Back-up
![Page 56: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/56.jpg)
Hurdles for RTT
● Coexistence is a concern for WANs○ RTT-based schemes compete poorly with loss-
based schemes
● Coexistence is not a concern for datacenters
Wide Area Networks Datacenters
Coexistence Competes poorly with loss-based schemes N/A
Measurement Inaccuracies due to varying paths
Too hard to measure at microsecond
granularity
![Page 57: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/57.jpg)
Hurdles for RTT
● Coexistence is a concern for WANs○ RTT-based schemes compete poorly with loss-
based schemes
● Coexistence is not a concern for datacenters
Wide Area Networks Datacenters
Coexistence Competes poorly with loss-based schemes N/A
Measurement Inaccuracies due to varying paths
Too hard to measure at microsecond
granularity
![Page 58: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/58.jpg)
Hurdles for RTT
● Coexistence is a concern for WANs○ RTT-based schemes compete poorly with loss-
based schemes
● Coexistence is not a concern for datacenters
Wide Area Networks Datacenters
Coexistence Competes poorly with loss-based schemes N/A
Measurement Inaccuracies due to varying paths
Too hard to measure at microsecond
granularity
![Page 59: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/59.jpg)
Hurdles for RTT
● Coexistence is a concern for WANs○ RTT-based schemes compete poorly with loss-
based schemes
● Coexistence is not a concern for datacenters
Wide Area Networks Datacenters
Coexistence Competes poorly with loss-based schemes N/A
Measurement Inaccuracies due to varying paths
Too hard to measure at microsecond
granularity
![Page 60: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/60.jpg)
Hurdles for RTT
● Coexistence is a concern for WANs○ RTT-based schemes compete poorly with loss-
based schemes
● Coexistence is not a concern for datacenters
Wide Area Networks Datacenters
Coexistence Competes poorly with loss-based schemes N/A
Measurement Inaccuracies due to varying paths
Too hard to measure at microsecond
granularity
![Page 61: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/61.jpg)
Excluding Reverse Path Congestion
![Page 62: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/62.jpg)
RTT is a holistic signalRTT ECN
● Total queuing across multiple hops
● Queuing at a single hop
● Includes queuing at the endhost
● Excludes queuing at the endhost
● Total delay seen by low priority packets
● Only the occupancy of low priority queue
![Page 63: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/63.jpg)
Pacing Engine
● Sends segments to the NIC for transmission● Uses a single scheduler● Segments with elapsed send times serviced using
round robin● Segments with future send times queued up
![Page 64: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/64.jpg)
Rate-Based Protocol
● Why not set a window of outstanding bytes?○ Segment sizes as high as 64KB○ (16us RTT x 10Gbps) = 20KB window size○ 20KB < 64KB : Window makes no sense
![Page 65: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/65.jpg)
Reaction to Delay Gradient
● Why not absolute delay?○ Bursty traffic and small propagation delays○ Absolute delay not a stable signal
● Why gradient? ○ allows prompt detection of rising and falling
queue○ is a direct proxy for rate-mismatch
![Page 66: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/66.jpg)
TIMELY Algorithm : BasicStep 1: Compute smoothed delay gradient
Difference in previous and new RTT passed through EWMA filterResult divided by a fixed minRTT value
Actual RTTsAfter Filtering
![Page 67: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/67.jpg)
TIMELY Algorithm : BasicStep 1: Compute smoothed delay gradient
Difference in previous and new RTT passed through EWMA filterResult divided by a fixed minRTT value
Step 2: Compute New Rate If (gradient ≤ 0): rate = rate + δElse: rate = rate . (1 – β . gradient)
AIMD ensures fairness across
flows.
![Page 68: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/68.jpg)
TIMELY Algorithm : For Transient SpikesStep 1: Compute smoothed delay gradient
Difference in previous and new RTT passed through EWMA filterResult divided by a fixed minRTT value
Step 2: Compute New Rate
If(rtt < Tlow) : rate + δIf(rtt < Tlow) : return
If (gradient ≤ 0) : rate = rate + δElse: rate = rate . (1 – β . gradient)
Avoids reaction to transient spikes
![Page 69: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/69.jpg)
TIMELY Algorithm : Adding a Safety NetStep 1: Compute smoothed delay gradient
Difference in previous and new RTT passed through EWMA filterResult divided by a fixed minRTT value
Step 2: Compute New Rate
If(rtt < Tlow) : rate + δIf(rtt < Tlow) : return
If(rtt > Thigh) : rate = rate . (1 – β (1 – Thigh /rtt)) return
If (gradient ≤ 0) : rate = rate + δElse: rate = rate . (1 – β . gradient)
Safety Net
![Page 70: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/70.jpg)
TIMELY Algorithm : Hyperactive Increment
Step 1: Compute smoothed delay gradient Difference in previous and new RTT passed through EWMA filterResult divided by a fixed minRTT value
Step 2: Compute New Rate
If(rtt < Tlow) : rate + δIf(rtt < Tlow) : return
If(rtt > Thigh) : rate = rate . (1 – β (1 – Thigh /rtt)) return
If (gradient ≤ 0) : rate = rate + N. δElse: rate = rate . (1 – β . gradient)
If gradient < 0 for 5 consecutive events:
N = 5Else, N = 1
![Page 71: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/71.jpg)
Gradient vs Absolute Delay
High target RTT achieves
high throughput but at the cost of
high latencies
Low target RTT achieves
low latencies but at the cost of low throughput
Delay Gradient achieves
high throughput with
low latencies
![Page 72: for the Datacenter TIMELY: RTT-based Congestion Controlnetseminar.stanford.edu/seminars/12_03_15.pdf · Overview RTT Measurement Engine Timestamps RTT Rate Computation Engine Pacing](https://reader030.vdocuments.site/reader030/viewer/2022040110/5f286667fc30695d4768772f/html5/thumbnails/72.jpg)
Comparison with PFC - Large Scale
Reduction in median RPC
latencies increase with
increasing load.
Reduction in tail RPC latencies decrease with
increasing load.