performance & troubleshooting @ esnet
Post on 11-Nov-2021
10 Views
Preview:
TRANSCRIPT
Performance & Troubleshooting @ ESnet Mary Hester ESnet Science Engagement Lawrence Berkeley National Laboratory
Main Points
• Troubleshooting tools • Troubleshooting methodology • Case studies
5/16/16 2
5/16/16 3
public perfSONAR Servers (May 2016) • ESnet: 50
– mostly 10G, includes a 40G host in Boston – About 50% are now a ‘combined’ throughput/latency host
• GEANT: 22 • Internet2: 3 • Around 1600 publicly registered servers
May 16, 2016 © 2016, http://www.perfsonar.net 4
Default perfSONAR Throughput Tool: iperf3
• Iperf3 – New implementation if iperf from scratch
• More at: http://fasterdata.es.net/performance-testing/network-troubleshooting-tools/iperf-and-iperf3/
6/2/15 5
A small amount of packet loss makes a huge difference in TCP performance
5/16/16
Metro Area
Local (LAN)
Regional
Continental
International
Measured (TCP Reno) Measured (HTCP) Theoretical (TCP Reno) Measured (no loss)
With loss, high performance beyond metro distances is essentially impossible
Eli’s Testing Methodology 1. Segment-to-segment testing is unlikely to be helpful 2. Run long-distance tests 3. Testers need to be already deployed when you start troubleshooting
5/16/16 7
Wide Area Testing – Problem Statement
5/16/16 8
10GE
10GE
10GE
Nx10GE
10GE
10GE
perfSONARperfSONARBorder perfSONAR Science DMZ perfSONAR
perfSONARBorder perfSONAR
perfSONARScience DMZ perfSONAR
PoorPerformance
WAN
University CampusNational Labortory
Eli’s Methodology – WAN Troubleshooting
5/16/16 9
10GE
10GE
10GE10GE 10GE10GE
10GE10GE
10GE
10GE
Nx10GE
Nx10GE
100GE
100GE
10GE
10GE
10GE
10GE
10GE
100GE100GE
100GE
perfSONAR
perfSONAR
perfSONARBorder perfSONAR Science DMZ perfSONAR
perfSONAR
perfSONARperfSONAR perfSONAR perfSONAR
perfSONAR
10GE
perfSONAR
perfSONARBorder perfSONAR
perfSONARScience DMZ perfSONAR
Internet2 path~15 msec
ESnet path~30 msec
RegionalPath
~2 msec
Campus~1 msecLab
~1 msec
PoorPerformance
Wide Area Testing – Long Clean Test
5/16/16 10
10GE
10GE
10GE10GE 10GE10GE
10GE10GE
10GE
10GE
Nx10GE
Nx10GE
100GE
100GE
10GE
10GE
10GE
10GE
10GE
100GE100GE
100GE
perfSONAR
perfSONAR
perfSONAR
48 msec
Border perfSONAR Science DMZ perfSONAR
perfSONAR
perfSONARperfSONAR perfSONAR perfSONAR
perfSONAR
10GE
perfSONAR
perfSONARBorder perfSONAR
perfSONARScience DMZ perfSONAR
Internet2 path~15 msec
Clean,FastClean,
Fast
ESnet path~30 msec
RegionalPath
~2 msec
Campus~1 msecLab
~1 msec
Poorly Performing Tests Illustrate Likely Problem Areas
5/16/16 11
10GE
10GE
10GE10GE 10GE10GE
10GE10GE
10GE
10GE
Nx10GE
Nx10GE
100GE
100GE
10GE
10GE
10GE
10GE
10GE
100GE100GE
100GE
perfSONAR
perfSONAR
perfSONAR
48 msec
Border perfSONAR Science DMZ perfSONAR
perfSONAR
perfSONARperfSONAR perfSONAR perfSONAR
perfSONAR
10GE
perfSONAR
perfSONARBorder perfSONAR
perfSONARScience DMZ perfSONAR
49 msec
49 msec
Internet2 path~15 msec
Clean,Fast
Clean,FastClean,
Fast
Dirty,Slow
Dirty,Slow
Clean,Fast
ESnet path~30 msec
RegionalPath
~2 msec
Campus~1 msecLab
~1 msec
Troubleshooting Case studies
4/22/16 12
Troubleshooting—Host Tuning • Long path (~70ms), single stream TCP, 10G cards, tuned hosts • Why the nearly 2x uptick? Adjusted net.ipv4.tcp_rmem/wmem maximums (used in
auto tuning) to 64M instead of 16M. • As the path length/throughput expectation increases, this is a good idea. There are limits (e.g.
beware of buffer bloat on short RTTs)
May 16, 2016 13 © 2016, http://www.perfsonar.net
Troubleshooting—Host Tuning • A more complete view – showing the role of MTUs and host tuning (e.g. ‘its all
related’):
May 16, 2016 14 © 2016, http://www.perfsonar.net
Troubleshooting—Host Tuning
May 16, 2016 15 © 2016, http://www.perfsonar.net
HIDDEN SLIDE
• OWAMP shows packet loss increase as utilization increaes
Monitoring Transatlantic Links
May 16, 2016 17 © 2016, http://www.perfsonar.net
Monitoring Transatlantic Links
May 16, 2016 18 © 2016, http://www.perfsonar.net
HIDDEN SLIDE
• perfSONAR testing can be made more precise – this is what happens when you use larger buffers and the omit flag
ESnet Science Engagement Lawrence Berkeley National Laboratory
Thank you!
top related