how is the internet performing?
DESCRIPTION
How is the Internet Performing?. Les Cottrell – SLAC Lecture # 2 presented at the 26 th International Nathiagali Summer College on Physics and Contemporary Needs, 25 th June – 14 th July, Nathiagali, Pakistan. - PowerPoint PPT PresentationTRANSCRIPT
1
How is the Internet Performing?
Les Cottrell – SLACLecture # 2 presented at the 26th International Nathiagali Summer College on Physics
and Contemporary Needs, 25th June – 14th July, Nathiagali, Pakistan
Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP
2
Overview• Internet characteristics
– packet sizes, protocols, hops, hosts …– complexity, flows, applications
• Application requirements
• How the Internet worldwide is performing as seen by various measurements and metrics
• How well are requirements met?
• Many sources of measurements
CAIDA/Skitter
PingER/IEPM
Matrix
Surveyor
3
Packet size• primarily 3 sizes:
– close to minimum=telnet and ACKs, 1500 (max Ethernet payload, e.g. FTP, HTTP); ~ 560Bytes for TCP implementations not using max transmission unit discovery
Packet size (bytes)Cu,
mul
ativ
e pr
obab
ilit
y %
Packets
Bytes
Mean ~ 420Bytes, median ~ 80Bytes
Measured Feb 2000 at Ames Internet eXchange
~ 84M packets, < 0.05% fragmented
4
Internet protocol use• There are 3 main
protocols in use on the Internet:– UDP (connectionless
datagrams, best effort delivery),
– TCP (Connection oriented, “guaranteed” delivery)
– ICMP (Control Message protocol)
Time Feb-May 2001
Flo
ws/
10m
in
InO
ut
TCP dominates today
SLAC protocol flowsTCP
UDP
ICMP
5
Web use characteristics• Size of web objects varies from site to site, server to
server and by time of day.– Typical medians vary from 1500 to 4000 bytes
• Also varies by object type, e.g. medians for– movies few 100KB to MBs, postscript & audio few
100KB– text, html, applets and images few thousand KB
BytesBig peaks for error messages
6
Hops• Hop counts seen from 4 Skitter sites (Japan, S. Cal,
N. Cal, E. Canada, i.e. 10-15 hops on average
Hop Count
Weak RTT dependenceon hop count
95%
50%
5%
RT
T
Hops
7
Autonomous Systems (AS) Disperson• Color indicates the AS responsible for the router at
the hop, height is number of probes for that route• Seen by Skitter at Palo Alto US (F root name server)
Hop number
8
Country dispersion• Seen from Japan
• After 3 to 4 hops most goes to US.– In some cases goes US & back to jp– Some goes to UK & onto other European countries
Hops
Pro
bes
9
Route maps• Simple routes from TRIUMF, Canada to several
sites already gets quite complex
TRIUMF
SLAC
KEK
UW
FNAL
DESY
CERN
10
Getting more complex• PingER Beacon sites in US seen from TRIUMF,
Vancouver (from Andrew Daviel, TRIUMF)
11
Connections by country
Unknown
USUK
NL
DE
IT
JP
RU
12
Richness of connectivity• Angle = longitude of AS HQ in whois records• Radius=1-log(outdegree(AS)+1)/(maxoutdegree + 1)
– Outdegree = number of next Hops As’ accepting traffic
• Deeper blue & red more connections
•All except 1 of top 15 AS’ are in US, exception in Canada
•Few links between ISPs in Europe and Asia
13
Notes:
•Many .com are in N. America
•S. Asia = in (36K), pk (6K), lk, bd
•E. Asia=jp, cn, my, sg, tw, hk, th, id, bn, mm
•Mid East=il, kw, lb, ae, tr, sa
•TLDs with hosts~238
•Total TLDs~258
Hosts by regions• Jan 2001, 109 Million hosts
– Source: Internet Software Consortium (www.isc.org)• see web site also for hosts/population
14
Backbone utilization
Shows utilization of I2/Abilene backbone links, NB Backbone < 30% loadedMost losses at exchange points & edges
15
Flow sizes
Heavy tailed, in ~ out, UDP flows shorter than TCP, packet~bytes75% TCP-in < 5kBytes, 75% TCP-out < 1.5kBytes (<10pkts)UDP 80% < 600Bytes (75% < 3 pkts), ~10 * more TCP than UDPTop UDP = AFS (>55%), Real(~25%), SNMP(~1.4%)
SNMP
RealA/V
AFS fileserver
16
Flow lengths
• 60% of TCP flows less than 1 second
• Would expect TCP streams longer lived – But 60% of UDP flows over 10 seconds, maybe due to heavy
use of AFS at SLAC– Another (CAIDA) study indicates UDP flows are shorter than
TCP flows
TCP outbound flows
Active time in secs
Measured by Netflowflows tied off at 30 mins
17
Typical Internet traffic by Application• CERFnet link
• Dominated by WWW (http)
WWW
FTP
RealAudio
18
SLAC Traffic profileSLAC offsite links: OC3 to ESnet, 1Gbps to Stanford U & thence OC12 to I2 OC48 to NTONProfile bulk-data xfer dominates
SSHFTP
HTTP
Mbp
s in
Mbp
s ou
tLast 6 months 2 Days
bbftp
iperf
19
SLAC Internet Application usage
Ames IXP: approximately 60-65% was HTTP, about 13% was NNTPUwisc: 34% HTTP, 24% FTP, 13% Napster
20
What does performance depend on?• End-to end internet performance seen by
applications depends on:– round trip times– packet loss– jitter– reachability– bottleneck bandwidth– implementation/configurations– application requirements
• Data transmitted in packets
21
Application requirements• Based on ITU Y1541
• The VoIP loss of 10^-3 used to be 0.25 but that assumed random flat loss– actual loss is often bursty
• Tail drop in routers• Sync loss in circuits, bridge spanning tree reconfiguration, route changes
22
RTT from ESnet to Groups of Sites
ITU G.114 300 ms RTT limit for voice
20%/year
RTT ~ distance/(0.6*c) + hops * router delayRouter delay = queuing + clocking in & out + processing
23
RTT Region to Region
OKWhite 0-64msGreen 64-128msYellow 128-256ms
NOT OKPink 256-512msRed > 512ms
OK within regions, N. America OK with Europe, Japan
24
RTT from California to world
Longitude (degrees)
300ms
300ms
RTT (ms.)
Fre
quen
cy
RT
T (
ms)
Source = Palo Alto CA, W. Coast
E. C
oast
US
W. C
oast
US
Eur
ope
& S
. Am
eric
a
Europe
0.3*0.6c
Bra
zil
E. C
oast
Data from CAIDA Skitter project
25
Longitude
RT
T(m
s)
Seen from Japan
RTT from Japan to world
26
Cumulative RTT distributions• Gives quality
measure
• Seen from San Diego, US Skitter
• Steeper = less jitter, i.e. better
• Small values better
RTT ms
Cum
ulat
ive
%
27
Routes are not symmetric• Min, 50% & 90% RTT
measured by Surveyor• Notice big differences in RTTs• May be due to different paths in
the 2 directions or to different loading
Advanced to U. Chicago
RT
T m
sR
TT
ms U. Chicago to Advanced
28
Loss seen from US to groups of Sites
ETSI DTR/TIPHON-05001 V1.2.5 threshold for good speech
50% improvement / year
29
Detailed example of improvementsIncrease of bandwidth by factor of 460 in 6 years, more than kept pace - factor of 50 times improvement in loss
Note valleys when students on vacation
30
Loss to world from USUsing year 2000, fraction of world’s population/country fromwww.nua.ie/surveys/how_many_online/
31
How are the U.S.
Nets doing?
In general performance is good (i.e. <= 1%)ESnet holding steady, still better than othersEdu (vBNS/Abilene) & .com improving
32
Losses for 28 days in May 2001
• Measured by MIDS to 583 DNS services, 383 Web services, 1367 Internet (ping) hosts, & 1225 ISPs (routers)
DNS
WWWInternet
ISP
% L
oss
33
Losses between Regions
34
Bulk throughput• Important for long TCP flows where we want to
copy large amounts of data from one site to another in a relatively short time, e.g. file transfer
• Depends on RTT, loss, timeouts, window sizes
35
Throughput qualityTCPBW < 1/(RTT*sqrt(loss))
Note E. Europe catching up
Macroscopic Behavior of the TCP Congestion Avoidance Algorithm, Matthis, Semke, Mahdavi, Ott, Computer Communication Review 27(3), July 1997
36
Throughput also depends on window• Optimal window size depends on:
– Bandwidth end to end, i.e. min(BWlinks) AKA bottleneck bandwidth
– Round Trip Time (RTT)– For TCP keep pipe full
• Window (sometime called pipe) ~ RTT*BW
– Can increase bandwidth by
orders of magnitude
• If no loss Throughput ~ Window/RTT
Src Rcv
ACK
t = bits in packet/link speed RTT
37
“Jitter” from N. America to W. Europe“Jitter” = IQR(ipdv), where ipdv(i) =RTT(i) – RTT(i-1)214 pairs
ETSI: DTR/TIPHON-05001 V1.2.5 (1998-09) good speech < 75ms jitter
38
“Jitter” between regions
75ms=Good 125ms=Med 225ms=Poor
ETSI: DTR/TIPHON-05001 V1.2.5 (1998-09)
Jitter varies with loading
39
SLAC-CERNJitter
IQR(ipdv) between CERN & SLAC from Surveyor measurements (12/15/98 & medians for Dec-98)
0.1
1
10
100
0 5 10 15 20 25
Time since midnight (GMT)
IQR
(IP
DV
) in
ms
ec
.
IQR(ipdv) CERN>SLAC IQR(ipdv) SLAC>CERN
Monthly IQR(ipdv) CERN>SLAC Monthly IQR(ipdv) SLAC>CERN
ETSI/TIPHON delayjitter threshold
(75 ms)
40
Reachability Within N. America, & W. Europe loss, RTT and jitter is acceptable for VoIP
But what about reachability
41
Reachability – Outage ProbabilitySurveyor probes randomly 2/secondMeasure time (Outage length) consecutive probes don’t getthroughHeavy tailed outage lengths (packet loss not Poisson)
http://www-iepm.slac.stanford.edu/monitoring/surveyor/outage.html
42
Europe seen from U.S.
650ms
200 ms
7% loss10% loss
1% loss
Monitor siteBeacon site (~10% sites)HENP countryNot HENPNot HENP & not monitored
43
Asia seen from U.S.
3.6% loss
10% loss
0.1% loss
640 ms
450 ms
250ms
44
Latin America, Africa & Australasia4% Loss
2% Loss
350 ms
700ms
170 ms
220 ms
45
Animated monthly 2000
20% loss
200ms RTT
20% unreachable
Big is Bad
46
RTT worldwide from the Matrix
47
More Information• IEEE Communications, May 2000, Vol 38, No 5,
pp 120-159• IEPM/PingER home site
– www-iepm.slac.stanford.edu/• CAIDA/Skitter home site
– www.caida.org/home/• Matrix Net home site
– www.matrix.net/index.html• Surveyor home site:
– www.advanced.org/csg-ippm/