Download - Internet Measurement Masterclass 2006
![Page 1: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/1.jpg)
Internet Measurement Masterclass 2006
10:00 Session 1:Kick off, problem space, thinking ahead, you and the law
Andrew Moore - Queen Mary, University of London
11:00 Morning tea11:15 Session 2:
Monitoring with Windows and how not to be deluged with dataDinan Gunawardena - Microsoft Research Cambridge
12:15Hardware selection for monitoring
Fabian Schneider - TU Berlin
12:45 Lunch + concurrently with Endace hardware demonstration13:45 Session 3:
Netflow, and routing data as a source of measurementSteve Uhlig - Delft University of Technology
14:45 Afternoon tea15:00 Session 4:
Statistics for the measurement communitySteven Gilmour - Queen Mary, University of London
15:45 Wrap-up16:00 beer / NGN ProgNet06 workshop starts
![Page 2: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/2.jpg)
Kick-off
Andrew Moore
Queen Mary, University of London
www.dcs.qmul.ac.uk/~awm
![Page 3: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/3.jpg)
What we won’t cover
• Active measurement (AMP, ping, traceroute, rrt, planetlab)
• Exhaustive survey of current measurement research
• I’m happy to provide opinion on these things in a break, but
I am not an active-measurement expert, I don’t even play-one on television.
![Page 4: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/4.jpg)
WHY Measure?
• Measuring something helps you understand it
Few would argue the Internet is important enough to understand
- Good data outlives bad theory- Jeff Dozier
- Measure what is measurable, make measurable what is not.
- after Galelio
![Page 5: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/5.jpg)
Why?a non-exhaustive list
• Measurements are inputs to– validate a model– drive a simulation– test a new approach
• Measurements help understanding (fault-finding)
• Measurements are often part of the accounting process
![Page 6: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/6.jpg)
Why so hard?
Wrong.
-Law
-Level 2 is not always
-accessible
-monitor-able
-Operations staff hate you
1Other monitoring boards are available
Pick your (Endace1) Dag board, plug it in and go. Right?
-Data on the wire is not the only first class measurement object
-Hardware doesn’t work
-Wrong Measurements
-Wrong Interpretation
-Wrong Problem
![Page 7: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/7.jpg)
Where should I start?
• Ask WHY are you measuring?
“Measure twice & cut once”
great for carpenters but
“Think (at least) twice and measure once”
is better for us.
![Page 8: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/8.jpg)
Pick the right tool for the right job
• Measurement of packets on a wire in your lab– Great for observing once specific use of
one set of applications in one place in the Internet
– Terrible for telling you how many mobile devices are used for IPtv in China, or the connectivity among world ISPs, or ….
![Page 9: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/9.jpg)
Uh-Oh
• Who are you going to measure? 1 user? 1000 users?
• When? (what time of the day?)• Where? (your personal machine, a
campus? a country?)• How?
– How-long? a day? week? month?– What method are you going to use?
![Page 10: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/10.jpg)
Law(I am Not a Lawyer and this is UK Law)
• If in doubt, seek out advice• Everything is illegal• Don’t ask a question you don’t want to know
the answer to.
• We care about– RIPA (Interception)– DPA (personal-data storage)
Many Thanks to Richard Clayton and Andrew Cormack
![Page 11: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/11.jpg)
Data Protection Act 1998
• Overriding aim is protect the interests of (and avoid risks to) the Data Subject
• Data processing must comply with the eight principles (as interpreted by the regulator)
• All data controllers must “notify” (£35) the Information Commissioner (unless exempt)– Exceptions for “private use”, “basic business purpose”: see the website
![Page 12: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/12.jpg)
Data Protection act (1998)
• Principle 7 is specially relevant– Appropriate technical and organization measures
shall be taken against unauthorized or unlawful processing of personal data and against accidental loss or destruction of, or damage to personal data
• The Information Commissioner advises that a risk-based approach should be taken in determining what measures are appropriate– Management and organizational measures are as
important as technical ones– Pay attention to data over its entire lifetime
![Page 13: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/13.jpg)
RIP Act 2000
• Part I, Chapter I interception
• Part I, Chapter II communications data
• Part II surveillance & informers
• Part III encryption– not as relevant for this
• Part IV oversight– sets up tribunal and interception commissioner
![Page 14: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/14.jpg)
RIP Act 2000 - Interception
• Tapping a telephone (or copying an email) is “interception”. It must be authorized by a warrant signed by the secretary of state.– SoS means the home secretary (or similar). Power
delegation is temporary. Product is not admissible in court
• Some sensible exceptions exist– Delivered data– Stored data that can be accessed by the production of
an order– Techies running a network– “Lawful business practice”
![Page 15: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/15.jpg)
Lawful Business Practice
• Regulations prescribe how not to commit an offence under the RIP act. They do not specify how to avoid problems with DPA (or other legislation)
• Must make all reasonable efforts to tell all users of system that interception may occur
![Page 16: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/16.jpg)
Law One-slider• If in doubt - ask someone!• Why do you want to do this?
– bare minimum, no “data for data’s sake”– the onus is on you at all times to justify what you
are doing
• Unless you want to keep the DPA happy; don’t keep any personal identifiers
• Use your University ethics committee
I am NOT a Lawyer!
![Page 17: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/17.jpg)
(Good) Measurement Principles
• Check your methodology• Keep all Meta-data• Calibrate your experiments• Automate all processing
– it’s a documentation trail– cache those intermediate results; they tell
you where you went wrong
• Visualize your data at every stage– this helps ensure you didn’t goof
![Page 18: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/18.jpg)
Check your Methodology
• Talk to people around you, find a mentor and even an antagonist
• Better they find something wrong than the external examiner or the reviewers of the paper
• Consider the scope of a reasonable measurement and the claims you can make
![Page 19: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/19.jpg)
Meta-Data
• the filter you used on tcpdump is meta-data.
• your methodology is meta-data• the day/time of the week is meta-data• the hardware you used is meta-data• (possibly) how much alcohol in your
blood-stream is meta-dataKeep it all
![Page 20: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/20.jpg)
Calibrate your experiments• Test your assumptions
• (been assuming the network is busiest at midday - okay this is the moment you find that 3:30 is the busy time)
• “bench-test” your setup; this is just good science – test your processing scripts many (many)
times
• Most departments do not have good test equipment, this is no excuse
![Page 21: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/21.jpg)
Automate your processing
• Make is your friend
• intermediate processing (and the scripts/code that did it) are more meta-data
• critical when you want to reproduce your results (and have others reproduce your results)
![Page 22: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/22.jpg)
Visualize your data
• visualize your data early and often
• scatter plots are always useful
• identify/understand those outliers now– problem? or expected result?
![Page 23: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/23.jpg)
My first network monitor
• configurations– monitor and method
• gotcha
• backhaul network
• storage, archive, index
![Page 24: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/24.jpg)
Configuration
• Hardware selection– How are you going to remote-admin this machine?
• OS / Software selection– Much work in unix domain; that doesn’t make it
good-work; Dinan – tcpdump/pcap is standard and lots of tools
• Not fast, loss-error prone, timestamps are junk,
– divorce the data representation from the method• tcpdump is a useful offline tool but dagtools, CoMo and
others (nprobe, etc) are simply better online
– consider the right tool for the task
![Page 25: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/25.jpg)
Hardware (getting the traffic)
• Passive taps– invasive installation– no impact in operation– “stealing photons”
• Port Mirrors (e.g. Cisco SPAN)– be vewy vewy careful.
• jitter, loss, reordering
– fantastic for multiple/redundant links• multiple copies of packets
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
![Page 26: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/26.jpg)
Hardware 2
• Remember about physical layers?• Observing traffic at end systems is pretty
easy (but imposes an overhead)• intermediate networks may not be trivial to
monitor:– Packet over Ethernet, Packet over Sonet are not
the only possibilities
• Aside from weird layer-2s, maybe encrypted,
![Page 27: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/27.jpg)
Getting the data to somewhere useful
• Out of Band backhaul
– Co-schedule Measurements– FedEx the disks
(realistically - postgrad-u-haul)
– Co-locate storage/processing• storage & processing = heat/power
– Dedicated backhaule.g. using (a piece of) the dedicated research net
![Page 28: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/28.jpg)
Tools• tcpdump (libpcap) - but know the limitationsa) no records of lossb) microsecond accuracy only - and RARELY thatc) simultaneous arrival times are possibled) no record of precision or accuracy or filter or conditions
or monitor-circumstance or equipment failure or …
• gnuplot (or any plotting packet)scatter plot are always useful (combined with eye-
squared)
![Page 29: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/29.jpg)
SharingProviding Access to the data
• Law may prevent access• Either need to control who gets dataOR• Ship code to monitor
(Mogul et al, MineNet 2005/6)
• One PlatformCoMo http://como.sourceforge.net
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 30: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/30.jpg)
These guys do run the Internet(or why I should be nice to my ops guys)
• Looking for a real problem?• Wondering about actual impact?• Talk to your front line• Sysadmins and Operators are front-line• They are rarely stupid• Don’t have the time to “think outside the box”• they will be honest with you (brutally honest in
most cases)• www.nanog.org • www.ripe.org
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
![Page 31: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/31.jpg)
Next….
• Lets examine hardware and Operating Systems issues, specifically:– Windows: the other operating-system– Data-management: how to prevent success-
disaster
– So you want to monitor 10Gbps?
![Page 32: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/32.jpg)
Suppliers
• NetOptics - fibre splitters
• Endace - capture hardware
![Page 33: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/33.jpg)
UK specific resources
• Janet’s NDA and AUP:http://www.ja.net/development/traffic-data/
• Data Protection Act:http://www.hmso.gov.uk/acts/acts1998/19980029.htm
• RIPAhttp://www.legislation.hmso.gov.uk/acts/acts2000/20000023.htm
![Page 34: Internet Measurement Masterclass 2006](https://reader036.vdocuments.site/reader036/viewer/2022062408/5681340f550346895d9b002f/html5/thumbnails/34.jpg)
Specific references• Mark Crovella & Bala Krishnamurthy, Internet Measurement, Wiley
2006
• Walter Willinger, Pragmatic Approach to Dealing with High Variability, IMC 2004
• Vern Paxson, Sound Internet Measurement, IMC 2004
Very early “what I did with my measurements” paper; these papers grandparent much Internet measurement work
• kc claffy, etal, A parameterizable methodology for Internet traffic flow profiling, IEEE JSAC, 1995
• V. Paxson, End-to-End Routing Behavior in the Internet. IEEE/ACM Transactions on Networking, Vol.5, No.5, pp. 601-615, October 1997