gridnm network monitoring architecture (and a bit about my phd) yee-ting li, 1 st year report @ ucl,...

33
GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, <[email protected]> 1 st Year Report @ UCL, 17 th June 2002

Upload: lionel-winfred-marshall

Post on 31-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

GridNMNetwork Monitoring Architecture(and a bit about my phd)

Yee-Ting Li, <[email protected]>

1st Year Report @ UCL, 17th June 2002

Page 2: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li2

What the GRID is

Distributed System Interconnected with networks Balancing processors, storage and

network utilisation Like the SETI project on steriods Networking is important to make

GRID work

Page 3: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li3

Networking Important!

Only way two grid nodes can communicate with each other

Need ways of determining how ‘efficiently’ they talk

Focus on:The characterising how they talkThe language they use to talk

Page 4: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li4

Part 1

Network Metrics and Measurement GridNM Case studies

Page 5: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li5

Network Metrics / Characteristics Metric: ‘several quantities related to the

performance and reliability of the Internet that we'd like to know the value of. When such a quantity is carefully specified, we term the quantity a metric.’

Can be empirical or derived Singletons, Sample and Statistical

Metrics

Yee-Ting Li
Change to IETF and GGF definitions
Page 6: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li6

Example Metrics

Connectivity One-way delay Two-way delay Throughput / goodput Network path Loss Jitter

Page 7: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li7

Metrics Example

Video Conferencing Needs predictable bit rate Doesn’t usually matter if bit rate changes too

much Needs constant jitter Low one-way delay preferable

FTP Needs reliable transport Throughput depends on urgency of data Jitter and delay don’t matter

Page 8: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li8

Measurement Methodology How to get the metrics Must be repeatable – need to define

methodology carefully Direct measurement of a performance

metric using injected test traffic. Projection of a metric from lower-level

measurements. Estimation of a constituent metric from a set

of aggregated measurements. Estimation of a given metric at one time

from a set of related metrics at other times.

Yee-Ting Li
Change to IETF and GGF definitions
Page 9: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li9

Measurement Example

‘ping’ measures rtt – a direct measurement Sending a single ‘ping’ would give a

singleton - empirical Sending 10 pings (a sample) out and

getting the average is a statistical metric – derived

Using a set of measurements over time, we can derive an Estimate of the rtt

Projection would be if we had the owd for each router to the next – add all up together to get path owd.

Yee-Ting Li
Ping examplewith primary and derived metrics - raw metric - rtt - derived - jitter...
Page 10: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li10

Network Monitoring Uses

Monitoring is measuring over long periods of time

Gives an indication of network performance over time – a baseline

Allows comparison of different tools for analysis

Allows analysis of how different protocols behave in different conditions – in real life

Allows ‘tuning’ of existing protocols to make most out of network

Page 11: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li11

GridNM

Architecture for monitoring the network Backend – collects data for presentation Logs metrics in ASCII log files on a single

host Allows mesh measurements – all nodes

performs measurements to al other nodes Uses standard UNIX infrastructure – ssh

Should be easily adaptable to using Globus certifications once interactive processing is introduced in EDG.

Page 12: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li12

GridNM (cont…)

Uses existing (and future tools) to collect metrics

Modular - uses XML to describe available resources Hosts Tools

Locks hosts if under measurement – prevents other tests affecting metrics

Currently monitoring 6 sites around Europe using 5 tools

Yee-Ting Li
check number of tools and hosts
Page 13: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li13

GridNM ‘plot’

Page 14: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li14

Security

As secure as SSHBut requires automatic logon

Denial of Service Attacks Certain Tools (eg iperf) require

servers to be run.GridNM runs the server (unless

otherwise told not to) before each tests on the remote host

Page 15: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li15

Tool ExamplesName Protocol Metrics Notes

Iperf TCP/UDP Goodput Idea application level performance

UDPMon UDP Loss, goodput

Indication of network performance

Ping ICMP/IP RTT, Jitter Response of network

Traceroute ICMP/IP Path, RTT

Pipechar UDP? Router utilisation

Approximate

BBCP TCP Goodput SCP Copy

GridFTP TCP… Goodput Application

Page 16: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li16

UDP versus TCP

Yee-Ting Li
A plot of UDP against TCP over time for various buffer sizes
Page 17: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li17

Rtt – good network

Yee-Ting Li
Add graphs of pings to various sites around the UK and to Netherlands/CERN
Page 18: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li18

Rtt – periodicity

Page 19: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li19

Rtt – bad network

Page 20: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li20

Rtt – bad network, loss

Page 21: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li21

TCP / Iperf Throughput

Yee-Ting Li
Different window sizes?
Page 22: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li22

TCP Performance

Page 23: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li23

TCP Performance

Page 24: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li24

What does TCP do?

Tap is independent of Tank size Tank filled by application Valve opening (data rate)

determined by feedback from network

Small tanks mean small data rate Large tanks mean larger data rate Even larger tank mean smaller

data rate?!?!

Socket buffer size

TCP Protocol

Network

Page 25: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li25

Investigation

Possible explanation: Rate of tank filling < rate of water flow out i.e. application not fast enough to fill socket

buffer past threshold BUT - needs further investigation

Back to back lab tests with PCs and routers Comparison to other tcp based tools

Page 26: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li26

Part 2

Network Communication Languages Known as transport protocols -

determines how applications put traffic into the network

Sits on top of IP – common language of the internet

Page 27: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li27

Transport Level Protocols

TCP (HTTP, FTP, GridFTP) used for file transfer Gives guarantee on delivery All data is copied precisely Performance can be poor Respects other internet users

UDP (Real, H323) used for video conferencing Gives no guarantees on delivery Data may be incomplete Performance good Doesn’t respect other internet users

Page 28: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li28

UDP versus TCPperformance at high speeds

Page 29: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li29

Measuring Performance of Transport Level Protocols Need to identify what we want to measure –

the metrics. Dependant on the use of the transport

protocol. Need to analyse application level usage

For Grid: Movement of ‘transient’ data

• File Transfer and Replication• process jobs or ‘sandboxes’

Movement of Real-Time Data • Video Conferencing – Access Grid• Real-Time applications

Page 30: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li30

Transport Protocols ‘NG’

Name Transport Notes

UDP Blast UDP

Tsunami UDP/TCP Uses TCP as ‘control’ channel

High Speed TCP

TCP For 10Gb/sec links

PGM / CC Modified UDP Multicast UDP – new transport protocol

IBP Application ‘logistical networking’

Page 31: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li31

Tools to Measure Grid Traffic Eg TCP

Can use web100 – allows analysis of TCP traffic via fundamental variables important to TCP/IP\

GridFTP allows logging of transfer information UDP (UDP Blast, Tsunami)

Need either transport level recording (like web100) or application monitoring

PGM / CC Need application to be built to use transport protocol

General Solution Gather SNMP data from nodes along network.

Page 32: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li32

Future Directions(the phd bit)

Provision Title in field of Providing Advanced Transport Protocols for

Grid Applications Aim: Use GridNM infrastructure to analyse

performance of different transport protocols Implement findings into Grid infrastructure,

eg GridFTP, to improve grid processes (processing jobs, file transfer, file replication, Access Grid…)

Page 33: GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year Report @ UCL, 17 th June 2002

17th June 2002GridNM - Yee-Ting Li33

Conclusion

Created a flexible infrastructure to monitor and analyse internet traffic

Shown metrics for different scenarios Given performance overview of

current transport protocols Identified future areas of research into

Transport Protocols for the grid.