postech 1/46 csed702y: software defined networking james won-ki hong department of computer science...
TRANSCRIPT
POSTECH 1/46CSED702Y: Software Defined Networking
Software Defined Networking:Traffic Monitoring and Analysis
James Won-Ki Hong
Department of Computer Science and Engineering
POSTECH, Korea
POSTECH 2/47CSED702Y: Software Defined Networking
Outline
Introduction Motivation Research Issues and Goals Active Monitoring Techniques Passive Monitoring Techniques
POSTECH 3/47CSED702Y: Software Defined Networking
Introduction
POSTECH 4/47CSED702Y: Software Defined Networking
Introduction (1/9)
Growth of Internet Users The Number of Internet users is growing
Source : www.internetworldstats.com
POSTECH 5/47CSED702Y: Software Defined Networking
Introduction (2/9)
Growth of Internet Users Internet traffic has increased dramatically
Source: Cisco
(Exabyte = 1 million terabytes = 260 bytes)
POSTECH 6/47CSED702Y: Software Defined Networking
Introduction (3/9)
Stand-alone applications can now utilize networking Cooperative editing: Abiword, ACE, MS SharePoint Workspace Browser-based software: Google Docs, Google Wave Game console: Microsoft XBOX, Sony Playstation, Nintendo Wii
Network applications Online games, shopping, banking, stock trading, network storage,
P2P applications VOD, EOD (Education on Demand), VOIP, IPTV
Online game VoIP VOD
POSTECH 7/47CSED702Y: Software Defined Networking
Introduction (4/9)
Client-Server Traditional structure
Peer-to-Peer (P2P) New concept between file sharing and transferring Generates high volume of traffic
clientserver
peerdiscovery, content, transfer query
peer
peer
Structures of applications are changing!
POSTECH 8/47CSED702Y: Software Defined Networking
Introduction (5/9)
Types of Traffic Static sessions vs. Dynamic sessions
Bursty data transfer vs. Streaming data transfer
Types of traffic are various and increasing!
Negotiate &
allocate
connect
disconnect
use dynamic protocol, port
data
connect
disconnect
control
use static protocol,
port
packet network packet network
POSTECH 9/47CSED702Y: Software Defined Networking
Introduction (6/9)
Internet Protocol Distribution
Transport Protocol Distribution The amount of UDP flows is increasing by P2P applications The amount of ICMP flows is increasing by Internet worms
protocol Flows Packets Bytes
TCP 32,515 14.4% 1,797,176 86.3% 1,339,396,630 96.8%
UDP 54,561 24.2% 141,769 6.8% 27,812,586 2.0%
ICMP 138,253 61.3% 141,247 6.7% 15,720,410 1.1%
Others 125 0.0% 474 0.0% 32,160 0.0%
2003.09.16 – 19:36
POSTECH Internet Junction Traffic
2003.09.16 – 19:36
POSTECH Internet Junction Traffic
POSTECH 10/47CSED702Y: Software Defined Networking
Introduction (7/9)
Internet Protocol Distribution
Transport Protocol Distribution The amount of UDP flows is increasing by P2P, gaming & multime-
dia streaming applications
protocol Flows Packets Bytes
TCP 42,533 5.8% 1,677,721 38.7% 1,288,490,188 39.9%
UDP 678,800 93.4% 2,621,440 60.5% 1,932,735,283 59.9%
ICMP 4,452 0.6% 31,256 0.7% 2,516,582 0.1%
Others 445 0.0% 3,099 0.0% 570,726 0.0%
2011.03.28 – 18:15
POSTECH Internet Junction Traffic
2011.03.28 – 18:15
POSTECH Internet Junction Traffic
POSTECH 11/47CSED702Y: Software Defined Networking
Introduction (8/9)
Port Number Usage in TCP/UDP Port number distribution in bytes
Proportion of Internet applications
TCP Server Listening Port Number Distribution
2%
98% <1024>=1024
41%
59%
<1024>=1024
UDP Port Number Distribution
? ?
54%21%
20%
0%
5%
HTTPFTPTELNETSMTPOthers
2003.09.16 – 19:36
POSTECH Internet Junction Traffic
2003.09.16 – 19:36
POSTECH Internet Junction Traffic
?
POSTECH 12/47CSED702Y: Software Defined Networking
Introduction (9/9)
Port Number Usage in TCP/UDP Port number distribution in bytes
Proportion of Internet applications
0.18%
99.82%
< 1024Others
0.75%
99.25%
< 1024Others
TCP Server Listening Port Number Distribution UDP Port Number Distribution
? ?
11.403%2.484%
84.986%
httpssltcp encap.smtppoprtspsshOthers
?
2011.03.28 – 18:15
POSTECH Internet Junction Traffic
2011.03.28 – 18:15
POSTECH Internet Junction Traffic
POSTECH 13/47CSED702Y: Software Defined Networking
Motivation (1/2)
Needs of Service Providers Understand the behavior of their networks Provide fast, high-quality, reliable service to satisfy customers and
thus reduce churn rate Plan for network deployment and expansion SLA monitoring, Network security Increase Revenue!
• Usage-based billing for network users (like telephone calls)• Marketing using CRM data
Needs of Customers Want to get their money’s worth Fast, reliable, high-quality, secure, virus-free Internet access
To Satisfy Service Providers’ Needs to Satisfy Their Customers!
POSTECH 14/47CSED702Y: Software Defined Networking
Motivation (2/2)
Application Areas Network Problem Determination and Analysis Traffic Report Generation Intrusion & Hacking Attack (e.g., DoS, DDoS) Detection Service Level Monitoring (SLM) Network Planning Usage-based Billing Customer Relationship Management (CRM) Marketing
POSTECH 15/47CSED702Y: Software Defined Networking
Issues in Traffic Monitoring
Choices Single-point vs. Multi-point monitoring
• Number of probing or test packet generation point In-service vs. Out-of-service monitoring
• Whether monitoring should be executed during service or not Continuous vs. On-demand monitoring
• Monitoring executes continuously or by on-demand. Packet vs. Flow-based monitoring
• Collect packets or flows from network devices. One-way vs. Bi-directional monitoring
• Monitor forward path only / forward and return path
Trade-offs Network bandwidth Processing overhead Accuracy Cost
POSTECH 16/47CSED702Y: Software Defined Networking
Problems
Capturing Packets High-speed networks (Mbps Gbps Tbps) High-volume traffic Streaming media (Windows Media, Real Media, Quicktime) P2P traffic Network Security Attacks
Flow Generation & Storage What packet information to save to perform various analysis? How to minimize storage requirements?
Analysis How to analyze and generate data needed quickly? What kinds of info needs to be generated? Depends on applica-
tions
POSTECH 17/47CSED702Y: Software Defined Networking
Research & Development Goals
Develop Methods to Capture all packets Generate flows Store flows efficiently Analyze data efficiently Generate various reports or information that are suitable for various
application areas
Develop a Flexible, Scalable Traffic Monitoring and Analysis System for High-speed High-volume Rich media IP networks
POSTECH 18/47CSED702Y: Software Defined Networking
Traffic Monitoring
POSTECH 19/47CSED702Y: Software Defined Networking
Network Monitoring Metrics (1/5)
One way loss
RT loss
One way delay
RT delay
Capacity
Bandwidth
Throughput
Delay variance
Network MonitoringMetrics
AvailabilityConnectivity
Functionality
Loss
Delay
Utilization
POSTECH 20/47CSED702Y: Software Defined Networking
Network Monitoring Metrics (2/5)
Availability The percentage of a specified time interval during which the system
was available for normal use
What is supposed to be available?• Service, Host, Network
Availabilities are usually reported as a single monthly figure• 99.99% availability means that the service is unavailable for 4 minutes during a
month
One can test availability by sending suitable packets and observing the answering packets (latency, packet loss)
Metrics• Connectivity: the physical connectivity of network elements• Functionality: whether the associated system works well or not
POSTECH 21/47CSED702Y: Software Defined Networking
Network Monitoring Metrics (3/5)
Packet Loss The fraction of packets lost in transit from a host to another during
a specified time interval
Internet packet transport works on a best-effort basis, i.e., a router may drop them depending on its current conditions
A moderate level of packet loss is not in itself tolerable• Some real-time services, e.g., VoIP, can tolerate some packet losses• TCP resends lost packets at a slower rate
Metrics• One way loss• Round Trip (RT) loss
POSTECH 22/47CSED702Y: Software Defined Networking
Network Monitoring Metrics (4/5)
Delay (Latency) The time taken for a packet to travel from a host to another
Round Trip Time (RTT)• Forward transport delay + server delay + backward transport delay
Forward transport delay is often not the same as backward trans-port delay (may use different paths)
For streaming applications, high delay or delay variation (jitter) can cause degradation on user-perceived QoS
Metrics• One way delay• Round Trip Time (delay)• Delay variance (jitter)
POSTECH 23/47CSED702Y: Software Defined Networking
Network Monitoring Metrics (5/5)
Throughput The rate at which data is sent through the network, usually ex-
pressed in bytes/sec, packets/sec, or flows/sec
Be careful in choosing the interval; a long interval will average out short-term bursts in the data rate• A good compromise is to use one- to five-minute intervals, and to produce daily,
weekly, monthly, and yearly plots
Link Utilization over a specified interval is simply the throughput for the link expressed as a percentage of the access rate
Metrics• Link Capacity (Mbps, Gbps)• Throughput (bytes/sec, packets/sec, flows/sec)• Utilization (%)
POSTECH 24/47CSED702Y: Software Defined Networking
Traffic Monitoring Approaches (1/4)
Active Monitoring
Passive Monitoring
POSTECH 25/47CSED702Y: Software Defined Networking
Traffic Monitoring Approaches (2/4)
Active Monitoring Performed by sending test (probe) traffic into network
• Generate test packets periodically or on-demand• Measure performance of test packets or responses • Take the statistics
Impose extra traffic on network and distort its behavior in the process Test packet can be blocked by firewall or processed at low priority by
routers Mainly used to monitor network performance
Test packetgenerator
Test packetprobe
ResponseProbe Target
host
POSTECH 26/47CSED702Y: Software Defined Networking
Traffic Monitoring Approaches (3/4)
Passive Monitoring Carried out by observing network traffic
• Collect packets from a link or network flow from a router• Perform analysis on captured packets for various purposes
Network device performance degrades by mirroring or flow export Used to perform various traffic usage/characterization analysis or
intrusion detection
FlowData
TrafficInformation
PacketCapture
TrafficAnalysis
FlowGeneration
Ne
two
rk link
Ro
ute
r
POSTECH 27/47CSED702Y: Software Defined Networking
Traffic Monitoring Approaches (3/4)
Comparison of Two Monitoring Approaches
Active Monitoring Passive MonitoringConfiguration Multi-point Single or multi-point Data size Small Large
Network overhead Additional traffic Device overhead No overhead if splitter is used
Purpose Delay, packet loss, availability Throughput, traffic pattern, trend, & detection
CPU Requirement Low to Moderate High
Advantages
Gain some benefits at the initial stage of network construction, because not much data gained from passive one
Measured result may show the real network characteris-tics
Does not need to generate additional probe messages
Disadvantages
Cannot reflect network charac-teristics
Need to generate the probe messages which may cause extra overhead to network
Captured data has massive volume size
Should have additional facility to capture the mirrored packet from network
POSTECH 28/47CSED702Y: Software Defined Networking
Active Monitoring Tech-niques
POSTECH 29/47CSED702Y: Software Defined Networking
Active Monitoring Techniques
ICMP-based Method Diagnose network problems Availability / Round-trip delay / Round-trip packet loss
TCP-based Method One-way bandwidth / Round trip bandwidth Bulk transfer rate
UDP-based Method One-way packet loss / Round trip bandwidth
POSTECH 30/47CSED702Y: Software Defined Networking
ICMP-based Method (1/5)
Active Monitoring – ICMP Internet Control Message Protocol (ICMP), RFC 792 The purpose of ICMP messages is to provide feedback about prob-
lems in the IP network environment Delivered in IP packets
ICMP message format• 4 byte of ICMP header and optional message
POSTECH 31/47CSED702Y: Software Defined Networking
ICMP-based Method (2/5)
ICMP Functions To announce network errors
• If a network, host, port is unreachable, ICMP Destination Unreachable Message is sent to the source host
To announce network congestion• When a router runs out of buffer queue space, ICMP Source Quench Message is
sent to the source host
To assist troubleshooting • ICMP Echo Message is sent to a host to test if it is alive - used by ping
To announce timeouts• If a packet’s TTL field drops to zero, ICMP Time Exceeded Message is sent to the
source host - used by traceroute
POSTECH 32/47CSED702Y: Software Defined Networking
ICMP-based Method (3/5)
ICMP Drawbacks ICMP messages may be blocked (i.e., dropped) by firewall and
processed at low priority by router ICMP has also received bad press by being used in many denial of
service (DoS) attacks and because of the number of sites generat-ing monitoring traffic
As a consequence some ISPs disable ICMP even though this po-tentially causes poor performance and does not comply with RFC1009 (Internet Gateway Requirements)
In spite of these limitations, ICMP is still most widely used in active network measurements
POSTECH 33/47CSED702Y: Software Defined Networking
ICMP-based Method (4/5)
Ping A simple application that runs on a host, typically supplied as part
of the host's operating system Uses ICMP ECHO_REQUEST and ECHO_RESPONSE packets Provides round-trip time and packet loss For average measurement, run ping at regular intervals so as to
measure the site's latency and packet loss
POSTECH 34/47CSED702Y: Software Defined Networking
ICMP-based Method (5/5)
Traceroute Produces a hop-by-hop listing for each router along the path to the
target host For each hop, it prints the round-trip time for the router Algorithm: uses ICMP and TTL field in the IP header
• Send an ICMP packet with TTL=1• First router sends back ICMP TIME_EXCEEDED• Then send ICMP packet with TTL=2 and hear back from the second
router• Continue till the destination is reached or TTL expires (default max
TTL=30) It shows you only the forward path
• The reverse path is seldom the same• To trace the reverse path one must run traceroute on the remote host
(reverse traceroute server, Looking Glass Server)
POSTECH 35/47CSED702Y: Software Defined Networking
TCP-based Method
TCP – Throughput
MeasurementSource Machine Measurement
Source Machine Measurement
Destination Machine Measurement
Destination Machine
NTP Synchronized hosts
TCP
local time : t1
local time : t2
t1
t2
Throughput (Mbps) = t2(㎲ ) – t1(㎲ )
105 x 8
100 KB
POSTECH 36/47CSED702Y: Software Defined Networking
UDP-based Method
UDP – One Way Loss
MeasurementSource MachineMeasurement
Source Machine MeasurementDestination Machine
MeasurementDestination Machine
NTP Synchronized hosts
UDP
100 KB
100 KB
One way Loss = 100 - x 100 (%) Sent Packet Counts
Received Packet Counts
1 Packet (1000 Byte)
POSTECH 37/47CSED702Y: Software Defined Networking
Passive Monitoring and Analysis Techniques
POSTECH 38/47CSED702Y: Software Defined Networking
Packet Capturing (1/2)
Packet Capturing Packets can be captured using Port Mirroring or Network Splitter (Tap)
Mirroring
Probe system
Splitting
Probe system
Port Mirroring Network Splitter
How it works- Copies all packets pass-ing on a port to another port
- Splits the signal and sends a signal to original path and an-other to probe
Advantage- No extra hardware re-quired
- No processing overhead on router/switch
Disadvantage- Processing overhead on router/switch
- Splitter hardware required
POSTECH 39/47CSED702Y: Software Defined Networking
Packet Capturing (2/2)
Difficulties in packet capturing Massive amount of data
• How much packet data is generated from 100 Mbps network in an hour? Port speed In&Out Link Utilization sec/hour = throughput ⅹ ⅹ ⅹ 100 Mbps 2 0.5 3600 = 360 Gbpsⅹ ⅹ ⅹ Throughput / avg. packet length bytes of packet data = data sizeⅹ 360 Gbps / (1500 8) 30 = 1 Gbyteⅹ ⅹ
Processing of high-speed packets• Processing time for 100 Mbps networkPort speed In&Out Link Utilization / average packet length ⅹ ⅹ = 8333 packets/sec => 0.12 msec/packet
100 Mbps 1 Gbps 1 Tbps
Data size per hour (assume 0.5 link util)
1 Gbyte 10 Gbyte 10 Tbyte
Processing Time per packet 0.12 msec 0.012 msec 0.012 μsec
POSTECH 40/47CSED702Y: Software Defined Networking
Sampling
Why We Need Sampling? If the rate is too high to capture all packets reliably, there is no al-
ternative but to sample the packets Sampling algorithms: every Nth packet or fixed time interval
1 2 3 4 5 6 7 8 9 10 11
(a) 2:1 sampling
(b) 1 msec sampling
0 msec 1 msec 2 msec 3 msec 4 msec
POSTECH 41/47CSED702Y: Software Defined Networking
Flow Generation
Flow Flow is a collection of packets with the same {SRC and DST IP
address, SRC and DST port number, protocol number} Flow data can be collected from routers directly, or standalone flow
generator having packet capturing capability Popular flow formats
• NetFlow (Cisco), sFlow (sFlow.org), IPFIX (IETF) Issues in flow generation
• What information should be included in a flow data?• How to generate flow data from raw packet information efficiently?• How to save bulk flow data into DB or binary file in a collector?• How long should the data be preserved?
flow 4flow 1 flow 2 flow 3
POSTECH 42/47CSED702Y: Software Defined Networking
Flow: NetFlow
Cisco NetFlow An option configurable in Cisco routers that exports data on each
IP flow passed through an interface
NetFlow Export Datagram
Flow format of Version5
Header· Sequence number· Record count· Version number
FlowRecord
FlowRecord
FlowRecord
FlowRecord
FlowRecord
• Source IP Address• Destination IP Address
• Next Hop Address• Source AS Number• Dest. AS Number• Source Prefix Mask• Dest. Prefix Mask
• Input Interface Port• Output Interface Port
• Type of Service• TCP Flags• Protocol
• Packet Count• Byte Count
• Start Timestamp• End Timestamp
• Source TCP/UDP Port• Destination TCP/UDP Port
Usage
QoS
Timeof Day
Application
RoutingandPeering
PortUtilization
From/To
POSTECH 43/47CSED702Y: Software Defined Networking
Flow: sFlow
sFlow Described in RFC 3176: “InMon Corporation's sFlow: A Method for
Monitoring Traffic in Switched and Routed Networks” sFlow is a monitoring technology that gives visibility into the use of
networks, enabling performance optimization, accounting/billing for usage, and defense against security threats
sFlow samples packets using statistical sampling theory Format of Version 4
• Packet Header Data• Header Protocol (Format of sampled header)
• Frame_length
• Header bytes
• Packet IP v4 Data• Length
• Protocol (IP Protocol Type)
• src_ip / dst_ip
• src_port / dst_port
• TCP flags
• tos
POSTECH 44/47CSED702Y: Software Defined Networking
Traffic Analysis Aspects
Spatial Aspect The patterns of traffic flow relative to the network topology Important for proper network design and planning Identification of bottleneck & avoidance of congestion Example: Flow aggregation by src, dst IP address or AS number
Temporal Aspect The stochastic behavior of a traffic flow, described in statistical terms Important for resource management and traffic control Important for traffic shaping and caching policies Example: Packet or byte per hour, day, week, month
Composition of Traffic A breakdown of traffic according to the contents, application, packet
length, flow duration Helps to explain its temporal and spatial characteristics Example: game, streaming media traffic for a week from peer ISP
POSTECH 45/47CSED702Y: Software Defined Networking
Traffic Classification/Identification (1/2)
Traffic Classification Classifying traffic based on features passively observed in the traffic,
and according to specific classification goals
Types of Traffic Classification Port-based approaches
• E.g., TCP port 20 and 21 FTP, TCP port 80 HTTP
Payload-based approaches• E.g., “0x12BitTorrent protocol” BitTorrent
Machine Learning (ML)-based approaches• Connection-related statistical information-including connection duration, inter-
packet arrival time, and packet
Accuracy Strength Weakness
Port-based Low Low computational cost Low accuracy
Payload-based High Most accurate methodHigh computational costExhaustive signature generation
ML-based High Can handle encrypted traffic High computational cost
POSTECH 46/47CSED702Y: Software Defined Networking
Traffic Classification/Identification (2/2)
In the Perspective of Network Layers
Classification Level in Practice (Classification Output)
• IP, ARP, RARP, etc.Network Layer
• TCP, UDP, ICMP, etc.Transport Layer
• HTTP, HTTPS, SMTP, FTP, TELNET, SSH, POP, etc.Application Layer
• Bulk transfer, small transaction, etc.Traffic clustering
• Web, game, P2P, messenger, streaming, mail, etc.Application-type breakdown
• HTTP, HTTPS, SMTP, FTP, TELNET, SSH, POP, etc.Application proto-col breakdown
• BitTorrent, MSN, NateOn, Filezilla FTP, etc. Application Break-down
POSTECH 47/47CSED702Y: Software Defined Networking
Q&A