tdc561 network programming
DESCRIPTION
TDC561 Network Programming. Week 8: Multicasting; Socket Options;. Camelia Zlatea, PhD Email: [email protected]. W. Richard Stevens, Network Programming : Networking API: Sockets and XTI, Volume 1, 2nd edition, 1998 (ISBN 0-13-490012-X) Chap. 7, 11, 19, 21, 22. References. - PowerPoint PPT PresentationTRANSCRIPT
TDC561 Network Programming
Camelia Zlatea, PhD
Email: [email protected]
Week 8:
Multicasting; Socket Options;
Page 2Network Programming (TDC561) Winter 2003
References
W. Richard Stevens, Network Programming : Networking API: Sockets and XTI, Volume 1, 2nd edition, 1998 (ISBN 0-13-490012-X) – Chap. 7, 11, 19, 21, 22
Page 3Network Programming (TDC561) Winter 2003
Addressing in the Internet
Addressing tied to reachability– Every host interface has its own IP address
– Router interfaces usually have their own IP addresses
IP is version 4 (IPv4 addresses)– 4 bytes long
– two part hierarchy
» network number and host number
– different types of boundary indicator
» class, subnet mask, prefix
– Goal of boundaries is address aggregation
Page 4Network Programming (TDC561) Winter 2003
Address classes
Historical first choice– fixed network-host partition, with 8 bits of network number
Generalization– Class A addresses have 8 bits of network number– Class B addresses have 16 bits of network number– Class C addresses have 24 bits of network number
Distinguished by leading bits of address– leading 0 => class A (first byte < 128)– leading 10 => class B (first byte in the range 128-191)– leading 110 => class C (first byte in the range 192-223)– leading 1110 => class D (multicast)– leading 1111 => Class E (reserved)
Page 5Network Programming (TDC561) Winter 2003
Address evolution
Class based scheme was too inflexible Two problems
– Too many routes– Too few addresses
Four extensions– Subnetting (flexible boundaries within network)– CIDR (flexible grouping of networks- Classless Inter-
domain Routing)– Dynamic host configuration (reuse of addresses)– A bigger address (IPv6)
One issue– Network address translation
Page 6Network Programming (TDC561) Winter 2003
What is Multicast?
Multicast is a communication paradigm– 1 source, multiple destination
Applications:– bulk-data distribution to subscribers
» (e.g., newspaper, software, and video tapes distribution),
– connection-time-based charging data distribution
» (e.g., financial data, stock market information, and news tickets broadcasting),
– streaming (e.g., video/audio real-time distribution),
– push applications, web-casting,
– distance learning, conferencing, collaborative work, distributed simulation, and interactive games.
Page 7Network Programming (TDC561) Winter 2003
The Internet group model– multicast/group communications means...
» 1 n as well as n m
– a group is identified by a class D IP address (224.0.0.0 to 239.255.255.255)
» abstract notion does not identify any host
host_1
140.192.1.8140.192.1.8sourcesource
host_2
receiverreceiver140.192.1.6140.192.1.6
host_3
receiverreceiver216.47.143.60216.47.143.60
multicast group225.1.2.3 multicast router
Ethernet
multicast router
multicast router
host_1
sourcesource
host_2
Ethernet
receiverreceiver
host_3
site 1
site 2
Internet
receiverreceiver
multicast distribution tree
from logical view...
...to physical view
Page 8Network Programming (TDC561) Winter 2003
IP Multicast: Basic Idea
Multicast groups: abstract “rendez-vous” points. Set up optimal spanning tree spanning participants for
each group. Make it cheap by not providing strong guarantees: send
out packets and hope for the best.
Page 9Network Programming (TDC561) Winter 2003
The Internet group model (cont’)
the group model is an open model– anybody can belong to a multicast group
» no authorization is required– a host can belong to many different groups
» no restriction– a source can send to a group, no matter whether it
belongs to the group or not» membership not required
– the group is dynamic, a host can subscribe to or leave at any time
– a host (source/receiver) does not know the number/identity of members of the group
Page 10Network Programming (TDC561) Winter 2003
Mapping IP Multicast onto Ethernet Multicast
IP Multicast (class D IP address): – Class D: 224.x.x.x-239.x.x.x (in HEX: Ex.xx.xx.xx): 28 bits
– No further structure (like Class A, B, or C)
– Not addresses but identifiers of groups
– Some of them are assigned by the IANA to permanent host groups
Mapping a class D IP adr. into an Ethernet multicast adr.– The least 23 bits of the Class D address are inserted into the 23 bits of
Ethernet multicast address
– Many to one mapping: 5 bits are not used
– More filtering has to be done at IP level
Page 11Network Programming (TDC561) Winter 2003
Ethernet Multicast
Ethernet is a broadcast medium
– Every frame can potentially be seen by every host Ethernet cards have a unique Ethernet address Broadcast address:
– ff:ff:ff:ff:ff:ff Ethernet Multicast address range for IP:
– 01:00:5e:00:00:00 -to- 01:00:5e:7f:ff:ff Mapping IP Multicast onto Ethernet Multicast
Page 12Network Programming (TDC561) Winter 2003
The Internet group model (cont’)
local-area multicast» use the potential diffusion capabilities of the physical
layer (e.g. Ethernet)
» efficient and straightforward
wide-area multicast» requires to go through multicast routers, use
IGMP/multicast routing/...
» routing in the same administrative domain is simple and efficient
» inter-domain routing is complex, not fully operational
Page 13Network Programming (TDC561) Winter 2003
Multicast and the TCP/IP layered model
TCP UDP
IP / IP multicast
device drivers
ICMP IGMP
Application
Socket layer
congestioncontrol
reliabilitymgmt
other buildingblocks
multicastrouting
higher-levelservices
user space
kernel space
Page 14Network Programming (TDC561) Winter 2003
What is Multicast? Several applications need efficient means to transmit data
to multiple destinations with:– less bandwidth– higher throughput– lower delay– higher reliability
Classification– Data dissemination– Transactions– Large Scale Virtual Environments
Build on top of the existing Internet and take into account group communication constraints– Manage groups– Create and maintain multicast routes– Efficient end-to-end delay (reliability, flow control, time constraints)
Page 15Network Programming (TDC561) Winter 2003
Ideal Multicast
Senders (S) and Receivers (R) not aware of each other’s position in the network.
Scalable. Low latency (join, data propagation). Low bandwidth and processing overhead. “Reliable”, if this is cheap (“end-to-end”?) Easy to join/leave.
Page 16Network Programming (TDC561) Winter 2003
Why IP multicast?
scalability...
– scales to an unlimited number of users reduced costs...
– cheaper equipment and access line increased speed...
– increases the delivery speed
...or multicast?contentserver ISP and Internet
access line client
client
contentserver ISP and Internet
access line client
client
use unicast?
Page 17Network Programming (TDC561) Winter 2003
Multicast Features: Multicast Scope Control
Who gets which packets?– Send everything to everybody ..
TTL scope– To keep multicast traffic within an administrative domain by
setting ttl thresholds on interfaces on the border router
Administratively scoped addresses– A multicast boundary can be setup
on the borders for addresses in range of 239.0.0.0–239.255.255.255
– Better than ttl scope
Page 18Network Programming (TDC561) Winter 2003
Multicasting: Receiving multicast message
For a process to receive multicast messages it needs to perform the following steps:
1. Create a UDP socket msd msd = socket(AF_INET,SOCK_DGRAM, 0);
2. Bind it to a UDPport, e.g., 1234. All processes must bind to the same port in order to receive the multicast messages. struct sockaddr_in groupHost;
groupHost.sin_family = AF_INET; groupHost.sin_port = htons(UDPport); groupHost.sin_addr.s_addr = htonl(INADDR_ANY);
bind(msd, (struct sockaddr *) &groupHost, sizeof(groupHost))
Page 19Network Programming (TDC561) Winter 2003
Multicasting: Receiving multicast message
(cont’)3. Join a multicast group address GroupIPaddress ,
e.g., 224.111.112.113
joinGroup (msd, GroupIPaddress);
4. Use recv or recvfrom to read the messages, e.g.,
nbytes = recv(msd, recvBuf, BufLen,0);
Page 20Network Programming (TDC561) Winter 2003
Multicast Groups and Addresses
Every IP multicast group has a group address. IP multicast provides only open groups
– it is not necessary to be a member of a group in order to send datagrams to the group.
Multicast address are like IP addresses used for single hosts, and is written in the same way: A.B.C.D.– Multicast addresses will never clash with host addresses because
a portion of the IP address space is specifically reserved for multicast. 224.0.0.0 to 239.255.255.255.
– Multicast addresses from 224.0.0.0 to 224.0.0.255 are reserved for multicast routing information;
– Application programs should use multicast addresses outside this range.
Page 21Network Programming (TDC561) Winter 2003
Multicasting: Receiving multicast message /* This function sets the socket option to make the local host join the multicast
group */void joinGroup(int s, char *group){ struct sockaddr_in groupStruct; struct ip_mreq mreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1) printf("error in inet_addr\n"); /* check if group address is indeed a Class D address */ mreq.imr_multiaddr = groupStruct.sin_addr; mreq.imr_interface.s_addr = INADDR_ANY; if ( setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,(char *) &mreq,
sizeof(mreq)) == -1 ) { printf("error in joining group \n"); exit(-1); }}
Page 22Network Programming (TDC561) Winter 2003
Receiving Multicast Datagrams
Join a particular multicast group. This is done using another call to setsockopt:
struct ip_mreq mreq;
setsockopt(sock,IPPROTO_IP,IP_ADD_MEMBERSHIP,&mreq,sizeof(mreq));
The definition of struct ip_mreq is as follows: struct ip_mreq {
struct in_addr imr_multiaddr; /* multicast group to join */
struct in_addr imr_interface; /* interface to join on */
}
Page 23Network Programming (TDC561) Winter 2003
Multicasting: Receiving multicast message /* This function removes the process from the group */void leaveGroup(int recvSock,char *group){ struct sockaddr_in groupStruct; struct ip_mreq dreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1) printf("error in inet_addr\n");
dreq.imr_multiaddr = groupStruct.sin_addr; dreq.imr_interface.s_addr = INADDR_ANY; if( setsockopt(recvSock,IPPROTO_IP,IP_DROP_MEMBERSHIP, (char *) &dreq,sizeof(dreq)) == -1 ) { printf("error in leaving group \n"); exit(-1); } printf("process quitting multicast group %s \n",group);}
Page 24Network Programming (TDC561) Winter 2003
Multicasting: Sending multicast message
For a process to send multicast messages it needs to
perform the following:
1. use the UDP socket msd for sending multicast messages
struct sockaddr_in dest;
dest.sin_family = AF_INET;
dest.sin_port = UDPport;
dest.sin_addr.s_addr = inet_addr(GroupIPaddress);
sendto (msd, sendBuf, BufLen,0, (struct sockaddr *) &dest, sizeof(dest)) ;
Page 25Network Programming (TDC561) Winter 2003
Multicasting: Sending multicast message
(cont’)2. Join a multicast group address GroupIPaddress ,
e.g., 224.111.112.113
joinGroup (msd, GroupIPaddress);
3. Use recv or recvfrom to read the messages, e.g.,
nbytes = recv(msd, recvBuf, BufLen,0);
Page 26Network Programming (TDC561) Winter 2003
Multicasting: Sending multicast message /* This function sets the socket option to make the local host join the multicast
group */void joinGroup(int s, char *group){ struct sockaddr_in groupStruct; struct ip_mreq mreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1) printf("error in inet_addr\n"); /* check if group address is indeed a Class D address */ mreq.imr_multiaddr = groupStruct.sin_addr; mreq.imr_interface.s_addr = INADDR_ANY; if ( setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,(char *) &mreq,
sizeof(mreq)) == -1 ) { printf("error in joining group \n"); exit(-1); }}
Page 27Network Programming (TDC561) Winter 2003
Multicasting
Time-to-live– control how far the messages can go, e.g., 2 means at most 2
routers away. (default is 1- which will result in multicast packets going only to other hosts on the local network. )
u_char TimeToLive; TimeToLive = 2; setTTLvalue (s, &TimeToLive);
/* This function sets the Time-To-Live value */void setTTLvalue(int s,u_char *ttl_value){ if( setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, (char *) ttl_value,
sizeof(u_char)) == -1 ) { printf("error in setting loopback value\n"); }}
Page 28Network Programming (TDC561) Winter 2003
Multicasting
Time-to-live– To provide meaningful scope control, multicast routers enforce the
following "thresholds" on forwarding based on the TTL field:
0 restricted to the same host
1 restricted to the same subnet
32 restricted to the same site
64 restricted to the same region
128 restricted to the same continent
255 unrestricted
Page 29Network Programming (TDC561) Winter 2003
Multicasting
Loop-back– allow the process to get a copy of its own transmission we use:
u_char loop;
loop = 1;
setLoopback (s, &loop);
void setLoopback(int s,u_char loop)
{
if( setsockopt(s,IPPROTO_IP,IP_MULTICAST_LOOP,(char *) &loop,
sizeof(u_char)) == -1 )
{
printf("error in disabling loopback\n");
}
}
By default, messages sent to the multicast group are looped back to the local host. this function disables that.
loop = 1 /* means enable loopback (default)
loop = 0 /* means disable loopback
Page 30Network Programming (TDC561) Winter 2003
Multicasting
Reuse-port – allow multiple multicast processes to to run on the same host:
reusePort (s);
/* This function sets a socket option that allows multiple processes to bind to the same port*/
void reusePort(int s){ int one=1; if ( setsockopt(s,SOL_SOCKET,SO_REUSEADDR,(char *) &one,sizeof(one)) == -
1 ) { printf("error in setsockopt,SO_REUSEPORT \n"); exit(-1); }}
Page 31Network Programming (TDC561) Winter 2003
Multicasting - Example
http://condor.depaul.edu/~czlatea/TDC561/LectureNotes/TDC561_week8/
multicast.h multicastUtilities.c multicastChat.c
Page 32Network Programming (TDC561) Winter 2003
Reliable One-One Communication
Use reliable transport protocols (TCP) or handle at the application layer Client/Server semantics in the presence of failures Possibilities
– Client unable to locate server– Lost request messages– Server crashes after receiving request– Lost reply messages– Client crashes after sending request
Page 33Network Programming (TDC561) Winter 2003
Reliable One-Many Communication
Reliable multicast– Lost messages => need
to retransmit
Possibilities– ACK-based schemes
» Sender can become bottleneck
– NACK-based schemes
Page 34Network Programming (TDC561) Winter 2003
Atomic Multicast
Reliable Group Communication– Processes can fail– Atomicity of Multicast is required
» Atomicity? Group Membership
– Multicast and a corresponding group of recipients– Failures of processes can be viewed as changes to group membership.
System Model– Separating receiving a message and delivering it to a application– Group View: a list of processes associated with a message
View Change– A special multicast message– Race between m and vc
Condition– Either m is delivered to all processes before a process is delivered a new vc– Or, m is not delivered at all.
Page 35Network Programming (TDC561) Winter 2003
Atomic Multicast
Atomic multicast: a guarantee that all process received the message or none at all
– Replicated database example
Problem: how to handle process crashes?
Solution: group view– Each message is uniquely
associated with a group of processes
» View of the process group when message was sent
» All processes in the group should have the same view (and agree on it)
Virtually Synchronous Multicast
Page 36Network Programming (TDC561) Winter 2003
Reliable Mcast Transport Protocol
• S, R use windows• Designated Receivers eliminate ACK implosion• ACK’s sent to DR’s• DR’s and S cache data and retransmit it when needed.
Smart “session manager”elects DR’s and setsparameters. How? Justlike that...
Page 37Network Programming (TDC561) Winter 2003
RMTP(2)
After set up S starts sending data. Receivers send periodic ACK’s after first packet received.
If no ACK’s for a long time, connection terminates. DR’s or S retransmit info using unicast or multicast,
depending on number of errors. Immediate TX request sent to DR’s, for receivers that join the
session. Sender window advance determined by slowest receiver. ACK’s must not be repeated too often. Measure RTT to AP. S adjusts (decreases) send window to 1 if many errors; then
increases linearly. DR’s are fixed, but each R chooses its DR. (DR sends
SND_ACK_TOME with TTL fixed to a known value).
Page 38Network Programming (TDC561) Winter 2003
Socket Options
Various attributes that are used to determine the behavior of sockets.
Setting options tells the OS/Protocol Stack the behavior we want.
Support for generic options (apply to all sockets) and protocol specific options.
Page 39Network Programming (TDC561) Winter 2003
Option types
Many socket options are Boolean flags indicating whether some feature is enabled (1) or disabled (0).
Other options are associated with more complex types including int, timeval, in_addr, sockaddr, etc.
Read-Only Socket Options– Some options are readable only (we can’t set the value).
Page 40Network Programming (TDC561) Winter 2003
Setting and Getting option values
getsockopt() gets the current value of a socket option.
setsockopt() is used to set the value of a socket option.
#include <sys/socket.h>
Page 41Network Programming (TDC561) Winter 2003
int getsockopt( int sockfd,
int level,
int optname,
void *opval,
socklen_t *optlen);
level specifies whether the option is a general option or a protocol specific option (what level of code should interpret the option).
getsockopt()
Page 42Network Programming (TDC561) Winter 2003
int setsockopt( int sockfd,
int level,
int optname,
const void *opval,
socklen_t optlen);
setsockopt()
Page 43Network Programming (TDC561) Winter 2003
General Options
Protocol independent options. Handled by the generic socket system code. Some general options are supported only by specific
types of sockets (SOCK_DGRAM, SOCK_STREAM).
Page 44Network Programming (TDC561) Winter 2003
Some Generic Options
SO_BROADCAST
SO_DONTROUTE
SO_ERROR
SO_KEEPALIVE
SO_LINGER
SO_RCVBUF,SO_SNDBUF
SO_REUSEADDR
Page 45Network Programming (TDC561) Winter 2003
SO_BROADCAST
Boolean option: enables/disables sending of broadcast messages.
Underlying DL layer must support broadcasting! Applies only to SOCK_DGRAM sockets. Prevents applications from inadvertently sending
broadcasts (OS looks for this flag when broadcast address is specified).
Page 46Network Programming (TDC561) Winter 2003
SO_DONTROUTE
Boolean option: enables bypassing of normal routing.
Used by routing daemons.
Page 47Network Programming (TDC561) Winter 2003
SO_ERROR
Integer value option.
The value is an error indicator value (similar to
errno).
Readable only
Reading (by calling getsockopt()) clears any
pending error.
Page 48Network Programming (TDC561) Winter 2003
SO_KEEPALIVE
Boolean option: enabled means that STREAM sockets should send a probe to peer if no data flow for a “long time”.
Used by TCP - allows a process to determine whether peer process/host has crashed.
Consider what would happen to an open telnet connection without keepalive.
Page 49Network Programming (TDC561) Winter 2003
SO_LINGER
Value is of type:
struct linger {
int l_onoff; /* 0 = off */
int l_linger; /* time in seconds */
}; Used to control whether and how long a call to close
will wait for pending ACKS. connection-oriented sockets only.
Page 50Network Programming (TDC561) Winter 2003
SO_LINGER usage
By default, calling close() on a TCP socket will return immediately.
The closing process has no way of knowing whether or not the peer received all data.
Setting SO_LINGER means the closing process can determine that the peer machine has received the data (but not that the data has been read() !).
Page 51Network Programming (TDC561) Winter 2003
shutdown() vs SO_LINGER
How you can use shutdown() to find out when the peer process has read all the sent data [R.Stevens, 7.5]
Page 52Network Programming (TDC561) Winter 2003
FINSN=X
FINSN=X
Client Server
ACK=X+1ACK=X+1
ACK=Y+1ACK=Y+1
1
2
4
FINSN=Y
FINSN=Y
3
...
TCP Connection Termination
writecloseclose returns
Data queued By TCP
App. Reads queued dataand FIN
close
Page 53Network Programming (TDC561) Winter 2003
FINSN=X
FINSN=X
Client Server
ACK=X+1ACK=X+1
ACK=Y+1ACK=Y+1
1
2
4
FINSN=Y
FINSN=Y
3
...
TCP Connection Termination close w/ SO_LINGER
writeclose
close returns
Data queued By TCP
App. Reads queued dataand FIN
close
Page 54Network Programming (TDC561) Winter 2003
FINSN=X
FINSN=X
Client Server
ACK=X+1ACK=X+1
ACK=Y+1ACK=Y+1
1
2
4
FINSN=Y
FINSN=Y
3
...
TCP Connection Termination w/ shutdown
writeshutdown WRread blocks
read returns 0
Data queued By TCP
App. Reads queued dataand FIN
close
Page 55Network Programming (TDC561) Winter 2003
SO_RCVBUF and SO_SNDBUF
Integer values options - change the receive and send buffer sizes.
Can be used with STREAM and DGRAM sockets. With TCP, this option effects the window size used for
flow control - must be established before connection is made.
Page 56Network Programming (TDC561) Winter 2003
SO_REUSEADDR
Boolean option: enables binding to an address (port) that is already in use.
Used by servers that are transient - allows binding a passive socket to a port currently in use (with active sockets) by other processes.
Can be used to establish separate servers for the same service on different interfaces (or different IP addresses on the same interface).
Virtual Web Servers can work this way.
Page 57Network Programming (TDC561) Winter 2003
IP Options (IPv4)
IP_HDRINCL: used on raw IP sockets when we want to build the IP header ourselves.
IP_TOS: allows us to set the “Type-of-service” field in an IP header.
IP_TTL: allows us to set the “Time-to-live” field in an IP header.
Page 58Network Programming (TDC561) Winter 2003
TCP socket options
TCP_KEEPALIVE: set the idle time used when SO_KEEPALIVE is enabled.
TCP_MAXSEG: set the maximum segment size sent by a TCP socket.
TCP_NODELAY: can disable TCP’s Nagle algorithm that delays sending small packets if there is unACK’d data pending.
TCP_NODELAY also disables delayed ACKS (TCP ACKs are cumulative).