don’t underestimate the performance of tcp. tcp versus udp tcp connection-oriented reliable ...
TRANSCRIPT
Don’t Underestimate Don’t Underestimate the Performance of TCPthe Performance of TCP
TCP versus UDPTCP versus UDP TCP
connection-oriented reliable byte stream
Application: typically concurrent server SMTP(Simple Mail
Transfer Protocol) Telnet FTP HTTP NNTP(Network News
TP)
UDP connectionless unreliable datagram
Applications: typically iterative server SNMP(Simple Network
Management Protocol) TFTP(Trivial FTP) BOOTP(Bootstrap
Protocol) DHCP(Bootstrap
Protocol)
TCPTCP 는 충분히 는 충분히 Optimize Optimize 되어 있다되어 있다 .. 통상적인 TCP segment 수신 루틴 : 30
instructions (excluding checksum) ACK 도 piggy-back 섯불리 UDP 를 사용하지 말라 .
But, a single request-response 로 끝나는 transaction 에서는 TCP 는 Connection set-up: RTT 소요 Connection release: 적어도 RTT 소요
A UDP Source and SinkA UDP Source and Sinkudpsource.c: udpsink.c:
Record of length 0i.e. end-of-record mark
Set UDP socket receive buffer size
실제 5,000 개의 datagram 을 buffering 할수는 없다 . 왜냐 하면 , 시스템에서 허용하는socket receiver buffer 의 최대 크기는이보다 훨씬 작기 때문이다 ( 수십 KB).
UDP datagram 은 lost 될 수 있다 !!1.Network congestion (no congestion cotrol)2.Recv buffer overflow(no flow control)
A TCP Source and SinkA TCP Source and Sinktcpsource.c: tcpsink.c:
Options: -s sndsz -b sndbufsz -c blks
Set TCP send buffer size
Set TCP receive buffer size
Comparison of TCP and UDP Comparison of TCP and UDP PerformancePerformance LAN 에서 UDP 가 TCP 보다 20% 정도 성능이
우수했음 그러나 , UDP 에서는 lost 가 많이 발생함 (no flow
control)
Loopback interface( 같은 host 내 ) 에서는 TCP가 UDP 보다 훨씬 성능 우수했음 Local host 의 MTU 는 16,384 B (BSD) Ethernet 의 MTU 는 1,500 B
Avoid Reinventing TCPAvoid Reinventing TCP Any reasonably robust UDP application must
provide Error recovery: reTx a request if not received a
response within RTO Sequencing: ensure that replies are matched correctly
to requests Flow control: if server’s reply can consist of multiple
datagrams, prohibit overflow of client’s recv buffer Cause to rewrite TCP
TCP 는 kernel 에서 처리되기 때문에 application에서 reliable protocol 을 구현하는 것보다 실제 빠르다 .
When to Use UDP instead of TCPWhen to Use UDP instead of TCP Adv. Of UDP
supports broadcasting and multicasting no overhead for connection setup or teardown
UDP requires 2 packets to exchange a request and a reply TCP requires about 10 packets to exchange assuming new TCP
connection is established for each request-reply exchange
Features of TCP that are not provided by UDP positive ACK, reTx of lost packet, duplicate packet detection,
sequencing of packets windowed flow control slow start and congestion avoidance
Recommendation of UDP Usage must be used for broadcast or multicast applications
desired level of error control must be added can be used for simple request-reply applications
error detection must be needed should not be used for bulk data transfer
Connected UDP SocketConnected UDP Socket Call connect only to communication with exactly one peer
Kernel just records IP address and port # of the peer Connected UDP socket
No need to specify the destination IP addr and port # for output operation write, send instead of sendto
No need to verify received response read, recv instead of recvfrom
Asynchronous errors are returned Connected UDP socket provides better performance
Unconnected UDP socket: make a temporary connection(1/3 overhead) May connect multiple times for a UDP socket by specifying a new IP
addr and port #
하나의 지정된 상대와 UDP 통신할 때는TCP 처럼 connect() 하여 send(), recv()하는 편이 좋다 .
I/O MultiplexingI/O Multiplexing
Filling the Pipe: Echo C/SFilling the Pipe: Echo C/S In stop-and-wait mode
response time = RTT(round-trip time) + server’s processing time(=0)
batch mode 로 (file 을 stdin으로 redirection 해서 ) 1000 line 을 보내면 1000 x response time
Continuous Tx Fill the pipe
TCPClient
TCPServer
stdin
stdout
fgets
fputs
writen
readline
Blocking I/O ModelBlocking I/O Model
I/O Multiplexing ModelI/O Multiplexing Model
Blocking I/O versus I/O MultiplexingBlocking I/O versus I/O Multiplexing
select() Ready!
read()Ready!
read()
read()
read()
read()
Blocking I/O I/O Multiplexing
Usage of I/O MultiplexingUsage of I/O Multiplexing Client
handles an interactive input and a socket handles multiple sockets at the same time
Server handles both a listening socket and its connected
socket handles both TCP and UDP handles multiple services and perhaps multiple
protocols (e.g., inetd daemon)
Select FunctionsSelect Functions
Wait for any one of multiple events to occur and wake up the process only when one or more of these events occurs or, a specified amount of time has passed
wait forever: timeout = NULL wait up to a fixed amount of time polling: do not wait at all: timer value = 0
readset, writeset, exceptset after select returns may be changed Need to set them again for testing file descriptors ready
ready ?
Readset
Writeset
Exceptionset
readwrite exceptionhandling
#include <sys/time.h> /* UNIX */#include <sys/select.h> /* UNIX */#include <unistd.h> /* UNIX */#include <winsock2.h> /* Windows */int select(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, const struct timeval *timeout);Returns: count of ready descriptors if positive, 0 on timeout, -1 on error
Socket API 가 아님
How to Manipulate Descriptor SetsHow to Manipulate Descriptor Sets Descriptor sets(fd_set): array of integers(FD_SETSIZE)
if fdset == NULL, no interest on the condition caution: value result arguments
Macros
Conditions for Descriptor ReadyConditions for Descriptor Ready A socket is ready for reading (readable)
data in socket receive buffer >= low-water mark SO_RCVLOWAT(==1, default) read-half of connection is closed (TCP has received a FIN) listening socket and # of completed connections > 0 socket error is pending
A socket is ready for writing (writable) available space in socket send buffer >= low-water mark SO_SNDLOWAT ( ==
2048, default (TCP, UDP)) write-half connection is closed: write() will generate SIGPIPE and return error EPIPE A socket using nonblocking connect has completed the connection, or the
connect has failed socket error is pending
Socket has an exception condition pending if there exists out-of-band data still at out-of-band mark
Echo client - using I/O MultiplexingEcho client - using I/O Multiplexing
EoF on input close socket 남은 reply 를 socket 에서 read 불가능
UNP
Echo client (using Echo client (using shutdownshutdown)) UNP
TCP Echo Server - I/O MultiplexingTCP Echo Server - I/O Multiplexing
TCP Echo Server – I/O MultiplexingTCP Echo Server – I/O MultiplexingUNP
Server ProgrammingServer Programmingusing Concurrent Processesusing Concurrent Processes
ServerServer 는 동시에 여러 는 동시에 여러 ClientClient 를 를 지원해야 한다지원해야 한다 !!1. Server 는 동시에 여러 개의 client 의 connection request
SYN) 를 받아들여야 한다 . Listening socket 에 connection completion queue 가 있다 listen(listenfd, …): queue 의 크기를 선언하고 , listening socket
으로 만듬 connfd = accept(listenfd, …): queue 에서 connection 이 완료된
connected socket 을 가져옴
2. Server 는 연결된 여러 개의 client 가 보낸 데이터 (request) 를 지체없이 처리해서 response 해야 한다 . Iterative Server
연결 후 client 의 request 를 차례로 반복적으로 (loop 을 돌면서 ) 처리 Client 의 연결시간이 길면 지원하기 곤란
Concurrent Server Concurrent processes 를 이용 Multi-thread 이용 기타 다른 방법 : I/O Multiplexing, Non-blocking
UNIX Process CreationUNIX Process Creation
Fork: create a new process
Exec: replace current process image with the new executable file
Typical Concurrent ServersTypical Concurrent ServersServer Client
listenfdconnfd
connect()
listenfdconnfd
listenfdconnfd
listenfd
connfd
connect()
connect()
listenfdconnect()
connection request
connection
connection
connection
fork
Before accept
After return from accept
After fork return
After close sockets
TCP Echo ServerTCP Echo Servertcpcliserv/tcpserv01.c:
lib/str_echo.c:
TCPClient
TCPServer
(Child)
writen
readline
readline
writen
TCPServer
(Parent)
forkconnec
t
acce
pt
TCPClient
TCPServer
(Child)
writen
readline
readline
writen
forkconnect accept
UNP
Realize that TCP is a Reliable Realize that TCP is a Reliable Protocol, Protocol, Not Infallible ProtocolNot Infallible Protocol
TCP is a Reliable Protocol, Not TCP is a Reliable Protocol, Not Infallible ProtocolInfallible Protocol 2 peer 간에 connection 이 유지되는 한 TCP 는 ordered and
uncorrupted delivery 를 보장한다 . Application 은 통신이 불가능함을 모르고 데이터를 보낼 수 있고 ,
따라서 목적지에 delivery 되지 못하는 경우가 발생한다 . TCP 는 data 를 보내봐야 실제 peer TCP 와 통신 가능한지 확인 가능
(ACK 를 받아 봐야 ) 또는 , 2 시간 이상 데이터 교환이 없을 경우에나 통신 불가능을 확인 할 수 있음 교환할 데이터가 없어도 주기적으로 교환해야 heart beat mechanism 구현 필요
Application 은 send()/recv() 가 error return 되었을 때야 , 통신 불가능함을 알게 된다 . failure 처리
통신 불가능한 경우 ( 실제 connection 이 유지되지 않는 경우 ) Network outage (due to router or link failure) Peer app crashes Peer host crashes
Network OutageNetwork Outage Inside TCP
Segment 보낸 후 ACK 가 없으면 , 12 번 ReTx 한다 ( 약 9 분 걸림 )
여전히 ACK 를 받지 못하면 , set socket pending error (ETIMEOUT)
Inside IP/ICMP IP datagram 을 forwarding 할 수 없으면 (router 나 link 장애로
인해 ), ICMP host unreachable/network unreachable message를 source 로 보낸다 .
이 메시지를 Source IP 가 받으면 , set socket pending error (ENETUNREACH/EHOSTUNREACH)
Socket Pending Error Send() returns on failure send buffer 에 쓰는 것이 실패를 의미 실제 보낸 데이터가 peer 에게 전달할 수 없음은 한참 뒤에나 알 수
있다 . Kernel 은 이와 같은 error 가 발생하면 , 해당되는 socket 에 pending
시켜 놓는다 Socket API call 이 이루어질 때 , error return 하면서 errno 에 설정한다 .
Peer App CrashesPeer App Crashes
When peer app crashes, Local app is In recv(): return 0
통상적인 절차로 종료 In send(): normal return
But, sent data is lost.Local connection 을 강제 close.Error is pending.
Send()/recv(): error return (ECONNRESET)
Send()/recv(): rrror return (EPIPE)
Peer app crashes(killed) 1. Call exit(), implicitly
2. Call close() in exit()
3. TCP: send FINFIN
data
RESETNo connection !Not delivered to peer app
Ways to Detect Various TCP ConditionWays to Detect Various TCP Condition
Peer app crashes – an examplePeer app crashes – an example
tcprw.c: count.c:
killedkilled
Peer Host CrashesPeer Host Crashes
Local app Connection set-up
Send(): normal return But sent data is lost Error is pending
(ETIMEOUT) after retransmitting 12 times(9 min)
or error is pending (EHOSTUNREACH or ENETUNREACH) by ICMP
Peer host crash No TCP there, so no
TCP response
dataNo TCP/IP Protocol !
Remember that TCP/IP is Not Remember that TCP/IP is Not PolledPolled
No notification when connectivity is lost
HeartbeatsHeartbeats 데이터를 보내 보지 않고서는 peer 와 통신 가능 여부를
알 수 없다 통신 불가능함에도 불구하고 데이터를 보내면 lost 됨
( 송금했는데 못 받았으면 ???) 데이터 교환과 별도로 상대가 살아 있는지 주기적으로 check 해
봐야 함 hearbeat 필요 C/S 가 여러가지 msg type 을 교환하는 경우
Heartbeat msg 에 새로운 type 을 할당 C/S 가 byte stream 으로 교환하는 경우
Hearbeat 과 data 를 구분할 수 없음 Hearbeat 에 대해 별도 TCP connection 설정하여 구분 Or, Hearbeat 에 대해 OOB msg 로 교환
(send/recv() 에서 flag 를 OOB 로 설정 )
Hearbeat Client – msg typeHearbeat Client – msg typeheartbeat.h:
hb_client.c:
Heartbeat ServerHeartbeat Server- msg type- msg type
hb_server.c:
Hearbeat Client – separate connectionHearbeat Client – separate connectionhb_client2.c:
Hearbeat Server – separate connectionHearbeat Server – separate connectionhb_server2.c: