제목 : tcp input

54
TCP/IP Illustracted TCP/IP Illustracted Vol2. Vol2. 제제 제제 : TCP Input : TCP Input 2005. 6. 13( 제 ) 제 제 제 [email protected]

Upload: duante

Post on 12-Jan-2016

37 views

Category:

Documents


1 download

DESCRIPTION

제목 : TCP Input. 2005. 6. 13( 월 ) 한 민 규 [email protected]. Introduction Preliminary Processing Header Prediction ACK Processing. Content. Introduction. The tcp_input function is called by ipintr when a datagram is received with a protocol field of TCP. Introduction (Con’t). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 제목  : TCP Input

TCP/IP Illustracted Vol2.TCP/IP Illustracted Vol2.

제목 제목 : TCP Input: TCP Input

2005. 6. 13( 월 )한 민 규

[email protected]

Page 2: 제목  : TCP Input

ContentContent Introduction Preliminary Processing Header Prediction ACK Processing

Page 3: 제목  : TCP Input

IntroductionIntroduction The tcp_input function is called by ipintr

when a datagram is received with a protocol field of TCP

Page 4: 제목  : TCP Input

Introduction (Con’t)Introduction (Con’t)IntroductionIntroduction

1. Validate the input segment & locate the PCB for this connection

2. The term drop means to drop the segmentbeing processed, not drop the connection,but when an RST is sent by dropwithreset itit normally causes the connection to be dropped

Fast path

slow path

normal flow

Page 5: 제목  : TCP Input

Preliminary ProcessingPreliminary Processing

Declarations and preliminary processing

If the number of bytes in the first mbuf in the chain is less than the size of the combined IP/TCP header(40bytes), m_pullup moves the first 40 bytes into thefirst mbuf.

tcp statistic value

Convert data pointer of tcpiphdr *

Page 6: 제목  : TCP Input

Preliminary ProcessingPreliminary Processing

tlen is the TCP length, the number of bytes following the IP header

Dataoffset : 4bit 로 구성된 dataoffset 항목은 32bit word. 즉 한 bit 가 4byte 를 나타내어 TCP 헤더의 전체 길이를 표시한다 .

Page 7: 제목  : TCP Input

Preliminary ProcessingPreliminary Processing

If all three conditions are true, ts_present is set to 1

The benefit in recognizing the timestamp optionthis way is to avoid calling the general option processing functiontcp_dooptions later in the code

Page 8: 제목  : TCP Input

Preliminary ProcessingPreliminary Processing

The two 16-bit port numbers are left in network byte order

TCP maintains a one-behind cache(tcp_last_inpcb) conaining the address of the PCB for the last recevied TCP segment

If the PCB was not found, the input segment is dropped and an RST is sent as a reply

IF the PCB exists but a corresponding TCP control block does not exist, the socket is probably being closed, so the input segment is dropped and an RST is sent as a reply

Page 9: 제목  : TCP Input

Preliminary ProcessingPreliminary Processing

Socket has had listen()

tcp_saveti : These become arguments to tcp_trace when it is called at the end of the function

When a segment arrives for a listening socket , a new socket is created by sonewconn

Compute window scale factorRFC793 window size 16bit : 64k extend(scale option with SYN)

Page 10: 제목  : TCP Input

Preliminary ProcessingPreliminary Processing

t_idle is set to 0 since a segment has been received on the connection. The keep-alive timer is also reset to 2hours

If options are present in the TCP header, and if the connection state is not LISTEN, tcp_dooptions processes the options

Page 11: 제목  : TCP Input

tcp_dooptions Functiontcp_dooptions Function

MSS Option:If the length is not 4(TCPOLEN_MAXSEG), or the segment does not have the SYN flag set, the option is ignored

Window scale Option

Page 12: 제목  : TCP Input

Header PredictionHeader Prediction

Header prediction helps unidirectional data transfer by handling the two common cases.

1. If TCP is sending data, the next expected segment for this connection is an ACK for outstanding data.

2. If TCP is sending data, the next expected segment for this connections is the next in-sequence data segment

Faster then the general processing

1. Check if segment is the next expected The connection state musth be

ESTABLISHED SYN,FIN,RST, or URG control flags must not

be on ts_val > ts_recent ti_seq ==rcv_nxt tiwin(adv window) must be nonzero snd_nxt must equal the highest sequence

number sent(snd_max)2. If a timestamp option is present, Update ts_recent from received timestamp

Page 13: 제목  : TCP Input

Header PredictionHeader Prediction

Update RTT estimators

Delete acknowledged bytes from send buffer

Stop retransmit timer ???

Awaken waiting process :If a process must be awakened when the send buffer is modified

Page 14: 제목  : TCP Input

Header PredictionHeader Prediction

Page 15: 제목  : TCP Input

TCP input: Slow Path ProcessingTCP input: Slow Path ProcessingWe continue with the code that’s executed if header prediction fails, the slow path through tcp_input.

win is set to he number of bytes available in the socket’s receive buffer

Receive window setting

Page 16: 제목  : TCP Input

Initiation of Passive Open, Initiation of Passive Open, Completion of Active OpenCompletion of Active Open

If the state is LISTEN or SYN_SENT, expected segment in these two states is a SYN, and we’ll see that any other received segment is dropped

Drop if RST, ACK, or no SYN

Page 17: 제목  : TCP Input

Initiation of Passive Open, Initiation of Passive Open, Completion of Active OpenCompletion of Active Open

TCP is defined only for unicast applications. Recall that the M_BCAST and M_MCAST flags wre set by ether_input, based on the destination hardware address of the frame

• Get mbuf for client’s IP address and port• Set local address in PCB• Connect PCB to peer

Page 18: 제목  : TCP Input

Complete processing of SYN received Complete processing of SYN received in LISTEN statein LISTEN state

Initialize sequence number variables in control block

• TF_ACKNOW flag is set since the ACK of a SYN is not delayed• The connection state becomes SYN_RCVD• The connection-establishment timer is set to 75 seconds(TCPTV_KEEP_INIT) tcp_output will be called

Page 19: 제목  : TCP Input

Completion of Active OpenCompletion of Active OpenTCP is expecting to receive a SYN

• tcp_sendseqinit sets all four of these variables to 365• tcp_sendseqinit sets all four of these variables to 365

Acceptable ACK and RST

Page 20: 제목  : TCP Input

Process received SYN in response to an active Process received SYN in response to an active openopen

• Since data can arrive for a connection befoe the connection is established, any such data is now placed in the receive by calling tcp_reass

•If the SYN that is ACKed was being timed, tcp_xmit_timer initializes the RTT estimators based on the measured RTT for the SYN

Active Open complete

•Simultaneous Open

Page 21: 제목  : TCP Input

Simultaneous openSimultaneous open

• If it is greater than the receive winodw, the excess data is dropped by m_adj()

Page 22: 제목  : TCP Input

PAWS:Protecion Against Wrapped PAWS:Protecion Against Wrapped Sequence NumbersSequence Numbers

Page 23: 제목  : TCP Input

Trim Segment so Data is Within Trim Segment so Data is Within WindowWindow

• duplicate data at the beginning of the received segment is discarded• Data that is beyond the end of the window is discarded from the end of the segment

• These data bytes have already been acknowledged and passed to the applications

Page 24: 제목  : TCP Input

Handle completely duplicate segmentHandle completely duplicate segment

rcv_nxt

ti_seq

This normally ocuurs when the other end did not receive our ACK, causing the other end to retransmit the segment.

Page 25: 제목  : TCP Input

Handle data that arrives after the process terminatesHandle data that arrives after the process terminates

• If the socket has no descriptor referencing it, the process has closed t he connection• The segment is then dropped and an RST is output

Page 26: 제목  : TCP Input

Calculate number of bytes beyond right edge of windowCalculate number of bytes beyond right edge of window

• todrop would be (6+5) – (4+6) = 1

Page 27: 제목  : TCP Input

Reomove data beyond right edge of windowReomove data beyond right edge of window

• Check for new incarnation of a connection in the TIME_WAIT state - the SYN flag is set, - the connection is in the TIME_WAIT state - the new starting sequence number is greater then the final sequence number for the connection

• This is allowed by RFC 1122, - the ISS for the new connection must be greater than the last sequence number used(rcv_nxt). TCP adds 128,000(TCP_ISSNCR), which becomes the ISS

• check for pobe of closed window

• Drop other segments that are completely outside window• The data to the right of the window is discarded from the mbuf chain by m_adj and ti_len is updated

Page 28: 제목  : TCP Input

When to Drop an AckWhen to Drop an Ack• In an actual scenario, when both ends of a connection had a hole in the data on the reassembly queue and both ends enter the persist state, the connection becomes deadlocked as both ends throw away perfectly good ACKs

Page 29: 제목  : TCP Input

Self-Connects and Simultaneous Self-Connects and Simultaneous OpensOpens

Page 30: 제목  : TCP Input

Record TimestampRecord Timestamp

Page 31: 제목  : TCP Input

Process RST flagProcess RST flag

• SYN_RCVD state - Normally it is entered from the LISTEN state - This state can also be entered by a simultaneous open, after a process has called connect

• The other end sent its SYN and then terminated before the reply arrived, causing it to end an RST• This state can also be entered by a simultaneous open, after a process has called connect

• Other states - The receipt of an RST in the ESTABLISHED, FIN_WAIT1, FIN_WAIT2, or CLOSE_WAIT states returns the error ECONNRESET

Page 32: 제목  : TCP Input

ACK ProcessingACK Processing

Page 33: 제목  : TCP Input

Received ACK in SYN_RCVD stateReceived ACK in SYN_RCVD state

snd_unasnd_max

Available wndsize

ti_ack

Accept, select

Page 34: 제목  : TCP Input

Fast Retransmit and Fast RecoveryFast Retransmit and Fast Recovery The fast retransmit algorithm occurs when TCP

deduces from a small number (normally 3) of consecutive duplicate ACKs that a segment has been lost and deduces the starting sequence number of the missing segment

The fast recovery algorithm says that after the fast retransmit algorithm (that is, after the missing segment has been retransmitted), congestion avoidance but not slow start is performed

Page 35: 제목  : TCP Input

Check for completely duplicate ACKCheck for completely duplicate ACK

• snd_una < acknowledgment field <= snd_max

Page 36: 제목  : TCP Input

Duplicate ACK processingDuplicate ACK processing

• t_dupacks equals 3(tcprexmtthresh). Congestion avoidance is performed and the missing segment is retransmitted• t_dupacks exceeds 3. Increase the congestion window and perform normal TCP output• t_dupacks is less than 3. Do nothing.

Set snd_nxt

Set congestion window

Number of consecutive duplicate ACKS exceeds threshold of 3

Page 37: 제목  : TCP Input

Value of cwnd and send sequence Value of cwnd and send sequence while data is being transmittedwhile data is being transmitted

Page 38: 제목  : TCP Input

ACK ProcessingACK Processing

Congestion window reset

Check for out-of-range ACK (Acceptable ACK)

Page 39: 제목  : TCP Input

RTT measurements and RTT measurements and retransmission timerretransmission timer

Base on Delayed-ACKts_ecr : timestamp echo reply

needoutput : 1This flag forces a call to tcp_output at the end of this function

Page 40: 제목  : TCP Input

Open congestion window in response Open congestion window in response to ACKsto ACKs

Page 41: 제목  : TCP Input

Remove acknowledged data from Remove acknowledged data from send buffersend buffer

Reomve from the send buffer

/* actual chars in buffer */

Page 42: 제목  : TCP Input

Receipt of ACK in FIN_WAIT_1 stateReceipt of ACK in FIN_WAIT_1 state

한쪽 호스트에서 연결 종료과정 초기의 FIN-ACK 세그먼트가 전송된 상태

Page 43: 제목  : TCP Input

Receipt of ACK in CLOSING stateReceipt of ACK in CLOSING state

FIN-ACK 가 수신되었지만 이에 대한 ACK 는 전송되지 않은상태 . Simultaneous close 상태

Page 44: 제목  : TCP Input

Receipt of ACK in LAST_ACK stateReceipt of ACK in LAST_ACK state

수신된 FIN-ACK 에 대한 ACK 의 전송

Page 45: 제목  : TCP Input

Receipt of ACK in TIME_WAIT stateReceipt of ACK in TIME_WAIT state

양 호스트의 TCP 가 FIN-ACK 와 이에 대한 응답을 교환하고 TCP 연결종료 과정을 마무리한 상태 . 이 상태가 되면 Maximum segment lifetime(Default 120 초 ) 의 두배의 시간동안 기다렸다가 해당 연결의 포트번호를 다시 사용할 수 있다 .

Page 46: 제목  : TCP Input

Update window informationUpdate window information

• snd_w11 records the sequence number of the last segment used to update the send window• snd_w12 records the acknowledgement number of the last segment used to update the send window

snd_nxtsnd_una

snd_wnd

ti_seqsnd_w11

snd_nxtsnd_una

snd_wnd

ti_seqsnd_w11

snd_w12

ti_ack

Needoutput is set to 1 since the new value of snd_wndmight enable a segment to be sent

Page 47: 제목  : TCP Input

Urgent mode processingUrgent mode processing

• URG flag is ignored in the CLOSE_WAIT, CLOSING, LAST_ACK, or TIME_WAIT state• If the urgent offset plus the number of bytes already in the receive buffer exceeds the maximum size of a socket buffer

Page 48: 제목  : TCP Input

Processing of received urgent pointerProcessing of received urgent pointer

A new urgent pointer has been received

Page 49: 제목  : TCP Input

Place out-of-band byte into t_iobcPlace out-of-band byte into t_iobc

Page 50: 제목  : TCP Input

Merge received data into sequencing Merge received data into sequencing queue for socketqueue for socket

Page 51: 제목  : TCP Input

FIN Processing(first half)FIN Processing(first half)

/* can't receive more data from peer */

Page 52: 제목  : TCP Input

FIN Processing (second half)FIN Processing (second half)

• TIME_WAIT state - If a FIN arrives in the TIME_WAIT state, it is a duplicate, the TIME_WAIT timer is restarted with a value of twice the MSL

Page 53: 제목  : TCP Input

Final ProcessingFinal Processing

Page 54: 제목  : TCP Input