ch. 7 : internet transport protocols
DESCRIPTION
Ch. 7 : Internet Transport Protocols. Our goals: understand principles behind transport layer services: Multiplexing / demultiplexing data streams of several applications reliable data transfer flow control congestion control. Chapter 6: rdt principles Chapter 7: multiplex/ demultiplex - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/1.jpg)
1
Ch. 7 : Internet Transport Protocols
![Page 2: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/2.jpg)
Transport Layer
Transport LayerOur goals: understand
principles behind transport layer services: Multiplexing /
demultiplexing data streams of several applications
reliable data transfer flow control congestion control
Chapter 6: rdt principlesChapter 7: multiplex/ demultiplex Internet transport layer
protocols: UDP: connectionless
transport TCP: connection-oriented
transport• connection setup• data transfer• flow control• congestion control
22
![Page 3: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/3.jpg)
Transport vs. network layer
Transport layer uses Network layer services adds more value to these services
Transport Layer Network Layer
logical communication between processes
logical communication between hosts
exists only in hostsexists in hosts and
in routers
ignores network routes data through network
Port #s used for routing in destination computer
IP addresses used for routing in network
3
![Page 4: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/4.jpg)
4
Multiplexing &Demultiplexing
![Page 5: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/5.jpg)
Multiplexing/demultiplexing
application
transport
network
link
physical
P1 application
transport
network
link
physical
application
transport
network
link
physical
P2P3 P4P1
host 1 host 2 host 3
= process= socket
receive segment from L3deliver each received segment to correct socket
Demultiplexing at rcv host:gather data from multiplesockets, envelop data with headers (later used for demultiplexing), pass to L3
Multiplexing at send host:
5
![Page 6: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/6.jpg)
How demultiplexing works host receives IP datagrams
each datagram has source IP address, destination IP address in its header
each datagram carries one transport-layer segment
each segment has source, destination port number in its header
host uses port numbers, and sometimes also IP addresses to direct segment to correct socket from socket data gets to
the relevant application process
source port # dest port #
32 bits
applicationdata
(message)
other header fields
TCP/UDP segment format
L4
head
er
ap
pl. m
sgL3
h
dr
other IP header fields
source IP addr dest IP addr.
6
![Page 7: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/7.jpg)
Connectionless demultiplexing (UDP)
Processes create sockets with port numbers
a UDP socket is identified by a pair of numbers:(my IP address , my port number)
Client decides to contact: a server ( peer IP-address) + an application ( peer port #)
Client puts those into the UDP packet he sends; they are written as: dest IP address - in the
IP header of the packet dest port number - in its
UDP header
When server receives a UDP segment: checks destination port
number in segment directs UDP segment to
the socket with that port number (packets from different remote sockets directed to same socket)
the UDP message waits in socket queue and is processed in its turn.
answer message sent to the client UDP socket (listed in Source fields of query packet)
7
![Page 8: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/8.jpg)
ClientIP:B
SP: 53
DP: 5775
S-IP: C
D-IP: B
SP = Source port numberDP= Destination port numberS-IP= Source IP AddressD-IP=Destination IP Address
Connectionless demux (cont)
client IP: A
P1
serverIP: C
SP and S-IP provide “return address”
P2
client socket:port=9157, IP=A
P3
server socket:port=53, IP = C
message
SP: 9157
DP: 53
S-IP: A
D-IP: C
IP-HeaderUDP-Header
SP: 53
DP: 9157
S-IP: C
D-IP: A
SP: 5775
DP: 53
S-IP: B
D-IP: C
message
client socket:port=5775, IP=B
message message
Wait for application
Getting Service
Reply
Getting Service
Reply
L1
L2
L4
L5
L3
8
![Page 9: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/9.jpg)
Connection-oriented demux (TCP)
TCP socket identified by 4-tuple: local (my) IP address local (my) port number remote IP address remote port number
receiving host uses all four values to direct segment to appropriate socket
Server host may support many simultaneous TCP sockets: each socket identified
by its own 4-tuple
Web servers have a different socket for each connecting client If you open two browser
windows, you generate 2 sockets at each end
non-persistent HTTP will open a different socket for each request
9
![Page 10: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/10.jpg)
Connection-oriented demux (cont)
ClientIP: B
client IP: A
P1
serverIP: C
P3
client socket:LP= 9157, L-IP= ARP= 80 , R-IP= C
P1
LP= Local Port , RP= Remote Port L-IP= Local IP , R-IP= Remote IP
P4
server socket:LP= 80 , L-IP= CRP= 9157, R-IP= A
P6
server socket:LP= 80 , L-IP= CRP= 5775, R-IP= B
“L”= Local = My“R”= Remote = Peer
P2
client socket:LP= 5775, L-IP= BRP= 80 , R-IP= C
client socket:LP= 9157, L-IP= BRP= 80 , R-IP= C
P5
server socket:LP= 80 , L-IP= CRP= 9157, R-IP= B
H3
H4SP: 9157
DP: 80
S-IP: A
D-IP: C
packet:
message
SP: 5775
DP: 80
D-IP: CS-IP: B
packet:
message
SP: 9157
DP: 80
packet:
D-IP: CS-IP: B
message
L1
L2
L4
L5
L3
10
![Page 11: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/11.jpg)
11
UDP Protocol
![Page 12: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/12.jpg)
UDP: User Datagram Protocol [RFC 768]
simple transport protocol “best effort” service, UDP
segments may be: lost delivered out of order
to applicationwith no correction by UDP
UDP will discard bad checksum segments if so configured by application
connectionless: no handshaking
between UDP sender, receiver
each UDP segment handled independently of others
Why is there a UDP? no connection
establishment saves delay
no congestion control: better delay & BW
simple: small segment header typical usage: realtime
appl. loss tolerant rate sensitive
other uses (why?): DNS SNMP
12
![Page 13: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/13.jpg)
UDP segment structure
source port # dest port #
32 bits
applicationdata (variable length)
length
Total length of segment (bytes)
Checksum computed over:• the whole segment• part of IP header:
– both IP addresses– protocol field – total IP packet length
checksum
Checksum usage:• computed at destination to detect
errors• in case of error, UDP will discard
the segment, or
13
![Page 14: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/14.jpg)
14
UDP checksum
Sender: treat segment contents
as sequence of 16-bit integers
checksum: addition (1’s complement sum) of segment contents
sender puts checksum value into UDP checksum field
Receiver: compute checksum of
received segment check if computed
checksum equals checksum field value: NO - error detected YES - no error detected.
Goal: detect “errors” (e.g., flipped bits) in transmitted segment
![Page 15: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/15.jpg)
15
TCP Protocol
![Page 16: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/16.jpg)
16
TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581
full duplex data: bi-directional data flow
in same connection MSS: maximum
segment size
connection-oriented: handshaking (exchange
of control msgs) init’s sender, receiver state before data exchange
flow controlled: sender will not
overwhelm receiver
point-to-point: one sender, one receiver between sockets
reliable, in-order byte steam: no “message
boundaries”
pipelined: TCP congestion and flow
control set window size
send & receive buffers
socketdoor
T C Psend buffer
T C Preceive buffer
socketdoor
segm ent
applicationwrites data
applicationreads data
![Page 17: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/17.jpg)
17
TCP segment structure
source port # dest port #
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberrcvr window size
ptr urgent datachecksum
FSRPAUheadlen
notused
Options (variable length)
URG: urgent data (generally not used)
ACK: ACK #valid
PSH: push data now(generally not used)
RST, SYN, FIN:connection estab(setup, teardown
commands)
# bytes rcvr willingto accept
countingby bytes of data(not segments!)
Internetchecksum
(as in UDP)
hdr length in 32 bit words
![Page 18: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/18.jpg)
TCP sequence # (SN) and ACK (AN)SN:
byte stream “number” of first byte in segment’s data
AN: SN of next byte
expected from other side
cumulative ACKQn: how receiver handles
out-of-order segments? puts them in receive
buffer but does not acknowledge them
Host A Host B
time
SN=42, AN=79, 100 data bytes
SN=79, AN=142, 50 data bytes
SN=142, AN=129 , no data
host A sends100 data bytes
host ACKsreceipt of data , sends no dataWHY?
host B ACKs 100bytes and sends50 data bytes
simple data transfer scenario (some time after conn. setup)
18
![Page 19: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/19.jpg)
19
Connection Setup: Objective Agree on initial sequence numbers
a sender should not reuse a seq# before it is sure that all packets with the seq# are purged from the network
• the network guarantees that a packet too old will be purged from the network: network bounds the life time of each packet
To avoid waiting for seq #s to disappear, start new session with a seq# far away from previous
• needs connection setup so that the sender tells the receiver initial seq#
Agree on other initial parameters e.g. Maximum Segment Size
![Page 20: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/20.jpg)
TCP Connection ManagementSetup: establish connection
between the hosts before exchanging data segments
called: 3 way handshake initialize TCP variables:
seq. #s buffers, flow control
info (e.g. RcvWindow) client : connection initiator
opens socket and cmds OS to connect it to server
server : contacted by client has waiting socket accepts connection generates working socket
Teardown: end of connection(we skip the details)
Three way handshake:
Step 1: client host sends TCP SYN segment to server specifies initial seq # no data
Step 2: server host receives SYN, replies with SYNACK segment (also no data) allocates buffers specifies server initial
SN & window sizeStep 3: client receives
SYNACK, replies with ACK segment, which may contain data
20
![Page 21: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/21.jpg)
TCP Three-Way Handshake (TWH)
A
Send Buffer
Receive Buffer
Send Buffer
Receive Buffer
SYN , SN = X
SYNACK , SN = Y, AN = X+1
ACK , SN = X+1 , AN = Y+1
X+1
X+1
Y+1
Y+1
B
21
![Page 22: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/22.jpg)
22
Connection Close
Objective of closure handshake: each side can release
resource and remove state about the connection
• Close the socket
client
I am done. Are you done too?
server
I am done too. Goodbye!
initial close :
close
close
release resource?
release resource
release resource
![Page 23: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/23.jpg)
23
Ch. 7 : Internet Transport Protocols
Part B
![Page 24: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/24.jpg)
3-24
TCP reliable data transfer
TCP creates reliable service on top of IP’s unreliable service
pipelined segments cumulative acks single retransmission
timer receiver accepts out
of order segments but does not acknowledge them
Retransmissions are triggered by timeout events
Initially consider simplified TCP sender: ignore flow control,
congestion control
![Page 25: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/25.jpg)
3-25
TCP sender events:data rcvd from app: create segment with
seq # seq # is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unACKed segment)
expiration interval: TimeOutInterval
timeout: retransmit segment
that caused timeout restart timer ACK rcvd: if acknowledges
previously unACKed segments update what is known
to be ACKed start timer if there are
outstanding segments
![Page 26: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/26.jpg)
Transport Layer3-26
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) { switch(event)
event: data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event: timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer } } /* end of loop forever */
Comment:• SendBase-1: last cumulatively ACKed byteExample:• SendBase-1 = 71;y= 73, so the rcvrwants 73+ ;y > SendBase, sothat new data is ACKed
![Page 27: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/27.jpg)
3-27
TCP actions on receiver events:
application takes data: free the room in
buffer give the freed cells
new numbers circular numbering
WIN increases by the number of bytes taken
data rcvd from IP: if Checksum fails, ignore
segment If checksum OK, then :
if data came in order: update AN+WIN AN grows by the number of
new in-order bytes WIN decreases by same #
if data out of order: Put in buffer, but don’t count it
for AN/ WIN
![Page 28: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/28.jpg)
3-28
TCP: retransmission scenarios
stop timer
stop timer
starttimer for
SN 100
Host A
AN=100
timeA. normal scenario
Host B
AN=120
SN=100 , 20 bytes data
SN=92, 8 bytes data
starttimer for
SN 92
NO timer
starttimer for
new SN 92
AN=100
Host ASN=92, 8 bytes data
Xloss
B. lost ACK + retransmission
Host B
SN=92, 8 bytes data
AN=100
time
starttimer for
SN 92
TIMEOUT
NO timer
stop timer
timer setting
actual timer run
![Page 29: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/29.jpg)
3-29" " ב ס א תשע אפקה Transport Layer 3-29
TCP retransmission scenarios (more)
AN=100
Host ASN=92, 8 bytes data
Xloss
C. lost ACK, NO retransmission
Host B
SN=100, 20 bytes data
AN=120
time
starttimer for
SN 92
stop timer
NO timer
Host A
timeD. premature timeout
Host BSN=92, 8 bytes data
AN=120
starttimer for
SN 92
TIMEOUT
NO timer
star fort 92stop
start for 100
stop
SN=100, 20 bytes data
AN=100
AN=120
SN=92, 8 bytes data
redundant ACK
![Page 30: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/30.jpg)
Transport Layer 3-30
TCP ACK generation [RFC 1122, RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq #. All data up toexpected seq # already ACKed
Arrival of in-order segment withexpected seq #. One other segment has ACK pending
Arrival of out-of-order segmentwith higher-than-expect seq. # .Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK. Wait up to 500msfor next segment. If no data segment to send, then send ACK
Immediately send single cumulative ACK, ACKing both in-order segments
Immediately send duplicate ACK, indicating seq. # of next expected byteThis Ack carries no data & no new WIN
Immediately send ACK, provided thatsegment starts at lower end of gap
![Page 31: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/31.jpg)
Transport
Layer
3-31
Fast Retransmit
time-out period often relatively long:
Causes long delay before resending lost packet
detect lost segments via duplicate ACKs.
sender often sends many segments back-to-back
if segment is lost, there will likely be many duplicate ACKs for that segment
If sender receives 3 ACKs for same data, it assumes that segment after ACKed data was lost: fast retransmit: resend
segment before timer expires
![Page 32: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/32.jpg)
Transport
Layer
3-32
Host A
tim
eout
Host B
time
X
resend seq X2
seq # x1seq # x2seq # x3seq # x4seq # x5
ACK # x2
ACK # x2ACK # x2ACK # x2
tripleduplicate
ACKs
![Page 33: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/33.jpg)
Transport
Layer
3-33
event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer } else {if (segment carries no data & doesn’t change WIN) increment count of dup ACKs received for y if (count of dup ACKs received for y = 3) { { resend segment with sequence number y
count of dup ACKs received for y = 3 } }
Fast retransmit algorithm:
a duplicate ACK for already ACKed segment
fast retransmit
![Page 34: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/34.jpg)
34
TCP: Flow Control
![Page 35: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/35.jpg)
3-35
TCP Flow Control for A’s data
receive side of TCP connection at B has a receive buffer:
flow control matches the send rate of A to the receiving application’s drain rate at B
Receive buffer size set by OS at connection init
WIN = window size = number bytes A may send starting at AN
application process at B may be slow at reading from buffer
sender won’t overflow
receiver’s buffer bytransmitting too
much, too fast
flow control
node B : Receive process
Receive Buffer
data taken by
application
TCP datain buffer
spare room
WIN
data from IP
(sent by TCP at A)
AN
![Page 36: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/36.jpg)
3-36" " ב ס א תשע אפקה
TCP Flow control: how it works
Formulas: AN = first byte not received yet
sent to A in TCP header AckedRange =
= AN – FirstByteNotReadByAppl = = # bytes rcvd in sequence & not taken
WIN = RcvBuffer – AckedRange= SpareRoom
AN and WIN sent to A in TCP header Data rcvd out of sequence is
considered part of ‘spare room’ range
Procedure: Rcvr advertises “spare
room” by including value of WIN in his segments
Sender A is allowed to send at most WIN bytes in the range starting with AN guarantees that receive
buffer doesn’t overflow
node B : Receive process
ACKed datain buffer
Rcv Buffer
data from IPdata taken by
application
WIN
(sent by TCP at A)s p a r e r o o m
non-ACKed data in buffer(arrived out of order)
ignored
AN
![Page 37: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/37.jpg)
3-37" " ב ס א תשע אפקה
1 – דוגמה TCPבקרת זרימה של
![Page 38: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/38.jpg)
3-38" " ב ס א תשע אפקה
2 – דוגמה TCPבקרת זרימה של
![Page 39: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/39.jpg)
39
TCP: setting timeouts
![Page 40: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/40.jpg)
40
TCP Round Trip Time and TimeoutQ: how to set TCP
timeout value? longer than RTT
note: RTT will vary too short: premature
timeout unnecessary
retransmissions too long: slow
reaction to segment loss
Q: how to estimate RTT? SampleRTT: measured time
from segment transmission until ACK receipt ignore retransmissions,
cumulatively ACKed segments
SampleRTT will vary, want estimated RTT “smoother” use several recent
measurements, not just current SampleRTT
![Page 41: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/41.jpg)
41
High-level Idea
Set timeout = average + safe margin
![Page 42: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/42.jpg)
42
Estimating Round Trip Time
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT
Exponential weighted moving average influence of past sample decreases exponentially
fast typical value: = 0.125
SampleRTT: measured time from segment transmission until ACK receipt SampleRTT will vary, want a “smoother” estimated RTT
use several recent measurements, not just current SampleRTT
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT
(mill
isec
onds
)
SampleRTT Estimated RTT
![Page 43: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/43.jpg)
43
Setting TimeoutProblem: using the average of SampleRTT will generate
many timeouts due to network variations
Solution: EstimtedRTT plus “safety margin”
large variation in EstimatedRTT -> larger safety margin
TimeoutInterval = EstimatedRTT + 4*DevRTT
DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|
(typically, = 0.25)
Then set timeout interval:
RTT
freq.
![Page 44: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/44.jpg)
44
An Example TCP Session
![Page 45: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/45.jpg)
45
TCP: Congestion Control
![Page 46: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/46.jpg)
46
TCP Congestion Control Closed-loop, end-to-end, window-based
congestion control Designed by Van Jacobson in late 1980s, based
on the AIMD alg. of Dah-Ming Chu and Raj Jain Works well so far: the bandwidth of the Internet
has increased by more than 200,000 times Many versions
TCP/Tahoe: this is a less optimized version TCP/Reno: many OSs today implement Reno
type congestion control TCP/Vegas: not currently used
For more details: see TCP/IP illustrated; or readhttp://lxr.linux.no/source/net/ipv4/tcp_input.c for linux implementation
![Page 47: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/47.jpg)
47
TCP & AIMD: congestion
Dynamic window size [Van Jacobson] Initialization: MI
• Slow start Steady state: AIMD
• Congestion Avoidance
Congestion = timeout TCP Tahoe
Congestion = timeout || 3 duplicate ACK TCP Reno & TCP new Reno
Congestion = higher latency TCP Vegas
![Page 48: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/48.jpg)
48
Visualization of the Two Phases
threshold
Congw
ing
Slow start
Congestion avoidance
MSS
![Page 49: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/49.jpg)
49
TCP Slowstart: MI
exponential increase (per RTT) in window size (not so slow!)
In case of timeout: Threshold=CongWin/2
initialize: Congwin = 1 MSSfor (each segment ACKed) Congwin+MSSuntil (congestion event OR CongWin > threshold)
Slowstart algorithmHost A
one segment
RTT
Host B
time
two segments
four segments
![Page 50: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/50.jpg)
50
TCP Tahoe Congestion Avoidance
/* slowstart is over */ /* Congwin > threshold */Until (timeout) { /* loss event */ on every ACK: CWin/MSS+= 1/(Cwin/MSS) }threshold = Congwin/2Congwin = 1 MSSperform slowstart
Congestion avoidance
TCP Tahoe
![Page 51: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/51.jpg)
51
TCP Reno Fast retransmit:
Try to avoid waiting for timeout
Fast recovery: Try to avoid slowstart. used only on triple duplicate event Single packet drop: not too bad
![Page 52: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/52.jpg)
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60
Time
Co
ng
esti
on
Win
do
w
threshold
congestionwindowtimeouts
slow start period
additive increase
fast retransmission
52
TCP Reno cwnd Trace
CACA
CA
Slo
w S
tart
Slo
w S
tart
Sl.S
tart
triple duplicate Ack
![Page 53: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/53.jpg)
Transport
Layer
3-53
TCP congestion control: bandwidth probing
“probing for bandwidth”: increase transmission rate on receipt of ACK, until eventually loss occurs, then decrease transmission rate continue to increase on ACK, decrease on loss (since available
bandwidth is changing, depending on other connections in network)
ACKs being received, so increase rate
X
X
XX
X loss, so decrease rate
send
ing
rate
time
Q: how fast to increase/decrease? details to follow
TCP’s“sawtooth”behavior
![Page 54: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/54.jpg)
Transport
Layer
3-54
TCP Congestion Control: details
sender limits rate by limiting number of unACKed bytes “in pipeline”:
cwnd: differs from rwnd (how, why?) sender limited by min(cwnd,rwnd)
roughly,
cwnd is dynamic, function of perceived network congestion
rate = cwnd
RTT bytes/sec
LastByteSent-LastByteAcked cwnd
cwndbytes
RTT
ACK(s)
![Page 55: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/55.jpg)
Transport
Layer
3-55
TCP Congestion Control: more details
segment loss event: reducing cwnd
timeout: no response from receiver cut cwnd to 1
3 duplicate ACKs: at least some segments getting through (recall fast retransmit) cut cwnd in half, less
aggressively than on timeout
ACK received: increase cwnd slowstart phase:
start low (cwnd=MSS) increase cwnd exponentially
fast (despite name) used: at connection start, or
following timeout
congestion avoidance: increase cwnd linearly
![Page 56: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/56.jpg)
Transport
Layer
3-56
TCP Slow Start when connection begins, cwnd
= 1 MSS example: MSS = 500 bytes
& RTT = 200 msec initial rate = 20 kbps
available bandwidth may be >> MSS/RTT desirable to quickly ramp up
to respectable rate increase rate exponentially
until first loss event or when threshold reached double cwnd every RTT done by incrementing cwnd
by 1 for every ACK received
Host Aone segment
RT
T
Host B
time
two segments
four segments
![Page 57: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/57.jpg)
Transport
Layer
3-57
TCP: congestion avoidance
when cwnd > ssthresh grow cwnd linearly increase cwnd
by 1 MSS per RTT approach possible
congestion slower than in slowstart
implementation: cwnd = cwnd + MSS^2/cwnd for each ACK received
ACKs: increase cwnd by 1 MSS per RTT: additive increase
loss: cut cwnd in half (non-timeout-detected loss ): multiplicative decrease true in macro picture may require Slow
Start first to grow up to this
AIMD
AIMD: Additive IncreaseMultiplicative Decrease
![Page 58: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/58.jpg)
Transport
Layer
3-58
TCP congestion control FSM: overview
slow start
congestionavoidance
fastrecovery
cwnd > ssthresh
loss:timeout
loss:timeout new ACK loss:
3dupACK
loss:3dupACK
loss:timeout
![Page 59: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/59.jpg)
Transport
Layer
3-59
TCP congestion control FSM: details
slow start
congestionavoidance
fastrecovery
timeoutssthresh = cwnd/2cwnd = 1 MSSdupACKcount = 0retransmit missing segment
timeoutssthresh = cwnd/2 cwnd = 1 MSSdupACKcount = 0retransmit missing segment
cwnd > ssthresh
cwnd = cwnd+MSSdupACKcount = 0transmit new segment(s),as allowed
new ACKcwnd = cwnd + MSS (MSS/cwnd)dupACKcount = 0transmit new segment(s),as allowed
new ACK.
dupACKcount++
duplicate ACK
ssthresh= cwnd/2cwnd = ssthresh + 3 MSSretransmit missing segment
dupACKcount == 3
dupACKcount++
duplicate ACK
ssthresh= cwnd/2cwnd = ssthresh + 3 MSS
retransmit missing segment
dupACKcount == 3
timeoutssthresh = cwnd/2cwnd = 1 MSSdupACKcount = 0retransmit missing segment
cwnd = cwnd + MSStransmit new segment(s), as allowed
duplicate ACK
cwnd = ssthreshdupACKcount = 0
New ACK
cwnd = 1 MSSssthresh = 64 KBdupACKcount = 0
![Page 60: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/60.jpg)
Transport
Layer
3-60
Popular “flavors” of TCP
ssthresh
ssthresh
TCP Tahoe
TCP Reno
Transmission round
cwnd w
ind
ow
siz
e
(in
segm
ents
)
![Page 61: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/61.jpg)
Transport
Layer
3-61
Summary: TCP Congestion Control
when cwnd < ssthresh, sender in slow-start phase, window grows exponentially.
when cwnd >= ssthresh, sender is in congestion-avoidance phase, window grows linearly.
when triple duplicate ACK occurs, ssthresh set to cwnd/2, cwnd set to ~ ssthresh
when timeout occurs, ssthresh set to cwnd/2, cwnd set to 1 MSS.
![Page 62: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/62.jpg)
Transport
Layer
3-62
TCP throughput
Q: what’s average throughout of TCP as function of window size, RTT? ignoring slow start
let W be window size when loss occurs.when window is W, throughput is
W/RTT just after loss, window drops to W/2,
throughput to W/2RTT. average throughout: .75 W/RTT
![Page 63: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/63.jpg)
Transport
Layer
3-63
fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K
TCP connection 1
bottleneckroutercapacity R
TCP connection 2
TCP Fairness
![Page 64: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/64.jpg)
Transport
Layer
3-64
Why is TCP fair?
Two competing sessions: (Tahoe, Slow Start ignored) Additive increase gives slope of 1, as throughout increases multiplicative decrease decreases throughput proportionally
R
R
equal bandwidth share
Connection 1 throughput
Con
nect
ion
2 t h
rou g
h pu t
congestion avoidance: additive increase
loss: decrease window by factor of 2congestion avoidance: additive increase
loss: decrease window by factor of 2
(a,b)
(a+t,b+t) => y = x+(b-a)
(a/2+t1/2+t,b/2+t1/2+t) => y = x+(b-a)/2
y = x+(b-a)/4
![Page 65: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/65.jpg)
Transport
Layer
3-65
Fairness (more)
Fairness and UDP multimedia apps
often do not use TCP do not want rate
throttled by congestion control
instead use UDP: pump audio/video at
constant rate, tolerate packet loss
Fairness and parallel TCP connections
nothing prevents app from opening parallel connections between 2 hosts.
web browsers do this example: link of rate R
supporting already9 connections; new app asks for 1 TCP,
gets rate R/10 new app asks for 11 TCPs,
gets R/2 !!
![Page 66: Ch. 7 : Internet Transport Protocols](https://reader035.vdocuments.site/reader035/viewer/2022062500/5681521d550346895dc060b3/html5/thumbnails/66.jpg)
Transport
Layer
3-66
Exercise MSS = 1000 Only one event per row