transport layer3-1 what’s the internet: “nuts and bolts” view r millions of connected...
Post on 20-Dec-2015
219 Views
Preview:
TRANSCRIPT
Transport Layer 3-1
Whatrsquos the Internet ldquonuts and boltsrdquo view
millions of connected computing devices hosts = end systems examples of hosts
running network apps examples of
applications
local ISP
companynetwork
regional ISP
router workstation
servermobile
Transport Layer 3-2
Whatrsquos the Internet ldquonuts and boltsrdquo view
communication links fiber copper radio
satellite transmission rate = bandwidth
bull typical bandwidth for modem wireless
routers forward packets (chunks of data) whatrsquos in a packet
local ISP
companynetwork
regional ISP
router workstation
servermobile
Transport Layer 3-3
Whatrsquos the Internet ldquonuts and boltsrdquo view protocols control sending
receiving of msgs eg TCP IP HTTP FTP PPP
Internet ldquonetwork of networksrdquo loosely hierarchical public Internet versus private
intranet Internet standards
RFC Request for comments IETF Internet Engineering
Task Force
local ISP
companynetwork
regional ISP
router workstation
servermobile
Transport Layer 3-4
Whatrsquos the Internet a service view communication
infrastructure enables distributed applications Web email other
examples communication services
provided to apps connection-oriented reliable
bull example apps Connectionless unreliable
bull example apps
Transport Layer 3-5
Whatrsquos a protocolhuman protocols ldquowhatrsquos the timerdquo ldquoI have a questionrdquo introductions
network protocols machines rather than
humans all communication
activity in Internet governed by protocols
protocols define format order of msgs sent and received among network entities and actions taken on msg
transmission receipt
Transport Layer 3-6
Whatrsquos a protocola human protocol and a computer network protocol
Q Why are protocols so important
Hi
Hi
Got thetime
200
TCP connection req
TCP connectionresponseGet httpwwwawlcomkurose-ross
ltfilegttime
Transport Layer 3-7
A closer look at network structure network edge
applications and hosts network core
routers network of networks
access networks physical media communication links
Transport Layer 3-8
Protocol ldquoLayersrdquoNetworks are complex many ldquopiecesrdquo
hosts routers links of various
media applications protocols hardware software
Question Is there any hope of organizing structure of
network
Or at least our discussion of networks
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-2
Whatrsquos the Internet ldquonuts and boltsrdquo view
communication links fiber copper radio
satellite transmission rate = bandwidth
bull typical bandwidth for modem wireless
routers forward packets (chunks of data) whatrsquos in a packet
local ISP
companynetwork
regional ISP
router workstation
servermobile
Transport Layer 3-3
Whatrsquos the Internet ldquonuts and boltsrdquo view protocols control sending
receiving of msgs eg TCP IP HTTP FTP PPP
Internet ldquonetwork of networksrdquo loosely hierarchical public Internet versus private
intranet Internet standards
RFC Request for comments IETF Internet Engineering
Task Force
local ISP
companynetwork
regional ISP
router workstation
servermobile
Transport Layer 3-4
Whatrsquos the Internet a service view communication
infrastructure enables distributed applications Web email other
examples communication services
provided to apps connection-oriented reliable
bull example apps Connectionless unreliable
bull example apps
Transport Layer 3-5
Whatrsquos a protocolhuman protocols ldquowhatrsquos the timerdquo ldquoI have a questionrdquo introductions
network protocols machines rather than
humans all communication
activity in Internet governed by protocols
protocols define format order of msgs sent and received among network entities and actions taken on msg
transmission receipt
Transport Layer 3-6
Whatrsquos a protocola human protocol and a computer network protocol
Q Why are protocols so important
Hi
Hi
Got thetime
200
TCP connection req
TCP connectionresponseGet httpwwwawlcomkurose-ross
ltfilegttime
Transport Layer 3-7
A closer look at network structure network edge
applications and hosts network core
routers network of networks
access networks physical media communication links
Transport Layer 3-8
Protocol ldquoLayersrdquoNetworks are complex many ldquopiecesrdquo
hosts routers links of various
media applications protocols hardware software
Question Is there any hope of organizing structure of
network
Or at least our discussion of networks
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-3
Whatrsquos the Internet ldquonuts and boltsrdquo view protocols control sending
receiving of msgs eg TCP IP HTTP FTP PPP
Internet ldquonetwork of networksrdquo loosely hierarchical public Internet versus private
intranet Internet standards
RFC Request for comments IETF Internet Engineering
Task Force
local ISP
companynetwork
regional ISP
router workstation
servermobile
Transport Layer 3-4
Whatrsquos the Internet a service view communication
infrastructure enables distributed applications Web email other
examples communication services
provided to apps connection-oriented reliable
bull example apps Connectionless unreliable
bull example apps
Transport Layer 3-5
Whatrsquos a protocolhuman protocols ldquowhatrsquos the timerdquo ldquoI have a questionrdquo introductions
network protocols machines rather than
humans all communication
activity in Internet governed by protocols
protocols define format order of msgs sent and received among network entities and actions taken on msg
transmission receipt
Transport Layer 3-6
Whatrsquos a protocola human protocol and a computer network protocol
Q Why are protocols so important
Hi
Hi
Got thetime
200
TCP connection req
TCP connectionresponseGet httpwwwawlcomkurose-ross
ltfilegttime
Transport Layer 3-7
A closer look at network structure network edge
applications and hosts network core
routers network of networks
access networks physical media communication links
Transport Layer 3-8
Protocol ldquoLayersrdquoNetworks are complex many ldquopiecesrdquo
hosts routers links of various
media applications protocols hardware software
Question Is there any hope of organizing structure of
network
Or at least our discussion of networks
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-4
Whatrsquos the Internet a service view communication
infrastructure enables distributed applications Web email other
examples communication services
provided to apps connection-oriented reliable
bull example apps Connectionless unreliable
bull example apps
Transport Layer 3-5
Whatrsquos a protocolhuman protocols ldquowhatrsquos the timerdquo ldquoI have a questionrdquo introductions
network protocols machines rather than
humans all communication
activity in Internet governed by protocols
protocols define format order of msgs sent and received among network entities and actions taken on msg
transmission receipt
Transport Layer 3-6
Whatrsquos a protocola human protocol and a computer network protocol
Q Why are protocols so important
Hi
Hi
Got thetime
200
TCP connection req
TCP connectionresponseGet httpwwwawlcomkurose-ross
ltfilegttime
Transport Layer 3-7
A closer look at network structure network edge
applications and hosts network core
routers network of networks
access networks physical media communication links
Transport Layer 3-8
Protocol ldquoLayersrdquoNetworks are complex many ldquopiecesrdquo
hosts routers links of various
media applications protocols hardware software
Question Is there any hope of organizing structure of
network
Or at least our discussion of networks
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-5
Whatrsquos a protocolhuman protocols ldquowhatrsquos the timerdquo ldquoI have a questionrdquo introductions
network protocols machines rather than
humans all communication
activity in Internet governed by protocols
protocols define format order of msgs sent and received among network entities and actions taken on msg
transmission receipt
Transport Layer 3-6
Whatrsquos a protocola human protocol and a computer network protocol
Q Why are protocols so important
Hi
Hi
Got thetime
200
TCP connection req
TCP connectionresponseGet httpwwwawlcomkurose-ross
ltfilegttime
Transport Layer 3-7
A closer look at network structure network edge
applications and hosts network core
routers network of networks
access networks physical media communication links
Transport Layer 3-8
Protocol ldquoLayersrdquoNetworks are complex many ldquopiecesrdquo
hosts routers links of various
media applications protocols hardware software
Question Is there any hope of organizing structure of
network
Or at least our discussion of networks
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-6
Whatrsquos a protocola human protocol and a computer network protocol
Q Why are protocols so important
Hi
Hi
Got thetime
200
TCP connection req
TCP connectionresponseGet httpwwwawlcomkurose-ross
ltfilegttime
Transport Layer 3-7
A closer look at network structure network edge
applications and hosts network core
routers network of networks
access networks physical media communication links
Transport Layer 3-8
Protocol ldquoLayersrdquoNetworks are complex many ldquopiecesrdquo
hosts routers links of various
media applications protocols hardware software
Question Is there any hope of organizing structure of
network
Or at least our discussion of networks
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-7
A closer look at network structure network edge
applications and hosts network core
routers network of networks
access networks physical media communication links
Transport Layer 3-8
Protocol ldquoLayersrdquoNetworks are complex many ldquopiecesrdquo
hosts routers links of various
media applications protocols hardware software
Question Is there any hope of organizing structure of
network
Or at least our discussion of networks
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-8
Protocol ldquoLayersrdquoNetworks are complex many ldquopiecesrdquo
hosts routers links of various
media applications protocols hardware software
Question Is there any hope of organizing structure of
network
Or at least our discussion of networks
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-9
Organization of air travel
a series of steps
ticket (purchase)
baggage (check)
gates (load)
runway takeoff
airplane routing
ticket (complain)
baggage (claim)
gates (unload)
runway landing
airplane routing
airplane routing
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-10
ticket (purchase)
baggage (check)
gates (load)
runway (takeoff)
airplane routing
departureairport
arrivalairport
intermediate air-trafficcontrol centers
airplane routing airplane routing
ticket (complain)
baggage (claim
gates (unload)
runway (land)
airplane routing
ticket
baggage
gate
takeofflanding
airplane routing
Layering of airline functionality
Layers each layer implements a service via its own internal-layer actions relying on services provided by layer below
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-11
Why layeringDealing with complex systems explicit structure allows identification relationship of
complex systemrsquos pieces layered reference model for discussion
modularization eases maintenance updating of system change of implementation of layerrsquos service
transparent to rest of system eg change in gate procedure doesnrsquot affect rest of
system layering considered harmful
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-12
Internet protocol stack application supporting network applications
FTP SMTP STTP transport host-host data transfer
TCP UDP network routing of datagrams from source
to destination IP routing protocols
link data transfer between neighboring network elements
PPP Ethernet physical bits ldquoon the wirerdquo
application
transport
network
link
physical
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-13
messagesegment
datagram
frame
sourceapplicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
destination
applicationtransportnetwork
linkphysical
HtHnHl M
HtHn M
Ht M
M
networklink
physical
linkphysical
HtHnHl M
HtHn M
HtHnHl M
HtHn M
HtHnHl M HtHnHl M
router
switch
Encapsulation
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-14
Internet transport protocols services
TCP service connection-oriented setup
required between client and server processes
reliable transport between sending and receiving process
flow control sender wonrsquot overwhelm receiver
congestion control throttle sender when network overloaded
does not provide timing minimum bandwidth guarantees
UDP service unreliable data transfer
between sending and receiving process
does not provide connection setup reliability flow control congestion control timing or bandwidth guarantee
Q why bother Why is there a UDP
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-15
Transport vs network layer
network layer logical communication between hosts
transport layer logical communication between processes relies on enhances
network layer services
Household analogy12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-16
Reliable data transfer getting startedWersquoll incrementally develop sender receiver
sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow on both directions
use finite state machines (FSM) to specify sender receiver
state1
state2
event causing state transitionactions taken on state transition
state when in this ldquostaterdquo next state
uniquely determined by next event
eventactions
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-17
Rdt10 reliable transfer over a reliable channel
underlying channel perfectly reliable no bit errors no loss of packets
separate FSMs for sender receiver sender sends data into underlying channel receiver read data from underlying channel
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packetdata)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-18
Rdt20 channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question how to recover from errors acknowledgements (ACKs) receiver explicitly tells
sender that pkt received OK negative acknowledgements (NAKs) receiver
explicitly tells sender that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt20 (beyond rdt10) error detection receiver feedback control msgs (ACKNAK) rcvr-
gtsender
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-19
rdt20 FSM specification
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-20
rdt20 operation with no errors
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-21
rdt20 error scenario
Wait for call from above
snkpkt = make_pkt(data checksum)udt_send(sndpkt)
extract(rcvpktdata)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) ampamp corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-22
rdt20 has a fatal flaw
Stop and Wait Sender sends one packet then waits for receiver response
What happens if ACKNAK corrupted sender doesnrsquot know what happened at receiver canrsquot just retransmit possible duplicate
Handling duplicates sender adds sequence number to each pkt sender retransmits current pkt if ACKNAK garbled receiver discards (doesnrsquot deliver up) duplicate pkt
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-23
rdt21 sender handles garbled ACKNAKs
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-24
rdt21 receiver handles garbled ACKNAKs
Wait for 0 from below
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq0(rcvpkt)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq0(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp not corrupt(rcvpkt) ampamp has_seq1(rcvpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt)
sndpkt = make_pkt(ACK chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK chksum)udt_send(sndpkt)
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-25
rdt21 discussion
Sender seq added to pkt two seq rsquos (01)
will suffice Why must check if
received ACKNAK corrupted
twice as many states state must
ldquorememberrdquo whether ldquocurrentrdquo pkt has 0 or 1 seq
Receiver must check if
received packet is duplicate state indicates
whether 0 or 1 is expected pkt seq
note receiver can not know if its last ACKNAK received OK at sender
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-26
rdt22 a NAK-free protocol
same functionality as rdt21 using ACKs only instead of NAK receiver sends ACK for last pkt
received OK receiver must explicitly include seq of pkt being
ACKed
duplicate ACK at sender results in same action as NAK retransmit current pkt
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-27
rdt22 sender receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) || isACK(rcvpkt1) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp has_seq1(rcvpkt)
extract(rcvpktdata)deliver_data(data)sndpkt = make_pkt(ACK1 chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) ampamp (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-28
rdt30 channels with errors and loss
New assumption underlying channel can also lose packets (data or ACKs) checksum seq
ACKs retransmissions will be of help but not enough
Approach sender waits ldquoreasonablerdquo amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost) retransmission will be
duplicate but use of seq rsquos already handles this
receiver must specify seq of pkt being ACKed
requires countdown timer
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-29
rdt30 sender
sndpkt = make_pkt(0 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt1) )
Wait for call 1 from
above
sndpkt = make_pkt(1 data checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt0)
rdt_rcv(rcvpkt) ampamp ( corrupt(rcvpkt) ||isACK(rcvpkt0) )
rdt_rcv(rcvpkt) ampamp notcorrupt(rcvpkt) ampamp isACK(rcvpkt1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-30
rdt30 in action
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-31
rdt30 in action
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-32
Performance of rdt30
rdt30 works but performance stinks Why
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-33
rdt30 stop-and-wait operation
first packet bit transmitted t = 0
sender receiver
RTT
last packet bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-34
Pipelined protocolsPipelining sender allows multiple ldquoin-flightrdquo yet-to-be-
acknowledged pkts range of sequence numbers must be increased buffering at sender andor receiver
Two generic forms of pipelined protocols go-Back-N selective repeat
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-35
Pipelining increased utilization
first packet bit transmitted t = 0
sender receiver
RTT
last bit transmitted t = L R
first packet bit arriveslast packet bit arrives send ACK
ACK arrives send next packet t = RTT + L R
last bit of 2nd packet arrives send ACKlast bit of 3rd packet arrives send ACK
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-36
Go-Back-NSender k-bit seq in pkt header ldquowindowrdquo of up to N consecutive unackrsquoed pkts allowed
ACK(n) ACKs all pkts up to including seq n - ldquocumulative ACKrdquo may deceive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n) retransmit pkt n and all higher seq pkts in window
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-37
GBN receiver
ACK-only always send ACK for correctly-received pkt with highest in-order seq may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt discard (donrsquot buffer) -gt isnt this bad Re-ACK pkt with highest in-order seq
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-38
GBN inaction
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-39
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts as needed for eventual in-order
delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq rsquos again limits seq s of sent unACKed pkts
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-40
Selective repeat sender receiver windows
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-41
Selective repeat
data from above if next available seq in
window send pkt
timeout(n) resend pkt n restart
timerACK(n) in
[sendbasesendbase+N] mark pkt n as received if n smallest unACKed
pkt advance window base to next unACKed seq
senderpkt n in [rcvbase rcvbase+N-1]
send ACK(n) out-of-order buffer in-order deliver (also
deliver buffered in-order pkts) advance window to next not-yet-received pkt
pkt n in [rcvbase-Nrcvbase-1]
ACK(n)
otherwise ignore
receiver
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-42
Selective repeat in action
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-43
Selective repeat dilemmaExample seq rsquos 0 1 2 3 window size=3
receiver sees no difference in two scenarios
incorrectly passes duplicate data as new in (a)
Q what relationship between seq size and window size
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-44
TCP Overview RFCs 793 1122 1323 2018 2581
full duplex data bi-directional data flow in
same connection MSS maximum segment
size connection-oriented
handshaking (exchange of control msgs) initrsquos sender receiver state before data exchange
flow controlled sender will not
overwhelm receiver
point-to-point one sender one receiver
reliable in-order byte steam no ldquomessage boundariesrdquo
pipelined TCP congestion and flow
control set window size send amp receive buffers
socketdoor
TCPsend buffer
TCPreceive buffer
socketdoor
segment
applicationwrites data
applicationreads data
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-45
TCP segment structure
source port dest port
32 bits
applicationdata
(variable length)
sequence number
acknowledgement numberReceive window
Urg data pnterchecksum
FSRPAUheadlen
notused
Options (variable length)
URG urgent data (generally not used)
ACK ACK valid
PSH push data now(generally not used)
RST SYN FINconnection estab(setup teardown
commands)
bytes rcvr willingto accept
countingby bytes of data(not segments)
Internetchecksum
(as in UDP)
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-46
TCP seq rsquos and ACKsSeq rsquos
byte stream ldquonumberrdquo of first byte in segmentrsquos data
ACKs seq of next byte
expected from other side
cumulative ACK piggybacking
Q how receiver handles out-of-order segments
A TCP spec doesnrsquot say - up to implementor
Host A Host B
Seq=42 ACK=79 data = lsquoCrsquo
Seq=79 ACK=43 data = lsquoCrsquo
Seq=43 ACK=80
Usertypes
lsquoCrsquo
host ACKsreceipt
of echoedlsquoCrsquo
host ACKsreceipt of
lsquoCrsquo echoesback lsquoCrsquo
timesimple telnet scenario
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-47
TCP Round Trip Time and TimeoutQ how to set TCP
timeout value longer than RTT
but RTT varies too short premature
timeout unnecessary
retransmissions too long slow
reaction to segment loss
Q how to estimate RTT SampleRTT measured time
from segment transmission until ACK receipt ignore retransmissions
SampleRTT will vary want estimated RTT ldquosmootherrdquo average several recent
measurements not just current SampleRTT
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-48
Example RTT estimationRTT gaiacsumassedu to fantasiaeurecomfr
100
150
200
250
300
350
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)
RTT (milliseconds)
SampleRTT Estimated RTT
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-49
TCP reliable data transfer
TCP creates rdt service on top of IPrsquos unreliable service
Pipelined segments Cumulative acks TCP uses single
retransmission timer
Retransmissions are triggered by timeout events duplicate acks
Initially consider simplified TCP sender ignore duplicate acks ignore flow control
congestion control
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-50
TCP sender eventsdata rcvd from app Create segment with
seq seq is byte-stream
number of first data byte in segment
start timer if not already running (think of timer as for oldest unacked segment)
expiration interval TimeOutInterval
timeout retransmit segment
that caused timeout restart timer Ack rcvd If acknowledges
previously unacked segments update what is known
to be acked start timer if there are
outstanding segments
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-51
TCP sender(simplified)
NextSeqNum = InitialSeqNum SendBase = InitialSeqNum
loop (forever) switch(event)
event data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data)
event timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer
event ACK received with ACK field value of y if (y gt SendBase) SendBase = y if (there are currently not-yet-acknowledged segments) start timer
end of loop forever
Commentbull SendBase-1 last cumulatively ackrsquoed byteExamplebull SendBase-1 = 71y= 73 so the rcvrwants 73+ y gt SendBase sothat new data is acked
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-52
TCP retransmission scenarios
Host A
Seq=100 20 bytes data
ACK=100
timepremature timeout
Host B
Seq=92 8 bytes data
ACK=120
Seq=92 8 bytes data
Seq=
92
tim
eout
ACK=120
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
lost ACK scenario
Host B
X
Seq=92 8 bytes data
ACK=100
time
Seq=
92
tim
eout
SendBase= 100
SendBase= 120
SendBase= 120
Sendbase= 100
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-53
TCP retransmission scenarios (more)
Host A
Seq=92 8 bytes data
ACK=100
loss
tim
eout
Cumulative ACK scenario
Host B
X
Seq=100 20 bytes data
ACK=120
time
SendBase= 120
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-54
TCP ACK generation [RFC 1122 RFC 2581]
Event at Receiver
Arrival of in-order segment withexpected seq All data up toexpected seq already ACKed
Arrival of in-order segment withexpected seq One other segment has ACK pending
Arrival of out-of-order segmenthigher-than-expect seq Gap detected
Arrival of segment that partially or completely fills gap
TCP Receiver action
Delayed ACK Wait up to 500msfor next segment If no next segmentsend ACK
Immediately send single cumulative ACK ACKing both in-order segments
Immediately send duplicate ACK indicating seq of next expected byte
Immediate send ACK provided thatsegment startsat lower end of gap
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-55
Fast Retransmit
Time-out period often relatively long long delay before
resending lost packet Detect lost segments
via duplicate ACKs Sender often sends
many segments back-to-back
If segment is lost there will likely be many duplicate ACKs
If sender receives 3 ACKs for the same data it supposes that segment after ACKed data was lost fast retransmit resend
segment before timer expires
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-56
TCP Flow Control
receive side of TCP connection has a receive buffer
speed-matching service matching the send rate to the receiving apprsquos drain rate app process may be
slow at reading from buffer
sender wonrsquot overflowreceiverrsquos buffer by
transmitting too much too fast
flow control
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-57
TCP Flow control how it works
(Suppose TCP receiver discards out-of-order segments)
spare room in buffer= RcvWindow
= RcvBuffer-[LastByteRcvd - LastByteRead]
Rcvr advertises spare room by including value of RcvWindow in segments
Sender limits unACKed data to RcvWindow guarantees receive
buffer doesnrsquot overflow
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-58
TCP Connection Management
Recall TCP sender receiver establish ldquoconnectionrdquo before exchanging data segments
initialize TCP variables seq s buffers flow control info (eg RcvWindow)
client connection initiator Socket clientSocket = new Socket(hostnameport
number) server contacted by client Socket connectionSocket = welcomeSocketaccept()
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-59
TCP Connection Management
Three way handshake
Step 1 client host sends TCP SYN segment to server specifies initial seq no data
Step 2 server host receives SYN replies with SYNACK segment server allocates buffers specifies server initial seq
Step 3 client receives SYNACK replies with ACK segment which may contain data
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-60
TCP Connection Management (cont)
Closing a connection
client closes socket clientSocketclose()
Step 1 client end system sends TCP FIN control segment to server
Step 2 server receives FIN replies with ACK Closes connection sends FIN
client
FIN
server
ACK
ACK
FIN
close
close
closed
tim
ed w
ait
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-61
TCP Connection Management (cont)
Step 3 client receives FIN replies with ACK
Enters ldquotimed waitrdquo - will respond with ACK to received FINs
Step 4 server receives ACK Connection closed
Note with small modification can handle simultaneous FINs
client
FIN
server
ACK
ACK
FIN
closing
closing
closed
tim
ed w
ait
closed
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-62
TCP Congestion Control
end-end control (no network assistance)
sender limits transmission LastByteSent-LastByteAcked CongWin Roughly
CongWin is dynamic function of perceived network congestion
How does sender perceive congestion
loss event = timeout or 3 duplicate acks
TCP sender reduces rate (CongWin) after loss event
three mechanisms AIMD slow start conservative after
timeout events
rate = CongWin
RTT Bytessec
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-63
TCP AIMD
8 Kbytes
16 Kbytes
24 Kbytes
time
congestionwindow
multiplicative decrease cut CongWin in half after loss event
additive increase increase CongWin by 1 MSS every RTT in the absence of loss events probing
Long-lived TCP connection
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-64
TCP Slow Start
When connection begins CongWin = 1 MSS Example MSS = 500
bytes amp RTT = 200 msec
initial rate = 20 kbps
available bandwidth may be gtgt MSSRTT desirable to quickly
ramp up to respectable rate
When connection begins increase rate exponentially fast until first loss event
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-65
TCP Slow Start (more)
When connection begins increase rate exponentially until first loss event double CongWin every
RTT done by incrementing CongWin for every ACK received
Summary initial rate is slow but ramps up exponentially fast
Host A
one segment
RTT
Host B
time
two segments
four segments
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-66
Refinement After 3 dup ACKs
CongWin is cut in half window then grows linearly
But after timeout event CongWin instead set to 1 MSS window then grows
exponentially to a threshold then grows
linearly
bull 3 dup ACKs indicates network capable of delivering some segmentsbull timeout before 3 dup ACKs is ldquomore alarmingrdquo
Philosophy
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-67
Refinement (more)Q When should the
exponential increase switch to linear
A When CongWin gets to 12 of its value before timeout
Implementation Variable Threshold At loss event Threshold is
set to 12 of CongWin just before loss event
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-68
Summary TCP Congestion Control
When CongWin is below Threshold sender in slow-start phase window grows exponentially
When CongWin is above Threshold sender is in congestion-avoidance phase window grows linearly
When a triple duplicate ACK occurs Threshold set to CongWin2 and CongWin set to Threshold
When timeout occurs Threshold set to CongWin2 and CongWin is set to 1 MSS
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-69
TCP sender congestion control
Event State TCP Sender Action Commentary
ACK receipt for previously unacked data
Slow Start (SS)
CongWin = CongWin + MSS If (CongWin gt Threshold) set state to ldquoCongestion Avoidancerdquo
Resulting in a doubling of CongWin every RTT
ACK receipt for previously unacked data
CongestionAvoidance (CA)
CongWin = CongWin+MSS (MSSCongWin)
Additive increase resulting in increase of CongWin by 1 MSS every RTT
Loss event detected by triple duplicate ACK
SS or CA Threshold = CongWin2 CongWin = ThresholdSet state to ldquoCongestion Avoidancerdquo
Fast recovery implementing multiplicative decrease CongWin will not drop below 1 MSS
Timeout SS or CA Threshold = CongWin2 CongWin = 1 MSSSet state to ldquoSlow Startrdquo
Enter slow start
Duplicate ACK
SS or CA Increment duplicate ACK count for segment being acked
CongWin and Threshold not changed
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-70
1
23
0111
value in arrivingpacketrsquos header
routing algorithm
local forwarding tableheader value output link
0100010101111001
3221
Interplay between routing and forwarding
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-71
u
yx
wv
z2
2
13
1
1
2
53
5
Graph G = (NE)
N = set of routers = u v w x y z
E = set of links = (uv) (ux) (vx) (vw) (xw) (xy) (wy) (wz) (yz)
Graph abstraction
Remark Graph abstraction is useful in other network contexts
Example P2P where N is set of peers and E is set of TCP connections
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-72
Graph abstraction costs
u
yx
wv
z2
2
13
1
1
2
53
5 bull c(xxrsquo) = cost of link (xxrsquo)
- eg c(wz) = 5
bull cost could always be 1 or inversely related to bandwidthor inversely related to congestion
Cost of path (x1 x2 x3hellip xp) = c(x1x2) + c(x2x3) + hellip + c(xp-1xp)
Question Whatrsquos the least-cost path between u and z
Routing algorithm algorithm that finds least-cost path
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-73
Routing Algorithm classificationGlobal or decentralized
informationGlobal all routers have complete
topology link cost info ldquolink staterdquo algorithmsDecentralized router knows physically-
connected neighbors link costs to neighbors
iterative process of computation exchange of info with neighbors
ldquodistance vectorrdquo algorithms
Static or dynamicStatic routes change slowly
over timeDynamic routes change more
quickly periodic update in response to link
cost changes
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-74
A Link-State Routing Algorithm
Dijkstrarsquos algorithm net topology link costs
known to all nodes accomplished via ldquolink
state broadcastrdquo all nodes have same info
computes least cost paths from one node (lsquosourcerdquo) to all other nodes gives forwarding table
for that node iterative after k iterations
know least cost path to k destrsquos
Notation c(xy) link cost from node x
to y = infin if not direct neighbors
D(v) current value of cost of path from source to dest v
p(v) predecessor node along path from source to v
N set of nodes whose least cost path definitively known
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-75
Dijsktrarsquos Algorithm
1 Initialization 2 N = u 3 for all nodes v 4 if v adjacent to u 5 then D(v) = c(uv) 6 else D(v) = infin 7 8 Loop 9 find w not in N such that D(w) is a minimum 10 add w to N 11 update D(v) for all v adjacent to w and not in N 12 D(v) = min( D(v) D(w) + c(wv) ) 13 new cost to v is either old cost to v or known 14 shortest path cost to w plus cost from w to v 15 until all nodes in N
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-76
Dijkstrarsquos algorithm example
Step012345
Nu
uxuxy
uxyvuxyvw
uxyvwz
D(v)p(v)2u2u2u
D(w)p(w)5u4x3y3y
D(x)p(x)1u
D(y)p(y)infin
2x
D(z)p(z)infin infin
4y4y4y
u
yx
wv
z2
2
13
1
1
2
53
5
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-77
Dijkstrarsquos algorithm discussion
Algorithm complexity n nodes each iteration need to check all nodes w not in N n(n+1)2 comparisons O(n2) more efficient implementations possible O(nlogn)
Oscillations possible eg link cost = amount of carried traffic
A
D
C
B1 1+e
e0
e
1 1
0 0
A
D
C
B2+e 0
001+e1
A
D
C
B0 2+e
1+e10 0
A
D
C
B2+e 0
e01+e1
initiallyhellip recompute
routinghellip recompute hellip recompute
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-78
Distance Vector Algorithm (1)
Bellman-Ford Equation (dynamic programming)
Definedx(y) = cost of least-cost path from x to y
Thendx(y) = min c(xv) + dv(y)
where min is taken over all neighbors of x
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-79
Bellman-Ford example (2)
u
yx
wv
z2
2
13
1
1
2
53
5Clearly dv(z) = 5 dx(z) = 3 dw(z) = 3
du(z) = min c(uv) + dv(z) c(ux) + dx(z) c(uw) + dw(z) = min 2 + 5 1 + 3 5 + 3 = 4
Node that achieves minimum is nexthop in shortest path forwarding table
B-F equation says
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-80
Distance Vector Algorithm (3)
Dx(y) = estimate of least cost from x to y
Distance vector Dx = [Dx(y) y є N ] Node x knows cost to each neighbor v c(xv) Node x maintains Dx = [Dx(y) y є N ] Node x also maintains its neighborsrsquo distance
vectors For each neighbor v x maintains
Dv = [Dv(y) y є N ]
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-81
Distance vector algorithm (4)
Basic idea Each node periodically sends its own distance
vector estimate to neighbors When node a node x receives new DV estimate
from neighbor it updates its own DV using B-F equation
Dx(y) larr minvc(xv) + Dv(y) for each node y ∊ N
Under minor natural conditions the estimate Dx(y) converge the actual least cost dx(y)
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-82
Distance Vector Algorithm (5)
Iterative asynchronous each local iteration caused by
local link cost change DV update message from
neighbor
Distributed each node notifies
neighbors only when its DV changes
neighbors then notify their neighbors if necessary
wait for (change in local link cost of msg from neighbor)
recompute estimates
if DV to any dest has
changed notify neighbors
Each node
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-83
x y z
xyz
0 2 7
infin infin infininfin infin infin
from
cost to
from
from
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 3
from
cost to
x y z
xyz
infin infin
infin infin infin
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
0 2 3
from
cost to
x y z
xyz
0 2 3
from
cost tox y z
xyz
0 2 7
from
cost to
x y z
xyz
infininfin infin7 1 0
cost to
infin2 0 1
infin infin infin
2 0 17 1 0
2 0 17 1 0
2 0 13 1 0
2 0 13 1 0
2 0 1
3 1 0
2 0 1
3 1 0
time
x z12
7
y
node x table
node y table
node z table
Dx(y) = minc(xy) + Dy(y) c(xz) + Dz(y) = min2+0 7+1 = 2
Dx(z) = minc(xy) + Dy(z) c(xz) + Dz(z) = min2+1 7+0 = 3
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-84
Distance Vector link cost changes
Link cost changes node detects local link cost
change updates routing info recalculates
distance vector if DV changes notify neighbors
ldquogoodnews travelsfastrdquo
x z14
50
y1
At time t0 y detects the link-cost change updates its DV and informs its neighbors
At time t1 z receives the update from y and updates its table It computes a new least cost to x and sends its neighbors its DV
At time t2 y receives zrsquos update and updates its distance table yrsquos least costs do not change and hence y does not send any message to z
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-85
Distance Vector link cost changesLink cost changes good news travels fast bad news travels slow - ldquocount to infinityrdquo problem 44 iterations before algorithm stabilizes see text
Poissoned reverse If Z routes through Y to get to X
Z tells Y its (Zrsquos) distance to X is infinite (so Y wonrsquot route to X via Z)
will this completely solve count to infinity problem
x z14
50
y60
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-86
Comparison of LS and DV algorithms
Message complexity LS with n nodes E links
O(nE) msgs sent DV exchange between
neighbors only convergence time varies
Speed of Convergence LS O(n2) algorithm requires
O(nE) msgs may have oscillations
DV convergence time varies may be routing loops count-to-infinity problem
Robustness what happens if router malfunctions
LS node can advertise incorrect
link cost each node computes only its
own table
DV DV node can advertise
incorrect path cost each nodersquos table used by
others bull error propagate thru network
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-87
Multiple Access Links and Protocols
Two types of ldquolinksrdquo point-to-point
PPP for dial-up access point-to-point link between Ethernet switch and host
broadcast (shared wire or medium) traditional Ethernet upstream HFC 80211 wireless LAN
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-88
Multiple Access protocols single shared broadcast channel two or more simultaneous transmissions by nodes
interference collision if node receives two or more signals at the same time
multiple access protocol distributed algorithm that determines how nodes share
channel ie determine when node can transmit communication about channel sharing must use channel
itself no out-of-band channel for coordination
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-89
Ideal Mulitple Access Protocol
Broadcast channel of rate R bps1 When one node wants to transmit it can send
at rate R2 When M nodes want to transmit each can
send at average rate RM3 Fully decentralized
no special node to coordinate transmissions no synchronization of clocks slots
4 Simple
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-90
MAC Protocols a taxonomy
Three broad classes Channel Partitioning
divide channel into smaller ldquopiecesrdquo (time slots frequency code)
allocate piece to node for exclusive use
Random Access channel not divided allow collisions ldquorecoverrdquo from collisions
ldquoTaking turnsrdquo Nodes take turns but nodes with more to send can
take longer turns
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-91
Channel Partitioning MAC protocols TDMA
TDMA time division multiple access access to channel in rounds each station gets fixed length slot (length = pkt trans time) in each round unused slots go idle example 6-station LAN 134 have pkt slots 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-92
Channel Partitioning MAC protocols FDMA
FDMA frequency division multiple access channel spectrum divided into frequency bands each station assigned fixed frequency band unused transmission time in frequency bands go idle example 6-station LAN 134 have pkt frequency bands 256 idle
TDM (Time Division Multiplexing) channel divided into N time slots one per user inefficient with low duty cycle users and at light load
FDM (Frequency Division Multiplexing) frequency subdivided
frequ
ency
bands time
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-93
Random Access Protocols
When node has packet to send transmit at full channel data rate R no a priori coordination among nodes
two or more transmitting nodes ldquocollisionrdquo random access MAC protocol specifies
how to detect collisions how to recover from collisions (eg via delayed
retransmissions) Examples of random access MAC protocols
slotted ALOHA ALOHA CSMA CSMACD CSMACA
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-94
Slotted ALOHA
Assumptions all frames same size time is divided into equal
size slots time to transmit 1 frame
nodes start to transmit frames only at beginning of slots
nodes are synchronized if 2 or more nodes
transmit in slot all nodes detect collision
Operation when node obtains fresh
frame it transmits in next slot
no collision node can send new frame in next slot
if collision node retransmits frame in each subsequent slot with prob p until success
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-95
Slotted ALOHA
Pros single active node can
continuously transmit at full rate of channel
highly decentralized only slots in nodes need to be in sync
simple
Cons collisions wasting slots idle slots nodes may be able to
detect collision in less than time to transmit packet
clock synchronization
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-96
Slotted Aloha efficiency
Suppose N nodes with many frames to send each transmits in slot with probability p
prob that node 1 has success in a slot = p(1-p)N-1
prob that any node has a success = Np(1-p)N-1
For max efficiency with N nodes find p that maximizes Np(1-p)N-1
For many nodes take limit of Np(1-p)N-1 as N goes to infinity gives 1e = 37
Efficiency is the long-run fraction of successful slots when there are many nodes each with many frames to send
At best channelused for useful transmissions 37of time
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-97
Pure (unslotted) ALOHA unslotted Aloha simpler no synchronization when frame first arrives
transmit immediately
collision probability increases frame sent at t0 collides with other frames sent in [t0-
1t0+1]
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-98
CSMA (Carrier Sense Multiple Access)
CSMA listen before transmitIf channel sensed idle transmit entire frame If channel sensed busy defer transmission
Human analogy donrsquot interrupt others
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-99
CSMA collisions
collisions can still occurpropagation delay means two nodes may not heareach otherrsquos transmissioncollisionentire packet transmission time wasted
spatial layout of nodes
noterole of distance amp propagation delay in determining collision probability
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-100
CSMACD (Collision Detection)CSMACD carrier sensing deferral as in CSMA
collisions detected within short time colliding transmissions aborted reducing channel
wastage collision detection
easy in wired LANs measure signal strengths compare transmitted received signals
difficult in wireless LANs receiver shut off while transmitting
human analogy the polite conversationalist
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-101
CSMACD collision detection
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-102
ldquoTaking Turnsrdquo MAC protocolschannel partitioning MAC protocols
share channel efficiently and fairly at high load inefficient at low load delay in channel access 1N
bandwidth allocated even if only 1 active node Random access MAC protocols
efficient at low load single node can fully utilize channel
high load collision overheadldquotaking turnsrdquo protocols
look for best of both worlds
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-103
ldquoTaking Turnsrdquo MAC protocolsPolling master node
ldquoinvitesrdquo slave nodes to transmit in turn
concerns polling overhead latency single point of
failure (master)
Token passing control token passed from one
node to next sequentially token message concerns
token overhead latency single point of failure (token)
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-104
Ethernet uses CSMACD
No slots adapter doesnrsquot
transmit if it senses that some other adapter is transmitting that is carrier sense
transmitting adapter aborts when it senses that another adapter is transmitting that is collision detection
Before attempting a retransmission adapter waits a random time that is random access
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-105
Ethernet CSMACD algorithm
1 Adaptor receives datagram from net layer amp creates frame
2 If adapter senses channel idle it starts to transmit frame If it senses channel busy waits until channel idle and then transmits
3 If adapter transmits entire frame without detecting another transmission the adapter is done with frame
4 If adapter detects another transmission while transmitting aborts and sends jam signal
5 After aborting adapter enters exponential backoff after the mth collision adapter chooses a K at random from 012hellip2m-1 Adapter waits K512 bit times and returns to Step 2
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
Transport Layer 3-106
Ethernetrsquos CSMACD (more)Jam Signal make sure all other
transmitters are aware of collision 48 bits
Bit time 1 microsec for 10 Mbps Ethernet for K=1023 wait time is about 50 msec
Exponential Backoff Goal adapt retransmission
attempts to estimated current load
heavy load random wait will be longer
first collision choose K from 01 delay is K 512 bit transmission times
after second collision choose K from 0123hellip
after ten collisions choose K from 01234hellip1023
Seeinteract with Javaapplet on AWL Web sitehighly recommended
- Whatrsquos the Internet ldquonuts and boltsrdquo view
- Slide 2
- Slide 3
- Whatrsquos the Internet a service view
- Whatrsquos a protocol
- Slide 6
- A closer look at network structure
- Protocol ldquoLayersrdquo
- Organization of air travel
- Layering of airline functionality
- Why layering
- Internet protocol stack
- Encapsulation
- Internet transport protocols services
- Transport vs network layer
- Reliable data transfer getting started
- Rdt10 reliable transfer over a reliable channel
- Rdt20 channel with bit errors
- rdt20 FSM specification
- rdt20 operation with no errors
- rdt20 error scenario
- rdt20 has a fatal flaw
- rdt21 sender handles garbled ACKNAKs
- rdt21 receiver handles garbled ACKNAKs
- rdt21 discussion
- rdt22 a NAK-free protocol
- rdt22 sender receiver fragments
- rdt30 channels with errors and loss
- rdt30 sender
- rdt30 in action
- Slide 31
- Performance of rdt30
- rdt30 stop-and-wait operation
- Pipelined protocols
- Pipelining increased utilization
- Go-Back-N
- GBN receiver
- GBN in action
- Selective Repeat
- Selective repeat sender receiver windows
- Selective repeat
- Selective repeat in action
- Selective repeat dilemma
- TCP Overview RFCs 793 1122 1323 2018 2581
- TCP segment structure
- TCP seq rsquos and ACKs
- TCP Round Trip Time and Timeout
- Example RTT estimation
- TCP reliable data transfer
- TCP sender events
- TCP sender (simplified)
- TCP retransmission scenarios
- TCP retransmission scenarios (more)
- TCP ACK generation [RFC 1122 RFC 2581]
- Fast Retransmit
- TCP Flow Control
- TCP Flow control how it works
- TCP Connection Management
- Slide 59
- TCP Connection Management (cont)
- Slide 61
- TCP Congestion Control
- TCP AIMD
- TCP Slow Start
- TCP Slow Start (more)
- Refinement
- Refinement (more)
- Summary TCP Congestion Control
- TCP sender congestion control
- Interplay between routing and forwarding
- Graph abstraction
- Graph abstraction costs
- Routing Algorithm classification
- A Link-State Routing Algorithm
- Dijsktrarsquos Algorithm
- Dijkstrarsquos algorithm example
- Dijkstrarsquos algorithm discussion
- Distance Vector Algorithm (1)
- Bellman-Ford example (2)
- Distance Vector Algorithm (3)
- Distance vector algorithm (4)
- Distance Vector Algorithm (5)
- PowerPoint Presentation
- Distance Vector link cost changes
- Slide 85
- Comparison of LS and DV algorithms
- Multiple Access Links and Protocols
- Multiple Access protocols
- Ideal Mulitple Access Protocol
- MAC Protocols a taxonomy
- Channel Partitioning MAC protocols TDMA
- Channel Partitioning MAC protocols FDMA
- Random Access Protocols
- Slotted ALOHA
- Slide 95
- Slotted Aloha efficiency
- Pure (unslotted) ALOHA
- CSMA (Carrier Sense Multiple Access)
- CSMA collisions
- CSMACD (Collision Detection)
- CSMACD collision detection
- ldquoTaking Turnsrdquo MAC protocols
- Slide 103
- Ethernet uses CSMACD
- Ethernet CSMACD algorithm
- Ethernetrsquos CSMACD (more)
-
top related