operating systems and networks network lecture 8 ... · operating systems and networks network...

89
Operating Systems and Networks Network Lecture 8: Transport Layer Adrian Perrig Network Security Group ETH Zürich

Upload: vothu

Post on 28-May-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

OperatingSystemsandNetworks

NetworkLecture8:TransportLayer

AdrianPerrigNetworkSecurityGroupETHZürich

2

• IwasgoingtotellyouajokeaboutUDP,butIwasn’tsureifyouweregoingtogetit…

3

WhereweareintheCourse• StartingtheTransportLayer!– Buildsonthenetworklayertodeliverdataacrossnetworksforapplicationswiththedesiredreliabilityorquality

PhysicalLink

NetworkTransportApplication

Recall• Transportlayerprovidesend-to-endconnectivityacrossthenetwork

4

TCPIP

802.11

app

IP

802.11

IP

Ethernet

TCP

IPEthernet

app

RouterHost Host

5

Recall(2)• Segmentscarryapplicationdataacrossthenetwork• Segmentsarecarriedwithinpacketswithinframes

802.11 IP TCP App,e.g.,HTTP

Segment

PacketFrame

6

TransportLayerServices• Providedifferentkindsofdatadeliveryacrossthenetworktoapplications

Unreliable ReliableMessages Datagrams(UDP)Bytestream Streams (TCP)

ComparisonofInternetTransports• TCPisfull-featured,UDPisaglorifiedpacket

7

TCP(Streams) UDP(Datagrams)Connections Datagrams

Bytesaredelivered once,reliably,andinorder

Messages maybelost,reordered,duplicated

Arbitrarylengthcontent LimitedmessagesizeFlowcontrolmatchessendertoreceiver

Cansendregardlessofreceiver state

Congestion controlmatchessendertonetwork

Cansendregardlessofnetworkstate

8

SocketAPI• Simpleabstractiontousethenetwork

– The“network”API(reallyTransportservice)usedtowriteallInternetapps

– PartofallmajorOSes andlanguages;originallyBerkeley(Unix)~1983

• SupportsbothInternettransportservices(StreamsandDatagrams)

9

SocketAPI(2)• Sockets letappsattachtothelocalnetworkat

differentports

Socket,Port#1

Socket,Port#2

SocketAPI(3)• SameAPIusedforStreamsandDatagrams

10

Primitive MeaningSOCKET CreateanewcommunicationendpointBIND Associatealocaladdress(port)withasocketLISTEN AnnouncewillingnesstoacceptconnectionsACCEPT PassivelyestablishanincomingconnectionCONNECT ActivelyattempttoestablishaconnectionSEND(TO) SendsomedataoverthesocketRECEIVE(FROM) ReceivesomedataoverthesocketCLOSE Releasethesocket

OnlyneededforStreams

To/FromformsforDatagrams

11

Ports• ApplicationprocessisidentifiedbythetupleIPaddress,protocol,

andport– Portsare16-bitintegersrepresentinglocal“mailboxes”thataprocess

leases

• Serversoftenbindto“well-knownports”– <1024,requireadministrativeprivileges

• Clientsoftenassigned“ephemeral”ports– ChosenbyOS,usedtemporarily

SomeWell-KnownPorts

12

Port Protocol Use20,21 FTP Filetransfer

22 SSH Remotelogin,replacementforTelnet25 SMTP Email80 HTTP WorldWideWeb

110 POP-3 Remoteemailaccess143 IMAP Remoteemailaccess443 HTTPS SecureWeb(HTTPoverSSL/TLS)543 RTSP Mediaplayercontrol631 IPP Printersharing

13

Topics• Servicemodels

– SocketAPIandports– Datagrams,Streams

• UserDatagramProtocol(UDP)• Connections(TCP)• SlidingWindow(TCP)• Flowcontrol(TCP)• Retransmissiontimers(TCP)

• Congestioncontrol(TCP) Later

Thistime

14

UserDatagramProtocol(UDP)(§6.4)• SendingmessageswithUDP– Ashimlayeronpackets

Ijustwanttosendapacket!

Network

15

UserDatagramProtocol(UDP)• Usedbyappsthatdon’twantreliabilityorbytestreams– Voice-over-IP(unreliable)– DNS,RPC(message-oriented)– DHCP(bootstrapping)

(Ifapplicationwantsreliabilityandmessagesthenithasworktodo!)

16

DatagramSocketsClient(host1) Server(host2)Time

request

reply

17

DatagramSockets(2)Client(host1) Server(host2)Time

1:socket 2:bind1:socket

6:sendto

3:recvfrom*4:sendto

5:recvfrom*

7:close 7:close*=callblocks

request

reply

18

UDPBufferingApp

PortMux/Demux

App AppApplication

Transport(TCP)

Network(IP) packet

Messagequeues

Ports

19

UDPHeader• Usesportstoidentifysendingandreceivingapplicationprocesses

• Datagramlengthupto64K• Checksum(16bits)forreliability

20

UDPPseudoheader• OptionalchecksumcoversUDPsegmentandIPpseudoheader– CheckskeyIPfields(addresses)– Valueofzeromeans“nochecksum”

21

ConnectionEstablishment(6.5.5,6.5.7,6.2.2)

• Howtosetupconnections– We’llseehowTCPdoesit

SYN!ACK!

Network

SYNACK!

22

ConnectionEstablishment• Bothsenderandreceivermustbereadybeforewestartthetransferofdata– Needtoagreeonasetofparameters– e.g.,theMaximumSegmentSize(MSS)

• Thisissignaling– Itsetsupstateattheendpoints– Like“dialing”foratelephonecall

23

Three-WayHandshake• UsedinTCP;opensconnectionfor

datainbothdirections

• EachsideprobestheotherwithafreshInitialSequenceNumber(ISN)– SendsonaSYNchronize segment– EchoonanACKnowledge segment

• Chosentoberobustevenagainstdelayedduplicates

Activeparty(client)

Passiveparty(server)

24

Three-WayHandshake(2)• Threesteps:– ClientsendsSYN(x)– ServerreplieswithSYN(y)ACK(x+1)– ClientreplieswithACK(y+1)– SYNsareretransmittediflost

• Sequenceandack numberscarriedonfurthersegments

1

2

3

Activeparty(client)

Passiveparty(server)

Time

25

Three-WayHandshake(3)• Supposedelayed,duplicatecopiesoftheSYNandACKarriveattheserver!– Improbable,butanyhow…

Activeparty(client)

Passiveparty(server)

26

Three-WayHandshake(4)• Supposedelayed,duplicatecopiesoftheSYNandACKarriveattheserver!– Improbable,butanyhow…

• ConnectionwillbecleanlyrejectedonbothsidesJ

Activeparty(client)

Passiveparty(server)

XXREJECT

REJECT

TCPConnectionStateMachine• Capturesthestates(rectangles)andtransitions(arrows)– A/BmeanseventAtriggersthetransition,withactionB

27

Bothpartiesruninstancesofthisstatemachine

TCPConnections(2)• Followthepathoftheclient:

28

TCPConnections(3)• Andthepathoftheserver:

29

TCPConnections(4)• Again,withstates…

30

LISTEN

SYN_RCVD

SYN_SENT

ESTABLISHED

ESTABLISHED

1

2

3

Activeparty(client) Passiveparty(server)

Time

CLOSEDCLOSED

31

TCPConnections(5)• Finitestatemachinesareausefultooltospecifyandcheckthehandlingofallcasesthatmayoccur

• TCPallowsforsimultaneousopen– i.e.,bothsidesopenatonceinsteadoftheclient-serverpattern

– TryathometoconfirmitworksJ

32

ConnectionRelease(6.5.6-6.5.7,6.2.3)

• Howtoreleaseconnections– We’llseehowTCPdoesit

Network

FIN! FIN!

33

ConnectionRelease• Orderlyreleasebybothpartieswhendone– Deliversallpendingdataand“hangsup”– Cleansupstateinsenderandreceiver

• Keyproblemistoprovidereliabilitywhilereleasing– TCPusesa“symmetric”closeinwhichbothsidesshutdownindependently

34

TCPConnectionRelease• Twosteps:

– ActivepartysendsFIN(x),passivepartysendsACK

– PassivepartysendsFIN(y),activepartysendsACK

– FINsareretransmittediflost

• EachFIN/ACKclosesonedirectionofdatatransfer

Activeparty Passiveparty

1

2

TCPConnectionStateMachine

35

Bothpartiesruninstancesofthisstatemachine

TCPRelease• Followtheactiveparty

36

TCPRelease(2)• Followthepassiveparty

37

TCPRelease(3)• Again,withstates…

38

Activeparty Passiveparty

1

2

FIN_WAIT_1

CLOSE_WAIT

LAST_ACKFIN_WAIT_2

TIME_WAIT

CLOSEDCLOSED

ESTABLISHED

(timeout)

ESTABLISHED

39

TIME_WAITState• Wewaitalongtime(twotimesthemaximumsegmentlifetimeof60seconds)aftersendingallsegmentsandbeforecompletingtheclose

• Why?– ACKmighthavebeenlost,inwhichcaseFINwillberesentforanorderlyclose

– Couldotherwiseinterferewithasubsequentconnection

40

SlidingWindows(§3.4,§6.5.8)• Theslidingwindowalgorithm– Pipeliningandreliability– BuildingonStop-and-Wait

Yeah!

Network

41

Recall• ARQwithonemessageatatimeisStop-and-Wait(normalcasebelow)

Frame0

ACK0Timeout Time

Sender Receiver

Frame1

ACK1

42

LimitationofStop-and-Wait• Itallowsonlyasinglemessagetobeoutstandingfromthesender:– FineforLAN(onlyoneframefit)– NotefficientfornetworkpathswithBD>>1packet

43

LimitationofStop-and-Wait(2)• Example:R=1Mbps,D=50ms– RTT(RoundTripTime)=2D=100ms– Howmanypackets/sec?

– WhatifR=10Mbps?

44

SlidingWindow• Generalizationofstop-and-wait– AllowsWpacketstobeoutstanding– CansendWpacketsperRTT(=2D)

– Pipelining improvesperformance– NeedW=2BDtofillnetworkpath

45

SlidingWindow(2)• WhatWwillusethenetworkcapacity?• Ex:R=1Mbps,D=50ms

• Ex:WhatifR=10Mbps?

46

SlidingWindow(3)• Ex:R=1Mbps,D=50ms

– 2BD=106 b/secx100.10-3sec=100kbit– W=2BD=10packetsof1200bytes

• Ex:WhatifR=10Mbps?– 2BD=1000kbit– W=2BD=100packetsof1200bytes

47

SlidingWindowProtocol• Manyvariations,dependingonhowbuffers,acknowledgements,andretransmissionsarehandled

• Go-Back-N– Simplestversion,canbeinefficient

• SelectiveRepeat– Morecomplex,betterperformance

48

SlidingWindow– Sender• SenderbuffersuptoWsegmentsuntiltheyareacknowledged– LFS=LAST FRAME SENT,LAR=LAST ACK REC’D– SendswhileLFS– LAR≤W

.. 5 6 7 .. 2 3 4 5 2 3 ..

LAR LFS

W=5

Acked Unacked 3 ..Unavailable

Available

seq.number

SlidingWindow

49

SlidingWindow– Sender(2)• TransportacceptsanothersegmentofdatafromtheApplication...– Transportsendsit(asLFS–LARà 5)

.. 5 6 7 .. 2 3 4 5 2 3 ..

LAR LFS

W=5

Acked Unacked 3 ..Unavailable

seq.number

4

50

SlidingWindow– Sender(3)• NexthigherACKarrivesfrompeer…– Windowadvances,bufferisfreed– LFS–LARà 4(cansendonemore)

.. 5 6 7 2 3 4 5 2 3 ..

LAR LFS

W=5

Acked 3 ..Unavail.

Available

seq.number

..2 Unacked

51

SlidingWindow– Go-Back-N• Receiverkeepsonlyasinglepacketbufferforthenextsegment– Statevariable,LAS=LAST ACK SENT

• Onreceive:– Ifseq.numberisLAS+1,acceptandpassittoapp,updateLAS,sendACK

– Otherwisediscard(asoutoforder)

52

SlidingWindow– SelectiveRepeat• Receiverpassesdatatoappinorder,andbuffersout-of-order

segmentstoreduceretransmissions

• ACKconveyshighestin-ordersegment,plushintsaboutout-of-ordersegments

• TCPusesaselectiverepeatdesign;we’llseethedetailslater

53

SlidingWindow– SelectiveRepeat(2)

• BuffersWsegments,keepsstatevariable,LAS=LAST ACKSENT

• Onreceive:– Buffersegments[LAS+1,LAS+W]– Passuptoappin-ordersegmentsfromLAS+1,andupdateLAS– SendACKforLASregardless

54

SlidingWindow– Retransmissions• Go-Back-Nsenderusesasingletimertodetectlosses

– Ontimeout,resendsbufferedpackets startingatLAR+1

• SelectiveRepeatsenderusesatimerperunacked segmenttodetectlosses– Ontimeoutforsegment,resendit– Hopetoresendfewersegments

55

SequenceNumbers• Needmorethan0/1forStop-and-Wait…

– Buthowmany?

• ForSelectiveRepeat,needWnumbersforpackets,plusWforacks ofearlierpackets– 2Wseq.numbers– FewerforGo-Back-N(W+1)

• Typicallyimplementseq.numberwithanN-bitcounterthatwrapsaroundat2N—1– E.g.,N=8:…,253,254,255,0,1,2,3,…

56

SequenceTimePlot

Time

Seq.Num

ber

Acks(atReceiver)

Delay(=RTT/2)

Transmissions(atSender)

Windowsize

57

SequenceTimePlot(2)

Time

Seq.Num

ber

Go-Back-Nscenario

58

SequenceTimePlot(3)

Time

Seq.Num

ber Loss

Timeout

Retransmissions

59

FlowControl(§6.5.8)• Addingflowcontroltotheslidingwindowalgorithm– Toslowtheover-enthusiasticsender

Pleaseslowdown!

Network

60

Problem• Slidingwindowusespipeliningtokeepthenetworkbusy– Whatifthereceiverisoverloaded?

StreamingvideoBigIron WeeMobile

Arg …

61

SlidingWindow– Receiver• ConsiderreceiverwithWbuffers– LAS=LAST ACK SENT,apppullsin-orderdatafrombufferwithrecv()call

.. 5 6 7 5 2 3 ..

LAS

W=5

Finished 3 ..Toohigh

seq.number

555 5Acceptable

SlidingWindow

62

SlidingWindow– Receiver(2)• Supposethenexttwosegmentsarrivebutappdoesnotcallrecv()

.. 5 6 7 5 2 3 ..

LAS

W=5

Finished 3 ..Toohigh

Acceptable

seq.number

555 5

63

SlidingWindow– Receiver(3)• Supposethenexttwosegmentsarrivebutappdoesnotcallrecv()– LASrises,butwecan’tslidewindow!

.. 5 6 7 5 2 3 ..

LAS

W=5

Finished 3 ..Toohigh

Acceptable

seq.number

555 544Acked

64

SlidingWindow– Receiver(4)• Iffurthersegmentsarrive(eveninorder)wecanfillthebuffer– Mustdropsegmentsuntilapprecvs!

.. 5 6 7 5 2 3 ..

LAS

W=5

Finished 3 ..Toohigh

NothingAcceptable

seq.number

5 44Acked 44 4Acked

65

SlidingWindow– Receiver(5)• Apprecv()takestwosegments– Windowslides(phew)

.. 5 6 7 5 2 3 ..

LAS

W=5

Finished 3 ..Toohigh

Acceptable

seq.number

555 5 44 4Acked

66

FlowControl• Avoidlossatreceiverbytellingsendertheavailablebufferspace– WIN=#Acceptable,notW(fromLAS)

.. 5 6 7 5 2 3 ..

LAS

W=5

Finished 3 ..Toohigh

Acceptable

seq.number

555 544Acked

67

FlowControl(2)• Senderusestheloweroftheslidingwindowandflowcontrolwindow(WIN)astheeffectivewindowsize

.. 5 6 7 5 2 3 ..

LAS

WIN=3

Finished 3 ..Toohigh

seq.number

555 544Acked

68

FlowControl(3)• TCP-styleexample– SEQ/ACK slidingwindow– FlowcontrolwithWIN

– SEQ +length<ACK+WIN

– 4KBbufferatreceiver– Circularbufferofbytes

69

RetransmissionTimeouts(§6.5.9)• Howtosetthetimeoutforsendingaretransmission– Adaptingtothenetworkpath

Lost?

Network

70

Retransmissions• Withslidingwindow,thestrategyfordetectinglossisthetimeout– Settimerwhenasegmentissent– Canceltimerwhenack isreceived– Iftimerfires,retransmit dataaslost

Retransmit!

71

TimeoutProblem• Timeoutshouldbe“justright”

– Toolongwastesnetworkcapacity– Tooshortleadstospuriousresends– Butwhatis“justright”?

• EasytosetonaLAN(Link)– Short,fixed,predictableRTT

• HardontheInternet(Transport)– Widerange,variableRTT

ExampleofRTTs

72

0

100

200

300

400

500

600

700

800

900

1000

0 50 100 150 200Seconds

Roun

dTripTim

e(m

s)

BCNàSEAàBCN

ExampleofRTTs(2)

73

0

100

200

300

400

500

600

700

800

900

1000

0 50 100 150 200Seconds

Roun

dTripTim

e(m

s) Variationduetoqueuingatrouters,changesinnetworkpaths,etc.

BCNàSEAàBCN

Propagation(+transmission)delay≈2D

ExampleofRTTs(3)

74

0

100

200

300

400

500

600

700

800

900

1000

0 50 100 150 200Seconds

Roun

dTripTim

e(m

s)

Timertoohigh!

Timertoolow!

Needtoadapttothenetworkconditions

75

AdaptiveTimeout• KeepsmoothedestimatesoftheRTT(1)andvarianceinRTT(2)

– Updateestimateswithamovingaverage1. SRTTN+1 =0.9*SRTTN +0.1*RTTN+12. SvarN+1 =0.9*SvarN +0.1*|RTTN+1– SRTTN+1|

• Settimeouttoamultipleofestimates– ToestimatetheupperRTTinpractice– TCPTimeoutN =SRTTN +4*SvarN

ExampleofAdaptiveTimeout

76

0

100

200

300

400

500

600

700

800

900

1000

0 50 100 150 200Seconds

RTT(m

s)

SRTT

Svar

ExampleofAdaptiveTimeout(2)

77

0

100

200

300

400

500

600

700

800

900

1000

0 50 100 150 200Seconds

RTT(m

s)

Timeout(SRTT+4*Svar)

Earlytimeout

78

AdaptiveTimeout(2)• Simpletocompute,doesagoodjoboftrackingactualRTT– Little“headroom”tolower– Yetveryfewearlytimeouts

• Turnsouttobeimportantforgoodperformanceandrobustness

79

TransmissionControlProtocol(TCP)(§6.5)• HowTCPworks!– ThetransportprotocolusedformostcontentontheInternet

TCPTCPTCP

WeloveTCP/IP!

Network

WeloveTCP/IP!WeloveTCP/IP!We© TCP/IP!

80

TCPFeatures• Areliablebytestream service• Basedonconnections• Slidingwindowforreliability– Withadaptivetimeout

• Flowcontrolforslowreceivers

• Congestioncontroltoallocatenetworkbandwidth

Thistime

Nexttime

ReliableBytestream• Messageboundariesnotpreservedfromsend()torecv()– Butreliableandordered(receivebytesinsameorderassent)

81

Foursegments,eachwith512bytesofdataandcarriedinanIPpacket

2048bytesofdatadeliveredtoappinasinglerecv()call

Sender Receiver

82

ReliableBytestream (2)• Bidirectionaldatatransfer– Controlinformation(e.g.,ACK)piggybacksondatasegmentsinreversedirection

A BdataBàA

ACK AàB

ACK BàA

dataAàB

83

TCPHeader(1)• Portsidentifyapps(socketAPI)– 16-bitidentifiers

84

TCPHeader(2)• SEQ/ACK usedforslidingwindow– SelectiveRepeat,withbytepositions

85

TCPSlidingWindow– Receiver• CumulativeACK tellsnextexpectedbytesequencenumber(“LAS+1”)

• Optionally,selectiveACKs (SACK)givehintsforreceiverbufferstate– Listupto3rangesofreceivedbytes

ACK upto100and200-299

86

TCPSlidingWindow– Sender• UsesadaptiveretransmissiontimeouttoresenddatafromLAS+1• Usesheuristicstoinferlossquicklyandresendtoavoidtimeouts

– “ThreeduplicateACKs”treatedasloss

ACK 100ACK 100,200-299

ACK 100,200-399

ACK 100,200-499

Senderdecides100-199islost

87

TCPHeader(3)• SYN/FIN/RST flagsforconnections– FlagindicatessegmentisaSYN etc.

88

TCPHeader(4)• Windowsizeforflowcontrol– RelativetoACK,andinbytes

89

OtherTCPDetails• Many,manyquirksyoucanlearnaboutitsoperation– Buttheyarethedetails

• Biggestremainingmysteryistheworkingsofcongestioncontrol– We’lltacklethisnexttime!