http and web content delivery and web content delivery ... http/1.1 304 not modified

37
HTTP and Web Content Delivery COS 461: Computer Networks Spring 2011 Mike Freedman hGp://www.cs.princeton.edu/courses/archive/spring11/cos461/

Upload: trankien

Post on 23-May-2018

224 views

Category:

Documents


1 download

TRANSCRIPT

HTTPandWebContentDelivery

COS461:ComputerNetworksSpring2011

MikeFreedmanhGp://www.cs.princeton.edu/courses/archive/spring11/cos461/

Outline

•  Layering•  HTTP•  HTTPconnecLonmanagementandcaching

•  ProxyingandcontentdistribuLonnetworks– Webproxiesandhierarchicalnetworks– ModerndistributedCDNs(Akamai)

•  Assignment#1(availablenextweek):– WriteabasicWebproxy

•  (Itwillworkwithyourbrowserandrealwebpages!)

2

HTTPBasics

•  HTTPlayeredoverbidirecLonalbytestream

•  InteracLon– Clientsendsrequesttoserver,followedbyresponsefromservertoclient

– Requests/responsesareencodedintext

•  Stateless– Servermaintainsnoinfoaboutpastclientrequests

• WhataboutpersonalizaLon?Datastoredinback‐enddatabase;clientsends“webcookie”usedtolookupdata

3

HTTPneedsastreamofdata

CircuitSwitching

4

http://www.tcpipguide.com/free/t_CircuitSwitchingandPacketSwitchingNetworks-2.htm

Packetswitching

Today’snetworksprovidepacketdelivery,notstreams!

WhatiftheDataDoesn’tFit?5

Problem: Packet size

Solution: Split the data across multiple packets

•  Typical Web page is 10 kbytes

•  On Ethernet, max IP packet is 1500 bytes

GET / cours es/arc hive/s

GET index.html

GET /courses/archive/spr09/cos461/ HTTP/1.1 Host: www.cs.princeton.edu User-Agent: Mozilla/4.03 CRLF

Request

Layering=FuncLonalAbstracLon•  Sub‐dividetheproblem

–  Eachlayerreliesonservicesfromlayerbelow–  Eachlayerexportsservicestolayerabove

•  InterfacebetweenlayersdefinesinteracLon–  HidesimplementaLondetails–  Layerscanchangewithoutdisturbingotherlayers

6

Link hardware

Host-to-host connectivity

Application-to-application channels

Application Sockets:• streams–TCP• datagrams‐UDPPackets

LayerEncapsulaLoninHTTP

Get index.html

Connection ID

Source/Destination

Link Address

User A User B

Link hardware

Host-to-host connectivity

App-to-app channels

Application

7

IPSuite:EndHostsvs.Routers8

HTTP

TCP

IP

Ethernet interface

HTTP

TCP

IP

Ethernet interface

IP IP

Ethernet interface

Ethernet interface

SONET interface

SONET interface

host host

router router

HTTP message

TCP segment

IP packet IP packet IP packet

HTTPRequestExample

GET/HTTP/1.1

Host:sns.cs.princeton.eduAccept:*/*

Accept‐Language:en‐usAccept‐Encoding:gzip,deflate

User‐Agent:Mozilla/5.0(Macintosh;U;IntelMacOSX10.5;en‐US;rv:1.9.2.13)Gecko/20101203Firefox/3.6.13

ConnecLon:Keep‐Alive

9

HTTPRequest

10

10

HTTPResponseExampleHTTP/1.1200OKDate:Wed,02Feb201104:01:21GMTServer:Apache/2.2.3(CentOS)X‐Pingback:hGp://sns.cs.princeton.edu/xmlrpc.phpLast‐Modified:Wed,01Feb201112:41:51GMTETag:"7a11f‐10ed‐3a75ae4a"Accept‐Ranges:bytesContent‐Length:4333Keep‐Alive:Lmeout=15,max=100ConnecLon:Keep‐Alive

<!DOCTYPEhtmlPUBLIC"‐//W3C//DTDXHTML1.0TransiLonal//EN""hGp://www.w3.org/TR/xhtml1/DTD/xhtml1‐transiLonal.dtd">

<htmlxmlns="hGp://www.w3.org/1999/xhtml"dir="ltr"lang="en‐US">

11

HowtoMarkEndofMessage?

•  CloseconnecLon– Onlyservercandothis

•  Content‐Length– Mustknowsizeoftransferinadvance

•  Impliedlength– E.g.,304(NOTMODIFIED)neverhavebodycontent

•  Transfer‐Encoding:chunked(HTTP/1.1)– Arerheaders,eachchunkiscontentlengthinhex,CRLF,thenbody.Finalchunkislength0.

12

Example:ChunkedEncoding

HTTP/1.1 200 OK <CRLF>

Transfer-Encoding: chunked <CRLF> <CRLF> 25 <CRLF> This is the data in the first chunk <CRLF> 1A <CRLF> and this is the second one <CRLF> 0 <CRLF>

•  Especiallyusefulfordynamically‐generatedcontent,aslengthisnotaprioriknown

13

SingleTransferExample

Client Server

ACK

ACK

DAT DAT

Server reads from disk

Client sends HTTP request for HTML

Client parses HTML

14

SingleTransferExample

Client Server SYN

SYN ACK

ACK

ACK

DAT DAT

FIN ACK

0 RTT

1 RTT

2 RTT

Server reads from disk

FIN

Client opens TCP connection

Client sends HTTP request for HTML

Client parses HTML

15

SingleTransferExample

Client Server SYN

SYN

SYN

SYN

ACK

ACK

ACK

ACK

ACK

DAT DAT

DAT

DAT

FIN ACK

0 RTT

1 RTT

2 RTT

3 RTT

4 RTT

Server reads from disk

FIN

Server reads from disk

Client opens TCP connection

Client sends HTTP request for HTML

Client parses HTML Client opens TCP

connection

Client sends HTTP request for image

Image begins to arrive

16

Problemswithsimplemodel•  MulLpleconnecLonsetups

–  Three‐wayhandshakeeachLme(TCP“synchronizing”stream)

•  Shorttransfersarehardonstreamprotocol(TCP)–  Howmuchdatashoulditsendatonce?–  CongesLonavoidance:Takesawhileto“rampup”tohighsendingrate(TCP“slowstart”)

–  Lossrecoveryispoorwhennot“rampedup”

•  LotsofextraconnecLons–  Increasesserverstate/processing–  ServerforcedtokeepconnecLonstate

17

Outline

•  Layering•  HTTP•  HTTPconnecLonmanagementandcaching

•  ProxyingandcontentdistribuLonnetworks– Webproxiesandhierarchicalnetworks– ModerndistributedCDNs(Akamai)

•  Assignment#1(availablenextweek):– WriteabasicWebproxy

•  (Itwillworkwithyourbrowserandrealwebpages!)

18

PersistentConnecLonExample

Client Server

ACK

ACK

DAT

DAT

ACK

0 RTT

1 RTT

2 RTT

Server reads from disk

Client sends HTTP request for HTML

Client parses HTML Client sends HTTP request for image

Image begins to arrive

DAT Server reads from

disk

DAT

19

PersistentHTTPNon‐persistentHTTPissues:•  Requires2RTTsperobject•  OSmustallocateresources

foreachTCPconnecLon

•  ButbrowsersorenopenparallelTCPconnecLonstofetchreferencedobjects

PersistentHTTP:•  ServerleavesconnecLon

openarersendingresponse

•  SubsequentHTTPmessagesbetweensameclient/serveraresentoverconnecLon

Persistentwithoutpipelining:•  Clientissuesnewrequestonly

whenpreviousresponsehasbeenreceived

•  OneRTTforeachobject

Persistentwithpipelining:•  DefaultinHTTP/1.1spec•  Clientsendsrequestsassoonas

itencountersreferencedobject•  AsliGleasoneRTTforallthe

referencedobjects•  Servermusthandleresponses

insameorderasrequests

20

“Persistentwithoutpipelining”mostcommon•  Whendoespipeliningworkbest?

–  Smallobjects,equalLmetoserveeachobject

–  SmallbecausepipeliningsimplyremovesaddiLonal1RTTdelaytorequestnewcontent

•  AlternaLvedesign?– MulLpleparallelconnecLons(~2‐4).Easierserverparallelism

–  No“head‐of‐lineblocking”problemlikepipelining

•  DynamiccontentmakesHOLblockingpossibilityworse

•  InpracLce,manyserversdon’tsupport,andmanybrowsersdonotdefaulttopipelining

21

HTTPCaching•  Clientsorencachedocuments

– Whenshouldoriginbecheckedforchanges?–  EveryLme?Everysession?Date?

•  HTTPincludescachinginformaLoninheaders–  HTTP0.9/1.0used:“Expires:<date>”;“Pragma:no‐cache”–  HTTP/1.1has“Cache‐Control”

•  “No‐Cache”,“Private”,“Max‐age:<seconds>”•  “E‐tag:<opaquevalue>”

•  Ifnotexpired,usecachedcopy•  Ifexpired,usecondiLonGETrequesttoorigin

–  “If‐Modified‐Since:<date>”,“If‐None‐Match:<etag>”–  304(“NotModified”)or200(“OK”)response

22

HTTPCondiLonalRequestGET/HTTP/1.1Host:sns.cs.princeton.edu

User‐Agent:Mozilla/5.0(Macintosh;U;IntelMacOSX10.5;en‐US;rv:1.9.2.13)

ConnecLon:Keep‐AliveIf‐Modified‐Since:Tue,1Feb201117:54:18GMTIf‐None‐Match:"7a11f‐10ed‐3a75ae4a"

23

HTTP/1.1304NotModifiedDate:Wed,02Feb201104:01:21GMTServer:Apache/2.2.3(CentOS)ETag:"7a11f‐10ed‐3a75ae4a"Accept‐Ranges:bytesKeep‐Alive:Lmeout=15,max=100ConnecLon:Keep‐Alive

WebProxyCaches

•  Userconfiguresbrowser:Webaccessesviacache

•  BrowsersendsallHTTPrequeststocache–  Objectincache:cachereturnsobject

–  Else:cacherequestsobjectfromorigin,thenreturnstoclient

client

Proxy server

client origin server

origin server

24

Whenasinglecacheisn’tenough•  Whatiftheworkingsetis>proxydisk?

–  CooperaLon!

•  AstaLchierarchy–  Checklocal–  Ifmiss,checksiblings–  Ifmiss,fetchthroughparent

•  InternetCacheProtocol(ICP)–  ICPv2inRFC2186(&2187)–  UDP‐based,shortLmeout

25

public Internet

Parent web cache

Webtraffichascacheableworkload

“Zipf”or“power‐law”distribuLon

26

Characteris-csofWWWClient‐basedTracesCarlosR.Cunha,AzerBestavros,MarkE.Crovella,BU‐CS‐95‐01

ContentDistribuLonNetworks(CDNs)

•  ContentprovidersareCDNcustomers

ContentreplicaLon•  CDNcompanyinstallsthousands

ofserversthroughoutInternet–  Inlargedatacenters–  Or,closetousers

•  CDNreplicatescustomers’content

•  Whenproviderupdatescontent,CDNupdatesservers

origin server in North America

CDN distribution node

CDN server in S. America CDN server

in Europe

CDN server in Asia

27

ContentDistribuLonNetworks&ServerSelecLon

•  Replicatecontentonmanyservers•  Challenges

– Howtoreplicatecontent– Wheretoreplicatecontent– Howtofindreplicatedcontent– Howtochooseamongknowreplicas– Howtodirectclientstowardsreplica

28

ServerSelecLon

•  Whichserver?– Lowestload:tobalanceloadonservers– Bestperformance:toimproveclientperformance

•  BasedonGeography?RTT?Throughput?Load?– Anyalivenode:toprovidefaulttolerance

•  HowtodirectclientstoaparLcularserver?– AspartofrouLng:anycast,clusterloadbalancing– AspartofapplicaLon:HTTPredirect– Aspartofnaming:DNS

29

HTTP

HowAkamaiWorks

End‐user

cnn.com (content provider) DNS root server

1 2

Nearby Akamai cluster

GET index.html

30

http://cache.cnn.com/cnn.com/foo.jpg

HTTP

Akamai cluster

Akamai global DNS server

Akamai regional DNS server

HTTP

HowAkamaiWorks

End‐user

cnn.com (content provider) DNS root server

1 2

Nearby Akamai cluster

31

DNS lookup cache.cnn.com

Akamai cluster

3

4 ALIAS: g.akamai.net

Akamai global DNS server

Akamai regional DNS server

HTTP

HowAkamaiWorks

End‐user

cnn.com (content provider) DNS root server

1 2

Akamai global DNS server

Akamai regional DNS server

Nearby Akamai cluster

32

Akamai cluster

3

4 6

5

ALIAS a73.g.akamai.net

DNS lookup g.akamai.net

HTTP

HowAkamaiWorks

End‐user

cnn.com (content provider) DNS root server

1 2

Akamai global DNS server

Akamai regional DNS server

Nearby Akamai cluster

33

Akamai cluster

3

4 6

5

8

7

DNS a73.g.akamai.net

Address 1.2.3.4

HTTP

HowAkamaiWorks

End‐user

cnn.com (content provider) DNS root server

1 2

Akamai global DNS server

Akamai regional DNS server

Nearby Akamai cluster

34

Akamai cluster

3

4 6

5

8

7

9

GET /foo.jpg Host: cache.cnn.com

HTTP

HowAkamaiWorks

End‐user

cnn.com (content provider) DNS root server

1 2

Akamai global DNS server

Akamai regional DNS server

Nearby Akamai cluster

35

Akamai cluster

3

4 6

5

8

7

9

GET /foo.jpg Host: cache.cnn.com

12 11

GET foo.jpg

HTTP

HowAkamaiWorks

End‐user

cnn.com (content provider) DNS root server

1 2

Akamai global DNS server

Akamai regional DNS server

Nearby Akamai cluster

36

Akamai cluster

3

4 6

5

8

7

9

12 11

10

Summary•  HTTP:Simpletext‐basedfileexchangeprotocol

–  Supportforstatus/errorresponses,authenLcaLon,client‐sidestatemaintenance,cachemaintenance

•  InteracLonswithTCP–  ConnecLonsetup,reliability,statemaintenance–  PersistentconnecLons

•  Howtoimproveperformance–  PersistentandpipelinedconnecLons–  Caching–  ReplicaLon:Webproxies,cooperaLveproxies,andCDNs

37