http and web content delivery and web content delivery ... http/1.1 304 not modified
TRANSCRIPT
HTTPandWebContentDelivery
COS461:ComputerNetworksSpring2011
MikeFreedmanhGp://www.cs.princeton.edu/courses/archive/spring11/cos461/
Outline
• Layering• HTTP• HTTPconnecLonmanagementandcaching
• ProxyingandcontentdistribuLonnetworks– Webproxiesandhierarchicalnetworks– ModerndistributedCDNs(Akamai)
• Assignment#1(availablenextweek):– WriteabasicWebproxy
• (Itwillworkwithyourbrowserandrealwebpages!)
2
HTTPBasics
• HTTPlayeredoverbidirecLonalbytestream
• InteracLon– Clientsendsrequesttoserver,followedbyresponsefromservertoclient
– Requests/responsesareencodedintext
• Stateless– Servermaintainsnoinfoaboutpastclientrequests
• WhataboutpersonalizaLon?Datastoredinback‐enddatabase;clientsends“webcookie”usedtolookupdata
3
HTTPneedsastreamofdata
CircuitSwitching
4
http://www.tcpipguide.com/free/t_CircuitSwitchingandPacketSwitchingNetworks-2.htm
Packetswitching
Today’snetworksprovidepacketdelivery,notstreams!
WhatiftheDataDoesn’tFit?5
Problem: Packet size
Solution: Split the data across multiple packets
• Typical Web page is 10 kbytes
• On Ethernet, max IP packet is 1500 bytes
GET / cours es/arc hive/s
GET index.html
GET /courses/archive/spr09/cos461/ HTTP/1.1 Host: www.cs.princeton.edu User-Agent: Mozilla/4.03 CRLF
Request
Layering=FuncLonalAbstracLon• Sub‐dividetheproblem
– Eachlayerreliesonservicesfromlayerbelow– Eachlayerexportsservicestolayerabove
• InterfacebetweenlayersdefinesinteracLon– HidesimplementaLondetails– Layerscanchangewithoutdisturbingotherlayers
6
Link hardware
Host-to-host connectivity
Application-to-application channels
Application Sockets:• streams–TCP• datagrams‐UDPPackets
LayerEncapsulaLoninHTTP
Get index.html
Connection ID
Source/Destination
Link Address
User A User B
Link hardware
Host-to-host connectivity
App-to-app channels
Application
7
IPSuite:EndHostsvs.Routers8
HTTP
TCP
IP
Ethernet interface
HTTP
TCP
IP
Ethernet interface
IP IP
Ethernet interface
Ethernet interface
SONET interface
SONET interface
host host
router router
HTTP message
TCP segment
IP packet IP packet IP packet
HTTPRequestExample
GET/HTTP/1.1
Host:sns.cs.princeton.eduAccept:*/*
Accept‐Language:en‐usAccept‐Encoding:gzip,deflate
User‐Agent:Mozilla/5.0(Macintosh;U;IntelMacOSX10.5;en‐US;rv:1.9.2.13)Gecko/20101203Firefox/3.6.13
ConnecLon:Keep‐Alive
9
HTTPResponseExampleHTTP/1.1200OKDate:Wed,02Feb201104:01:21GMTServer:Apache/2.2.3(CentOS)X‐Pingback:hGp://sns.cs.princeton.edu/xmlrpc.phpLast‐Modified:Wed,01Feb201112:41:51GMTETag:"7a11f‐10ed‐3a75ae4a"Accept‐Ranges:bytesContent‐Length:4333Keep‐Alive:Lmeout=15,max=100ConnecLon:Keep‐Alive
<!DOCTYPEhtmlPUBLIC"‐//W3C//DTDXHTML1.0TransiLonal//EN""hGp://www.w3.org/TR/xhtml1/DTD/xhtml1‐transiLonal.dtd">
<htmlxmlns="hGp://www.w3.org/1999/xhtml"dir="ltr"lang="en‐US">
11
HowtoMarkEndofMessage?
• CloseconnecLon– Onlyservercandothis
• Content‐Length– Mustknowsizeoftransferinadvance
• Impliedlength– E.g.,304(NOTMODIFIED)neverhavebodycontent
• Transfer‐Encoding:chunked(HTTP/1.1)– Arerheaders,eachchunkiscontentlengthinhex,CRLF,thenbody.Finalchunkislength0.
12
Example:ChunkedEncoding
HTTP/1.1 200 OK <CRLF>
Transfer-Encoding: chunked <CRLF> <CRLF> 25 <CRLF> This is the data in the first chunk <CRLF> 1A <CRLF> and this is the second one <CRLF> 0 <CRLF>
• Especiallyusefulfordynamically‐generatedcontent,aslengthisnotaprioriknown
13
SingleTransferExample
Client Server
ACK
ACK
DAT DAT
Server reads from disk
Client sends HTTP request for HTML
Client parses HTML
14
SingleTransferExample
Client Server SYN
SYN ACK
ACK
ACK
DAT DAT
FIN ACK
0 RTT
1 RTT
2 RTT
Server reads from disk
FIN
Client opens TCP connection
Client sends HTTP request for HTML
Client parses HTML
15
SingleTransferExample
Client Server SYN
SYN
SYN
SYN
ACK
ACK
ACK
ACK
ACK
DAT DAT
DAT
DAT
FIN ACK
0 RTT
1 RTT
2 RTT
3 RTT
4 RTT
Server reads from disk
FIN
Server reads from disk
Client opens TCP connection
Client sends HTTP request for HTML
Client parses HTML Client opens TCP
connection
Client sends HTTP request for image
Image begins to arrive
16
Problemswithsimplemodel• MulLpleconnecLonsetups
– Three‐wayhandshakeeachLme(TCP“synchronizing”stream)
• Shorttransfersarehardonstreamprotocol(TCP)– Howmuchdatashoulditsendatonce?– CongesLonavoidance:Takesawhileto“rampup”tohighsendingrate(TCP“slowstart”)
– Lossrecoveryispoorwhennot“rampedup”
• LotsofextraconnecLons– Increasesserverstate/processing– ServerforcedtokeepconnecLonstate
17
Outline
• Layering• HTTP• HTTPconnecLonmanagementandcaching
• ProxyingandcontentdistribuLonnetworks– Webproxiesandhierarchicalnetworks– ModerndistributedCDNs(Akamai)
• Assignment#1(availablenextweek):– WriteabasicWebproxy
• (Itwillworkwithyourbrowserandrealwebpages!)
18
PersistentConnecLonExample
Client Server
ACK
ACK
DAT
DAT
ACK
0 RTT
1 RTT
2 RTT
Server reads from disk
Client sends HTTP request for HTML
Client parses HTML Client sends HTTP request for image
Image begins to arrive
DAT Server reads from
disk
DAT
19
PersistentHTTPNon‐persistentHTTPissues:• Requires2RTTsperobject• OSmustallocateresources
foreachTCPconnecLon
• ButbrowsersorenopenparallelTCPconnecLonstofetchreferencedobjects
PersistentHTTP:• ServerleavesconnecLon
openarersendingresponse
• SubsequentHTTPmessagesbetweensameclient/serveraresentoverconnecLon
Persistentwithoutpipelining:• Clientissuesnewrequestonly
whenpreviousresponsehasbeenreceived
• OneRTTforeachobject
Persistentwithpipelining:• DefaultinHTTP/1.1spec• Clientsendsrequestsassoonas
itencountersreferencedobject• AsliGleasoneRTTforallthe
referencedobjects• Servermusthandleresponses
insameorderasrequests
20
“Persistentwithoutpipelining”mostcommon• Whendoespipeliningworkbest?
– Smallobjects,equalLmetoserveeachobject
– SmallbecausepipeliningsimplyremovesaddiLonal1RTTdelaytorequestnewcontent
• AlternaLvedesign?– MulLpleparallelconnecLons(~2‐4).Easierserverparallelism
– No“head‐of‐lineblocking”problemlikepipelining
• DynamiccontentmakesHOLblockingpossibilityworse
• InpracLce,manyserversdon’tsupport,andmanybrowsersdonotdefaulttopipelining
21
HTTPCaching• Clientsorencachedocuments
– Whenshouldoriginbecheckedforchanges?– EveryLme?Everysession?Date?
• HTTPincludescachinginformaLoninheaders– HTTP0.9/1.0used:“Expires:<date>”;“Pragma:no‐cache”– HTTP/1.1has“Cache‐Control”
• “No‐Cache”,“Private”,“Max‐age:<seconds>”• “E‐tag:<opaquevalue>”
• Ifnotexpired,usecachedcopy• Ifexpired,usecondiLonGETrequesttoorigin
– “If‐Modified‐Since:<date>”,“If‐None‐Match:<etag>”– 304(“NotModified”)or200(“OK”)response
22
HTTPCondiLonalRequestGET/HTTP/1.1Host:sns.cs.princeton.edu
User‐Agent:Mozilla/5.0(Macintosh;U;IntelMacOSX10.5;en‐US;rv:1.9.2.13)
ConnecLon:Keep‐AliveIf‐Modified‐Since:Tue,1Feb201117:54:18GMTIf‐None‐Match:"7a11f‐10ed‐3a75ae4a"
23
HTTP/1.1304NotModifiedDate:Wed,02Feb201104:01:21GMTServer:Apache/2.2.3(CentOS)ETag:"7a11f‐10ed‐3a75ae4a"Accept‐Ranges:bytesKeep‐Alive:Lmeout=15,max=100ConnecLon:Keep‐Alive
WebProxyCaches
• Userconfiguresbrowser:Webaccessesviacache
• BrowsersendsallHTTPrequeststocache– Objectincache:cachereturnsobject
– Else:cacherequestsobjectfromorigin,thenreturnstoclient
client
Proxy server
client origin server
origin server
24
Whenasinglecacheisn’tenough• Whatiftheworkingsetis>proxydisk?
– CooperaLon!
• AstaLchierarchy– Checklocal– Ifmiss,checksiblings– Ifmiss,fetchthroughparent
• InternetCacheProtocol(ICP)– ICPv2inRFC2186(&2187)– UDP‐based,shortLmeout
25
public Internet
Parent web cache
Webtraffichascacheableworkload
“Zipf”or“power‐law”distribuLon
26
Characteris-csofWWWClient‐basedTracesCarlosR.Cunha,AzerBestavros,MarkE.Crovella,BU‐CS‐95‐01
ContentDistribuLonNetworks(CDNs)
• ContentprovidersareCDNcustomers
ContentreplicaLon• CDNcompanyinstallsthousands
ofserversthroughoutInternet– Inlargedatacenters– Or,closetousers
• CDNreplicatescustomers’content
• Whenproviderupdatescontent,CDNupdatesservers
origin server in North America
CDN distribution node
CDN server in S. America CDN server
in Europe
CDN server in Asia
27
ContentDistribuLonNetworks&ServerSelecLon
• Replicatecontentonmanyservers• Challenges
– Howtoreplicatecontent– Wheretoreplicatecontent– Howtofindreplicatedcontent– Howtochooseamongknowreplicas– Howtodirectclientstowardsreplica
28
ServerSelecLon
• Whichserver?– Lowestload:tobalanceloadonservers– Bestperformance:toimproveclientperformance
• BasedonGeography?RTT?Throughput?Load?– Anyalivenode:toprovidefaulttolerance
• HowtodirectclientstoaparLcularserver?– AspartofrouLng:anycast,clusterloadbalancing– AspartofapplicaLon:HTTPredirect– Aspartofnaming:DNS
29
HTTP
HowAkamaiWorks
End‐user
cnn.com (content provider) DNS root server
1 2
Nearby Akamai cluster
GET index.html
30
http://cache.cnn.com/cnn.com/foo.jpg
HTTP
Akamai cluster
Akamai global DNS server
Akamai regional DNS server
HTTP
HowAkamaiWorks
End‐user
cnn.com (content provider) DNS root server
1 2
Nearby Akamai cluster
31
DNS lookup cache.cnn.com
Akamai cluster
3
4 ALIAS: g.akamai.net
Akamai global DNS server
Akamai regional DNS server
HTTP
HowAkamaiWorks
End‐user
cnn.com (content provider) DNS root server
1 2
Akamai global DNS server
Akamai regional DNS server
Nearby Akamai cluster
32
Akamai cluster
3
4 6
5
ALIAS a73.g.akamai.net
DNS lookup g.akamai.net
HTTP
HowAkamaiWorks
End‐user
cnn.com (content provider) DNS root server
1 2
Akamai global DNS server
Akamai regional DNS server
Nearby Akamai cluster
33
Akamai cluster
3
4 6
5
8
7
DNS a73.g.akamai.net
Address 1.2.3.4
HTTP
HowAkamaiWorks
End‐user
cnn.com (content provider) DNS root server
1 2
Akamai global DNS server
Akamai regional DNS server
Nearby Akamai cluster
34
Akamai cluster
3
4 6
5
8
7
9
GET /foo.jpg Host: cache.cnn.com
HTTP
HowAkamaiWorks
End‐user
cnn.com (content provider) DNS root server
1 2
Akamai global DNS server
Akamai regional DNS server
Nearby Akamai cluster
35
Akamai cluster
3
4 6
5
8
7
9
GET /foo.jpg Host: cache.cnn.com
12 11
GET foo.jpg
HTTP
HowAkamaiWorks
End‐user
cnn.com (content provider) DNS root server
1 2
Akamai global DNS server
Akamai regional DNS server
Nearby Akamai cluster
36
Akamai cluster
3
4 6
5
8
7
9
12 11
10
Summary• HTTP:Simpletext‐basedfileexchangeprotocol
– Supportforstatus/errorresponses,authenLcaLon,client‐sidestatemaintenance,cachemaintenance
• InteracLonswithTCP– ConnecLonsetup,reliability,statemaintenance– PersistentconnecLons
• Howtoimproveperformance– PersistentandpipelinedconnecLons– Caching– ReplicaLon:Webproxies,cooperaLveproxies,andCDNs
37