csx760 computer networks1 http. csx760 computer networks2 the web: some jargon r web page: m...
TRANSCRIPT
CSx760 Computer Networks 1
HTTP
CSx760 Computer Networks 2
The Web: Some Jargon
Web page: consists of “objects” addressed by a URL
Most Web pages consist of: base HTML page, and several referenced
objects. URL has three
components: host name, port number and path name:
User agent for Web is called a browser: MS Internet Explorer Netscape Navigator
Server for Web is called Web server: Apache (public domain) MS Internet Information
Server
snowball.cs.uga.edu:80/~cs6760/index.html
CSx760 Computer Networks 3
The Web: the HTTP Protocol
HTTP: hypertext transfer protocol
Web’s application layer protocol
client/server model client: browser that
requests, receives, “displays” Web objects
server: Web server sends objects in response to requests
http1.0: RFC 1945 http1.1: RFC 2068
Windows runningExplorer
Server running
Apache Webserver
Linux runningNavigator
http request
http re
quest
http response
http re
sponse
CSx760 Computer Networks 4
The HTTP Protocol: Message Flow
HTTP: TCP transport service:
client initiates TCP connection (creates socket) to server, port 80
server accepts TCP connection from client
http messages (application-layer protocol messages) exchanged between browser (http client) and Web server (http server)
TCP connection closed
http is “stateless” server maintains no
information about past client requests
Protocols that maintain “state” are complex!
past history (state) must be maintained
if server/client crashes, their views of “state” may be inconsistent, must be reconciled
When is “state” desired?
aside
CSx760 Computer Networks 5
HTTP Example
Suppose user enters URL www.cs.uga.edu/index.html
1a. http client initiates TCP connection to http server (process) at www.cs.uga.edu. Port 80 is default for http server.
2. http client sends http request message (containing URL) into TCP connection socket
1b. http server at host www.cs.uga.edu waiting for TCP connection at port 80. “accepts” connection, notifying client
3. http server receives request message, forms response message containing requested object (index.html), sends message into socket (the sending speed increases slowly, which is called slow-start)
time
CSx760 Computer Networks 6
HTTP Example (cont.)
5. http client receives response message containing html file, parses html file, finds embedded image
6. Steps 1-5 repeated for each of the embedded images
4. http server closes TCP connection.
time
CSx760 Computer Networks 7
HTTP Request Message: General Format
CSx760 Computer Networks 8
The HTTP Protocol: Message Format
Two types of HTTP messages: request, response HTTP request message:
ASCII (human-readable format)
GET /~cs6760/index.html HTTP/1.1Host: snowball.cs.uga.eduConnection: close User-agent: Mozilla/4.0 Accept: text/html, image/gif, image/jpeg Accept-language: en
(extra carriage return, line feed)
request line(GET, POST,
HEAD commands)
header lines
Carriage return, line feed
indicates end of message
CSx760 Computer Networks 9
HTTP Message Format: Response
HTTP/1.1 200 OK Date: Sun, 06 Aug 2002 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Wed, 2 Oct 2002 …... Content-Length: 6821 Content-Type: text/html data data data data data ...
status line(protocol
status codestatus phrase)
header lines
data, e.g., requestedhtml file
CSx760 Computer Networks 10
HTTP Response Status Codes
200 OK request succeeded, requested object later in this message
301 Moved Permanently requested object moved, new location specified later in this
message (Location:)
400 Bad Request request message not understood by server
404 Not Found requested document not found on this server
505 HTTP Version Not Supported
In first line in server->client response message.A few sample codes:
CSx760 Computer Networks 11
More about Message Format
Why encode HTTP message in ASCII, why not use binary?
Loss a few encoding bytes on the wire, and in RAM …
But human-readable format is very important for high level network protocols typing interactively to the system or sending scripts
directly into the system is.
CSx760 Computer Networks 12
Trying out HTTP (client side) for yourself
1. Telnet to your favorite Web server:
Opens TCP connection to port 80(default http server port) at www.cs.uga.edu.Anything typed in sent to port 80 at www.cs.uga.edu
telnet www.cs.uga.edu 80
2. Type in a GET http request:
GET /index.html HTTP/1.0 By typing this in (hit carriagereturn twice), you sendthis minimal (but complete) GET request to http server
3. Look at response message sent by http server!
CSx760 Computer Networks 13
Response time modeling
Definition of RTT: time to send a small packet to travel from client to server and back.
Web Response time: one RTT to initiate TCP
connection one RTT for HTTP request
and first few bytes of HTTP response to return
file transmission timetotal = 2RTT+transmit time
time to transmit file
initiate TCPconnection
RTT
requestfile
RTT
filereceived
time time
CSx760 Computer Networks 14
Non-Persistent and Persistent Connections
Non-persistent HTTP/1.0 server parses request,
responds, and closes TCP connection
2 RTTs to fetch each object
Each object transfer suffers from slow start
Persistent default for HTTP/1.1 on same TCP connection:
server parses request, responds, parses new request, …
Client sends requests for all referenced objects as soon as it receives base HTML.
Fewer RTTs and less slow start.
But most 1.0 browsers useparallel TCP connections.
CSx760 Computer Networks 15
Pipelining
Persistent without pipelining:
client issues new request only when previous response has been received
one RTT for each referenced object
Persistent with pipelining: default in HTTP/1.1
“It is unfortunately not well supported by many web servers and proxies”, therefore Mozilla (and probably all other browers disable it by default)
client sends requests as soon as it encounters a referenced object
as little as one RTT for all the referenced objects
CSx760 Computer Networks 16
User-Server Interaction: Authentication
Authentication goal: control access to server documents
stateless: client must present authorization in each request
authorization: typically name, password Authorization: header
line in request if no authorization
presented, server refuses access, sendsWWW-authenticate: header line in response
client server
usual http request msg401: authorization req.
WWW-authenticate:
usual http request msg
+ Authorization:lineusual http response
msg
usual http request msg
+ Authorization:lineusual http response
msg
timeBrowser caches name & password sothat user does not have to repeatedly enter it.
CSx760 Computer Networks 17
User-server Interaction: Cookies
server sends “cookie” to client in response msgSet-cookie: 1678453
client presents cookie in later requestsCookie: 1678453
server matches presented-cookie with server-stored info authentication remembering user
preferences, previous choices
client server
usual http request msgusual http response
+Set-cookie: #
usual http request msg
Cookie: #usual http response
msg
usual http request msg
Cookie: #usual http response msg
cookie-spectificaction
cookie-spectificaction
CSx760 Computer Networks 18
HTTPS
Secure Socket Layer (SSL)
Certification Authority (CA)
Symmetric Key Cryptography
Asymmetric Key Cryptography Public Key & Private
Key
What if Sniffing? Replaying?
client serverhttps request msg
(cryptographic preferences)
Cryptographic preference, public key, and CA
certificate
time
If the client has the CA’s public key, then the client
can verify the CA certificate
Client generates a random symmetric key, encrypts it using the server’s public
key
Server extracts the symmetric key and encrypt further communication with
it
CSx760 Computer Networks 19
Certification Example
A digital certificate contains the name of a company, web site or individual, along with a cryptographic key that can be used to encrypt information that must be sent to that individual
A list of CAs is pre-loaded in your browser
CSx760 Computer Networks 20
User-Server Interaction: Conditional GET
Goal: don’t send object if client has up-to-date stored (cached) version
client: specify date of cached copy in http requestIf-modified-since:
<date> server: response contains
no object if cached copy up-to-date: HTTP/1.0 304 Not
Modified
client server
http request msgIf-modified-since:
<date>
http responseHTTP/1.0
304 Not Modified
object not
modified
http request msgIf-modified-since:
<date>
http responseHTTP/1.1 200 OK
…
<data>
object modified
CSx760 Computer Networks 21
Web Caches (Proxy Server)
user sets browser: Web accesses via web cache
client sends all http requests to web cache
if object at web cache, web cache immediately returns object in http response
else requests object from origin server, then returns http response to client
Why do we need proxy?
Goal: satisfy client request without involving origin server
client
Proxyserver
client
http request
http re
quest
http response
http re
sponse
http re
quest
http re
sponse
http requesthttp response
origin server
origin server
CSx760 Computer Networks 22
Why Web Caching?
Assume: cache is “close” to client (e.g., in same network)
smaller response time: cache “closer” to client
decrease traffic to distant servers link out of
institutional/local ISP network often bottleneck
originservers
public Internet
institutionalnetwork 10 Mbps LAN
1.5 Mbps access link
institutionalcache
CSx760 Computer Networks 23
Content distribution networks (CDNs)
The content providers are the CDN customers.
Content replication CDN company installs
hundreds of CDN servers throughout Internet in lower-tier ISPs, close
to users CDN replicates its
customers’ content in CDN servers. When provider updates content, CDN updates servers
origin server in North America
CDN distribution node
CDN serverin S. America CDN server
in Europe
CDN serverin Asia
CSx760 Computer Networks 24
CDN example
origin server www.foo.com distributes HTML Replaces: http://www.foo.com/sports.ruth.gif
with
http://www.cdn.com/www.foo.com/sports/ruth.gif
HTTP request for
www.foo.com/sports/sports.html
DNS query for www.cdn.com
HTTP request for
www.cdn.com/www.foo.com/sports/ruth.gif
1
2
3
Origin server
CDNs authoritative DNS server
NearbyCDN server
CDN company cdn.com distributes gif files uses its authoritative
DNS server to route redirect requests
CSx760 Computer Networks 25
Content Distribution Networks
BBB.COM
client server surrogate
B
B
B
B
B
B
cache
A
A
A
A
A
A
AAA.COM
C
C
C
C
C
C
CCC.COM
redirector
CSx760 Computer Networks 26
Partial Replication on CDN
BBB.COM
client server surrogate
B
B
B
B
A
A
A
A
AAA.COM
C
C
C
B B
A
A
C
C
C
CCC.COM
redirector
CSx760 Computer Networks 27
Partial Replication on CDN
BBB.COM
client server surrogate
B
B
B
B
A
A
A
A
AAA.COM
C
C
C
C
CCC.COM
redirector
CSx760 Computer Networks 28
Factors Affecting Redirection
Goals Increase system throughput under load Reduce response latency perceived by clients
Server load Pick least loaded server
Network proximity Pick closest server
Cache locality Pick server just served the object
What’s the tradeoff across different loads?
Often conflict
CSx760 Computer Networks 29
More about CDNs
routing requests CDN creates a “map”,
indicating distances from leaf ISPs and CDN nodes
when query arrives at authoritative DNS server: server determines ISP
from which query originates
uses “map” to determine best CDN server
not just Web pages streaming stored
audio/video streaming real-time
audio/video CDN nodes create
application-layer overlay network
CSx760 Computer Networks 30
Summary
HTTP Stateless application layer protocols HTTP/1.1 Persistent connection & pipeline Authentication, Cookies, and HTTPS HTTP is not limited to web server/browser Proxy and CDN for performance improvement