hypertext transfer protocolpg/pg/teaching_files/05 http.pdf../../../resource.txt...
TRANSCRIPT
HTTP
Hypertext Transfer Protocol
used in the WWW protocol used for communication between web
browsers and web servers
client-server paradigm
TCP port 80
RFC 1945
Ing. Pierluigi Gallo
Introduction to HTTP
80% of Internet flows are HTTP connections
Early protocol is HTTP 0.9 read only
Today we use HTTP 1.0 read, input, delete, ...
New version: HTTP 1.1 performance optimizations
Ing. Pierluigi Gallo
HTTP Overview
Client (browser) sends HTTP request to server
Request specifies affected URL
Request specifies desired operation
Server performs operation on URL
Server sends response
Request and reply headers are in pure text
Ing. Pierluigi Gallo
Static Content and HTML
Most static web content is written in HTML
HTML allows Text formatting commands
Embedded objects
Links to other objects
Server need not understand or interpret HTML
Ing. Pierluigi Gallo
URI,URN,URL
RFC 3305
Uniform Resource Identifier Identifies a resource
Uniform Resource Name The name of the resource with in a namespace like a person’s name
Uniform Resource Locator How to find the resource, a URI that says how to find the
resource like a person street address
URI concept is more general than its use in web pages (XML, …)
Ing. Pierluigi Gallo
HTTP - URLs
URL Uniform Resource Locator
protocol (http, ftp, news)
host name (name.domain name)
port (80, 8080, …)
directory path to the resource
resource name
absolute
relative
http://www.tti.unipa.it/~pg/pg/Teaching.html
http://xxx.myplace.com:80/cgi-bin/t.exe
Ing. Pierluigi Gallo
URI examples
http://example.org/absolute/URI/with/absolute/path/to/resource.txt
ftp://example.org/resource.txt
urn:issn:1535-3613
/relative/URI/with/absolute/path/to/resource.txt
relative/path/to/resource.txt
../../../resource.txt
./resource.txt#frag01
Ing. Pierluigi Gallo
HTTP - methods Methods GET
retrieve a URL from the server
simple page request
depending on the requested page:
run a CGI program
run a CGI with arguments attached to the URL
POST
preferred method for forms processing
run a CGI program
parameterized data in sysin
more secure and private
Ing. Pierluigi Gallo
HTTP - methods
Methods (cont.) PUT
Used to transfer a file from the client to the server
HEAD
requests URLs status header only
used for conditional URL handling for performance enhancement schemes
retrieve URL only if not in local cache or date is more recent than cached copy
DELETE
deletes page from server
Ing. Pierluigi Gallo
req-resp approach
client-server network protocol
in use by the World-Wide Web since 1990
request-response HTTP request messages for HTML pages, images,
scripts and styles sheets. Web servers handle these requests by returning response
messages that contain the requested resource.
Ing. Pierluigi Gallo
Example of an HTTP Exchange
Client Server GET www.cs.virginia.edu
Retrieve Data From Disk
Ing. Pierluigi Gallo
Fetching Multiple Objects
Most web-pages contain embedded objects (e.g., images, backgrounds, etc)
Browser requests HTML page
Server sends HTML file
Browser parses file and requests embedded objects
Server sends requested objects
Ing. Pierluigi Gallo
Fetching Embedded Objects
Client Server GET www.cs.virginia.edu
Retrieve Data From Disk
GET image.gif Retrieve Image From Disk
Ing. Pierluigi Gallo
HTTP Request Packets Sent from client to server
Consists of HTTP header header is hidden in browser environment
contains:
content type / mime type
content length
user agent - browser issuing request
content types user agent can handle
and a URL GET /simtec/httpgallery/introduction/ HTTP/1.1 Accept:*/* Accept-Language: en-gb Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0) Host: www.httpwatch.com Connection: Keep-Alive
HTTP 1.1 is the latest version
Ing. Pierluigi Gallo
HTTP Request Headers
Precede HTTP Method requests
headers are terminated by a blank line
Header Fields: From
Accept
Accept-Encoding
Accept Language
Ing. Pierluigi Gallo
HTTP 1.1 vs 1.0
Additional Methods (PUT, DELETE, TRACE, CONNECT + GET, HEAD, POST)
Additional Headers
Transfer Coding (chunk encoding)
Persistent Connections (content-length matters)
Request Pipelining
Ing. Pierluigi Gallo
HTTP 1.0
Client opens a separate TCP connection for each requested object
Object is served and connection is closed
Advantages maximum concurrency
Limitations TCP connection setup/tear-down
overhead TCP slow start overhead
Ing. Pierluigi Gallo
HTTP 1.0
Client Server
ACK, GET www.cs.virginia.edu Retrieve Data From Disk
Retrieve Image From Disk
SYN SYN, ACK
connect()
close() connect()
ACK, GET image.gif
SYN SYN, ACK
close()
write()
write()
Ing. Pierluigi Gallo
HTTP 1.1
To avoid a connection per object model, HTTP 1.1 supports persistent connections
Client opens TCP connection to server
All requests use same connection
Problems Less concurrency
Server does not know when to close idle connections
Ing. Pierluigi Gallo
HTTP 1.1
Client Server
ACK, GET www.cs.virginia.edu Retrieve Data From Disk
Retrieve Image From Disk
SYN SYN, ACK
connect()
GET image.gif
close()
write()
write()
Ing. Pierluigi Gallo
Server Side Close()
Client Server
ACK, GET www.cs.virginia.edu Retrieve Data From Disk
Retrieve Image From Disk
SYN SYN, ACK
connect()
GET image.gif
write()
write()
Timeout! close()
Set timeout Reset timeout
Ing. Pierluigi Gallo
CGI Scripts
Common Gateway Interface
web server software can delegate the generation of web pages to a stand-alone application, an executable file.
CGI scripts are URLs with a .cgi extension
The script is a program (e.g., C, JAVA, …)
When the URL is requested, server invokes the named script, passing to it client info
Script outputs HTML page to standard output (redirected to server)
Server sends page to client
Ing. Pierluigi Gallo
Modified-Since:
Used with GET to make a conditional GET
if requested document has not been modified since specified date a Modified 304 header is sent back to client instead of document client can then display cached version
Ing. Pierluigi Gallo
Status Header
“HTTP/1.0 sp code”
Codes: 1xx - reserved for future use
2xx - successful, understood and accepted
3xx - further action needed to complete
4xx - bad syntax in client request
5xx - server can’t fulfill good request
Ing. Pierluigi Gallo
HTTP Response Headers
Sent by server to client browser
Status Header Entities
Content-Encoding:
Content-Length:
Content-Type:
Expires:
Last-Modified:
extension-header
Body – content (usually html)
Ing. Pierluigi Gallo
Status Codes 200 OK
201 created
202 accepted
204 no content
301 moved perm.
302 moved temp
304 not modified
400 bad request
401 unauthorized
403 forbidden
404 not found
500 int. server error
501 not impl.
502 bad gateway
503 svc not avail
Ing. Pierluigi Gallo
Statelessness
Because of the Connect, Request, Response, Disconnect nature of HTTP it is said to be a stateless protocol i.e. from one web page to the next there is nothing in the
protocol that allows a web program to maintain program “state” (like a desktop program).
“state” can be maintained by “witchery” or “trickery” if it is needed
Ing. Pierluigi Gallo
Maintaining program “state”
Hidden variables (<input type=hidden>
Sessions Special header tags interpreted by the server
Used by ASP, PHP, JSP
Implemented at the language api level
Ing. Pierluigi Gallo
SSL vs S-HTTP
Secure Sockets Layer (SSL) protocol
has become the Internet’s key “secure protocol”
supports security across a variety of Internet transfer protocols (FTP, HTTP, IRC, etc…)
Support public keys and private key.
Ing. Pierluigi Gallo
Introduction S-HTTP Secure Hypertext Transport Protocols (S-HTTP)
- Is a modified version of the Hypertext Transport Protocols (HTTP).
- Encryption for Web documents.
- Provides the client (browser) the ability to verify message by using a Message Authentication Code (MAC).
- Primary purpose is to enable commercial transactions within a wide range of applications.
Ing. Pierluigi Gallo
S-HTTP vs HTTP
HTTP message message body message header
S-HTTP message
encrypted message body
message header
Ing. Pierluigi Gallo
3 steps how server create S-HTTP message
Server Encrypt Methods List
Encrypt Method
Client Encrypt Methods List
Server compares encryption lists and selection.
KPCS-7 RSA Diffie-
Hellman
KPCS-7 RSA Diffie-
Hellman
KPCS-7
Ing. Pierluigi Gallo
from the plain text to the encrypted message
This is my
message PKCS-7
!@##$$%%* @***&^&^%$ @#$$@@!&^
Public-Key Cryptography Standards (PKCS)
RFC 2315. Used to sign and/or encrypt messages under a PKI. Used also for certificate dissemination (for instance as a response to a PKCS#10 message). Formed the basis for S/MIME, which is as of 2010 based on RFC 5652, an updated Cryptographic Message Syntax Standard (CMS). Often used for single sign-on. PKCS
S-http header !@##$$%%* @***&^&^%$ @#$$@@!&^
<html> This is
my message </html>
PKCS-7
Client’s PKCS-7
Session Key
Ing. Pierluigi Gallo
Cryptographic Algorithm and digital signature modes for S-HTTP
S-HTTP provides message protection in 3 ways:
Digital signature
Message authentication
Message encryption
Ing. Pierluigi Gallo
Further readings
Hypertext Transfer Protocol -- HTTP/1.1 RFC 2616
Hypertext Transfer Protocol -- HTTP/1.0 RFC 1945
Upgrading to TLS Within HTTP/1.1 RFC 2817
HTTP Over TLS RFC 2818
HTTP Authentication: Basic and Digest Access Authentication RFC 2617
HTTP State Management Mechanism (Cookies) RFC 2109
HTTP State Management Mechanism (Cookie2) RFC 2965
Ing. Pierluigi Gallo
HTTP proxy clients
proxy
servers Reply
Req. Req.
Reply
The proxy sits between the client and the server. In the simplest case, instead of sending requests directly to the server the client sends all its requests to the proxy. The proxy then opens a connection to the server, and passes on the client's request. The proxy receives the reply from the server, and then sends that reply back to the client the proxy is acting like
HTTP client (to the remote server) HTTP server (to the initial client) Ing. Pierluigi Gallo
how the proxy works
Client send requests to the proxy.
If the requested document is in its cache, the proxy serves the request from its cache.
Otherwise, the proxy forward the request to the server.
Server replies the request through the proxy (proxy keep a copy of the requested document).
Ing. Pierluigi Gallo
proxy caching
performance improvement
Reduce the user-perceived latency associated with obtaining Web documents.
Lower the network traffic from the Web servers.
Reduce the service demands on content providers.
Ing. Pierluigi Gallo
Proxy and privacy
The proxy can: inspect the requested URL and selectively block access to certain
domains reformat web pages (for instances, by stripping out images to make a
page easier to display on a handheld or other limited-resource client), perform other transformations and filtering.
Normally, web servers log all incoming requests for resources. client IP address the browser or other client program that they are using (called the User-
Agent date and time the requested file.
a proxy hides client personally identifiable information,
All requests coming from clients using the same proxy appear to come from the IP address and User-Agent of the proxy itself
Ing. Pierluigi Gallo
Other ideas (taken from papers)
Clients Web
Proxy Server
Remote Web Servers
Fire wall
(Cache)
Peak Level
Ban
dwid
th
0 Day 1 Day 2
Web Proxy Servers and Off-peak Prefetching
Ing. Pierluigi Gallo
Caching, proxing, filtering
Content delivery networks Codeen
Content Filtering Dansguardian, squidguard, …
http proxy squid, tinyproxy, Apache Traffic Server, …
firewall iptables, firehol, …
Ing. Pierluigi Gallo
get your hand dirty
/usr/local/etc/dansguardian.conf
���/usr/local/etc/dansguardian/lists/bannedsitelist
/usr/local/etc/tinyproxy.con
browser configuration (to use proxy)
run a web server
in our lab we already have a proxy! We need to instruct ourtesting proxy to contact the ‘official’ one.
http://alien.slackbook.org/dokuwiki/doku.php?id=slackware:parentalcontrol
Ing. Pierluigi Gallo