communicating on the web
DESCRIPTION
HTTP (Hyper Text Transfer Protocol) regulates simple conversations between clients and servers, like placing an order in a restaurant. However, there are some gotchas like the server having short term memory requiring the client to repeat themselves. But don’t despair, HTTP helps reduce confusion with standardized requests and responses. By following these conventions developers are able to create amazing things not possible with just POST requests and 200 OK responses. In this talk Adrian Cardenas will review examples of clients and servers, as well as the stateless nature of HTTP. He will then go into more detail about headers discussing request methods, and common request headers. Good conversations cannot be one sided, so he will also cover common response headers as well as useful response status codes.TRANSCRIPT
● Developer at ServerGrove● All around nerd● Systems Administrator for
7 years● @aramonc in all the places
About Me
CAN’T COMMUNICATE WELL WITHOUT
COMMON GROUND
HYPERTEXTTRANSFER PROTOCOL● Designed side by side with HTML● Before were the bulletin boards● Question & Answer style 2 way communication● M2M communication method composed of text
documents
THE CLIENTThe client is any application that initiates
an HTTP communication
THE SERVERServers are any application that receives a request
and terminates with a response
HTTP IS STATELESS
STATELESS IS THE OPPOSITE OF
STATEFUL● Stateless, in this context, is short term memory ● Stateless communication allows for
○ distributed system○ load balancing○ manage state separately
● Makes caching more difficult● Makes real time apps more difficult● Application is responsible for preserving state
SHORT/LONG POLLING● Used to update client side application state in
“real time” applications● Usually initiated by JavaScript● Can be initiated by any client side technology
like Objective C.● Short polling initiates short lived connections
to check if state changed● Long polling initiates long lived connections
until state changes
THE REQUESTGET https://www.google.com/ HTTP/1.1:version: HTTP/1.1:method: GET:scheme: https:host: www.google.comuser-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36accept-encoding: gzip,deflate,sdchaccept-language: en-US,en;q=0.8,es-419;q=0.6,es;q=0.4accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8cookie: OGP=-3904011:; HSID=A0hmwhHriSEJzPSI; SSID=AKHSzv76RXaggJwJ; APISID=PXmCmOabqgrdcm_z/A7eIE7i4enNC0Hn0;
THE REQUEST● Human readable text document ● Composed of the request, a set of headers, and
an optional content body● Headers are key value pairs separated by a colon
& terminated by a new line ● Headers describe the request and offer additional
metadata
THE REQUEST LINEGET https://www.google.com/ HTTP/1.1
● The request is the first line of the document● Composed of 3 parts● From the right: HTTP version
○ Let’s the server know which headers it can expect
THE REQUESTGET https://www.google.com/ HTTP/1.1
● URL (Universal Resource Locator)● Every request is for a resource● Like interacting with a bank teller● Composed of the scheme, the host, the path,
and optionally a query string
http://server/path/?query=string
THE REQUESTGET https://www.google.com/ HTTP/1.1
● A verb indicating what you would like to do with the resource
● Withdraw money, create a new account, deposit money, or even rob the bank
COMMON METHODSGET, POST, PUT, DELETE
HEAD, OPTIONS
● Also called verbs● Describe the intent of the request● CRUD is most common● Small subset● Some, like patch, still in draft form
COMMON HEADERSuser-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36
● Describes the client● Set by the client● Can be changed programmatically● Mozilla/5.0 compatible hold over from
Netscape years
COMMON HEADERS
accept-encoding: gzip,deflate,sdch
accept-language: en-US,en;q=0.8,es-419;q=0.6,es;q=0.4accept-charset: utf-8accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
ACCEPT FAMILY
● Describes the type of content the client can understand
● accept headers is a list of MIME types● ;q= indicates preference level
COMMON MIME TYPES● text/html● text/css● text/javascript● text/xml● text/plain● application/json● application/rss+xml
● multipart/form-data● image/jpeg● image/gif● image/png● audio/mpeg● video/mpeg● video/x-flv
COMMON HEADERScookie: SSID=AKHSzv76RXaggJwJ;
● Describes the contents of a cookie file set by a previous connection to the same host
● Used to persist data across HTTP connections
● Stored in files locally or in memory in the client process
NOT SO COMMON
authorization: Basic QWpIlc2FtZQ==
● Describes login credentials to password protected URLs
● Two methods, Basic and Digest● Digest more secure, but more complicated to
set up● If not included, response is to request a set of
credentials● Best if used in combination with TLS/SSL
NOT SO COMMONx-hello: world
hello: world
● x- used to describe a custom header● Deprecated by one of the latest RFCs● Still used by some APIs● New form is not to use the x-● Future proof
REQUEST BODYContent-Type: multipart/form-data; boundary=AaB03x
--AaB03x Content-Disposition: form-data; name="submit-name"
Larry --AaB03x Content-Disposition: form-data; name="files"; filename="file1.txt" Content-Type: text/plain
... contents of file1.txt ... --AaB03x--
REQUEST BODY● Optional content for POST, PUT, etc requests● Typically used to send data from HTML forms● Form data formatted as key value pairs with no
boundary● Multipart is most complicated● Form data is separated by boundaries &
terminated by the boundary plus --● File uploads need to be done with multipart● Content-Type is a MIME type describing the
contents of the file● Could be base64 representation of binary data
THE RESPONSEHTTP/1.1 200 OKstatus: 200 OKversion: HTTP/1.1content-encoding: gzipcontent-type: text/html; charset=UTF-8date: Wed, 20 Nov 2013 01:48:58 GMTset-cookie: PREF=ID=26af7b02617ef537:U=9bc26b9e4; expires=Fri, 20-Nov-2015 01:48:58 GMT; path=/; domain=.google.com
COMMON HEADERScontent-encoding: gzipcontent-type: text/html; charset=UTF-8
● The content body can be anything from binary, to json, to html
● The content returned is described by the content-type & content enconding
● Related to the accept-header
COMMON HEADERSset-cookie: PREF=ID=26af7b02617ef537:U=9bc26b9e4; expires=Fri, 20-Nov-2015 01:48:58 GMT; path=/; domain=.google.com
● Sets or overrides a cookie in the client’s system● Cookie content● Optional expiration date● Path & Domain cookie applies to● Localhost is not a valid domain. When testing it’s
preferable not to set the domain
THE RESPONSEHTTP/1.1 200 OK
● Only thing required to be sent back● Sometimes the only thing sent back● Apache always sends back all the SHOULD
headers
STATUS CODES200 OK, 404 NOT FOUND, 500 INTERNAL SERVER ERROR
STATUS CODE FAMILIES● 1xx: Informational Messages● 2xx: Success Messages● 3xx: Redirection Messages● 4xx: Client Error● 5xx: Server Error
● Specific codes convey specific messages● Sometimes sending the status code is enough
to communicate a message
1XX STATUS CODES● 100 CONTINUE● 101 SWITCHING PROTOCOL
● Not very common● Perfect for use with polling techniques for
asynchronous tasks
2XX STATUS CODES● 201 CREATED● 202 ACCEPTED
3XX STATUS CODES● 301 MOVED PERMANENTLY● 302 FOUND● 304 NOT MODIFIED● 305 USE PROXY
4XX STATUS CODES● 401 NOT AUTHORIZED● 402 PAYMENT REQUIRED● 403 FORBIDDEN● 429 TOO MANY REQUESTS
5XX STATUS CODES● 501 NOT IMPLEMENTED● 502 BAD GATEWAY● 503 SERVICE UNAVAILABLE
NOT JUST STANDARD418 & 420
● 418 is I AM A TEAPOT, IETF April Fool’s Joke
● 420 used by Twitter for a while to indicate too many connections
WHY DOES ANYOF IT MATTER?
FORMS
● POST request are marginally more secure, but not really
● Requests that carry content can carry more content on the body than on the query string
● Forms can send both query strings and content
● Can submit forms through XMLHTTPRequests with extra headers
BETTER SECURITY● Use of Auth headers● Use of custom headers
○ Server can reply with CSRF Tokens○ Client can send OAuth Tokens
● Still not as secure as using SSL, but better than nothing at all.
APIs● Not just about HyperMedia, all is
important● Well documented● URLs that point to actual resources● Use of Request methods & Headers● Use of proper Response codes● Standard communication without
vendor sponsorship
WHAT WE LEFT OUT● Caching● Proxies● Load balancing● TLS
THE FUTURE● New RFCs and specifications
○ Patch method○ New status codes○ HTTP 2.0
● SPDY○ Experimental protocol for a faster web○ Pronounced speedy○ Implementation before standardization○ claims of 64% page load reduction over
HTTP in lab tests○ Many concurrent connections over one TCP
channel
RESOURCES● http://net.tutsplus.com/tutorials/tools-and-tips/http-the-
protocol-every-web-developer-must-know-part-1/● http://net.tutsplus.com/sessions/http-succinctly/● http://en.wikipedia.
org/wiki/List_of_HTTP_status_codes#1xx_Informational● http://en.wikipedia.org/wiki/Internet_media_type● http://www.nczonline.net/blog/2009/05/05/http-cookies-
explained/● http://www.chromium.org/spdy/spdy-whitepaper● http://http2.github.io/● http://xkcd.com/869/● http://blog.servergrove.com/2013/12/16/talking-http/