…and other stuff
that make the web work
Bits ‘bout Moi!
Senor Bipin Upadhyay
Developer, Directi Pvt. Ltd.
Lead, NULL Open Security Group – Mumbai Chapter
OWASP ESAPI-PHP Committer
Part of IHP (Honeynet Project)
Amateur Photographer
I know Kung-fu…
If Only it was true…
Think about the possibilities…
I know Kung-fu
Me too..
Me three..
Sigh! But it ain’t true, yet!
Agenda
http://icanhascheezburger.files.wordpress.com/2009/02/funny-pictures-cat-has-naps-on-his-agenda.jpg
Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
Bit of History
Mar’89 – T.B. Lee presents “Information Management: A Proposal”
Aug’91 – Announces WWW
Mar’93 – Mosaic announced
Mar’94 – Netscape found
Oct’94 – W3C found by T.B. Lee
Web 2.0, uh!
http://www.wagnerblog.com/images/AjaxDarkSide.jpg
HTTP: What is it?
Part of the Application Layer of TCP/IP protocol suite
HTTP: What is it?
Part of the Application Layer of TCP/IP protocol suite
A set of grammatical rules for a client and server to communicate
http://www.flickr.com/photos/joshfassbind/4584323789/
HTTP: What is it?
Part of the Application Layer of TCP/IP protocol suite
A set of grammatical rules for a client and server to communicate
HTTP is what powers the WWW
…but
http://www.flickr.com/photos/quinnanya/4456123452/
Why should I bother?
Because:
web development sucks
http://www.flickr.com/photos/sneeu/1589152071/
Why should I bother?
Because:
web development sucks
Even your grandmom knows, ‘tis all about fundamentals
Why should I bother?
Also:
facilitates debugging,
improves understanding of security & performance
Why should I bother?
Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2985: Handling Statelessness
http://www.flickr.com/photos/stephenpoff/2312981944/
OSI & TCP/IP protocol suite
OSI is a reference model
http://blog.uad.ac.id/imam_riadi/files/2009/01/osi-layer.jpg
OSI & TCP/IP protocol suite…
TCP/IP protocol suite is implementation of OSI
http://www.hill2dot0.com/wiki/index.php?title=Image:G0209_TCPIP_vs_OSI.jpg
OSI & TCP/IP protocol suite…
Visual learning: Wireshark, baby
http://www.wireshark.org/
Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
The Communication
My favorite interview question:
http://www.flickr.com/photos/terryhart/2890904949/
The Communication
My favorite interview question:
What all happens between the time when:
we click on a hyperlink
and the page is completely rendered in a browser
Brower InternetzProxy LBWeb
ServerDB
Server
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
Browser cache/ hosts file/ DNS server
null.co.in
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
Browser cache/ hosts file/ DNS server
74.53.228.212null.co.in
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
TCP Connection: There, bro?
SYN
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
SYN
SYN-ACK
TCP Connection: Yo!
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
SYN
SYN-ACK
ACK
TCP Connection: Cool!
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
HTTP: Got this file?
GET /
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
HTTP: Yup! Here ‘tis.
GET /
200 OK
index.html
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
HTTP: Can I have these as well?
GET /
200 OK
index.html
GET /js.js
GET /pic.jpg
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
HTTP: Sure!
GET /
200 OK
index.html
GET /js.js
GET /pic.jpg
200 OK
more content…
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
FIN
TCP Connection: Arigato, am done.
Brower InternetzProxy LBWeb
ServerDB
Server
Client Server (null.co.in)
FIN
FIN-ACK
TCP Connection: Sayonara!
The Communication
…. or simply
The Communication
Web 2.0 has shrunk the client and server distinction
Conventionally, client sends an HTTP request
Server responds with an HTTP response
The Communication: HTTP Request
Request Line
Request Method
Requested Resource
HTTP Version used
Headers
General Headers
Request Headers
Entity Headers
Content (Optional)
The Communication: HTTP Response
Status Line
HTTP version(s) understood by server
Status code (3 digit numerical value)
Status description
Headers
General Headers
Response Headers
Entity Headers
Content (Optional)
Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
http://www.saynotocrack.com/wp-content/uploads/2007/06/flinstones-anatomy.jpg
Anatomy
HTTP Request and Response are comprised of various components:
Request Methods
Response Status Codes
Request Headers
Response Headers
General Headers
Entity Headers
Content (MIME Media Types)
Anatomy: Request Methods
Humans can convey emotions in several ways
Why should HTTP clients lag!!!
HTTP methods describe the type of communication
GET POST HEAD OPTIONS
TRACE PUT DELETE CONNECT
Anatomy: Response Status Codes
Indicate the server’s mood corresponding to a request
Combination of a numerical code, and a short description
Cab be categorized in 5 categories:
1xx -- Informational
2xx -- Successful
3xx -- Redirection
4xx -- Client Error
5xx -- Server Error
Anatomy: Request Headers
Specific to an HTTP Request
Carry information about the client, and the type of request
Facilitates better understanding between client and server
Host Accept-Language If-Modified-Since Referer
User-Agent Authorization If-None-Match Expect
Accept Proxy-Authorization
If-Range From
Accept-Charset Max-Forwards If-Unmodified-Since
TE
Accept-Encoding If-Match Range
Anatomy: Response Headers
Specific to an HTTP Response
Carry information about the server, and the type of response
Accept-Ranges ETag Retry-After WWW-Authenticate
Age Location Server Proxy-Authenticate
Vary
Anatomy: General Headers
Carry information about the HTTP transaction
Can be a part of request, as well as response
Cache-Control Keep-Alive Pragma Via
Connection Upgrade Trailer Warning
Transfer-Encoding Date
Anatomy: Entity Headers
Carry information about the content
Mainly a part of HTTP response
Allow Content-Language Content-Location Content-Range
Content-Encoding Content-Length Content-MD5 Content-Type
Expires Last-Modified
Anatomy: Content
IANA maintains a list of valid content types
It is specified by the Content-Type Entity header
Categorized in 9 MIME Media types:
application audio example image
message model multipart text
video
Agenda
Intro: What & Why???
OSI model: Back to the basics
10000 feet view: How the web works
RFC 2616: Anatomy
RFC 2965: Handling Statelessness
Handling Statelessness
HTTP is a stateless protocol
Handling Statelessness
HTTP is a stateless protocol
i.e., server’s got a bad memory
Handling Statelessness
Cookies to rescue
http://www.flickr.com/photos/lij/283869088/
Handling Statelessness
Cookies:
are text files stored by client browser
maintain session by storing information
are non-executable
Handling Statelessness
Cookie attributes:
name=value
expires=value
domain=value
path=value
Secure
HttpOnly --not a part of spec
Conclusion
The single biggest problem in communication
is the illusion… that it has taken place.
--George Bernard Shaw
Conclusion
The single biggest problem in communication
is the illusion… that it has taken place.
--George Bernard Shaw
Think about it
Q&A!!!
Got queries? Raise your hands.
Arigato!
Contact info:
Om—At—[projectbee.org/null.co.in]
http://projectbee.org/
Twitter - @bipinu
Flickr -- projectbee