understanding the web through http

58
Understanding the Web through HTTP Olivia Brundage

Upload: olivia-brundage

Post on 22-Jan-2017

56 views

Category:

Internet


1 download

TRANSCRIPT

Understanding the Web through HTTP

Olivia Brundage

Agenda

• Overall Flow of Data

• How HTTP Requests Work

• Introduction to HTTPS

Meet Datum!

He’s just a baby now, but let’s see how he grows into a big data through the OSI (Open System Interconnection) model.

Physical Layer

Datum’s home town.

He communicates with everyone through a physical medium (like wires).

His language is bits (0’s and 1’s).

Trailer

Data Link Layer

To talk to neighbors, Datum’s bits gets encapsulated as a frame so that the receiver know the start and end of the message.

This layer provides node-to-node data transfer.Header DATA

1010 101001000100111Bit pattern that

specifies the start of the frame

Bit pattern that specifies the end of

the frame

Frame

Network Layer

In order to reach the outside world, Datum has to be transformed into a packet.

Routers are responsible directing data to the correct machine.

This is where you’ll find IP addresses.

IP

Transport Layer

To reach his destination, Datum must be transported via a segment or datagram.

Segments are sent through the Transmission Control Protocol (TCP), which is for a connection-oriented transmission.

Datagrams are sent through the User Datagram Protocol (UDP), which is for a connectionless transmission (e.g., streaming).

IP

Session Layer

Upon arriving at his destination, Datum must create an open session to the client, so that they can continue their business.

Once here, Datum evolves into his final form: Data.

IP

Presentation Layer

Datum is now on the full screen!

This layer takes all your backend code, CSS files, etc and delivers them to the final layer.

IP

Application Layer

This layer is the end-user product and contains high-level APIs like resource sharing and remote file access.

This is also the layer you develop in!

IP

What’s with this abstraction?

• Gives us a framework on how data transforms throughout the network.

• But it’s a little too specific; real networks are a lot muddier than this.

OSI vs TCP/IP

TCP/IP combines some layers of the OSI, making it more succinct to the messier way of real life.

Where does HTTP fit into this?

First, let’s go over what HTTP is

• HTTP stands for hyper-text transfer protocol.

• This protocol is in plain-text and is stateless.

• This protocol resides in the application layer.

Let’s break down these requests!

Make the Request!

You type the URL (Uniform Resource Locator) in the browser:

http://www.google.com

Hey, wait! What’s an URL?

http://www.domain.com:1234/path/to/resource?a=b&x=yprotocol

host

port

resource path

query

Now, time to get the IP address!

After you type the address, another application layer protocol is used to get the IP address: the Domain Name System (DNS)

What’s the IP to Google’s server?

Google’s IP is 65.246.5.22

Domain Name Server Web Browser

HTTP makes the Request Racket

Generic Structure of HTTP Requests

message = <start-line> *(<message-header>) CRLF # Carriage Return Line Feed (i.e., new line) [<message-body>]

• Start line contains the initial request

• Message headers give more details about the request you’re making (i.e., the host, how to maintain the connection, how to handle cookies, etc).

NB: GET requests do not contain a message body, but POST requests can.

tl/dr:

A simple request looks likes:

VERB RESOURCE-URL PROTOCOLMESSAGE-HEADERS

GET / HTTP/1.1 HOST: www.google.comCONNECTION: keep-alive

HTTP Main Verbs

• GET: fetches a resource determined by the URL

• The server sends the resource in the message body if the status code is 200

• POST: creates a new resource where the requests specifies the data needed for the resource

• Params are carried in the body of the request instead of the header; making this a more ‘secure’ type of request

• PUT: updates a resource

• DELETE: deletes a resource

NB: PUT and DELETE can be considered a specialized versions of POST

Lesser-known Verbs

• HEAD: Requests only the server headers. Primarily used for checking if the resource has changed via timestamps.

• TRACE: Retrieves the hops that a request takes to round trip the serve. Used for network diagnostic purposes.

• OPTIONS: Retrieves the server capabilities. For the client-side, it can be used to modify the request based on what the server can support.

HTTP Packet gets ready for Transportation!

HTTP RequestHTTP Packet

TCP Packet

The TCP information maintains the session. Now to the IP layer!

TCP

TCP now hands it over to the Internet Protocol

• Local/Sender Address: Your PC’s IP • Receiver Address: Google’s Server IP • Post Service Nodes: Routers

HTTP RequestIP

IP further encapsulates the data

TCP

TCP

IP Packet

Now we head over to the last layer!

TrailerHTTP RequestHeader

The Network Interface Layer makes the Ethernet frame.

IP TCP

IP Packet

Ethernet Frame

We can finally send this HTTP request out!

The HTTP request is out!

So let’s recap.

But there’s still more to HTTP!!

Some important notes about the response:

• The server will send the status code along with the message payload.

• The status code tells the client how to interpret the server response.

1xx: Informational Messages

• This is just a provisional code and provides informational messages like:

• Keep this connection alive (i.e., still sending information)

• Tell the client to continue sending it’s message

• Ignore the next response

• This class was introduces in HTTP/1.1. Version 1.0 ignores this message.

2xx: Successful

Your request made it!

Request was completely successful.

Message successful, but there was no message body

3xx: Redirection

Your request needs to directed elsewhere.

Resource has moved to a new URL.

Resource has not been modified since last request.

4xx: Client Error

When the server thinks the client made a bad request.

Request can’t be fulfilled due to bad syntax.

Specifically used when authentication has failed.

Request was valid, but the server won’t respond.

Resource can’t be found. Try again later?

Method isn’t supported (like using a GET on a form that requires a POST method)

5xx: Server Error

Server failed creating the request

The infamous, generic server error.

The server doesn’t recognize the request method or can’t fulfill it.

The server was acting as a proxy and received something bad from the upstream server.

The server was acting as a proxy and did not receive a timely response from the upstream server.

Want more status codes?Here’s your source: https://httpstatusdogs.com/

Overall HTTP Interaction

So where does HTTPS come into play?

That is: HTTP over TLS, HTTP over SSL, and HTTP Secure

What HTTPS Is

• HTTPS provides authentication to the website and protection of the privacy and integrity of the exchanged data

• Security is brought to you by the Secure Sockets Layer (SSL) or the improved Transport Layer Security (TLS).

• Encryption is brought to you by Public Key Encryption and Symmetric Key Encryption.

• This security component happens between HTTP request and TCP (before they connect).

HTTPS Happens Before the Connection is Made

How HTTPS Works

• Client/Server Hellos

• Authenticate Client and Server with Cryptography

• Generate session keys

• Further interactions will be based on the encrypted session keys

Questions?

Resources

• “What is the role of the OSI layers when making a request to a website?” https://www.quora.com/What-is-the-role-of-OSI-layers-when-we-open-a-webpage

• “HTTP: The Protocol Every Web Developer Must Know - Part 1"https://code.tutsplus.com/tutorials/http-the-protocol-every-web-developer-must-know-part-1--net-31177

• “HTTP: The Protocol Every Web Developer Must Know - Part 2"https://code.tutsplus.com/tutorials/http-the-protocol-every-web-developer-must-know-part-2--net-31155

• "Understanding HTTP Basics"http://learn.onemonth.com/understanding-http-basics