internet engineering course

54
Internet Engineering Course Web Servers

Upload: xiang

Post on 25-Feb-2016

44 views

Category:

Documents


3 download

DESCRIPTION

Internet Engineering Course. Web Servers. Introduction. Company needs to provide various web services Hosting intranet applications Company web site Various internet applications Therefore there is a need to provide http server First we have a look at what http protocol is - PowerPoint PPT Presentation

TRANSCRIPT

csci5211: Computer Networks and Data Communications

Internet Engineering CourseWeb Servers

1IntroductionCompany needs to provide various web servicesHosting intranet applicationsCompany web siteVarious internet applicationsTherefore there is a need to provide http serverFirst we have a look at what http protocol isThen we talk about Web Servers and Apache as leading web server application2The World Wide Web (WWW)Global hypertext systemInitially developed in 1989By Tim Berners Lee at the European Laboratory for Particle Physics, CERN in Switzerland.To facilitate an easy way of sharing and editing research documents among a geographically dispersed groups of scientists.In 1993, started to grow rapidlyMainly due to the NCSA developing a Web browser called Mosaic (an X Window-based application)First graphical interface to the Web More convenient browsingFlexible way people can navigate through worldwide resources in the Internet and retrieve them3Web BrowsersProvides access to a Web serverBasic componentsHTML interpreterHTTP client used to retrieve HTML pagesSome also supportFTP, NTTP, POP, SMTP,

4Web ServersDefinitionsA computer, responsible for accepting HTTP requests from clients, and serving them Web pages.A computer program that provides the above mentioned functionality.Common featuresAccepting HTTP requests from the networkProviding HTTP response to the requesterTypically consists of an HTMLUsually capable of loggingClient requests/Server responses5Web Servers cont.Returned contentStaticComes from an existing fileDynamicDynamically generated by some other program/script called by the Web server.Path translationTranslate the path component of a URL into a local file system resourcePath specified by the client is relative to the servers root dir6Overall organization of the Web.

Basic function operation is to fetch documentsClient issues requests, browser displays documentServer responsible for retrieving document from local file systemClient/server communications based on HTTP protocolBasic Client/Server Architecture in WWW7Dynamic ContentParts of documents may be specified via scripts/programsClient-side (executed on client machine, e.g., within the browser)Client-side script - Script embedded in html documentApplet - pre-compiled program passed to clientServer-side (executed on server machine)Server-side script embedded in documentServelet - precompiled program executed within the servers address spaceCGI scripts8The principle of using server-side CGI programs.

Allows documents can be generated dynamically on-the-flyProvides a standard way for web server to execute a program using user-provided data as inputTo the server, CGI program appears as program responsible for fetching the requested documentCommon Gateway Interface (CGI)9Architectural OverviewArchitectural details of a client and server in the Web.

Document fetch (and possibly server-side script): 2b-3bExecute CGI Script (separate process): 2c-3c-4cExecute servlet program (run within server): 2a-3a-4a10http protocolDefines the communication between a web server and a clientUsed to deliver virtually all files and other data (collectively called resources) on the World Wide Web A browser is an HTTP client because it sends requests to an HTTP server (Web serverThe standard (and default) port for HTTP servers to listen on is 80, though they can use any port. 11Structure of http transactionsRequest/Response, text based protocolFormat of a http message: Header1: value1 Header2: value2 Header3: value3 1213The Format of a RequestmethodspURLspversionheadercrlf:valuecrlfheader:valuecrlfcrlfEntity Bodyheaderslines14Request ExampleGET /index.html HTTP/1.1 [CRLF]Accept: image/gif, image/jpeg [CRLF]User-Agent: Mozilla/4.0 [CRLF]Host: www.ui.ac.ir:80 [CRLF]Connection: Keep-Alive [CRLF][CRLF] Request ExampleGET /index.html HTTP/1.1Accept: image/gif, image/jpegUser-Agent: Mozilla/4.0Host: www.ui.ac.ir:80Connection: Keep-Alive[blank line here] methodrequest URLversionheaders16The Format of a Responseversionspstatus codespphraseheadercrlf:valuecrlfheader:valuecrlfcrlfEntity Bodyheaderslinesstatusline17HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354

Hello World (more file contents) . . . Response Example18HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354

Hello World (more file contents) . . . Response Exampleversionmessage bodyheadersreason phrasestatus codeInitial lineA typical initial request line:GET /path/to/file/index.html HTTP/1.0Initial response line:HTTP/1.0 200 OK HTTP/1.0 404 Not Found Status code:1xx indicates an informational message only 2xx indicates success of some kind 3xx redirects the client to another URL 4xx indicates an error on the client's part 5xx indicates an error on the server's part Common status codes:200 OK 404 Not Found 301 Moved Permanently 302 Moved Temporarily 303 See Other (HTTP1.1 only) 500 Server Error

19Header linesTypical request headers:From: email address of requesterUser-Agent: for example User-agent:Mozilla/3.0Gold Typical response headers:Server: for example Server:Apache/1.2b3-dev Last-modified: for example Last-Modified: , 19 Feb 2006 23:59:59 GMT 20Message bodyIn a response, this is where the requested resource is returned to the client (the most common use of the message body), or perhaps explanatory text if there's an error. In a request, this is where user-entered data or uploaded files are sent to the server. If an HTTP message includes a body, there are usually header lines in the message that describe the body. In particular, The Content-Type: header gives the MIME-type of the data in the body, such as text/html or image/gif. The Content-Length: header gives the number of bytes in the body. 21MIME Media typesMultipurpose Internet Mail ExtensionsHTTP sends the media type of the file using the Content-Type: headerSome important media types aretext/plain, text/htmlimage/gif, image/jpegaudio/basic, audio/wavmodel/vrmlvideo/mpeg, video/quicktimeapplication/*, application-specific data that does not fall under any other MIME category, e.g. application/octet-streamSample HTTP exchangeTo retrieve the file at the URL http://www.somehost.com/path/file.html Request:GET /path/file.html HTTP/1.0 From: [email protected] User-Agent: HTTPTool/1.0 [blank line here] Response:HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354 Happy New Millennium! (more file contents) . . . 23HTTP methodsGET: request a resource by urlHEADis just like a GET request, except it asks the server to return the response headers only, and not the actual resource (i.e. no message body). This is useful to check characteristics of a resource without actually downloading it, thus saving bandwidth. POSTA POST request is used to send data to the server to be processed in some way, like by a CGI script.There's a block of data sent with the request, in the message body. There are usually extra headers to describe this message body, like Content-Type: and Content-Length:. The request URI is not a resource to retrieve; it's usually a program to handle the data you're sending. The HTTP response is normally program output, not a static file. 24HTTP 1.1It is a superset of HTTP1.0. Improvements include:Faster response, by allowing multiple transactions to take place over a single persistent connection. Faster response and great bandwidth savings, by adding cache support. Faster response for dynamically-generated pages, by supporting chunked encoding, which allows a response to be sent before its total length is known. Efficient use of IP addresses, by allowing multiple domains to be served from a single IP address.

2526Manually Experimentingwith HTTP>telnet eng.ui.ac.ir 80Trying 192.168.50.84Connected to eng.ui.ac.irEscape character is ^].27Sending a Request> GET /~ladani/index.htm HTTP/1.0[blank line]

28The ResponseHTTP/1.1 200 OKDate: Fri, 29 Feb 2008 08:23:33 GMTServer: Apache/2.0.52 (CentOS)Last-Modified: Wed, 07 Nov 2007 12:27:44 GMTETag: "6ccb6-741c-43e55e05a5000"Accept-Ranges: bytesContent-Length: 29724Connection: closeContent-Type: text/html; charset=WINDOWS-1256

.

29GET /~ladani/index.htm HTTP/1.0HTTP/1.1 200 OKHTML code

30GET /~ladani/no-such-page.htm HTTP/1.0HTTP/1.1 404 Not FoundHTML code

31GET /index.html HTTP/1.1HTTP/1.1 400 Bad RequestHTML codeWhy is it a Bad Request?HTTP/1.1 without Host HeaderSession-persistent State What does session-persistent state mean?State information that is preserved between browsing sessions.Information that is stored semi-permanently (i.e., on disk) for later access.Why was calculator example not session-persistent?Sum, current display, etc. not preserved if we went to a different website and back to calculator.

32Why session-persistence?User-based customizations.MyYahoo, E*Trade, etc.Long transactions.Electronic shopping carts.Order preparationServer-side state maintenance.Large amounts of state info that you dont want to pass back and forth.

33Cookie OverviewHTTP cookies are a mechanism for creating and using session-persistent state.Cookies are simple string values that are associated with a set of URLs.Servers set cookies using an HTTP header.Client transmits the cookie as part of HTTP request whenever an associated URL is visited in the future.

34Anatomy of a cookie.Cookie has 6 parts:NameValueDomainPath ExpirationSecurity flagName and Value are required, others have default value.

35Setting a cookie.A cookie is set using the Set-cookie header in an HTTP response.String value of the Set-cookie header is parsed into semi-colon separated fields that define the different parts of the cookie.Cookie is stored by the client.36Sending cookiesEvery time a client makes an HTTP request, it tests every cookie for a match.Cookies match ifCookie domain is suffix of URL server. Cookie expiration has not passed.Cookie path is prefix of URL path.Cookie security flag is on and connection is secure.If a match is made, then name/value pair of cookie is sent as Cookie header in request.37Setting a CookieFull cookie:Set-Cookie: my_cookie = This is my cookie value; domain=.eng.ui.ac.ir; path=/~ladani; expires Thu, 06-March-08 12:00:00 GMTCan have more than one Set-Cookie header, or can combine more than one cookie in one header by separating with ,38Cookie MatchingBiggest misunderstanding:Servers do not RETRIEVE cookies!!!!Servers RECEIVE cookies previously planted.Step 1:Some response by server installs cookie with Set-cookie header.Client saves cookie to disk.39Cookie MatchingStep 2:Browser goes to some page which matches previously received cookie.Cookie name and value sent in request as Cookie HTTP header.Step 3:CGI program detects presence of cookie and uses it.Where is the cookie info?Environment variable HTTP_COOKIE40

Where are cookies stored on client?Client-specific locations.No standard.Latest IE stores in a folder called Temporary Internet FilesEach cookie stored in a separate file.Netscape stores in cookies.txt

41Typical Cookie UsagesCookies as Database IndexMost common use of cookies.State information is kept in some sort of database and the cookie acts as an index.Cookies as State VariablesName of cookie is like variable name.Value of cookie is state information.

42Cookie SecuritySecurity flag restricts when browser will send a cookie back to server.Requires secure connection.For example: https in effect.What does this mean about when the cookies was set?First Web ServerBerners-Lee wrote two programsA browser called WorldWideWebThe worlds first Web server, which ran on NeXSTEPThe machine is on exhibition at CERNs public museum

44Most Famous Web ServersApache HTTP Server from Apache Software FoundationInternet Information Services (IIS) from MicrosoftGoogle Web Server (GWS)Started from May 2007Lighttpdpowers several popular Web 2.0 sites like YouTube, wikipedia and meebo45Web Servers Usage StatisticsThe most popular Web servers, used for public Web sites, are tracked by Netcraft Web Server SurveyDetails given by Netcraft Web Server ReportsApache is the most popular since April 1996Currently (February 2008) about50.93% Apache35.56 % Microsoft (IIS, PWS, etc.)5.16 % Google0.99% Lighttpd46Web Servers Usage Statistics cont.Total Sites Across All Domains August 1995 - February 2008

47Web Servers Usage Statistics cont.Market Share for Top Servers Across All Domains August 1995 - February 2008

48Web Servers Usage Statistics cont.Totals for Active Servers Across All DomainsJune 2000 - February 2008

49Apache (A PAtCHy) Web ServerOrigins: NCSA (Univ. of Illinois,Urbana/Champaign)Now: Apache Software Foundation (www.apache.org), developers world-wideMost widely used web server today [NetCraft web survey, 2/2008]Open source softwareGeographically distributed developersModular, extensible design needed where third-party developers could override or extend basic characteristics

50Web Server Processing StepsAccept ClientConnectionRead HTTPRequest HeaderFindFileSend HTTPResponse HeaderRead FileSend Data51

Apache HTTP ServerApache CoreReceives client requestTypically, allocate new process for each incoming requestAllocates request recordInvokes handlers on individual modules in sequenceModules register handlers during configurationHandlerRequest record passed as single parameterEach handler reads/modifes request record52Web Server PhasesApache core invokes a handler for each phase Resolve document reference (URI) to a local file name (or CGI program+parameters)Client authentication (verify client identity)Client access control (determine access rights)Request access control (check if access allowed)MIME type determination of the responseGeneral phase for handling leftovers (e.g., check syntax of returned response, build up user profile)Transmission of the response to clientLogging data on the processing of the request53Referenceshttp://www.jmarshall.com/easy/http/TCP/IP Tutorial and Technical Overview, Rodriguez, Gatrell, Karas, Peschke, IBM redbooks, August 2001Wikipedia, the free encyclopediaApache: The Definitive Guide, 2nd edition, Ben Laurie, Peter Laurie, OReilly, February 1999Webmaster in a nutshell, 1st edition, Stephen Spainhour, Valerie Quercia, OReilly, October 1996Netcraft: February 2006 Web Server Survey54