1 chapter 1 web components (introduction) web protocols and practice
TRANSCRIPT
1
Chapter 1
Web Components (Introduction)
Web Protocols and Practice
2
Topics
Web Protocols and Practice
INTRODUCTION
Web History Web Definition Semantic Components of the Web Content on the Web Software Components Underlying Network Standardization Web Traffic and Performance Web Applications
3
Web History
Web Protocols and Practice
INTRODUCTION
1945: Vanner Bush proposed Memex which is a device to extend human memory by providing a large scaling indexing of text.
1965: Hypertext: Non-sequential writing that presents information as the collection of linked nodes.
1960-1970: U.S. Department of Defense extended the use of its communication infrastructure (ARPANET) for the connected computers. In 1980 they deployed TCP/IP that caused rapid growth in size and scope of ARPANET.
1989: Tim Berners Lee proposed using the hypertext for accessing the information of the computers at CERN
4
Web History
Web Protocols and Practice
INTRODUCTION
During 1980-1990 these systems have been used widely on Internet to access the information :
FTTP: For file transfer. It works by knowing the ftp server. Gopher: Provided the ways for the users to search the
servers in the network. WAIS (Wide Area Information Servers): Allowed users to
send queries to the databases around the network. Archie: Global index of ftp servers that allowed the users to
do the search based on file name.
1992: The first official release of the web browser. 1993: First graphical web browser (MOSAIC)
5
Web Definition
Web Protocols and Practice
INTRODUCTION
The World Wide Web, or simply the Web, is the universe of information accessible via networked computers.
Internet is different from web. It is a network of computers, in which a computer may not necessarily act as a web client or web server.
6
Semantic Components of the Web
Web Protocols and Practice
INTRODUCTION
Three main semantic components of the Web are:
A naming infrastructure (URI) A document language (HTML) A message exchange protocol (HTTP)
7
URI (Uniform Resource Identifier)
Web Protocols and Practice
INTRODUCTION
Accessing and manipulating resources distributed throughout the Web requires a way to identify them. URI is a universal naming mechanism for identifying resource on Web independent of its current location or value.
URI can be thought of as a pointer to a black box to which request method can be applied to generate different responses at different times. Request method is a simple operation such as fetching, changing, or deleting a resource.
for example in the high level a string such as
http:// www.foo.com/coolpic.gif is a URI. Later we will see how it is different from URL.
8
HTML (Hypertext Markup Language)
Web Protocols and Practice
INTRODUCTION
HTML provides a standard representation for hypertext documents in ASCII format.
9
HTTP (Hypertext Transfer Protocol)
Web Protocols and Practice
INTRODUCTION
HTTP is the most common way of transferring resources on the Web.
HTTP defines the format and meaning of messages exchanged between web components, such as clients and servers.
HTTP is simply a language that has specific syntax and semantics associate with the use of the language elements.
10
HTTP (Hypertext Transfer Protocol)
Web Protocols and Practice
INTRODUCTION
HTTP is a request-response protocol The client sends a request message and then the
server replies with the response message.
HTTP is a stateless protocol clients and servers treat each message
exchange independently and are not required to maintain any state across requests and responses.
11
Table 1.1. Common Web terms
TermDefinitionWWW/Web
Hypertext
Internet
Web pageWeb siteBrowser
World Wide Web, the universe of information accessible via networked computers
Nonlinear writing or linking related documents for navigation
Worldwide collection of interconnected networks using the Internet Protocol (IP)
Document accessible on the Web via a URICollection of related Web pagesApplication for requesting and displaying Web resources
Web Protocols and Practice
INTRODUCTION
12
Content on the Web
Web Protocols and Practice
INTRODUCTION
Each resource may be available in different formats for example:
HTML PostScript
A resource may be: A static file on a machine Generated dynamically at the time of the request
13
Content on the Web
Web Protocols and Practice
INTRODUCTION
Each HTTP transfer consists of two messages: The request message
» Sent by the client The response message
» Sent by the server
14
Table 1.2. Terminology related to Web resources and HTTP messages
TermDefinitionResourceMessageSender/receiver
HeaderEntity
Network data object or service identified by a URIBasic unit of communication in HTTPComponent responsible for sending/ receiving a messageControl portion of a messageInformation transferred in the body of a message
Web Protocols and Practice
INTRODUCTION
15
Software Components
Web Protocols and Practice
INTRODUCTION
User agent A user agent can be a Web browser that
generates requests on behalf of a user and performs a variety of other tasks, such as displaying Web pages and storing the user's bookmarks.
Proxy A proxy is an intermediary between clients and
servers that performs a variety of functions:» filtering of requests to undesirable Web sites» Providing a degree of anonymity to clients» caching popular resources.
16
Software Components
Web Protocols and Practice
INTRODUCTION
Server The server may instruct the user agent to retain
state across a series of requests and responses by storing a cookie. We will discuss cookies later
17
Table 1.3. Terminology related to the software components of the Web
TermDefinitionUser agentWeb client Web Server
Origin Server
Intermediary
Proxy
Cookie
Client program that initiates a request (e.g., a browser)Program that sends an HTTP request to a Web serverProgram that receives an HTTP request from a Web client and transmits a responseServer where the requested resource resides or is createdWeb component in the path between the user agent and an origin server (e.g., a proxy, gateway)Intermediary program that functions as a server to a client and as a client to a serverState information passed between the user agent and the origin server
Web Protocols and Practice
INTRODUCTION
18
Underlying Network
Web Protocols and Practice
INTRODUCTION
A Web client identifies the Web server by the hostname (e.g., www.att.com), rather than an IP address by using Domain name system (DNS)
The two applications exchange HTTP messages By using Transmission Control Protocol
(TCP) The client and the server establish a TCP connection.
19
Table 1.4. Terminology related to the Internet and its protocols of the Web
TermDefinitionHostPacketIP
IP addressHostnameDNS
TCP
Connection
Computer or machine connected to the networkBasic unit of communication in the InternetInternet Protocol, a protocol that coordinates the Delivery of individual packets between hosts32-bit numerical address identifying an Internet hostCase-insensitive string identifying an Internet hostDomain name System, a distributed infrastructure for translating between hostnames and IP addresses Transmission Control Protocol, a protocol that provides the abstraction of a reliable, bidirectional connectionLogical communication channel between two hosts
Web Protocols and Practice
INTRODUCTION
20
Standardization
Web Protocols and Practice
INTRODUCTION
A protocol standard is needed for interoperation of the components.
The Internet Engineering Task Force (IETF) is an open community that deals with Internet standardization through a series of official publications called Request for Comments (RFC)
Not all Internet Drafts become RFCs. RFCs are divided into different tracks: standards, historic, informational and Experimental
21
Standardization
Web Protocols and Practice
INTRODUCTION
Standard documents have compliance requirements of the following levels:
Any compliant implementation has to meet all the MUST-level requirements.
An implementation can be considered conditionally compliant if it meets all the SHOULD-level requirements.
The MAY- level requirements are optional for an implementation to meet.
22
Standardization
Web Protocols and Practice
INTRODUCTION
A standards document proceeds through three stages:
Proposed StandardDraft Standard Internet Standard
Some RFCs reflect the Best Current Practices (BCP)
Standards do not last forever; they can be retired and replaced by a superior specification.
23
Standardization
Web Protocols and Practice
INTRODUCTION
World Wide Web Consortium (W3C) was founded in 1994 to encourage the growth of Web.
The W3C works on» The representation of Web content, such as the
HTML language, rather than the networking aspects» Architectural issues» User-interface issues
Formats Languages
» Social issues » Legal and public policy matters» Accessibility issues to ensure that people with
disabilities are able to have access to the technology
24
Table 1.5. Terminology related to Internet protocol standards
TermDefinitionIETF
Working Group
Internet Draft
RFC
Internet Engineering Task force, an open community contributing to the evolution of the InternetIETF group chartered to work on a particular standards specificationInformal version of a standards documents reflecting work in progressRequest for comments, an official document related to Internet standards
Web Protocols and Practice
INTRODUCTION
25
Web Traffic and Performance
Web Protocols and Practice
INTRODUCTION
User expectations for quick responses have focused attention on performance issues.
High user perceived latency can be because of variety of factors such as: DNS overhead Network congestion Load on server
Analysis of logs is a useful for knowing the workload characteristics such as time between the requests and size of the requests and resource popularity, which have the important implications on Web performance
26
Table 1.6. Terminology related to Web traffic and performance
TermDefinitionLatency
User-perceived latency BandwidthWorkloadLog
Time between the initiation of an action and the first Indication of a responseTime between a user action and the initial display of the contentAmount of traffic that can be carried per unit timeInputs received by a Web component over timeRecord of transactions performed by a Web component
Web Protocols and Practice
INTRODUCTION
27
Web Applications
Web Protocols and Practice
INTRODUCTION
Important applications are: Web caching
» Caching moves contents closer to the user. » A cache can be located at
A user's browser An origin server A machine in the path between the user and the
origin server
Multimedia streaming» The client plays the samples and frames as they
arrive from the server, rather than downloading the content in its entirety before beginning playout.
28
Table 1.7. Terminology related to Web catching and multimedia streaming
TermDefinitionCache
Cache coherency
Replication
Content distribution
Audio/video streamStreaming
Media player
Store of messages used to reduce user- perceived latency and load on the network and serverMechanism to lower the possibility of returning out-of-date messages from the cacheDuplication of resources on multiple origin serversDelivery of resources on behalf of an origin server
Sequence of audio samples or video framesOverlap of the server transmission and client playback of audio/video dataHelper application for playing multimedia streams
Web Protocols and Practice
INTRODUCTION