csi 5389 (e-commerce technologies) 1 xml and web services

41
CSI 5389 (E-Commerce Technologies) 1 XML and Web Services

Upload: dorothy-byrd

Post on 28-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

CSI 5389 (E-Commerce Technologies) 1

XML and Web Services

CSI 5389 (E-Commerce Technologies) 2

Outline

Why XML? An Introduction to XML Web Services

CSI 5389 (E-Commerce Technologies) 3

Why XML?

CSI 5389 (E-Commerce Technologies) 4

What’s Wrong with HTML?

HTML (Hypertext Markup Language) was developed by Tim Berners-Lee in 1992 as a simplified version of SGML (Standard Generalized Markup Language). Simple language, well suited for hypertext, multimedia, and the

display of small and reasonably simple documents. SGML is a standard language for defining and using

document formats (ISO 8879). Too complicated to understand and to use (accessible only to

experts). Although HTML is workable for simple document, it

mixes up the ideas of the structure of a document and the display of that document.

CSI 5389 (E-Commerce Technologies) 5

What’s Wrong with HTML (cont.)? HTML has been extended in disorganized and

incompatible ways by Netscape and Microsoft. To compete with each other, these two companies

have added their own HTML tags, and implemented different interpretations of the same tags.

Many Web sites today contain tagging that is written for a specific browser.

These Web pages will work properly only with their intended specific browser (and therefore not work properly with other browsers).

CSI 5389 (E-Commerce Technologies) 6

What’s Wrong with HTML (cont.)? In addition, there are also other limitations: Extensibility: HTML does not allow users to

specify their own tags or attributes in order to parameterize or semantically qualify their data.

Structure: HTML does not support the specification of deep structures needed to represent database schemas or object-oriented hierarchies.

Validation: HTML does not support the kind of language specification that allows consuming applications to check data for structural validity on importation.

CSI 5389 (E-Commerce Technologies) 7

The XML Effort XML (Extensible Markup Language) was developed

starting in 1996 by a working group of the W3C (World Wide Web Consortium).

XML is a standardized language to represent structured data as text files.

XML advantages: XML provides strong separation of the structure of a

document and the display of that document. Information providers can define new tags and attributes

at will. Document structures can be nested to any level of

complexity. Any XML document can contain an optional description of

its grammar for use by applications that need to perform structural validation.

CSI 5389 (E-Commerce Technologies) 8

The Main Point By defining our own markup language, we can encode the

information of our documents much more precisely than it is possible with HTML.

This means that programs processing these documents can “understand” them much better and therefore can process the information in ways that are impossible with HTML.

Example: Imagine that we mark up recipes (say, for sea food dishes) according to some definition where we enter the amounts of ingredients needed for making each dish. We can write a program that, given a list of contents in our

fridge, would go through the list of recipes and make a list of the dishes we could make with the available ingredients.

Given nutritional information about the ingredients, the program could sort the dishes by the amount of calories in each dish.

Given the price information for the ingredients, the program could sort the dishes by the price of each dish, and so on.

The possibilities are almost endless, because the information is encoded in a way that the computer can “understand”.

CSI 5389 (E-Commerce Technologies) 9

Web Applications of XML The applications that need XML are those that cannot

be accomplished within the limitations of HTML. These applications can be divided into 4 categories:

1. Applications that require the Web client to mediate between two or more heterogeneous databases.

2. Applications that attempt to distribute a significant proportion of the processing load from the Web server to the Web client.

3. Applications that require the Web client to present different views of the same data to different users.

4. Applications in which intelligent Web agents attempt to tailor information discovery to the needs of individual users.

CSI 5389 (E-Commerce Technologies) 10

Web Applications of XML: An Example

Let’s consider a typical example of the first category of XML applications: the information tracking system for a home health care agency.

A patient entering a home health care agency is represented to the information system by a large collection of paper-based materials of the patient’s medical histories.

The major task in accepting the patient into the system is the manual entry of these materials into the agency’s database.

CSI 5389 (E-Commerce Technologies) 11

Web Applications of XML: An Example (cont.)

First solution (commonly used in practice):1. Log into the hospital’s Web site.2. Become an authorized user.3. Access the patient’s medical records using a Web browser.4. Print out the records from the Web browser.5. Manually key in the data from the printouts.

Second solution (slightly better): Instead of printing out the patient’s medical records, the

operator reads the records from the Web browser and directly key the data into the agency’s online forms in a separate window.

This solution saves the paper that would have been needed for the printouts, but does nothing to address the root of the problem.

CSI 5389 (E-Commerce Technologies) 12

Web Applications of XML: An Example (cont.) Desired solution:

1. Log into the hospital’s Web site.2. Become an authorized user.3. Access the patient’s medical records in a Web-based interface

that represents the patient’s records as a folder icon.4. Drag the folder from the Web application over to the internal

database application.5. Drop the folder into the database.

This solution is not possible within the limitations of HTML, for three reasons:

1. The HTML tag set is too limited to represent or identify multiple database fields in the mixture of the medical documents.

2. HTML is incapable of representing the variety of structures in those documents.

3. HTML does not have any mechanism to check data for structural validity before the application attempts to import the data into the target database.

CSI 5389 (E-Commerce Technologies) 13

Web Applications of XML: An Example (cont.)

One technically feasible solution is to require all hospitals and health care agencies to use a single standard system dictated by the government.

However, in an environment where many health care agencies and hospitals are in financial difficulty, it is hardly practical to require them to replace their existing heterogeneous systems with a single new system.

The other way to enable interchange between heterogeneous systems is to adopt a single industry-wide interchange format that serves as the single output format for all exporting systems, and as the single input format for all importing systems.

In other words, we need a standard language to export and import data: XML

CSI 5389 (E-Commerce Technologies) 14

An Introduction to XML

CSI 5389 (E-Commerce Technologies) 15

XML: A Simple Example

The above XML fragment contains an address in the U.S.

We are free to define new tags such as <Name>, <Street>, etc. to identify parts of the address.

This arrangement makes XML very easy for disparate software tools to create and use.

<?xml version=“1.0”?><Address>

<Name> Larry Stewart </Name><Street> 11 Serissa Circle </Street><City> Wayland </City><State> MA </State><Zip> 01778 </Zip>

</Address>

CSI 5389 (E-Commerce Technologies) 16

Well Formed and Valid XML Documents

An XML document is said to be well formed if it has correct syntax, and is said to be valid if it specifies a document type definition (DTD) and complies with the constraints expressed in that DTD.

If an XML document is well formed and valid, an XML parser will be able to process it.

A DTD is a schema for a class of XML documents, appropriate for a given domain. DTD acts as a rule book that allows authors to create new

documents with the same characteristics as the base document XML provides strong separation of the structure of a

document and the display of that document. The structure is encoded in XML, while the display is managed

by the Extensible Style-sheet Language (XSL).

CSI 5389 (E-Commerce Technologies) 17

XML Entities

Elements Attributes

CSI 5389 (E-Commerce Technologies) 18

XML Elements XML elements are similar to records in a

programming language. An element declaration has the following

form: <!ELEMENT ElementName (ElementContents)>

This declaration defines the relationships among the elements, the order of occurrences of the elements, and their number of occurrences.

CSI 5389 (E-Commerce Technologies) 19

XML Elements (cont.) If an element X consists of elements A, B, and C in that

order, then this would be declared as follows:<!ELEMENT X (A, B, C)>

If the elements A, B, and C can appear in any order, then "&" is used in place of ",".

If only one among A, B, or C is used, then the declaration is<!ELEMENT X ( A | B | C )>

If element X consists of zero or more As, and one or more Bs, then the declaration is<!ELEMENT X ( A*, B+ )>

A question mark after an element means that the element can be skipped:<!ELEMENT X ( A, B?, C? )>

Note that elements can be nested.

CSI 5389 (E-Commerce Technologies) 20

XML Element Types

#PCDATA Parsed character data: The element content contains data which the XML parser can search to look for tags or entity declarations.

ANY Character data: The element content can contain any element defined in any order. Data is not parsed.

EMPTY The element content contains no data.

CSI 5389 (E-Commerce Technologies) 21

XML Attributes Attribute declarations describe information about an

element. More than one attribute can be defined for one element. Attributes are contained within the start tag of an

element. They are defined as follows:<!ATTLIST ElementName AttributeName1 DeclaredValue1 DefaultValue1 AttributeName2 DeclaredValue2 DefaultValue2 ... AttributeNameN DeclaredValueN DefaultValueN >

Declared value is either a list of permissible values, or one of the pre-defined data types.

Default value specifies which value must or may be present as the default value.

CSI 5389 (E-Commerce Technologies) 22

XML Attributes: Declared Value Types CDATA

Character data: Characters other than the attribute value delimiters such as ( _ ‘ ) can be used.

NMTOKEN The value must conform with the rules for an XML name. In general, it must start with a letter and be followed by any combination of letters, digits, or a few special characters. No spaces are allowed.

NMTOKENS One or more NMTOKEN separated by spaces.

CSI 5389 (E-Commerce Technologies) 23

XML Attributes: Declared Value Types (cont.) ID

Identifier: The value of this attribute is unique for each element.

IDREF The value of this attribute matches the value of some ID attribute of an element in the same XML document. It is used to point to that element.

IDREFS One or more IDREF separated by spaces.

CSI 5389 (E-Commerce Technologies) 24

XML Attributes: Default Value Types #REQUIRED

Some value must be specified for this attribute.

#IMPLIED When an attribute with this default value is not specified, the application uses the pre-determined attribute value.

'value' The 'value’ specified is the default. Other permissible values may also be used.

#FIXED 'value' The value must and can only be the value specified.

CSI 5389 (E-Commerce Technologies) 25

XML Example: FAQ Document<?xml version=“1.0”?><!DOCTYPE FAQ SYSTEM http://www.server.com/DTDs/faq.dtd><FAQ>

<INFO><SUBJECT> XML </SUBJECT><AUTHOR> Lars Marius Garshol </AUTHOR><EMAIL> [email protected] </EMAIL><VERSION> 1.0 </VERSION><DATE> June 20 2005 </DATE>

</INFO>

<PART NO=“1”><Q NO=“1”>

<QTEXT> What is XML? </QTEXT><A> Simplified SGML. </A>

</Q><Q NO=“2”>

<QTEXT> What can I use it for? </QTEXT><A> Anything. </A>

</Q></PART>

</FAQ>

Element and tags

Attribute

Accessing DTD

CSI 5389 (E-Commerce Technologies) 26

XML Abstract Syntax Tree

FAQ

INFO PART

SUBJECT AUTHOR EMAIL VERSION DATE Q Q

QTEXT A QTEXT A

CSI 5389 (E-Commerce Technologies) 27

DTD for the FAQ System (faq.dtd)<?xml version=“1.0”?><!ELEMENT FAQ (INFO, PART+)>

<!ELEMENT INFO (SUBJECT, AUTHOR, EMAIL?, VERSION?, DATE?)><!ELEMENT SUBJECT (#PCDATA)><!ELEMENT AUTHOR (#PCDATA)><!ELEMENT EMAIL (#PCDATA)><!ELEMENT VERSION (#PCDATA)><!ELEMENT DATE (#PCDATA)>

<!ELEMENT PART (Q+)><!ELEMENT Q (QTEXT, A)><!ELEMENT QTEXT (#PCDATA)><!ELEMENT A (#PCDATA)>

<!ATTLiST PART NO CDATA #IMPLIED TITLE CDATA #IMPLIED><!ATTLIST Q NO CDATA #IMPLIED>

CSI 5389 (E-Commerce Technologies) 28

Linking in XML XML links can be between two or more resources, which

can be either files (not necessary XML or HTML files) or elements in files.

Linking is an element with attributes:<!ELEMENT simplink ANY><!ATTLIST simplink

ACTUATE (AUTO|USER) “USER” SHOW (REPLACE|EMBED|NEW)

“REPLACE”… >

Links can be specified with the ACTUATE attribute to be followed either when the user explicitly makes a request for instance by

clicking (if the value is USER), or automatically when the system reads the linking (if the value is

AUTO).

CSI 5389 (E-Commerce Technologies) 29

Linking in XML (cont.) What happens when following a link specified with the SHOW

attribute, which can take the following values: EMBED

This means that the resource the link points to is to be inserted into the document.

REPLACEThis means that the resource the link points to is to be replacing the linking element. (Hence, if you have two different versions of a paragraph, you can link them in such a way that one can see the other version in the same context by following the link.)

NEWThis means that the resource the link points to will be processed or displayed in a new context (e.g., a new page). Ordinary HTML links are of type NEW as the new page is displayed in place of the previous one.

CSI 5389 (E-Commerce Technologies) 30

XML Processing SAX (Simple API for XML):

SAX is an event-driven API, providing functions to be called whenever specific XML constructs are encountered during parsing.

It is used to transform/output as XML document is parsed. DOM (Document Object Model):

DOM is also an API, focused on the data structure. It provides functions that the client uses to traverse the structure

of an XML document, and functions for creating and altering the in-memory structure of a new document.

XPATH (XML Path Language): XPATH provides query syntax for addressing parts of an XML

document (i.e., addressing nodes in the abstract syntax tree). XSLT (Extensible Stylesheet Language Transformations):

XSLT provides rules to transform an XML document into other XML formats or into other formats (such as HTML).

CSI 5389 (E-Commerce Technologies) 31

XML on the Web

HTTP

DB

Browser Client

Server

HTMLXSLT

GUIDOM

Parse and ProcessSAX

Server

CSI 5389 (E-Commerce Technologies) 32

Web Services

CSI 5389 (E-Commerce Technologies) 33

A Simple Example Web services are simply applications made accessible

over the Web. Consider a shipping rate calculator provided by a

logistics company. Turning this calculator into a Web service requires the following steps:

1. Encapsulate the logic of the calculator (but not the user interface) into a subroutine.

2. Define the API for the calculator using the Web Services Definition Language (WSDL).

3. Host the subroutine on a Web server supporting the Simple Object Access Protocol (SOAP).

4. Publish the calculator definition to an appropriate UDDI (Universal Description, Discovery, and Integration) directory.

CSI 5389 (E-Commerce Technologies) 34

A Simple Example (cont.) Now, a programmer who wants to use the rate

calculator from an e-commerce system can do the following:

1. Look up the service in the UDDI directory.2. Use SOAP to make a remote call from the

client application to the rate calculator.3. Use the results of the call in the application. Web services make it easy for service

providers to make business logic available for remote use.

CSI 5389 (E-Commerce Technologies) 35

A Simple Example (cont.)

Internet

SOAP Response

SOAP Call

Publish ServiceLookup

Service

UDDI Registry

Web Services Host

Web Services Client

CSI 5389 (E-Commerce Technologies) 36

The Vision of Web Services

Web services provide a straightforward and interoperable means for programs to communicate with each other over the Web.

Web services also provide directories so that providers can advertise and users can search for services.

It is possible to develop a market for heavyweight remote services, such as payment systems, logistics, business messaging etc.

CSI 5389 (E-Commerce Technologies) 37

Remote Procedure Calls Web services are built on the concept of remote procedure calls

(RPC). In an RPC, the calling program, rather than invoking a local

subroutine, instead invokes a client stub, which has the same API as the desired subroutine.

The client stub communicates with a remote server, where a server stub makes the actual call to the actual subroutine.

In addition, the calling program must bind its interface to the appropriate server by using a network directory service.

The service directory is implemented using UDDI and the API is defined using WSDL, which is an XML schema.

Actual parameters and return values are encoded in text form in XML.

Web services are built on standard Web servers and HTTP. Taken together, these decisions make use of the existing Internet

infrastructure for communications between programs.

CSI 5389 (E-Commerce Technologies) 38

SOAP The Simple Object Access Protocol (SOAP) is the

specification of how RPCs are implemented over the Web.

There are 3 aspects to SOAP:1. The SOAP calling conventions explain how to

represent calls to remote procedures and their responses.

2. The SOAP encoding rules explain how to represent application data, namely the arguments and return values from the remote procedure calls.

3. The SOAP envelope defines the contents of a SOAP message and the rules for processing it.

SOAP is almost always used with HTTP as the transport protocol, but it can also be used with other communications systems.

CSI 5389 (E-Commerce Technologies) 39

WSDL The Web Services Definition Language (WSDL) is the

interface definition language for Web services. Most commonly, WSDL is used to describe services

that are available via SOAP and HTTP. WSDL defines Web services in terms of the following

six concepts:1. Types: The data type definitions that are used to describe

messages.2. Message: An abstract definition of the data being transmitted.3. Port Type: A set of abstract operations, each of which has

input and output messages.4. Binding: The concrete protocol and data format specifications5. Port: An address for a single communication endpoint.6. Service: The aggregation of a set of related ports.

CSI 5389 (E-Commerce Technologies) 40

UDDI

Universal Description, Discovery, and Integration (UDDI) is not a protocol so much as a process.

The idea is to operate directories or registries of business entities, business services so that people and programs can find providers of the Web services needed.

See www.uddi.org for further information.

CSI 5389 (E-Commerce Technologies) 41

References

Dr. Stan Matwin’s Lecture slides Dr. Thomas Tran Slides An Introduction to XML by Lars Marius Garshol (

http://www.garshol.priv.no/download/text/xml-intro/index-en.html)

XML, Java, and the Future of the Web by Jon Bosak (http://www.ibiblio.org/pub/sun-info/standards/xml/why/xmlapps.htm)