towards a knowledge society
DESCRIPTION
XML. Towards a Knowledge Society. Why & How?. What information can we see…. WWW2002 The eleventh international world wide web conference Sheraton waikiki hotel Honolulu, hawaii, USA 7-11 may 2002 1 location 5 days learn interact Registered participants coming from - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/1.jpg)
Towards a Knowledge Society
Why & How?
XML
![Page 2: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/2.jpg)
What information can we see…
WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong kong,
india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire
Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh
international world wide web conference. This prestigious event …Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …
![Page 3: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/3.jpg)
What information can a machine see…
WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong kong,
india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire
Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh
international world wide web conference. This prestigious event …Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …
![Page 4: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/4.jpg)
Solution: markup with “meaningful” tags?<name>WWW2002The eleventh international world wide webcon</name><location>Sheraton waikiki hotelHonolulu, hawaii, USA</location><date>7-11 may 2002</date><slogan>1 location 5 days learn interact</slogan><participants>Registered participants coming fromaustralia, canada, chile denmark, new zealand, the netherlands,
norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire</participants>
<introduction>Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh
prestigious Speakers confirmed</introduction><speaker>Tim berners-lee</speaker><bio>Tim is the well known inventor of the Web,</bio>…
![Page 5: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/5.jpg)
Structured Web Documents in XML
XML, a language that lets one write structured Web documents with a user-defined vocabulary
![Page 6: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/6.jpg)
Web page which contains information about a particular book in html
<h2>Nonmonotonic Reasoning: Context-Dependent Reasoning</h2>
<i>by <b>V. Marek</b> and <b>M. Truszczynski</b></i><br>
Springer 1993<br>ISBN 0387976892
![Page 7: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/7.jpg)
A typical representation in xml
<book><title>
Nonmonotonic Reasoning: Context-Dependent Reasoning</title><author>V. Marek</author><author>M. Truszczynski</author><publisher>Springer</publisher><year>1993</year><ISBN>0387976892</ISBN></book>
![Page 8: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/8.jpg)
Imagine an intelligent agent trying to retrieve the authors of the particular book
From html From xml
![Page 9: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/9.jpg)
XML allows to represent information that is also machine-accessible.
XML separates content from use and presentation.
![Page 10: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/10.jpg)
Another Example<h2>Relationship matter-energy</h2><i> E = M × c2 </i>
<equation><meaning>Relationship
matter-energy</meaning><leftside> E </leftside><rightside> M × c2 </rightside></equation>
![Page 11: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/11.jpg)
XML is a meta-language: it does not have a fixed set of tags, but allows users to define tags of their own.
applications on the WWW must agree on common vocabularies if they need to communicate and collaborate
Communities and business sectors are in the process of defining their specialized vocabularies, creating XML applications
![Page 12: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/12.jpg)
mathematics (MathML) bioinformatics (BSML) human resources (HRML) astronomy (AML) news (NewsML) investment (IRML) SBML (System Biology) Bioinformatic Sequence Markup Language (BSML) MicroArray and Gene Expression Markup
Language (MAGE-ML)
![Page 13: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/13.jpg)
Chemical Markup Language
Molecular Dynamics [Markup] Language (MoDL)StarDOM - Transforming Scientific Data into XMLBioinformatic Sequence Markup Language (BSML)BIOpolymer Markup Language (BIOML)CellML
Gene Expression Markup Language (GEML)GeneX Gene Expression Markup Language
(GeneXML)Genome Annotation Markup Elements (GAME)Microarray Markup Language (MAML)XML for Multiple Sequence Alignments (MSAML)Systems Biology Markup Language (SBML)OMG Gene Expression RFPProtein Extensible Markup Language (PROXIML)
![Page 14: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/14.jpg)
The XML LanguageAn XML document consists of a prolog a number of elements and attributes
![Page 15: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/15.jpg)
Prolog of an XML Document
The prolog consists of an XML declaration and an optional reference to external structuring
documents
<?xml version="1.0" encoding="UTF-16"?>
<!DOCTYPE book SYSTEM "book.dtd">
![Page 16: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/16.jpg)
XML Elements The “things” the XML document talks about
E.g. books, authors, publishers An element consists of:
an opening tag the content a closing tag
<lecturer>David Billington</lecturer>
![Page 17: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/17.jpg)
XML Elements (continue) Tag names can be chosen almost freely. The first character must be a letter, an
underscore, or a colon No name may begin with the string
“xml” in any combination of cases E.g. “Xml”, “xML”
![Page 18: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/18.jpg)
Content of XML Elements Content may be text, or other elements, or nothing
<lecturer><name>David Billington</name><phone> +61 − 7 − 3875 507 </phone>
</lecturer>
If there is no content, then the element is called empty; it is abbreviated as follows:<lecturer/> for <lecturer></lecturer>
![Page 19: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/19.jpg)
XML Attributes An empty element is not necessarily
meaningless It may have some properties in terms of
attributes
An attribute is a name-value pair inside the opening tag of an element<lecturer
name="David Billington" phone="+61 − 7 − 3875 507“
/>
![Page 20: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/20.jpg)
XML Attributes: An Example
<order orderNo="23456" customer="John Smith" date="October 15, 2002“>
<item itemNo="a528" quantity="1"/><item itemNo="c817" quantity="3"/>
</order>
![Page 21: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/21.jpg)
The Same Example without Attributes
<order><orderNo>23456</orderNo><customer>John Smith</customer><date>October 15, 2002</date><item>
<itemNo>a528</itemNo><quantity>1</quantity>
</item><item>
<itemNo>c817</itemNo><quantity>3</quantity></item>
</order>
![Page 22: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/22.jpg)
XML Elements vs Attributes
Attributes can be replaced by elements
When to use elements and when attributes is a matter of taste and need
But attributes cannot be nested
![Page 23: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/23.jpg)
Further Components of XML Docs
Comments A piece of text that is to be ignored by
parser <!-- This is a comment -->
Processing Instructions (PIs) Define procedural attachments <?stylesheet type="text/css"
href="mystyle.css"?>
![Page 24: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/24.jpg)
Well-Formed XML Documents
Syntactically correct documents Some syntactic rules:
Only one outermost element (called root element)
Each element contains an opening and a corresponding closing tag
Tags may not overlap <author><name>Lee Hong</author></name>
Attributes within an element have unique names Element and tag names must be permissible
![Page 25: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/25.jpg)
The Tree Model of XML Documents: An Example
<email><head>
<from name="Michael Maher"
address="[email protected]"/><to name="Grigoris Antoniou"
address="[email protected]"/><subject>Where is your draft?</subject>
</head><body>
Grigoris, where is the draft of the paper you promised me
last week?</body>
</email>
![Page 26: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/26.jpg)
The Tree Model of XML Documents: An Example
![Page 27: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/27.jpg)
The Tree Model of XML Docs
The tree representation of an XML document is an ordered labeled tree: There is exactly one root There are no cycles Each non-root node has exactly one parent Each node has a label. The order of elements is important … but the order of attributes is not important
![Page 28: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/28.jpg)
XML is not enough to ensure valid data structure!
Any XML document which conforms to the XML syntax (such as every tag must have a corresponding closing tag is considered) to be well-formed
However, this does not mean that all the structure of the data is what you wanted. For instance you may want to enforce: That a particular data field is present for each child Data fields in each child appear in the same order That a data field may not be present more than once in a child
node How machines know about structure they process?
![Page 29: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/29.jpg)
Issues Validation and Interoperability
How application can verify whether the data you receive from the outside world?
Is xml document follows the specified structure?
Is xml document follows the restrictions on the elements and attributes?
How all the xml document follows that particular structure?
![Page 30: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/30.jpg)
Structuring XML Documents An XML document is valid if
it is well-formed respects the structuring information it
uses There are two ways of defining the
structure of XML documents: DTDs (the older and more restricted
way) XML Schema (offers extended
possibilities)
![Page 31: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/31.jpg)
2nd Lecture
![Page 32: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/32.jpg)
DTD: Document Type Definition• DTD specifies grammar rules for an XML document
enforcing the structure• This allows several XML documents prepared from various
sources can be validated using a single set of grammar rules • An XML document that adheres to a DTD is called valid. • DTD specifies rules for elements (child nodes) and how it can
be expanded into sub elements (child nodes) • DTD consists - Element declarations, Attribute list, Data types
etc.• DTDs : difficult to create!
CCTM: Course material developed by James King ([email protected])
![Page 33: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/33.jpg)
DTD: Element Type Definition
<lecturer><name>David Billington</name><phone> +61 − 7 − 3875 507
</phone></lecturer>
DTD for above element (and all lecturer elements):<!ELEMENT lecturer (name,phone)><!ELEMENT name (#PCDATA)><!ELEMENT phone (#PCDATA)>
![Page 34: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/34.jpg)
The Meaning of the DTD The element types lecturer, name, and
phone may be used in the document A lecturer element contains a name
element and a phone element, in that order (sequence)
A name element and a phone element may have any content
In DTDs, #PCDATA is the only atomic type for elements
![Page 35: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/35.jpg)
DTD: Disjunction in Element Type Definitions
We express that a lecturer element contains either a name element or a phone element as follows:<!ELEMENT lecturer (name|phone)>
A lecturer element contains a name element and a phone element in any order. <!ELEMENT lecturer((name,phone)|
(phone,name))>
![Page 36: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/36.jpg)
Example of an XML Element
<order orderNo="23456" customer="John
Smith" date="October 15,
2002"><item itemNo="a528"
quantity="1"/><item itemNo="c817"
quantity="3"/></order>
![Page 37: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/37.jpg)
The Corresponding DTD<!ELEMENT order (item+)><!ATTLIST order orderNo ID
#REQUIREDcustomer CDATA
#REQUIREDdate CDATA
#REQUIRED>
<!ELEMENT item EMPTY><!ATTLIST item itemNo ID
#REQUIREDquantity CDATA
#REQUIREDcomments CDATA
#IMPLIED>
![Page 38: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/38.jpg)
Comments on the DTD The item element type is defined
to be empty + (after item) is a cardinality
operator: ?: appears zero times or once *: appears zero or more times +: appears one or more times No cardinality operator means exactly
once
![Page 39: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/39.jpg)
Comments on the DTD (cont.)
In addition to defining elements, we define attributes
This is done in an attribute list containing: Name of the element type to which the list
applies A list of triplets of attribute name, attribute type,
and value type Attribute name: A name that may be used in
an XML document using a DTD
![Page 40: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/40.jpg)
DTD: Attribute Types Similar to predefined data types, but limited
selection The most important types are
CDATA, a string (sequence of characters) ID, a name that is unique across the entire XML document IDREF, a reference to another element with an ID attribute
carrying the same value as the IDREF attribute IDREFS, a series of IDREFs (v1| . . . |vn), an enumeration of all possible values
Limitations: no dates, number ranges etc.
![Page 41: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/41.jpg)
DTD: Attribute Value Types #REQUIRED
Attribute must appear in every occurrence of the element type in the XML document
#IMPLIED The appearance of the attribute is optional
#FIXED "value" Every element must have this attribute
"value" This specifies the default value for the
attribute
![Page 42: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/42.jpg)
Referencing with IDREF and IDREFS
<!ELEMENT family (person*)><!ELEMENT person (name)><!ELEMENT name (#PCDATA)><!ATTLIST personid ID
#REQUIREDmother IDREF #IMPLIEDfather IDREF #IMPLIEDchildren IDREFS #IMPLIED>
![Page 43: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/43.jpg)
An XML Document Respecting the DTD
<family><person id="bob" mother="mary"
father="peter"><name>Bob Marley</name>
</person><person id="bridget" mother="mary">
<name>Bridget Jones</name></person><person id="mary" children="bob bridget">
<name>Mary Poppins</name></person><person id="peter" children="bob">
<name>Peter Marley</name></person>
</family>
![Page 44: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/44.jpg)
A DTD for an Email Element
<!ELEMENT email (head,body)><!ELEMENT head (from,to+,cc*,subject)><!ELEMENT from EMPTY><!ATTLIST from name CDATA
#IMPLIEDaddress CDATA #REQUIRED>
<!ELEMENT to EMPTY><!ATTLIST to name CDATA
#IMPLIEDaddress CDATA #REQUIRED>
![Page 45: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/45.jpg)
A DTD for an Email Element (cont..)
<!ELEMENT cc EMPTY><!ATTLIST cc name CDATA
#IMPLIEDaddress CDATA
#REQUIRED><!ELEMENT subject (#PCDATA)><!ELEMENT body (text,attachment*)><!ELEMENT text (#PCDATA)><!ELEMENT attachment EMPTY><!ATTLIST attachment
encoding (mime|binhex) "mime" file CDATA
#REQUIRED>
![Page 46: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/46.jpg)
Interesting Parts of the DTD
A head element contains (in that order): a from element at least one to element zero or more cc elements a subject element
In from, to, and cc elements the name attribute is not required the address attribute is always required
![Page 47: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/47.jpg)
Interesting Parts of the DTD (cont..)
A body element contains a text element possibly followed by a number of
attachment elements The encoding attribute of an
attachment element must have either the value “mime” or “binhex” “mime” is the default value
![Page 48: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/48.jpg)
Remarks on DTDs Recursive definitions possible in
DTDs <!ELEMENT bintree
((bintree root bintree)|emptytree)>
![Page 49: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/49.jpg)
DTD Problems Attribute Value
It is possible to specify all the values an attribute can have or allow it to have any value.
It is not possible to perform type checking on an attribute’s value so you cannot specify an attributes value is a integer or float…
Does not follow xml syntax
![Page 50: Towards a Knowledge Society](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815fe8550346895dceed09/html5/thumbnails/50.jpg)
Next Lecture XML Schema