sdpl 2002notes 2: document instances and grammars1 2.5 xml schemas n a quick introduction to xml...
TRANSCRIPT
SDPL 2002 Notes 2: Document Instances and Grammars
1
2.5 XML Schemas2.5 XML Schemas
A quick introduction to XML SchemaA quick introduction to XML Schema– W3C Recommendation, May 2, 2001:W3C Recommendation, May 2, 2001:
» XML Schema Part 0: Primer (readable non-XML Schema Part 0: Primer (readable non-normative introduction; Recommended)normative introduction; Recommended)
» XML Schema Part 1: StructuresXML Schema Part 1: Structures
» XML Schema Part 2: DatatypesXML Schema Part 2: Datatypes
– Also under development:Also under development:» XML Schema: Formal Description, W3C XML Schema: Formal Description, W3C
Working Draft, 25 September 2001Working Draft, 25 September 2001
SDPL 2002 Notes 2: Document Instances and Grammars
2
Schema terminologySchema terminology
SchemaSchema: a formal description for the structure : a formal description for the structure and allowed content of a set of data (esp. in and allowed content of a set of data (esp. in databases)databases)
““XML Schema” is often used for each of … XML Schema” is often used for each of … 1.1. XML SchemaXML Schema, the W3C Rec. that defines … , the W3C Rec. that defines … 2.2. XML Schema Definition LanguageXML Schema Definition Language ( (XSDLXSDL), ),
an XML-based markup language for an XML-based markup language for expressing ... expressing ... 3.3. schema documentsschema documents, each of which describes a , each of which describes a schema (schema ( DTD) for a set of XML document instances DTD) for a set of XML document instances
(This may cause some confusion!)(This may cause some confusion!)
SDPL 2002 Notes 2: Document Instances and Grammars
3
Advantages of XSDL (1)Advantages of XSDL (1)
XML syntaxXML syntax– schema documents easier to manipulate by schema documents easier to manipulate by
programs (than the special DTD syntax)programs (than the special DTD syntax) Compatibility with namespacesCompatibility with namespaces
– can validate documents using declarations from can validate documents using declarations from multiple sourcesmultiple sources
Content datatypesContent datatypes– 44 built-in datatypes (including primitive Java 44 built-in datatypes (including primitive Java
datatypes, datatypes of SQL, and XML attribute datatypes, datatypes of SQL, and XML attribute types)types)
– mechanisms to derive user-defined datatypesmechanisms to derive user-defined datatypes
SDPL 2002 Notes 2: Document Instances and Grammars
4
XSDL built-in types (Part 2, Chap. XSDL built-in types (Part 2, Chap. 3)3)
SDPL 2002 Notes 2: Document Instances and Grammars
5
Advantages of XSDL (2)Advantages of XSDL (2)
Independence of element names and Independence of element names and content types; Compare with content types; Compare with – DTDs: 1-to-1 correspondence btw. element type DTDs: 1-to-1 correspondence btw. element type
names and their content models names and their content models – CFGs: 1-to-1 correspondence btw. nonterminals CFGs: 1-to-1 correspondence btw. nonterminals
and their productionsand their productions For example, could define For example, could define titlestitles of people of people
as “Mr.”/”Mrs.”/”Ms.” and as “Mr.”/”Mrs.”/”Ms.” and titlestitles of of chapters as stringschapters as strings
SDPL 2002 Notes 2: Document Instances and Grammars
6
Advantages of XSDL (3)Advantages of XSDL (3)
Support for schema documentation Support for schema documentation – element element annotationannotation with sub-elements with sub-elements
documentation documentation (for human readers) and(for human readers) andappInfo appInfo (for applications)(for applications)
Ability to specify uniqueness and keys Ability to specify uniqueness and keys within selected parts of documentwithin selected parts of document– for example, that for example, that titletitless of chapters should be of chapters should be
uniqueunique
SDPL 2002 Notes 2: Document Instances and Grammars
7
Disadvantages of XSDLDisadvantages of XSDL
Complexity of XSDL (esp. of Rec. Part 1!) Complexity of XSDL (esp. of Rec. Part 1!) – > a long learning curve> a long learning curve
Possible immaturity of implementations (?)Possible immaturity of implementations (?)– W3C XML Schema Web site mentions a dozen of W3C XML Schema Web site mentions a dozen of
tools or processors tools or processors ((http://www.w3.org/XML/Schema#Tools, March http://www.w3.org/XML/Schema#Tools, March 2002)2002)
– Open-source Apache XML parsers (Xerces C++ Open-source Apache XML parsers (Xerces C++ 1.7.0 and Xerces Java 1.4.4) seem reasonable 1.7.0 and Xerces Java 1.4.4) seem reasonable implementations, but also document implementations, but also document limitations/problems in their XML Schema supportlimitations/problems in their XML Schema support
SDPL 2002 Notes 2: Document Instances and Grammars
8
XSDL through ExampleXSDL through Example
– Next: walk-through of an XML schema exampleNext: walk-through of an XML schema example– from XML Schema Rec, Part 0, Chap. 2from XML Schema Rec, Part 0, Chap. 2– Modelling purchase orders like below:Modelling purchase orders like below:
<purchaseOrder orderDate="1999-10-20"><purchaseOrder orderDate="1999-10-20"> <shipTo country="US"> <shipTo country="US"> <name>Alice Smith</name> <name>Alice Smith</name> <street>123 Maple Street</street> <street>123 Maple Street</street> <city>Mill Valley</city> <city>Mill Valley</city> <state>CA</state> <state>CA</state> <zip>90952</zip> <zip>90952</zip> </shipTo> </shipTo>
SDPL 2002 Notes 2: Document Instances and Grammars
9
purchaseOrder purchaseOrder instance instance continuescontinues
<billTo country="US"><billTo country="US"> <name>Robert Smith</name> <name>Robert Smith</name> <street>8 Oak Avenue</street> <street>8 Oak Avenue</street> <city>Old Town</city> <city>Old Town</city> <state>PA</state> <state>PA</state> <zip>95819</zip> </billTo> <zip>95819</zip> </billTo> <comment>Hurry, my lawn is wild!</comment> <comment>Hurry, my lawn is wild!</comment>
<items><item partNum="872-AA"><items><item partNum="872-AA"> <productName>Lawnmower</productName><productName>Lawnmower</productName> <quantity>1</quantity><quantity>1</quantity> <USPrice>148.95</USPrice><USPrice>148.95</USPrice> <comment>Only if electric</comment><comment>Only if electric</comment> </item></item>
SDPL 2002 Notes 2: Document Instances and Grammars
10
End of the example instanceEnd of the example instance
<item partNum="926-AA"><item partNum="926-AA"> <productName>Baby Phone</productName><productName>Baby Phone</productName> <quantity>1</quantity><quantity>1</quantity> <USPrice>39.98</USPrice><USPrice>39.98</USPrice> <shipDate>1999-05-21</shipDate><shipDate>1999-05-21</shipDate> </item></item> </items></items></purchaseOrder></purchaseOrder>
Next: A schema for this kind of purchase ordersNext: A schema for this kind of purchase orders
SDPL 2002 Notes 2: Document Instances and Grammars
11
The Purchase Order Schema (1/5)The Purchase Order Schema (1/5)
<xs:schema <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="purchaseOrder” type="POrdType"/><xs:element name="purchaseOrder” type="POrdType"/>
<xs:element name="comment" type="xs:string"/><xs:element name="comment" type="xs:string"/>
<xs:complexType name="POrdType"> <xs:complexType name="POrdType"> <xs:sequence> <xs:sequence> <xs:element name="shipTo” type="USAddr"/> <xs:element name="shipTo” type="USAddr"/> <xs:element name="billTo" type="USAddr"/> <xs:element name="billTo" type="USAddr"/> <xs:element ref="comment" minOccurs="0"/> <xs:element ref="comment" minOccurs="0"/> <xs:element name="items" type="Items"/> <xs:element name="items" type="Items"/> </xs:sequence> </xs:sequence>
<xs:attribute name="ordDate” type="xs:date"/> <xs:attribute name="ordDate” type="xs:date"/> </xs:complexType></xs:complexType>
SDPL 2002 Notes 2: Document Instances and Grammars
12
The Purchase Order Schema (2/5)The Purchase Order Schema (2/5)
<xs:complexType name="USAddr"> <xs:complexType name="USAddr"> <xs:sequence> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="name" type="xs:string"/> <xs:element name="street" <xs:element name="street"
type="xs:string"/> type="xs:string"/> <xs:element name="city" <xs:element name="city" type="xs:string"/> type="xs:string"/> <xs:element name="state" <xs:element name="state" type="xs:string"/> type="xs:string"/> <xs:element name="zip" <xs:element name="zip" type="xs:decimal"/> type="xs:decimal"/> </xs:sequence> </xs:sequence>
<xs:attribute name="country" <xs:attribute name="country" type="xs:NMTOKEN" fixed="US"/> type="xs:NMTOKEN" fixed="US"/>
</xs:complexType> </xs:complexType>
SDPL 2002 Notes 2: Document Instances and Grammars
13
The Purchase Order Schema (3/5)The Purchase Order Schema (3/5)
<xs:complexType name="Items"> <xs:complexType name="Items"> <xs:sequence> <xs:sequence> <xs:element name="item" <xs:element name="item"
minOccurs="0" maxOccurs="unbounded"> minOccurs="0" maxOccurs="unbounded"> <xs:complexType> <xs:complexType> <xs:sequence> <xs:sequence>
<xs:element name="productName" <xs:element name="productName" type="xs:string"/> type="xs:string"/>
<xs:element name="quantity"> <xs:element name="quantity"> <xs:simpleType> <xs:simpleType>
<xs:restriction <xs:restriction base="xs:positiveInteger"> base="xs:positiveInteger">
<xs:maxExclusive value="100"/> <xs:maxExclusive value="100"/> </xs:restriction> </xs:restriction>
</xs:simpleType> </xs:simpleType> </xs:element> </xs:element>
anonymous type for anonymous type for itemitem
anon. type anon. type for quantityfor quantity
SDPL 2002 Notes 2: Document Instances and Grammars
14
The Purchase Order Schema (4/5)The Purchase Order Schema (4/5)
<xs:element name="USPrice" <xs:element name="USPrice" type="xs:decimal"/> type="xs:decimal"/>
<xs:element ref="comment" <xs:element ref="comment" minOccurs="0"/> minOccurs="0"/>
<xs:element name="shipDate" <xs:element name="shipDate" type="xs:date" type="xs:date"
minOccurs="0"/> minOccurs="0"/> </xs:sequence> </xs:sequence> <xs:attribute name="partNum" type="SKU" <xs:attribute name="partNum" type="SKU"
use="required"/> use="required"/> </xs:complexType> </xs:complexType> </xs:element> </xs:element> </xs:sequence> </xs:sequence>
</xs:complexType> </xs:complexType>
SDPL 2002 Notes 2: Document Instances and Grammars
15
The Purchase Order Schema (5/5)The Purchase Order Schema (5/5)
<!-- Type for Stock Keeping Units, <!-- Type for Stock Keeping Units, (codes for identifying products): --> (codes for identifying products): -->
<xs:simpleType name="SKU"> <xs:simpleType name="SKU"> <xs:restriction base="xs:string"> <xs:restriction base="xs:string"> <xs:pattern value="\d{3}-[A-Z]{2}" /> <xs:pattern value="\d{3}-[A-Z]{2}" /> </xs:restriction> </xs:restriction>
</xs:simpleType> </xs:simpleType>
</xs:schema> </xs:schema>
SDPL 2002 Notes 2: Document Instances and Grammars
16
XSDL Content Models XSDL Content Models
Element content of Element content of complexType complexType can be can be regulated using regulated using – group elements group elements sequencesequence, , choice choice and and all all
and and
– occurrence constraint attributes occurrence constraint attributes minOccurs minOccurs and and maxOccursmaxOccurs
Elements Elements sequencesequence and and choice choice correspond to catenation and alternativity correspond to catenation and alternativity ( | ) in regular expressions ( | ) in regular expressions
SDPL 2002 Notes 2: Document Instances and Grammars
17
XSDL occurrence constraints XSDL occurrence constraints
– optionality (E?) can be expressed by optionality (E?) can be expressed by minOccurs=’0’minOccurs=’0’
– iteration (Eiteration (E**) can be expressed by ) can be expressed by minOccurs=’0’minOccurs=’0’andand maxOccurs=’unbounded’maxOccurs=’unbounded’
– Exactly five occurrences of element A:Exactly five occurrences of element A:<xs:element name=”A” <xs:element name=”A” minOccurs=’5’ minOccurs=’5’
maxOccurs=’5’ />maxOccurs=’5’ />
– 10 to 900 occurrences of element A:10 to 900 occurrences of element A:<xs:element name=”A” <xs:element name=”A” minOccurs=’10’ minOccurs=’10’
maxOccurs=’900’ />maxOccurs=’900’ />
default of both is default of both is 11
SDPL 2002 Notes 2: Document Instances and Grammars
18
Regular expression vs an XSDL content model Regular expression vs an XSDL content model
AA+ + | B (C D)| B (C D)** could be expressed by could be expressed by <xs:choice> <xs:choice> <xs:element name=”A” type=”typeA” <xs:element name=”A” type=”typeA”
maxOccurs=’unbounded’ />maxOccurs=’unbounded’ /> <xs:sequence> <xs:sequence>
<xs:element name=”B” type=”typeB” /><xs:element name=”B” type=”typeB” /> <xs:sequence minOccurs=’0’ <xs:sequence minOccurs=’0’
maxOccurs=’unbounded’ > maxOccurs=’unbounded’ > <xs:element name=”C” type=”typeC” /> <xs:element name=”C” type=”typeC” /> <xs:element name=”D” type=”typeD” /> <xs:element name=”D” type=”typeD” />
</xs:sequence> </xs:sequence> </xs:sequence> </xs:sequence> </xs:choice> </xs:choice>
SDPL 2002 Notes 2: Document Instances and Grammars
19
The The allall group group
XSDL XSDL all all group is a restricted version of group is a restricted version of the SGML &-connectorthe SGML &-connector– E1 & … & En allows sequences corresponding E1 & … & En allows sequences corresponding
to any permutation of E1, …, Ento any permutation of E1, …, En
– XSDL XSDL all all restricted to appear as the only restricted to appear as the only content model group of a content model group of a complexTypecomplexType, and its , and its children to (possibly optional) elementchildren to (possibly optional) element
SDPL 2002 Notes 2: Document Instances and Grammars
20
The The allall group: An example group: An example
For example For example <xs:<xs:allall> > <xs:element name=”A” /> <xs:element name=”A” /> <xs:element name=”<xs:element name=”BB” /> ” /> <xs:element name=”<xs:element name=”CC” ” minminOccurs=’Occurs=’00’ />’ /></xs:</xs:allall>>accepts all of the following element sequences:accepts all of the following element sequences:
A B C; A C B; B A C; B C A; C A B; C B A;A B C; A C B; B A C; B C A; C A B; C B A;A B; and B A; A B; and B A;