creating extensible content models xml schemas: best practices a set of guidelines for designing xml...

27
Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Upload: allison-holland

Post on 11-Jan-2016

229 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Creating Extensible Content Models

XML Schemas: Best Practices

A set of guidelines for designing XML Schemas

Created by discussions on xml-dev

Page 2: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Definition

• An element has an extensible content model if in instance documents that element can contain elements and data above and beyond what was specified by the schema.

Page 3: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Static, Fixed Content Model<xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:complexType></xsd:element>

<Book> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher></Book>

Book is rigidly defined to containfive child elements - Title,Author, Date, ISBN, andPublisher.

Instance document authors are restricted to just supplying title, author, date, ISBN, and publisherdata for a book.

Book's content model is static/fixed!

Page 4: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Static, Fixed Content Model

• Sometimes it is desirable to explicitly specify an element's content model.

• Sometimes, however, we want to give instance document authors more flexibility in what data they can provide for an element.

• How do we design a schema such that Book's content model is extensible?

• We will look at two methods for implementing extensible content models.

Page 5: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Extensibility via Type Substitution

<xsd:complexType name="BookType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence></xsd:complexType>

<xsd:complexType name="BookTypePlusReviewer"> <xsd:complexContent> <xsd:extension base="BookType" > <xsd:sequence> <xsd:element name="Reviewer" type="xsd:string"/> </xsd:sequence> </xsd:extension> </xsd:complexContent></xsd:complexType>

<xsd:element Book type="BookType"/>

Page 6: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

<Book> -- content --</Book>

Principle of Type Substitutability

The content model of Book can be eitherBookType, or any type which derives

from BookType, e.g., BookTypePlusReviewer

Extensibility via Type Substitution

Page 7: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

<Book xsi:type="BookTypePlusReviewer"> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> <Reviewer>Roger L. Costello</Reviewer></Book>

The type substitutability mechanism enables instance document authors to extend Book's content model by substituting its type with a derived type.

Here we see that BookType has been substituted by the type:BookTypePlusReviewer. Thus, <Book> now contains a new element, <Reviewer>.

Extensibility via Type Substitution

Do Lab1

Page 8: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Extend a Schema, without Touching it!

• In the last example BookTypePlusReviewer derived from BookType, and both types were in the same schema.

• What if we need to extend BookType, but BookCatalogue.xsd is read-only?

• In a separate schema, we can create a type which extends BookType. The instance document can do type substitution using the new type. Thus, we are able to extend a schema, without touching it!

Page 9: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Extend a Schema, without Touching it!

<xsd:complexType name="BookType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence></xsd:complexType>

<xsd:complexType name="BookTypePlusReviewer"> <xsd:complexContent> <xsd:extension base="BookType" > <xsd:sequence> <xsd:element name="Reviewer" type="xsd:string"/> </xsd:sequence> </xsd:extension> </xsd:complexContent></xsd:complexType>

<xsd:element Book type="BookType"/>

BookCatalogue.xsd

<xsd:include schemaLocation="BookCatalogue.xsd"/>

xmlns=" http://www.publishing.org"xmlns=" http://www.publishing.org"

MyTypeDefinitions.xsd

Page 10: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Extend a Schema, without Touching it!

<Book xsi:type="BookTypePlusReviewer"> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> <Reviewer>Roger L. Costello</Reviewer></Book>

xsi:schemaLocation="http://www.publishing.org MyTypeDefinitions.xsd"

xmlns="http://www.publishing.org"

We have type-substitutedBook's content with thetype specified in the newschema. Thus, we haveextended BookCatalogue.xsdwithout touching it!

Page 11: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Disadvantages of using Type Substitution for Extending an

Element's Content Model

• Location Restricted Extensibility: – The extensibility is restricted to appending

elements onto the end of the content model (after the <Publisher> element). What if we wanted to extend <Book> by adding elements to the beginning (before <Title>), or in the middle, etc? We can't do it with this mechanism.

Page 12: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Disadvantages of using Type Substitution for Extending an

Element's Content Model

• Unexpected Extensibility:

<xsd:complexType name="BookType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence></xsd:complexType>

<xsd:element Book type="BookType"/>

Simply looking at these componentsyou would think that <Book> willalways contain just Title, Author,Date, ISBN, and Publisher. It is easy to forget that someone couldextend the content model using thetype substitution mechanism.Extensibility is unexpected!Consequently, if you create a program to process Book's contentyou may forget to take into accountthat Book may contain different content.

Page 13: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Here's what we Desire for Extending Content Models

• Location Independent Extensibility:– We would like to be able to extend Book's content at any

location, not just at the end. For example, we might wish to add elements at the top (before Author), or in the middle (after Date), etc.

• Explicit Indication of where Extensibility may Occur:– It would be nice if there was a way to explicitly flag places

where extensibility may occur: "hey, instance documents may extend <Book> at this point, so be sure to write your code taking this possibility into account."

Page 14: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Extensibility via the <any> Element

<xsd:element name="Book"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> <xsd:any namespace="##any" minOccurs="0"/> </xsd:sequence></xsd:element>

"The content of Book is Title, Author, Date, ISBN, Publisherand then (optionally) any well-formed element. The new element may come from any namespace."

Note: the <any> element may be inserted at any point, e.g,it could be inserted at the top, in the middle, etc.

Page 15: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Extensibility via the <any> Element

<xsd:element name="Book"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> <xsd:any namespace="##any" minOccurs="0"/> </xsd:sequence></xsd:element>

<xsd:element name="Reviewer"> <xsd:complexType> <xsd:sequence> <xsd:element name="Name"> <xsd:complexType> <xsd:sequence> <xsd:element name="First" type="xsd:string"/> <xsd:element name="Last" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType></xsd:element>

xmlns=" http://www.MyRepository.org"

MyRepository.xsd

xmlns=" http://www.publishing.org"

BookCatalogue.xsd

In an instance document I can insert this Reviewer element after Publisher.

Page 16: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

<Book> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper &amp; Row</Publisher> <rev:Reviewer> <rev:Name> <rev:First>Roger</rev:First> <rev:Last>Costello</rev:Last> </rev:Name> </rev:Reviewer></Book>

xsi:schemaLocation="http://www.publishing.org BookCatalogue.xsd http://www.MyRepository.org MyRepository.xsd"

xmlns="http://www.publishing.org"xmlns:rev="http://www.MyRepository.org"

This instance document author hasextended Book with an element that the schema designer may havenever even envisioned.

We have empowered the author to create instance documents whichcontains all the data he/she requires.

Page 17: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Alternate Schema for Book<xsd:complexType name="BookType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> <xsd:any namespace="##any" minOccurs="0"/> </xsd:sequence></xsd:complexType>

<xsd:element Book type="BookType"/>

This is a better design than the previous version since now we have a nice reusable BookType component.

However, we are back to the "unexpected extensibility"problem. Consequently, after the Publisher element theremay be any well-formed XML element, and after thatanything could be present (due to type substitutability).

Page 18: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Controlling Extensibility using the block Attribute

• We can add a block attribute to the element declaration to prohibit type substitution:

<xsd:complexType name="BookType"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> <xsd:any namespace="##any" minOccurs="0"/> </xsd:sequence></xsd:complexType>

<xsd:element Book type="BookType" block="#all"/>

Page 19: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Control over where and how much Extensibility

• We can put the <any> element specifically where we desire extensibility.

• If we desire extensibility at multiple locations, we can insert multiple <any> elements.

• With maxOccurs we can specify "how much" extensibility we will allow.

<xsd:complexType name="BookType"> <xsd:sequence> <xsd:any namespace="##any" minOccurs="0" maxOccurs="2"/> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence></xsd:complexType>

- We are restricting extensions to occur at the top of the content model.- We are restricting the amount of

extensibility to two elements.

Page 20: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Recognizing our Limitations

• The <any> element allows a schema designer to recognize that he/she is not able to anticipate all the varieties of data that an instance document author might need to use in creating an instance document: "I'm smart enough to know that I'm not smart enough to anticipate all possible needs!"

Do Lab 2

Page 21: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Non-determinism and the <any> element

<xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:any namespace="##any" minOccurs="0" maxOccurs="2"/> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:complexType></xsd:element>

<Book> <Title>My Life and Times</Title> ...</Book>

Schema:

Instance: Does this element correspond to the <any>element, or the Title element declaration?Impossible to determine without "lookingahead" to the next element. The Book element has a non-deterministic content model. Non-determinism content modelsare illegal.

Page 22: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Definition of Non-determinism

• Defn: A non-deterministic content model is one where, upon encountering an element in an instance document, it is ambiguous which path was taken in the schema document.

Page 23: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Non-determinism and the <any> element

<xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:any namespace="##other" minOccurs="0" maxOccurs="2"/> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:complexType></xsd:element>

<Book> <Title>My Life and Times</Title> ...</Book>

Schema:

Instance: Clearly this element must have come fromthe Title element declaration, because the<any> element requires new elements tocome from a namespace other than thetargetNamespace. Thus, this schema hasa deterministic content model and is legal.

Page 24: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Non-determinism and the <any> element

<xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:any namespace="##other" minOccurs="0" maxOccurs="2"/> <xsd:element ref="bk:Title"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:complexType></xsd:element>

<Book> <bk:Title>My Life and Times</bk:Title> ...</Book>

Schema:

Instance: Does this Title element come from the <any>element, or from the Title element beingref'ed? There is no way of knowing,without looking ahead. Thus, this isnon-deterministic, and illegal.

Suppose that theBook element is comprised of components from avariety of namespaces.

Page 25: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

<any> --> Quite Restricted

• As we have seen, the requirement that all elements have deterministic content models imposes serious restrictions on the use of the <any> element.

• So, what do you do when you want to enable extensibility at arbitrary locations?

• Answer: embed the <any> element within an <other> element.

Page 26: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

<xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element name="other" minOccurs="0"> <xsd:complexType> <xsd:sequence> <xsd:any namespace="##any"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:year"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:complexType></xsd:element>

<Book> <Title>My Life and Times</Title> ...</Book>

Schema:

Instance:

Now we are forcing any additionalelements to be embedded within anoptional <other> element.

No more ambiguity about where this Title elementcomes from, i.e., no more non-determinism!

Embed Additional Elements within an <other> Element

Page 27: Creating Extensible Content Models XML Schemas: Best Practices A set of guidelines for designing XML Schemas Created by discussions on xml-dev

Commentary

• Requiring instance document authors to nest additional element within an <other> element is poor, at best.

• The whole reason that this design pattern was forced upon us is due to XML Schemas rule outlawing non-deterministic content models.

– They did this to simplify implementations of Schema validators.

• Write to the XML Schema Working Group requesting that they remove the rule outlawing non-deterministic content models.