what is metadata? - nc state computer science · exercise produce an example xml document...

59
Module 4: XML Representation Concepts Parsing and Validation Schemas c Munindar P. Singh, CSC 513, Spring 2008 p.106 What is Metadata? Literally, data about data Description of data that captures some useful property regarding its Structure and meaning Provenance: origins Treatment as permitted or allowed: storage, representation, processing, presentation, or sharing Markup is metadata pertaining to media artifacts (documents, images), generally specified for suitable parsable units c Munindar P. Singh, CSC 513, Spring 2008 p.107

Upload: nguyenanh

Post on 04-Jun-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Module 4: XML Representation

ConceptsParsing and ValidationSchemas

c©Munindar P. Singh, CSC 513, Spring 2008 p.106

What is Metadata?Literally, data about data

Description of data that captures someuseful property regarding its

Structure and meaningProvenance: originsTreatment as permitted or allowed:storage, representation, processing,presentation, or sharing

Markup is metadata pertaining to mediaartifacts (documents, images), generallyspecified for suitable parsable units

c©Munindar P. Singh, CSC 513, Spring 2008 p.107

Page 2: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Motivations for MetadataMediating information structure (surrogate formeaning) over time and space

Storage: extend life of informationInteroperation for businessInteroperation (and storage) for regulatoryreasonsGeneral themes

Make meaning of information explicitEnable reuse across applications:repurposing compare to screen-scrapingEnable better tools to improve productivity

Reduce need for detailed prior agreementsc©Munindar P. Singh, CSC 513, Spring 2008 p.108

Markup HistoryHow much prior agreement do you need?

No markup: significant prior agreement

Comma Separated Values (CSV): no nesting

Ad hoc tags

SGML (Standard Generalized Markup L): complex,few reliable tools; used for document management

HTML (HyperText ML): simplistic, fixed, unprincipledvocabulary that mixes structure and display

XML (eXtensible ML): simple, yet extensible subset ofSGML to capture custom vocabularies

Machine processibleComprehensible to people: easier debugging

c©Munindar P. Singh, CSC 513, Spring 2008 p.109

Page 3: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Uses of XML

Supporting arms-length relationshipsExchanging information across softwarecomponents, even within an administrativedomainStoring information in nonproprietary formatRepresenting semistructured descriptions:

Products, services, catalogsContractsQueries, requests, invocations, responses(as in SOAP): basis for Web services

c©Munindar P. Singh, CSC 513, Spring 2008 p.110

Example XML Document

<?xml vers ion ="1.0"? > <!−− processing i n s t r u c t i o n −−><topelem a t t r 0 =" foo "> <!−− exac t l y one roo t −−>

3 <subelem a t t r 1 ="v1 " a t t r 2 ="v2">Opt iona l t e x t (PCDATA) <!−− parsed charac te r data −−><subsubelem a t t r 1 ="v1 " a t t r 2 ="v2 " / >

</subelem><nul l_e lem / >

8 <short_elem a t t r 3 ="v3 " / ></ topelem >

c©Munindar P. Singh, CSC 513, Spring 2008 p.111

Page 4: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

ExerciseProduce an example XML documentcorresponding to a directed graph

c©Munindar P. Singh, CSC 513, Spring 2008 p.112

Compare with Lisp

List processing languageS-expressionsCons pairs: car and cdrLists as nil-terminated s-expressionsArbitrary structures built from few primitivesUntypedEasy parsingRegularity of structure encourages recursion

c©Munindar P. Singh, CSC 513, Spring 2008 p.113

Page 5: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

ExerciseProduce an example XML documentcorresponding to

An invoice from Locke Brothers for 100 unitsof door locks at $19.95, each ordered on 15January and delivered to Custom HomeBuildersFactor in certified delivery via UPS for$200.00 on 18 JanuaryFactor in addresses and contact info for eachpartyFactor in late payments

c©Munindar P. Singh, CSC 513, Spring 2008 p.114

Meaning in XML

Relational DBMSs work for highly structuredinformation, but rely on column names formeaningSame problem in XML (reliance on namesfor meaning) but better connections to richermeaning representations

c©Munindar P. Singh, CSC 513, Spring 2008 p.115

Page 6: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Namespaces: 1

Because XML supports custom vocabulariesand interoperation, there is a high risk ofname collisionA namespace is a collection of namesNamespaces must be identical or disjoint

Crucial to support independentdevelopment of vocabulariesMAC addressesPostal and telephone codesVehicle identification numbersDomains as for the InternetOn the Web, use URIs for uniqueness

c©Munindar P. Singh, CSC 513, Spring 2008 p.116

XML Namespaces: 21 <!−− xml∗ i s reserved −−>

<?xml vers ion ="1.0"? >< a r b i t : top xmlns ="a URI " <!−− d e f a u l t namespace −−>

xmlns : a r b i t =" h t t p : / / wherever . i t . might . be / a r b i t −ns "xmlns : random=" h t t p : / / another . one / random−ns">

6 < a r b i t : aElem a t t r 1 ="v1 " a t t r 2 ="v2">Opt iona l t e x t (PCDATA)< a r b i t : bElem a t t r 1 ="v1 " a t t r 2 ="v2 " / >

</ a r b i t : aElem><random : simple_elem/ >

11 <random : aElem a t t r 3 ="v3 " / ><!−− compare a r b i t : aElem −−>

</ a r b i t : top >

c©Munindar P. Singh, CSC 513, Spring 2008 p.117

Page 7: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Uniform Resource IdentifierURIs are abstractWhat matters is their (purported) uniquenessURIs have no proper syntax per seKinds of URIs

URLs, as in browsing: not used instandards any moreURNs, which leave the mapping of namesto locations up in the air

Good design: the URI resource existsIdeally, as a description of the resource inRDDLUse a URL or URN

c©Munindar P. Singh, CSC 513, Spring 2008 p.118

RDDLResource Directory Description Language

Meant to solve the problem that a URI maynot have any real content, but people expectto see some (human readable) contentCaptures namespace description for people

XML SchemaText description

c©Munindar P. Singh, CSC 513, Spring 2008 p.119

Page 8: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Well-Formedness and Parsing

An XML document maps to a parse tree (ifwell-formed; otherwise not XML)

Each element must end (exactly once):obvious nesting structure (one root)An attribute can have at most oneoccurrence within an element; anattribute’s value must be a quoted string

Well-formed XML documents can be parsed

c©Munindar P. Singh, CSC 513, Spring 2008 p.120

XML InfoSet

A standardization of the low-level aspects of XMLWhat an element looks likeWhat an attribute looks likeWhat comments and namespace referenceslook likeOrdering of attributes is irrelevantRepresentations of strings and characters

Primarily directed at tool vendors

c©Munindar P. Singh, CSC 513, Spring 2008 p.121

Page 9: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Elements Versus Attributes: 1

Elements are essential for XML: structureand expressiveness

Have subelements and attributesCan be repeatedLoosely might correspond toindependently existing entitiesCan capture all there is to attributes

c©Munindar P. Singh, CSC 513, Spring 2008 p.122

Elements Versus Attributes: 2Attributes are not essential

End of the road: no subelements orattributesLike text; restricted to string valuesGuaranteed unique for each elementCapture adjunct information about anelementGreat as references to elements

Good idea to use in such cases to improvereadability

c©Munindar P. Singh, CSC 513, Spring 2008 p.123

Page 10: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Elements Versus Attributes: 3

< invo ice >2 < p r i ce currency = ’USD’ >

19.95</ pr ice >

</ invo ice >

Or< invo i ce amount = ’19.95 ’ currency = ’USD’ / >

Or even< invo i ce amount = ’USD 19.95 ’ / >

c©Munindar P. Singh, CSC 513, Spring 2008 p.124

Validating

Verifying whether a document matches a givengrammar (assumes well-formedness)

Applications have an explicit or implicitsyntax (i.e., grammar) for their particularelements and attributes

Explicit is better have definitionsBest to refer to definitions in separatedocuments

When docs are produced by externalsoftware components or by humanintervention, they should be validated

c©Munindar P. Singh, CSC 513, Spring 2008 p.125

Page 11: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Specifying Document Grammars

Verifying whether a document matches a givengrammar

Implicitly in the applicationWorst possible solution, because it isdifficult to develop and maintain

Explicit in a formal document; languagesinclude

Document Type Definition (DTD): inessence obsoleteXML Schema: good and prevalentRelax NG: (supposedly) better but not asprevalent

c©Munindar P. Singh, CSC 513, Spring 2008 p.126

XML Schema

Same syntax as regular XML documentsLocal scoping of subelement namesIncorporates namespaces(Data) Types

Primitive (built-in): string, integer, float,date, ID (key), IDREF (foreign key), . . .simpleType constructors: list, unionRestrictions: intervals, lengths,enumerations, regex patterns,Flexible ordering of elements

Key and referential integrity constraints

c©Munindar P. Singh, CSC 513, Spring 2008 p.127

Page 12: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Schema: complexType

Specifies types of elements with structure:Must use a compositor if ≥ 1subelementsSubelements with typesMin and max occurrences (default 1) ofsubelements

Elements with text content are easyEMPTY elements: easy

Example?Compare to nulls, later

c©Munindar P. Singh, CSC 513, Spring 2008 p.128

XML Schema: Compositors

Sequence: ordered listCan occur within other compositorsAllows varying min and max occurrence

All: unorderedMust occur directly below root elementMax occurrence of each element is 1

Choice: exclusive orCan occur within other compositors

c©Munindar P. Singh, CSC 513, Spring 2008 p.129

Page 13: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Schema: Main Namespaces

Part of the standardxsd: http://www.w3.org/2001/XMLSchema

Terms for defining schemas: schema,element, attribute, . . .The schema element has an attributetargetNamespace

xsi: http://www.w3.org/2001/XMLSchema-instance

Terms for use in instances:schemaLocation,noNamespaceSchemaLocation, nil, type

targetNamespace: user-defined

c©Munindar P. Singh, CSC 513, Spring 2008 p.130

XML Schema Instance Doc

<!−− Comment −−><Music xmlns =" h t t p : / / a . b . c / Muse"

xmlns : x s i =" the standard−x s i "4 x s i : schemaLocation ="schema−URI schema−l o ca t i on−URL">

<!−− Not ice space charac te r i n above s t r i n g −−>. . .

</ Music>

Define null values as<aElem x s i : n i l =" t r u e " / >

c©Munindar P. Singh, CSC 513, Spring 2008 p.131

Page 14: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Schema: NillableAn xsd:element declaration may statenillable=’true’

An instance of the element might statexsi:nil="true"The instance would be valid even if nocontent is present, even if content isrequired by default

c©Munindar P. Singh, CSC 513, Spring 2008 p.132

Creating XML Schema Docs: 1

Included into the same namespace as theincluding doc<xsd : schema xmlns : xsd =" the−standard−xsd "

xsd : targetNamespace =" the−t a r g e t ">< inc lude xsd : schemaLocation =" par t−one . xsd " / >

4 < inc lude xsd : schemaLocation =" par t−two . xsd " / ><!−− schemaLocation as i n xsd , not x s i −−>

</ xsd : schema>

c©Munindar P. Singh, CSC 513, Spring 2008 p.133

Page 15: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Creating XML Schema Docs: 2

Use import instead of includeImports may have different targetsIncluded schemas have the same targetSpecify namespaces from whichschemas are to be importedLocation of schemas not required andmay be ignored if provided

c©Munindar P. Singh, CSC 513, Spring 2008 p.134

Foreign Attributes in XML Schema

XML Schema elements allow attributes that areforeign, i.e., with a namespace other than the xsdnamespace

Must have an explicit namespaceCan be used to insert any additionalinformation, not interpreted by a processorSpecific usage is with attributes from thexlink: namespace

<xsd : schema><xsd : element name= ’ course ’ type = ’cT ’

x l i n k : r o l e = ’ work ’ ncsu : o f f e r i n g = ’ t rue ’ >4 </ xsd : schema>

c©Munindar P. Singh, CSC 513, Spring 2008 p.135

Page 16: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Schema Style Guidelines: 1

Flatten the structure of the schemaDon’t nest declarations as you would adesired instance documentMake sure that element names are notreusedUnqualified attributes cannot be globalIf dealing with legacy documents with thesame element names having differentmeanings, place them in differentnamespaces where possible

Use named types where appropriate

c©Munindar P. Singh, CSC 513, Spring 2008 p.136

XML Schema Style Guidelines: 2

Don’t have elements with mixed contentDon’t have attribute values that need parsingAdd unique IDs for information that mayrepeatGroup information that may repeatEmphasize commonalities and reuse

Derive types from related typesCreate attribute groups

c©Munindar P. Singh, CSC 513, Spring 2008 p.137

Page 17: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Schema Documentationxsd:annotation

Should be the first subelement, except forthe whole schemaContainer for two mixed-contentsubelements

xsd:documentation: for humansxsd:appinfo: for machine-processible data

Such as application-specific metadataPossibly using the Dublin Corevocabulary, which describes librarycontent and other media

c©Munindar P. Singh, CSC 513, Spring 2008 p.138

Module 5: XML Manipulation

Key XML query and manipulation languagesinclude

XPathXQueryXSLT

c©Munindar P. Singh, CSC 513, Spring 2008 p.139

Page 18: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Metaphors for Handling XML: 1

How we conceptualize what XML documents aredetermines our approach for handling suchdocuments

Text: an XML document is textIgnore any structure and perform simplepattern matches

Tags: an XML document is text interspersedwith tags

Treat each tag as an “event” duringreading a document, as in SAX (SimpleAPI for XML)Construct regular expressions as inscreen scraping

c©Munindar P. Singh, CSC 513, Spring 2008 p.140

Metaphors for Handling XML: 2

Tree: an XML document is a treeWalk the tree using DOM (DocumentObject Model)

Template: an XML document has regularstructure

Let XPath, XSLT, XQuery do the workThought: an XML document represents agraph structure

Access knowledge via RDF or OWL

c©Munindar P. Singh, CSC 513, Spring 2008 p.141

Page 19: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XPath

Used as part of XPointer, SQL/XML, XQuery,and XSLT

Models XML documents as trees with nodesElementsAttributesText (PCDATA)CommentsRoot node: above root of document

c©Munindar P. Singh, CSC 513, Spring 2008 p.142

Achtung!

Parent in XPath is like parent as traditionallyin computer scienceChild in XPath is confusing:

An attribute is not a child of its parentMakes a difference for recursion (e.g., inXSLT apply-templates)

Our terminology follows computer science:e-children, a-children, t-childrenSets via et-, ta-, and so on

c©Munindar P. Singh, CSC 513, Spring 2008 p.143

Page 20: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XPath Location Paths: 1Relative or absoluteReminiscent of file system paths, but muchmore subtle

Name of an element to walk downLeading /: root/: indicates walking down a tree.: currently matched (context) node..: parent node

c©Munindar P. Singh, CSC 513, Spring 2008 p.144

XPath Location Paths: 2

@attr: to check existence or access value ofthe given attributetext(): extract the textcomment(): extract the comment[ ]: generalized array accessorsVariety of axes, discussed below

c©Munindar P. Singh, CSC 513, Spring 2008 p.145

Page 21: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XPath Navigation

Select children according to position, e.g., [j],where j could be 1 . . . last()Descendant-or-self operator, //

.//elem finds all elems under the currentnode//elem finds all elems in the document

Wildcard, *:collects e-children (subelements) of thenode where it is applied, but omits thet-children@*: finds all attribute values

c©Munindar P. Singh, CSC 513, Spring 2008 p.146

XPath Queries (Selection Conditions)Attributes: //Song[@genre="jazz"]Text: //Song[starts-with(.//group, "Led")]Existence of attribute: //Song[@genre]Existence of subelement: //Song[group]Boolean operators: and, not, orSet operator: union (|), analogous to choiceArithmetic operators: >, <, . . .String functions: contains(), concat(),length(), starts-with(), ends-with()distinct-values()Aggregates: sum(), count()

c©Munindar P. Singh, CSC 513, Spring 2008 p.147

Page 22: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XPath Axes: 1Axes are addressable node sets based on thedocument tree and the current node

Axes facilitate navigation of a treeSeveral are definedMostly straightforward but some of themorder the nodes as the reverse of othersSome captured via special notation

current, child, parent, attribute, . . .

c©Munindar P. Singh, CSC 513, Spring 2008 p.148

XPath Axes: 2

preceding: nodes that precede the start ofthe context node (not ancestors, attributes,namespace nodes)following: nodes that follow the end of thecontext node (not descendants, attributes,namespace nodes)preceding-sibling: preceding nodes that arechildren of the same parent, in reversedocument orderfollowing-sibling: following nodes that arechildren of the same parent

c©Munindar P. Singh, CSC 513, Spring 2008 p.149

Page 23: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XPath Axes: 3ancestor: proper ancestors, i.e., element nodes (otherthan the context node) that contain the context node,in reverse document order

descendant: proper descendants

ancestor-or-self: ancestors, including self (if itmatches the next condition)

descendant-or-self: descendants, including self (if itmatches the next condition)

c©Munindar P. Singh, CSC 513, Spring 2008 p.150

XPath Axes: 4

Longer syntax: child::SongSome captured via special notation

self::*:child::node(): node() matches all nodespreceding::*descendant::text()ancestor::Songdescendant-or-self::node(), whichabbreviates to //Compare /descendant-or-self::Song[1](first descendant Song) and //Song[1](first Songs (children of their parents))

c©Munindar P. Singh, CSC 513, Spring 2008 p.151

Page 24: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XPath Axes: 5Each axis has a principal node kind

attribute: attributenamespace: namespaceAll other axes: element

* matches whatever is the principal node kind of thecurrent axis

node() matches all nodes

c©Munindar P. Singh, CSC 513, Spring 2008 p.152

XPointer

Enables pointing to specific parts of documentsCombines XPath with URLsURL to get to a document; XPath to walkdown the documentCan be used to formulate queries, e.g.,

Song-URL#xpointer(//Song[@genre="jazz"])The part after # is a fragment identifier

Fine-grained addressability enhances theWeb architecture

High-level “conceptual” identification of node sets

c©Munindar P. Singh, CSC 513, Spring 2008 p.153

Page 25: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XQuery

The official query language for XML, now aW3C recommendation, as version 1.0Given a non-XML syntax, easier on thehuman eye than XMLAn XML rendition, XqueryX, is in the works

c©Munindar P. Singh, CSC 513, Spring 2008 p.154

XQuery Basic Paradigm

The basic paradigm mimics the SQL(SELECT–FROM–WHERE) clause

1 f o r $x i n doc ( ’ q2 . xml ’ ) / / Songwhere $x / @lg = ’ en ’r e t u r n<Engl ish−Sgr name= ’ { $x / Sgr /@name} ’ t i = ’ { $x / @ti } ’ / >

c©Munindar P. Singh, CSC 513, Spring 2008 p.155

Page 26: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

FLWOR Expressions

Pronounced “flower”For: iterative binding of variables over rangeof valuesLet: one shot binding of variables over vectorof valuesWhere (optional)Order by (sort: optional)Return (required)

Need at least one of for or let

c©Munindar P. Singh, CSC 513, Spring 2008 p.156

XQuery For Clause

The for clauseIntroduces one or more variablesGenerates possible bindings for eachvariableActs as a mapping functor or iterator

In essence, all possible combinations ofbindings are generated: like a Cartesianproduct in relational algebraThe bindings form an ordered list

c©Munindar P. Singh, CSC 513, Spring 2008 p.157

Page 27: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XQuery Where Clause

The where clauseSelects the combinations of bindings that aredesiredBehaves like the where clause in SQL, inessence producing a join based on theCartesian product

c©Munindar P. Singh, CSC 513, Spring 2008 p.158

XQuery Return Clause

The return clauseSpecifies what node-sets are returned basedon the selected combinations of bindings

c©Munindar P. Singh, CSC 513, Spring 2008 p.159

Page 28: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XQuery Let Clause

The let clauseLike for, introduces one or more variablesLike for, generates possible bindings foreach variableUnlike for, generates the bindings as a list inone shot (no iteration)

c©Munindar P. Singh, CSC 513, Spring 2008 p.160

XQuery Order By Clause

The order by clauseSpecifies how the vector of variable bindingsis to be sorted before the return clauseSorting expressions can be nested byseparating them with commasVariants allow specifying

descending or ascending (default)empty greatest or empty least toaccommodate empty elementsstable sorts: stable order bycollations: order by $t collationcollation-URI: (obscure, so skip)

c©Munindar P. Singh, CSC 513, Spring 2008 p.161

Page 29: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XQuery Positional Variables

The for clause can be enhanced with a positionalvariable

A positional variable captures the position ofthe main variable in the given for clause withrespect to the expression from which themain variable is generatedIntroduce a positional variable via the at $varconstruct

c©Munindar P. Singh, CSC 513, Spring 2008 p.162

XQuery Declarations

The declare clause specifies things likeNamespaces: declare namespacepref=’value’

Predefined prefixes include XML, XMLSchema, XML Schema-Instance, XPath,and local

Settings: declare boundary-space preserve(or strip)Default collation: a URI to be used forcollation when no collation is specified

c©Munindar P. Singh, CSC 513, Spring 2008 p.163

Page 30: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XQuery Quantification: 1

Two quantifiers some and everyEach quantifier expression evaluates to trueor falseEach quantifier introduces a bound variable,analogous to for

1 f o r $x i n . . .where some $y i n . . .s a t i s f i e s $y . . . $xr e t u r n . . .

Here the second $x refers to the same variableas the first

c©Munindar P. Singh, CSC 513, Spring 2008 p.164

XQuery Quantification: 2

A typical useful quantified expression would usevariables that were introduced outside of itsscope

The order of evaluation isimplementation-dependent: enablesoptimizationIf some bindings produce errors, this canmattersome: trivially false if no variable bindingsare found that satisfy itevery: trivially true if no variable bindings arefound

c©Munindar P. Singh, CSC 513, Spring 2008 p.165

Page 31: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Variables: Scoping, Bound, and Free

for, let, some, and every introduce variablesThe visibility variable follows typical scopingrulesA variable referenced within a scope is

Bound if it is declared within the scopeFree if it not declared within the scope

1 f o r $x i n . . .where some $x i n . . .s a t i s f i e s . . .r e t u r n . . .

Here the two $x refer to different variables

c©Munindar P. Singh, CSC 513, Spring 2008 p.166

XQuery Conditionals

Like a classical if-then-else clauseThe else is not optionalEmpty sequences or node sets, written ( ),indicate that nothing is returned

c©Munindar P. Singh, CSC 513, Spring 2008 p.167

Page 32: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XQuery Constructors

Braces { } to delimit expressions that areevaluated to generate the content to be included;analogous to macros

document { }: to create a document nodewith the specified contentselement { } { }: to create an element

element foo { ’bar’ }: creates<foo>Bar</foo>element { ’foo’ } { ’bar’ }: also evaluatesthe name expression

attribute { } { }: likewisetext { body}: simpler, because anonymous

c©Munindar P. Singh, CSC 513, Spring 2008 p.168

XQuery Effective Boolean Value

Analogous to Lisp, a general value can betreated as if it were a Boolean

A xs:boolean value maps to itselfEmpty sequence maps to falseSequence whose first member is a nodemaps to trueA numeric that is 0, negative, or NaN mapsto false, else trueAn empty string maps to false, others to true

c©Munindar P. Singh, CSC 513, Spring 2008 p.169

Page 33: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Defining Functions

1 dec lare f u n c t i o n l o c a l : i t em f top ( $ t ){

l o c a l : i t em f ( $t , ( ) )} ;

Here local: is the namespace of the queryThe arguments are specified in parenthesesAll of XQuery may be used within thedefining bracesSuch functions can be used in place ofXPath expressions

c©Munindar P. Singh, CSC 513, Spring 2008 p.170

Functions with Types

1 dec lare f u n c t i o n l o c a l : i t em f top ( $ t as element ( ) )as element ( ) ∗

{l o c a l : i t em f ( $t , ( ) )

} ;

Return types as aboveAlso possible for parameters, but ignore suchfor this course

c©Munindar P. Singh, CSC 513, Spring 2008 p.171

Page 34: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XSLT

A programming language with a functional flavorSpecifies (stylesheet) transforms fromdocuments to documentsCan be included in a document (best not to)

<?xml vers ion ="1.0"? ><?xml−s t y l eshee t type =" t e x t / x s l "

h re f ="URL−to−xs l−sheet "?><main−element >

5 . . .</main−element >

c©Munindar P. Singh, CSC 513, Spring 2008 p.172

XQuery versus XSLT: 1

Competitors in some ways, butShare a basis in XPathConsequently share the same data modelSame type systems (in the type-sensitiveversions)XSLT got out first and has a sizablefollowing, but XQuery has strong backingamong vendors and researchers

c©Munindar P. Singh, CSC 513, Spring 2008 p.173

Page 35: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XQuery versus XSLT: 2

XQuery is geared for querying databasesSupported by major relational DBMSvendors in their XML offeringsSupported by native XML DBMSsOffers superior coverage of processingjoinsIs more logical (like SQL) and potentiallymore optimizable

XSLT is geared for transforming documentsIs functional rather than declarativeBased on template matching

c©Munindar P. Singh, CSC 513, Spring 2008 p.174

XQuery versus XSLT: 3

There is a bit of an arms race between themTypes

XSLT 1.0 didn’t support typesXQuery 1.0 doesXSLT 2.0 does too

XQuery presumably will be enhanced withcapabilities to make updates, but XSLT couldtoo

c©Munindar P. Singh, CSC 513, Spring 2008 p.175

Page 36: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XSLT Stylesheets

A programming language that follows XMLsyntax

Use the XSLT namespace (conventionallyabbreviated xsl)Includes a large number of primitives,especially:

<copy-of> (deep copy)<copy> (shallow copy)<value-of><for-each select="..."><if test="..."><choose>

c©Munindar P. Singh, CSC 513, Spring 2008 p.176

XSLT Templates: 1

A pattern to specify where the giventransform should apply: an XPath expression

This match only works on the root:< x s l : template match =" / " >

. . .</ x s l : template >

Example: Duplicate text in an element< x s l : template match=" t e x t ( ) " >

2 < x s l : value−of s e l e c t = ’ . ’ / >< x s l : value−of s e l e c t = ’ . ’ / >

</ x s l : template >

c©Munindar P. Singh, CSC 513, Spring 2008 p.177

Page 37: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XSLT Templates: 2

If no pattern is specified, apply recursively onet-children via <xsl:apply-templates/>By default, if no other template matches,recursively apply to et-children of currentnode (ignores attributes) and to root:

1 < x s l : template match = "∗ | / " >< x s l : apply−templates / >

</ x s l : template >

c©Munindar P. Singh, CSC 513, Spring 2008 p.178

XSLT Templates: 3

Copy text node by defaultUse an empty template to override thedefault:< x s l : template match="X" / >

2 <!−− X = des i red pa t t e rn −−>

Confine ourselves to the examples discussed inclass (ignore explicit priorities, for example)

c©Munindar P. Singh, CSC 513, Spring 2008 p.179

Page 38: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XSLT Templates: 4

Templates can be namedTemplates can have parameters

Values for parameters are supplied atinvocationEmpty node sets by defaultAdditional parameters are ignored

c©Munindar P. Singh, CSC 513, Spring 2008 p.180

XSLT VariablesExplicitly declaredValues are node setsConvenient way to document templates

c©Munindar P. Singh, CSC 513, Spring 2008 p.181

Page 39: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Document Object Model (DOM)

Basis for parsing XML, which provides anode-labeled tree in its API

Conceptually simple: traverse by requestingelement, its attribute values, and its childrenProcessing program reflects documentstructure, as in recursive descentCan edit documentsInefficient for large documents: parses themfirst entirely even if a tiny part is neededCan validate with respect to a schema

c©Munindar P. Singh, CSC 513, Spring 2008 p.182

DOM Example

DOMParser p = new DOMParser ( ) ;p . parse ( " f i lename " ) ;

3 Document d = p . getDocument ( )Element s = d . getDocumentElement ( ) ;NodeList l = s . getElementsByTagName ( " member " ) ;Element m = ( Element ) l . i tem ( 0 ) ;i n t code = m. g e t A t t r i b u t e ( " code " ) ;

8 NodeList k ids = m. getChi ldNodes ( ) ;Node k id = k ids . i tem ( 0 ) ;S t r i n g elemName = ( ( Element ) k i d ) . getTagName ( ) ; . . .

c©Munindar P. Singh, CSC 513, Spring 2008 p.183

Page 40: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Simple API for XML (SAX)

Parser generates a sequence of events:startElement, endElement, . . .

Programmer implements these as callbacksMore control for the programmer

Processing program does not necessarilyreflect document structure

c©Munindar P. Singh, CSC 513, Spring 2008 p.184

SAX Example: 1

c lass MemberProcess extends Defau l tHand ler {p u b l i c vo id s ta r tE lement ( S t r i n g u r i , S t r i n g n ,

S t r i n g qName, A t t r i b u t e s a t t r s ) {i f ( n . equals ( " member " ) ) code = a t t r s . getValue ( " code " )

5 i f ( n . equals ( " p r o j e c t " ) ) i n P r o j e c t = t r ue ;b u f f e r . rese t ( ) ;

}

. . .

c©Munindar P. Singh, CSC 513, Spring 2008 p.185

Page 41: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

SAX Example: 2

1

. . .

p u b l i c vo id endElement ( S t r i n g u r i , S t r i n g n ,S t r i n g qName) {

6 i f ( n . equals ( " p r o j e c t " ) ) i n P r o j e c t = f a l s e ;i f ( n . equals ( " member " ) && ! i n P r o j e c t )

. . . do something . . .}

}

c©Munindar P. Singh, CSC 513, Spring 2008 p.186

SAX FiltersA component that mediates between anXMLReader (parser) and a client

A filter would present a modified set ofevents to the clientTypical uses:

Make minor modifications to the structureSearch for patterns efficiently

What kinds of patterns, though?Ideally modularize treatment of differentevent patternsIn general, a filter can alter the structure ofthe document

c©Munindar P. Singh, CSC 513, Spring 2008 p.187

Page 42: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Creating XML from Legacy Sources

Often need to read in information from non-XMLsources

From relational databasesEasier because of structureSupported by vendor tools

From flat files, CSV documents, HTML Webpages

Bit of a black art: lots of heuristicsTools based on regular expressions

c©Munindar P. Singh, CSC 513, Spring 2008 p.188

Programming with XML

LimitationsDifficult to construct and maintaindocumentsInternal structures are cumbersome;hence the criticisms of DOM parsers

Emerging approaches provide superiorbinding from XML to

Programming languagesRelational databases

Check pull-based versus push-based parsers

c©Munindar P. Singh, CSC 513, Spring 2008 p.189

Page 43: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Module 6: XML Storage

The major aspects of storing XML includeXML KeysConcepts: Data and Document CentrismStorageMapping to relational schemasSQL/XML

c©Munindar P. Singh, CSC 513, Spring 2008 p.190

Integrity Constraints in XML

Entity: xsd:unique and xsd:keyReferential: xsd:keyrefData type: XML Schema specificationsValue: Solve custom queries using XPath orXQuery

Entity and referential constraints are based onXPath

c©Munindar P. Singh, CSC 513, Spring 2008 p.191

Page 44: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Keys: 1

Keys serve as generalized identifiers, and arecaptured via XML Schema elements:

Unique: candidate keyThe selected elements yield unique fieldtuples

Key: primary key, which means candidatekey plus

The tuples exist for each selected elementKeyref: foreign key

Each tuple of fields of a selected elementcorresponds to an element in thereferenced key

c©Munindar P. Singh, CSC 513, Spring 2008 p.192

XML Keys: 2

Two subelements built using restrictedapplication of XPath from within XML Schema

Selector: specify a set of objects: this is thescope over which uniqueness appliesField: specify what is unique for eachmember of the above set: this is the identifierwithin the targeted scope

Multiple fields are treated as ordered toproduce a tuple of values for eachmember of the setThe order matters for matching keyref tokey

c©Munindar P. Singh, CSC 513, Spring 2008 p.193

Page 45: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Selector XPath Expression

A selector finds descendant elements of thecontext node

The sublanguage of XPath used allowsChildren via ./child or ./* or childDescendants via .// (not within a path)Choice via |

The subset of XPath used does not allowParents or ancestorstext()AttributesFancy axes such as preceding,preceding-sibling, . . .

c©Munindar P. Singh, CSC 513, Spring 2008 p.194

Field XPath Expression

A field finds a unique descendant element(simple type only) or attribute of the context node

The subset of XPath used allowsChildren via ./child or ./*Descendants via .// (not within a path)Choice via |Attributes via @attribute or @*

The subset of XPath used does not allowParents or ancestorstext()Fancy axes such as preceding, . . .

An element yields its text()c©Munindar P. Singh, CSC 513, Spring 2008 p.195

Page 46: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Foreign Keys

<keyre f name = " . . . " r e f e r =" pr imary−key−name">< s e l e c t o r xpath = " . . . " / >< f i e l d name = " . . . " / >

</ keyref >

Relational requirement: foreign keys don’thave to be unique or non-null, but if onecomponent is null, then all components mustbe null.

c©Munindar P. Singh, CSC 513, Spring 2008 p.196

Placing Keys in Schemas

Keys are associated with elements, not withtypesThus the . in a key selector expression isboundCould have been (but are not) associatedwith types where the . could be bound towhichever element was an instance of thetype

c©Munindar P. Singh, CSC 513, Spring 2008 p.197

Page 47: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Data-Centric View: 1

1 < r e l a t i o n name= ’ Student ’ >< tup le >< a t t r 1 >V11</ a t t r 1 >

. . .< a t t r n >V1n</ a t t r n >

</ tup le >6 . . .

</ r e l a t i o n >

Extract and store via mapping to DB modelRegular, homogeneous structure

c©Munindar P. Singh, CSC 513, Spring 2008 p.198

Data-Centric View: 2Ideally, no mixed content: an elementcontains text or subelements, not bothAny mixed content would be templatic, i.e.,

Generated from a database via suitabletransformationsGenerated via a form that a user or anapplication fills out

Order among siblings likely irrelevant (as isorder among relational columns)

Expensive if documents are repeatedly parsedand instantiated

c©Munindar P. Singh, CSC 513, Spring 2008 p.199

Page 48: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Document-Centric ViewIrregular: doesn’t map well to a relationHeterogeneous dataDepending on entire doc forapplication-specific meaning

c©Munindar P. Singh, CSC 513, Spring 2008 p.200

Data- vs Document-Centric ViewsData-centric: data is the main thing

XML simply renders the data for transportStore as dataConvert to/from XML as neededThe structure is important

Document-centric: documents are the mainthing

Documents are complex (e.g., designdocuments) and irregularStore documents whereverUse DBMS where it facilitates performingimportant searches

c©Munindar P. Singh, CSC 513, Spring 2008 p.201

Page 49: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Storing Documents in Databases

Use character large objects (CLOBs) withinDB: searchable only as textStore paths to external files containing docs

Simple, but no support for integrityUse some structured elements for easysearch as well as unstructured clobs or filesHeterogeneity complicates mappings totyped OO programming languages

Storing documents in their entirety maysometimes be necessary for external reasons,such as regulatory compliance

c©Munindar P. Singh, CSC 513, Spring 2008 p.202

Database Features

Storage: schema definition languageQuerying: query languageTransactions: concurrencyRecovery

c©Munindar P. Singh, CSC 513, Spring 2008 p.203

Page 50: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Potential DBMS Types for XML: 1

Object-orientedNice structureIntellectual basis of many XML concepts,including schema representations andpath expressionsNot highly popular in standalone products

RelationalLimited structuring ability (1NF: each cellis atomic)Extremely popularWell optimized for flat queries

c©Munindar P. Singh, CSC 513, Spring 2008 p.204

Potential DBMS Types for XML: 2

Object relational: hybrids of aboveNot highly popular in standalone products

Custom XML stores or native XMLdatabases

Emerging ideas: may lack core databasefeatures (e.g., recovery, . . . )Enable fancier content managementsystemsLeading open source products:

Apache Xindice (server; XPath)Berkeley DB XML (libraries; XQuery)

c©Munindar P. Singh, CSC 513, Spring 2008 p.205

Page 51: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML to Relational DatabasesUsing large objectsFlatten XML structuresReferring to external files

Recall that for a relational schema, its entire setof attributes is necessarily a superkey

c©Munindar P. Singh, CSC 513, Spring 2008 p.206

Artificial Representation: Repetitious

Capturing an object hierarchy in a relationImagine an artificial identifier for each nodeConstruct a relation with three mainrelational attributes or columns

One column for the identifierOne column for the name of an attribute(i.e., element name)One column for the value (assumes thevalue would fit into the same relationaltype: potentially this could be CLOB orBLOB)

c©Munindar P. Singh, CSC 513, Spring 2008 p.207

Page 52: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Artificial Representation: Graph

Use four generic relations to represent a graphVertices:

Element ID, NameContents

Element ID, Text, number (to allowmultiple text nodes)

AttributesID, Attribute name, Attribute value

EdgesSource ID, Target ID

Better typed than repetitious style because thishas no nulls

c©Munindar P. Singh, CSC 513, Spring 2008 p.208

Shallow Representation: 1

The “natural” approaches are based ontuple-generating elements (TGEs)

Choose one XML element type as the TGETGE corresponds to a tupleThe key is based on an ID attribute or textof the TGE

A relational attribute (column) for eachsubelement or attributeEasiest if there is an attribute for IDs andthere are no other attributes

c©Munindar P. Singh, CSC 513, Spring 2008 p.209

Page 53: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Shallow Representation: 2

ConsequencesNulls for missing subelements canproliferateSubelements with structure (subelementsor attributes) aren’t represented wellAncestors cannot be searched for

c©Munindar P. Singh, CSC 513, Spring 2008 p.210

Deep Representation

Also called shredding an XML documentChoose a TGE as beforeA column for each descendant, except that

Can skip wrapper elements (no text, onlysubelements), but must reconstruct themto create an XML document

ConsequencesNulls for missing subelementsLots of columns in a relationAncestors cannot be searched forLoses structural information

c©Munindar P. Singh, CSC 513, Spring 2008 p.211

Page 54: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Representing Ancestors

Ancestors are the elements that are above thescope of the given TGE

Choose a TGE as beforeA column for each descendant as beforeA column for each ancestor (that needs to besearched)

Appropriate attributes or text fields tomake the search worthwhile

ConsequencesNulls for missing subelementsLots of columns in a relation

c©Munindar P. Singh, CSC 513, Spring 2008 p.212

Generalized TGE

Each element is a TGE, yielding a differentrelationA column for each terminal child: attribute ortextA column for each ancestor to capture theentire path from root to this node

Must promote uniquifying content so thateach TGE yields unique tuples

ConsequencesNulls for missing subelementsLots of relationsLots of columns in a relation

c©Munindar P. Singh, CSC 513, Spring 2008 p.213

Page 55: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Variations in Structure

Create separate relations for each variantConsequences

Lots of possible structures to storeQueries would not be succinctAcceptable only if we know in advancethat the number of variants is small andthe data in each is substantial

c©Munindar P. Singh, CSC 513, Spring 2008 p.214

Semistructured Representation

Create two (sets of) relationsSpecific part: one (or more) relations basedon one of the natural approachesGeneric part: one relation based on anartificial approach

c©Munindar P. Singh, CSC 513, Spring 2008 p.215

Page 56: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Thoughtful Design

The above approaches are not sensitive tothe meaning and motivation behind the XMLstructureUnderstand the XML structure via aconceptual model (in terms of entities andrelationships)Avoid unnecessary nesting in the XMLstructure, if possibleDesign a corresponding relational schema byhand

This is not always possible, though

c©Munindar P. Singh, CSC 513, Spring 2008 p.216

Evaluation

How does the above work for data-centric anddocument-centric views?

Compare with respect toDocument structureDocument “roundtripping” (compare &,&amp;, #a39)Normalization

Are the documents unique?Are the documents unique up to“isomorphism”?

c©Munindar P. Singh, CSC 513, Spring 2008 p.217

Page 57: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Schema Evolution

A big problem for databases in practical settingsFor relational schemas, certain kinds ofupdates are simpler than othersCan have consequences on optimizationXML schemas can be evolved by using XSLTto map old data to new schema

c©Munindar P. Singh, CSC 513, Spring 2008 p.218

From Relations to XML

Mapping a relation schema (set of relations plusfunctional dependencies) to an XML document

Map relation R to an element RE with key orunique constraintsMap column C of R to an attribute of RE orequivalently a child element with just textMap relation S with a foreign key to R to

A child element SE of RE (omit foreignkey content from SE): works if only onesuch RE for SE; ORAn element SE that includes the foreignkey content, and includes a keyref to RE

c©Munindar P. Singh, CSC 513, Spring 2008 p.219

Page 58: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

Null Value: 1A special value, not in any domain, butcombinable with any domain

Need?Possible meanings

Not applicableUnknown: missingQuestionable existenceAbsent (known but absent)

Hazards of null values?

c©Munindar P. Singh, CSC 513, Spring 2008 p.220

Null Value: 2

XML Schema enables developing custom nullvalues for each domain

Create an arbitrary value thatMatches the given data typeIs not a valid value of the domain,however

Design applications to understand specificrestricted type

c©Munindar P. Singh, CSC 513, Spring 2008 p.221

Page 59: What is Metadata? - NC State Computer Science · Exercise Produce an example XML document corresponding to a directed graph c Munindar P. Singh, CSC 513, Spring 2008 p.112 Compare

XML Schema Null

<elem/> (equivalently <elem></elem>)means that the element contains the emptystring

This is not nullxsi defines the attribute nil

Used as <elem xsi:nil="true"/> if elem isdeclared nillable (via nillable="true")

c©Munindar P. Singh, CSC 513, Spring 2008 p.222

Quick Look at SQL

Structured Query LanguageData Definition Language: CREATE TABLEData Manipulation Language: SELECT,INSERT, DELETE, UPDATEBasic paradigm for SELECTSELECT t1 . column−1, t1 . column−2 . . . tm . column−nFROM tab le −1 t1 , tab le−m tm

3 WHERE t1 . column−3=t4 . column−4 AND . . .

c©Munindar P. Singh, CSC 513, Spring 2008 p.223