xml schemas in oracle xml db ravi murthy sandeepan banerjee oracle corporation

30
XML Schemas in Oracle XML DB Ravi Murthy Sandeepan Banerjee Oracle Corporation

Upload: primrose-french

Post on 27-Dec-2015

227 views

Category:

Documents


1 download

TRANSCRIPT

XML Schemas in Oracle XML DB

Ravi MurthySandeepan Banerjee

Oracle Corporation

Talk Overview

Why XML in DB ? Background

– XML Schema– Object-Relational DB

XML Schema Support in Oracle XML DB – XML Storage– XML Query – XML Update

Future directions and Conclusions

Why XML in DB ?

Trends in industry– More XML content being generated– More applications dealing with (structured) data and (semi-

structured) documents– Need for unified management for all kinds of data– XML Schema language provides strong typing

Many benefits to storing XML in DB– Better Queriability– Optimized Updates– Stronger Validation

Fidelity of XML very important

Oracle XML DB Overview

Native XML data type Supports W3C XML Schema data model XML/SQL Duality Support for XML standards

– Namespaces, XPath, XSLT, SQL/XML

High performance XML repository– Hierarchical File System Abstraction– Protocols : FTP, HTTP/WebDAV– Access Control (ACL)– Versioning

XML Schema W3C Recommendation Large number of built-in scalar data types Simple type definitions

– Length, pattern and other constraints Complex type definitions

– sequence / choice / all– Mixed content

Extensible– Derivation by extension and restriction– Substitution Groups– Wildcards

Gradually replacing the traditional DTD Commonly used as a validation mechanism

Object Relational DB Basics

Object types Collection types Object References LOBs

XML Schema Construct

Object Construct

Complex type Object type

Local complex type Embedded object type

Complex type with maxOccurs > 1

Collection type

Derived complex type

Subtype

XMLType Native data type for XML Used to define columns of tables and views,

arguments to stored procedures, etc. XML specific methods and operators for

– Querying and extracting XML using XPath– Transforming XML using XSLT– Validating XML using XML Schema

Multiple Storage Options– Unstructured Storage in CLOB– Structured Storage into object-relational rows and columns– Hybrid Storage

Maintains application transparency to physical storage choice

XML DB and XML Schema

XML Schema controls all aspects of processing – Storage mappings – In-memory representations– Language Bindings

XML Schema Registration Process– Associates XML Schema with URL– Generates Object types– Creates default tables

XMLType column can be constrained to a global element of registered schema

XML Schema Example

<schema targetNamespace=“http://www.oracle.com/PO.xsd” xmlns:po=“http://www.oracle.com/PO.xsd”elementFormDefault=”qualified”xmlns="http://www.w3.org/2001/XMLSchema">

<complexType name="PurchaseOrderType"> <sequence>

<element name="PONum" type="decimal"/> <element name="Company"> <simpleType>

<restriction base="string"> <maxLength value="100"/> </restriction>

</simpleType> </element>

XML Schema Example (contd)

<element name="Item" maxOccurs="1000"> <complexType> <sequence>

<element name="Part"> <simpleType> <restriction base="string">

<maxLength value="1000"/> </restriction> </simpleType> </element> <element name="Price" type="float"/>

</sequence> </complexType> </element>

</sequence> </complexType>

<element name="PurchaseOrder" type="po:PurchaseOrderType"/> </schema>

Generated Object Types

TYPE "Item_T" (part varchar2(1000), price number);

TYPE "Item_COLL" AS VARRAY(1000) OF "Item_T";

TYPE "PurchaseOrderType_T" (ponum number, company varchar2(100), item Item_COLL);

Generated Tables

TABLE po_tab

OF XMLTYPE

XMLSCHEMA “http://www.oracle.com/PO.xsd"

ELEMENT "PurchaseOrder“

VARRAY(item) STORE AS item_tab;

XML Document Example<PurchaseOrder xmlns="http://www.oracle.com/PO.xsd"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.oracle.com/PO.xsd

http://www.oracle.com/PO.xsd"> <PONum>1001</PONum> <Company>Oracle Corp</Company> <Item>

<Part>9i Doc Set</Part> <Price>2550</Price>

</Item> <Item>

<Part>8i Doc Set</Part> <Price>350</Price>

</Item> </PurchaseOrder>

XML Storage – PO_TAB and LINE_TAB

Row ID ponum company

1 1001 Oracle Corp

Parent Row ID Array Index part price

1 1 9i Doc Set 2250

1 2 8i Doc Set 350

Structured Storage Attributes and single-valued elements

– Stored as columns in single row– SQL data types correspond to XML Schema types– SQL constraints correspond to XML Schema constraints

Multi-valued elements (collections) stored in separate nested tables

– One row per item in collection– Nested table row stores parent key– Array Index column stores the position information

Number column – uses full range of floating points Elements inserted in the middle of the collection get

(previous_array_index + next_array_index)/2 Supports multiple levels of nesting

– Embedded object types– Embedded collection types with multiple nested tables

DOM Fidelity

Structured storage guarantees DOM fidelity– No whitespace fidelity

System binary attribute in object types– SYS_XDBPD$

PD attribute stores non-relational information– Ordering of elements – Comments – Processing Instructions– Namespace declarations – Prefix information – Mixed content – text nodes that are intermixed with

elements are stored in the system column

XDB Attributes in XML Schema

Mapping details captured as new attributes within the XML Schema

– Oracle attributes use namespace : http://xmlns.oracle.com/xdb

Input XML Schema does not need to be annotated– Default assumptions for all XDB attributes

Input XML Schema can explicitly specify XDB attributes

– SQLName : Name of column or type attribute– SQLType : Name of object type – SQLCollType : Name of collection type– MaintainOrder : Permits turning off order maintenance

Hybrid Storage of XML

Two ends of storage spectrum– Store entire content in a single LOB– Full shredding of XML data

Hybrid Storage Options– Combination of shredding and LOB storage– xdb:SQLType=“CLOB” applied to <complexType>– CLOB storage ideal for fragments that are not intended for

query or partial updates– Example : XHTML content embedded within resource

descriptors (metadata)

Querying XMLType

XPath based operators existsNode

– Boolean operator– Checks for existence of node identified by XPath

extract– Extracts a fragment identified by XPath

extractValue– Retrieves the raw value of leaf node identified by XPath

Namespace Aware ANSI SQL/XML Standards effort

Query Rewrite

Automatic rewrite of XPaths during query compilation

Rewritten query directly accesses underlying relational columns

Introduces joins with nested tables Enables use of indexes

Query Rewrite - Example

Original QuerySELECT extractValue(value(p),

'/PurchaseOrder/PONum') FROM po_tab pWHERE existsNode(value(p),

'/PurchaseOrder[Company=Oracle]') = 1; Rewritten Query

SELECT p.ponum FROM po_tab pWHERE p.company = 'Oracle';

Query Rewrite Examples

XPath Expression Rewrite LogicSimple XPath expressions /PurchaseOrder/Company/PurchaseOrder/@poDate

Traversals of object attributes1 or more levels

Collection traversals/PurchaseOrder/LineItem/Part

Joins with appropriate nested tables

Predicates[Company=“Oracle”]

Relational predicates

List indexesLineitem[1]

Operator to access i-th item of collectionUses index on array_index column

Updating XMLType

Updating entire XML document Partial Update

– Uses UpdateXML() operator– XPath identifies element or attribute to be

updated– New value for the updated node is specified– Rewritten to directly update underlying column(s)

Similar mechanisms for – Inserting new nodes– Deleting node(s)

Update Rewrite - Example

Original StatementUPDATE po_tab p SET value(p) = updatexml(value(p),

'/PurchaseOrder/PONum/text()',

9999)WHERE existsNode(value(p),

‘/PurchaseOrder[Company=Oracle]’) = 1; Rewritten Statement

UPDATE po_tab p SET p.ponum = 9999WHERE p.Company = ‘Oracle’;

XMLType Views

Provide an XML view of relational data Good evolutionary strategy

– New XML apps on XML abstraction of existing data

SQL/XML Standard operators used to generate XML Views can generate schema based XML Insert / Update / Delete operations via ‘Instead of’

Triggers Queries over XML views are rewritten to directly

access underlying relational columns

XMLType View - Example

CREATE VIEW po_view of XMLTYPEXMLSCHEMA "po.xsd" ELEMENT "PurchaseOrder" AS SELECT

XMLElement("PurchaseOrder", XMLForest(p.ponum "PONum",

p.company "Company"), (SELECT XMLAGG(

XMLElement("Item", XMLForest(i.part "Part",

i.price "Price")) FROM items_rel_tab i WHERE i.po_id = p.id))

FROM po_rel_tab p;

Complex XML Schemas

Cyclic Definitions– Object types created with references– Table row contains reference to other rows stored in same

or different table– Keys to document and parent rows stored in nested rows

Complex type derivation – Mapped to object type inheritance

Wildcards– Mapped to CLOB attributes

Indexing XMLType

Multiple index types B-Tree and bitmap indexes Function-based indexes

– Create index on specific XPath expressions

Text indexes– Inverted lists provide section-based search– Also support keyword based search within textual

content

Future Directions

Optimize XML Query support– Translate XML Query to SQL– Works against distributed databases using

Oracle connectivity solutions

Optimize support for highly variable documents

– New storage and indexing techniques

Conclusions

Oracle XML DB is a robust platform for building XML applications

Strong support for XML Schema based storage, query, indexing and updates

Structured storage maintains XML fidelity Support for standards such as SQL/XML,

XPath and XSLT Powerful XML/SQL Duality