xml processing moves forward xslt 2.0 and xquery 1.0 michael kay prague 2005

21
XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

Upload: hubert-miller

Post on 11-Jan-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

XML Processing Moves Forward XSLT 2.0 and XQuery 1.0

Michael Kay

Prague 2005

Page 2: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

2

About me

• Database background• Started using XML in 1998 for

content management applications• Author of XSLT Programmer’s

Reference• Developer of Saxon XSLT

processor• Member of W3C XSL and XQuery

Working Groups• Founded SAXONICA March 2004

Page 3: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

3

Contents

• A tour of the new specs

• What’s significant about XSLT 2.0

• A quick demo

• Why XQuery?

Page 4: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

4

The QT Specification Family

XSLT 2.0 XQuery 1.0

XPath 2.0

Data Model

XML Schema

Functionsand

Operators

Page 5: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

5

XSLT 1.0XPath 1.0

Standards maturity

Maturity

Time

XQueryXSLT 2.0XPath 2.0

XMLSchema

XML

REC

CR

Page 6: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

6

XML Schema

A family of standards

XPath 1.0

XPath 2.0

XQuery 1.0

XSLT 1.0

XSLT 2.0

Page 7: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

7

XSLT and XQuery

Documents Data

XSLT

XQuery

Page 8: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

8

What’s new in XSLT 2.0

• New Processing Model

• Major Features– grouping– regular expressions– functions– schema support

• Many “minor” features

Page 9: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

9

Some “minor” features

XSLT 2.0• Temporary trees

• Multiple Output Files

• Format date/time

• Tunnel parameters

• Declared variable types

• Multi-mode templates

• xsl:next-match

• conditional compilation

• XHTML serialization

• xsl:namespace

• separator=“,”

• character maps

XPath 2.0

• Sequences

• if..then..else

• for $x in X return f($x)

• some/every

• except/intersect

• $n is $m

Function library• String functions

• Regex functions

• Date/time arithmetic

• URI handling

• min(), max(), avg()

Page 10: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

10

Handling unstructured text

• unparsed-text() function– reads a text file into a string

• tokenize() function– splits a string into substrings

• xsl:analyze-string– parses a string and generates markup

Page 11: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

11

Regular expression functions

• matches()test if a string matches a regexif (matches($in, ‘[A-Z]{3}[0-9]{3}’)

• tokenize()split a string into substringsregex matches the separatorfor $s in tokenize($in, ‘,\s?’) ...

• replace()replace every occurrence of a matchreplace($in, ‘\s’, ‘%20’)

Page 12: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

12

Grouping

• Takes any sequence as input• Divides the items into groups• Applies processing to each group

group-by: items with a common value for a grouping key

group-adjacent:adjacent items with a common grouping key

group-starting-with:pattern to match first item in each group

group-ending-with:pattern to match last item in each group

Page 13: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

13

Grouping by Value

<xsl:for-each-group select=“book” group-by=“publisher”> <xsl:sort select=“current-grouping-key()”/> <h2>Publisher: <xsl:value-of select=“current-grouping-key”/> </h2> <xsl:for-each select=“current-group()”/> <xsl:sort select=“title”/> <p>author: <xsl:value-of select=“author”/></p> <p>title: <xsl:value-of select=“title”/></p> </xsl:for-each></xsl:for-each-group>

Page 14: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

14

User-defined Functions

• Written like named templates• Called from XPath• Return a result

<xsl:function name=“ged:date-to-ISO” as=“xs:date”><xsl:param name=“in” as=“ged:date”/><xsl:sequence select=“xs:date(concat( substring($in, 8, 4), ‘-’ format-number(index-of((“JAN”, “FEB”, ...), substring($in, 4, 3)), ’00’), ‘-’, substring($in, 1, 2)))”/></xsl:function>

<xsl:sort select=“ged:date-to-ISO(@birth-date)”/>

Page 15: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

15

XQuery 1.0

• Designed to query XML databases

• Also handles in-memory transformations

• Well supported by database vendors

Page 16: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

16

XQuery ExampleJoin two tables

xquery version 1.0;

<results> { for $p in doc ("auction.xml")/site/people/person let $a := for $t in doc("auction.xml") /site/closed_auctions/closed_auction where $t/buyer/@person = $p/@id return $t return <item person="{$p/name}"> {count ($a)} </item>} </results>

XMark Q8

Page 17: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

17

XSLT Equivalent

<result xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:for-each select="/site/people/person"> <xsl:variable name="a" select="/site/closed_auctions/closed_auction [buyer/@person = current()/@id]"/> <item person="{name}"> <xsl:value-of select="count($a)"/> </item> </xsl:for-each></result>

XMark Q8

Page 18: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

18

Optimization

• With multi-GB databases, using indexes is essential

• XQuery does not have template rules

• This makes it possible to do static analysis and join optimization

Page 19: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

19

XMark Q8 results (msecs)

1Mb

1503

160

33

90

Xalan

xt

MSXML

Saxon 8.4

XSLT

XQuerySaxon 8.4

Qizx

Galax

136

351

1870

4Mb

11006

2253

519

1340

1575

711

6672

10Mb

65855

16414

4248

11126

11947

1813

16625

O(n2)

O(n)

Page 20: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

20

Two can play at that game!

Xalan

xt

MSXML

Saxon 8.5

1Mb

1503

160

33

27

XSLT

XQuerySaxon 8.5

Qizx

Galax

16

351

1870

4Mb

11006

2253

519

26

16

711

6672

10Mb

65855

16414

4248

45

31

1813

16625

O(n2)

O(n)

caveat: this is one query only!

Page 21: XML Processing Moves Forward XSLT 2.0 and XQuery 1.0 Michael Kay Prague 2005

21

Conclusions

• XSLT 2.0 and XQuery 1.0 are nearly ready

• XSLT 2.0 has many powerful new features, making new applications possible

• XQuery 1.0 designed for optimization against very large databases