carrot: an appetizing hybrid of xquery and xslt

Post on 16-Jan-2015

718 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Balisage paper link: http://www.balisage.net/Proceedings/vol7/html/Lenz01/BalisageVol7-Lenz01.html

TRANSCRIPT

CarrotAn appetizing hybrid of XQuery and XSLT

Evan LenzSoftware Developer, Community

MarkLogic Corporation

On reinventing the wheel

XSLT-UK 2001

“XQuery: Reinventing the wheel?” http://xmlportfolio.com/xquery.html

Disclaimer My own personal project & opinions

Praises for XSLT Template rules are really elegant and powerful

It’s mature in its set of features

Powerful modularization features (<xsl:import>)

Praises for XQuery Concise syntax

Highly composable syntax An element constructor is an expression

So you can write things like: foo/<bar/>

Gripes about XSLT Two layers of syntax, which can’t be freely composed

You can’t nest an instruction inside an expression E.g., you can’t apply templates inside an XPath

expression

Verbose syntax In general In particular, for function definitions and parameter-

passing

Gripes about XQuery Conflation of modules and namespaces

Don’t like being forced to use namespace URIs

Distinction between main and library modules You can’t reuse a main module Reuse requires refactoring

Lack of a distinction between: the default namespace for XPath, and the default namespace for element constructors

No template rules!

A lot in common The same data model (XPath 2.0)

Much of the same syntax (XPath 2.0)

Feeling boxed-in XSLT’s lack of composability

XQuery’s lack of template rules

Don’t like having to pick between two languages all the time

Solution: add a third one. ;-)

And I’m calling it…

YAASFXSLT “Yet Another Alternative Syntax For XSLT”

Sam Wilmott’s RXSLT http://www.wilmott.ca/rxslt/rxslt.html

Paul Tchistopolskii’s XSLScript http://markmail.org/message/niumiluelzho6bmt

XSLTXT http://savannah.nongnu.org/projects/xsltxt

Actually: “Carrot” More than just an alternative syntax

Carrot combines: the friendly syntax and composability of XQuery

expressions the power and flexibility of template rules in XSLT

A “host language” for XQuery expressions

Motivation & inspiration My boxed-in feelings

OH: “I will never write code in XML.”

James Clark’s element constructor proposal back in 2001 http://www.jclark.com/xml/construct.html

Haskell

Overall design approach 95% of semantics defined by reference to XQuery

and XSLT

90% of syntax defined by reference to XQuery

Haskell similarities Haskell defines functions using equations:

foo = "bar"

Carrot defines variables, functions, and rules similarly: $foo := "bar"; my:foo() := "bar"; ^foo(*) := "bar";

Haskell similarities Haskell defines functions using equations:

foo = "bar"

Carrot defines variables, functions, and rules similarly: $foo := "bar"; my:foo() := "bar"; ^foo(*) := "bar";

Everything on the RHS is an XQuery expression (plus a few extensions)

Intro by example

Intro by example A rule definition in XSLT:

<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>

Intro by example A rule definition in XSLT:

<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>

A rule definition in Carrot:^(para) := <p>{^()}</p>;

Intro by example A rule definition in XSLT:

<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>

A rule definition in Carrot:^(para) := <p>{^()}</p>;

Intro by example A rule definition in XSLT:

<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>

A rule definition in Carrot:^(para) := <p>{^()}</p>;

Intro by example A rule definition in XSLT:

<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>

A rule definition in Carrot:^(para) := <p>{^()}</p>;

Intro by example This:

^()

Is short for this:^(node())

Just as, in XSLT, this:<xsl:apply-templates/>

Is short for this:<xsl:apply-templates select="node()"/>

Intro by example Another rule definition in Carrot:

^toc(section) := <li>{ ^toc() }</li>;

The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>

Intro by example Another rule definition in Carrot:

^toc(section) := <li>{ ^toc() }</li>;

The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>

Intro by example Another rule definition in Carrot:

^toc(section) := <li>{ ^toc() }</li>;

The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>

Intro by example Another rule definition in Carrot:

^toc(section) := <li>{ ^toc() }</li>;

The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>

Intro by example Another rule definition in Carrot:

^toc(section) := <li>{ ^toc() }</li>;

The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>

The identity transform In Carrot:

^(@*|node()) := copy{ ^(@*|node()) };

In XSLT:<xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>

The identity transform In Carrot:

^(@*|node()) := copy{ ^(@*|node()) };

In XSLT:<xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>

The identity transform In Carrot:

^(@*|node()) := copy{ ^(@*|node()) };

In XSLT:<xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>

The identity transform In Carrot:

^(@*|node()) := copy{ ^(@*|node()) };

In XSLT:<xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>

Note the asymmetry This definition is illegal (missing pattern):

^() := <foo/>;

Just as this template rule is illegal:<xsl:template

match=""><foo/></xsl:template>

However, when invoking, you can omit the argument:

^()

Just as in XSLT:<xsl:apply-templates/>

An XSLT example<xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <xsl:copy-of select="/doc/title"/> </head> <body> <xsl:apply-templates select="/doc/para"/> </body> </html> </xsl:template>

<xsl:template match="para"> <p> <xsl:apply-templates/> </p> </xsl:template>

</xsl:stylesheet>

An XSLT example<xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <xsl:copy-of select="/doc/title"/> </head> <body> <xsl:apply-templates select="/doc/para"/> </body> </html> </xsl:template>

<xsl:template match="para"> <p> <xsl:apply-templates/> </p> </xsl:template>

</xsl:stylesheet>

The equivalent in Carrot^(/) := <html> <head>{ /doc/title }</head> <body>{ ^(/doc/para) }</body> </html>;

^(para) := <p>{ ^() }</p>;

The equivalent in Carrot^(/) := <html> <head>{ /doc/title }</head> <body>{ ^(/doc/para) }</body> </html>;

^(para) := <p>{ ^() }</p>;

Carrot expressions

Carrot expressions Same as an expression in XQuery, with these

additions:1. ruleset invocations — ^mode(nodes)

2. shallow copy{…} constructors

3. text node literals — `my text node`

1. Ruleset invocations In XSLT, each mode can be thought of as the name of

a polymorphic function

The syntax of Carrot makes this explicit

1. Ruleset invocations “ruleset” == “mode”

In Carrot:^myMode(myExpr)

In XSLT:<xsl:apply-templates mode="myMode"

select="myExpr"/>

1. Ruleset invocations “ruleset” == “mode”

In Carrot:^myMode(myExpr)

In XSLT:<xsl:apply-templates mode="myMode"

select="myExpr"/>

1. Ruleset invocations “ruleset” == “mode”

In Carrot:^myMode(myExpr)

In XSLT:<xsl:apply-templates mode="myMode"

select="myExpr"/>

2. Shallow copy constructors

In Carrot: copy{…}

In XSLT:<xsl:copy>…</xsl:copy>

2. Shallow copy constructors

In Carrot: copy{…}

In XSLT:<xsl:copy>…</xsl:copy>

Why extend XQuery here? The lack of shallow copy constructors in XQuery makes

modified identity transforms impractical (specifically, for preserving namespace nodes)

3. Text node literals In Carrot:

`my text node`

3. Text node literals In Carrot:

`my text node`

In XSLT (a literal text node):my text node

3. Text node literals In Carrot:

`my text node`

In XSLT (a literal text node):my text node

In XQuery (dynamic text node constructor):text{ "my text node" }

3. Text node literals In Carrot:

`my text node`

In XSLT (a literal text node):my text node

In XQuery (dynamic text node constructor):text{ "my text node" }

Why aren’t dynamic text constructors sufficient? After all, they take only six more characters (text{…})

3. Text node literals Consider an example:

<xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template>

<xsl:template mode="file-name" match="doc">doc</xsl:template>

<xsl:template mode="file-ext" match="doc">.xml</xsl:template>

3. Text node literals Consider an example:

<xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template>

<xsl:template mode="file-name" match="doc">doc</xsl:template>

<xsl:template mode="file-ext" match="doc">.xml</xsl:template>

3. Text node literals Consider an example:

<xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template>

<xsl:template mode="file-name" match="doc">doc</xsl:template>

<xsl:template mode="file-ext" match="doc">.xml</xsl:template>

3. Text node literals Consider an example:

<xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template>

<xsl:template mode="file-name" match="doc">doc</xsl:template>

<xsl:template mode="file-ext" match="doc">.xml</xsl:template>

And the result:

<result>doc.xml</result>

3. Text node literals Same example, rewritten “naturally” in Carrot:

^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;

^file-name(doc) := "doc";^file-ext (doc) := ".xml";

3. Text node literals Same example, rewritten “naturally” in Carrot:

^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;

^file-name(doc) := "doc";^file-ext (doc) := ".xml";

3. Text node literals Same example, rewritten “naturally” in Carrot:

^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;

^file-name(doc) := "doc";^file-ext (doc) := ".xml";

3. Text node literals Same example, rewritten “naturally” in Carrot:

^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;

^file-name(doc) := "doc";^file-ext (doc) := ".xml";

And the result (uh-oh):

<result>doc .xml</result>

3. Text node literals Sequences of atomic values have spaces added upon

conversion to a text node

Various fixes, none satisfactory: Wrap the two invocations in text{} or concat()

(high coupling – what if one of them returns an element?)

Wrap the strings in text{} (burdensome fix—regular strings work 90% of the time)

3. Text node literals Note the imbalance:

Returning a text node is more concise in XSLT than in XQuery (!) XSLT: Hello XQuery: text{"Hello"}

Returning a string is more concise in XQuery than in XSLT XQuery: "Hello" XSLT: <xsl:sequence select="'Hello'"/>

Text node literals in Carrot redress the imbalance Text nodes in Carrot: `Hello` Strings in Carrot: "Hello"

3. Text node literals Rewritten properly in Carrot:

^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;

^file-name(doc) := `doc`;^file-ext (doc) := `.xml`;

Simple guideline now: Use text node literals when you are constructing part

of a result document Use string literals when you know you want to return a

string

Expression semantics Same as XQuery!

With this exception (remember the earlier gripe): Namespace attribute declarations on element

constructors do not affect the default element namespace for XPath expressions.

Okay, enough about namespaces

Carrot definitions

A Carrot module Consists of a set of unordered definitions

Three kinds of definitions: Global variables Functions Rules

Unlike XQuery, there is no top-level expression—only definitions Carrot is like XSLT in this regard

Global variables In Carrot:

$foo := "a string value";

Global variables In Carrot:

$foo := "a string value";

Equivalent to this XQuery: declare variable $foo := "a string value";

Global variables In Carrot:

$foo := "a string value";

Equivalent to this XQuery: declare variable $foo := "a string value";

Functions In Carrot:

my:foo() := "return value";

Functions In Carrot:

my:foo() := "return value";my:bar($str as xs:string) as xs:string := upper-case($str);

Functions In Carrot:

my:foo() := "return value";my:bar($str as xs:string) as xs:string := upper-case($str);

Equivalent to this XQuery:declare function my:foo() { "return value" };declare function my:bar($str as xs:string) as xs:string { upper-case($str) };

Functions In Carrot:

my:foo() := "return value";my:bar($str as xs:string) as xs:string := upper-case($str);

Equivalent to this XQuery:declare function my:foo() { "return value" };declare function my:bar($str as xs:string) as xs:string { upper-case($str) };

Rule definitions In Carrot:

^foo(*) := "return value";

Equivalent to this XSLT: <xsl:template match="*" mode="foo">

<xsl:sequence select="'return value'"/></xsl:template>

Rule definitions In Carrot:

^foo(*) := "return value";

Equivalent to this XSLT: <xsl:template match="*" mode="foo">

<xsl:sequence select="'return value'"/></xsl:template>

Rule definitions In Carrot:

^foo(*) := "return value";

Equivalent to this XSLT: <xsl:template match="*" mode="foo">

<xsl:sequence select="'return value'"/></xsl:template>

Rule definitions In Carrot:

^foo(*) := "return value";

Equivalent to this XSLT: <xsl:template match="*" mode="foo">

<xsl:sequence select="'return value'"/></xsl:template>

Rule parameters In Carrot:

^foo(* ; $str as xs:string) := concat($str, .);

Rule parameters In Carrot:

^foo(* ; $str as xs:string) := concat($str, .);

Equivalent to this XSLT:<xsl:template match="*" mode="foo"> <xsl:param name="str" as="xs:string"/> <xsl:sequence select="concat($str, .)"/></xsl:template>

Rule parameters In Carrot:

^foo(* ; $str as xs:string) := concat($str, .);

Equivalent to this XSLT:<xsl:template match="*" mode="foo"> <xsl:param name="str" as="xs:string"/> <xsl:sequence select="concat($str, .)"/></xsl:template>

Rule parameters In Carrot:

^foo(* ; tunnel $str as xs:string) := concat($str, .);

Equivalent to this XSLT:<xsl:template match="*" mode="foo"> <xsl:param name="str" as="xs:string" tunnel="yes"/> <xsl:sequence select="concat($str, .)"/></xsl:template>

Conflict resolution Same as XSLT

Import precedence first Then priority

Explicit priorities In Carrot:

^author-listing( author[1] ) 1 := ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();

Explicit priorities In Carrot:

^author-listing( author[1] ) 1 := ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();

Equivalent to this XSLT:<xsl:template mode="author-listing" match="author[1]" priority="1">

<xsl:apply-templates/>

</xsl:template>

<xsl:template mode="author-listing" match="author">

<xsl:text>, </xsl:text>

<xsl:apply-templates/>

</xsl:template>

<xsl:template mode="author-listing" match="author[last()]">

<xsl:text> and </xsl:text>

<xsl:apply-templates/>

</xsl:template>

Explicit priorities In Carrot:

^author-listing( author[1] ) 1 := ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();

Equivalent to this XSLT:<xsl:template mode="author-listing" match="author[1]" priority="1">

<xsl:apply-templates/>

</xsl:template>

<xsl:template mode="author-listing" match="author">

<xsl:text>, </xsl:text>

<xsl:apply-templates/>

</xsl:template>

<xsl:template mode="author-listing" match="author[last()]">

<xsl:text> and </xsl:text>

<xsl:apply-templates/>

</xsl:template>

Multiple modes In Carrot:

^foo|bar(*) := `result`;

Equivalent to this XSLT:<xsl:template mode="foo bar" match="*">result</xsl:template>

Multiple modes In Carrot:

^foo|bar(*) := `result`;

Equivalent to this XSLT:<xsl:template mode="foo bar" match="*">result</xsl:template>

Why I like composability

A couple of examples

Pipelines are easier A typical pipeline approach in XSLT:

<xsl:variable name="stage1-result"> <xsl:apply-templates mode="stage1" select="."/></xsl:variable><xsl:variable name="stage2-result"> <xsl:apply-templates mode="stage2" select="$stage1-result"/></xsl:variable><xsl:apply-templates mode="stage3" select="$stage2-result"/>

Pipelines are easier A typical pipeline approach in XSLT:

<xsl:variable name="stage1-result"> <xsl:apply-templates mode="stage1" select="."/></xsl:variable><xsl:variable name="stage2-result"> <xsl:apply-templates mode="stage2" select="$stage1-result"/></xsl:variable><xsl:apply-templates mode="stage3" select="$stage2-result"/>

In Carrot:^stage1(.)/^stage2(.)/^stage3(.)

Fewer features needed XSLT 2.1/3.0 promises to add some convenience

features, like: <xsl:copy select="foo">…</xsl:copy>

With Carrot, just use foo/copy{…}

What about feature X?

Two instruction categories

XSLT instructions that aren’t needed <xsl:for-each> - Use for expressions (or mapping operator!) <xsl:variable> - Use let instead <xsl:sort> - Use order by instead <xsl:choose>, <xsl:if> - Use if…then…else instead

XSLT instructions whose Carrot syntax is TBD <xsl:for-each-group> <xsl:result-document> <xsl:analyze-string> Etc., etc.

XQuery 3.0 may add grouping…

You can always import XSLT 2.0 stylesheets into Carrot

Top-level elements Imports and includes

import navigation.crt;include widgets.crt;

Top-level parametersparam $message as xs:string;

Etc. Still in mock-up stage. See some examples:

http://github.com/evanlenz/carrot/examples

Implementation strategy

Compile to XSLT 2.0 1:1 module mapping

Each Carrot module compiles to an XSLT 2.0 module

Carrot can include and import other Carrot modules or XSLT modules.

Carrot can also import XQuery modules, but since this is not supported directly in XSLT 2.0, the semantics depend on your target XSLT processor, e.g.: <saxon:import-query> in Saxon <xdmp:import-module> in MarkLogic Server

Steps to implementation Generate a parser

1. Create a BNF grammar for Carrota) Hand-convert the EBNF grammar for XQuery expressions

to BNF

b) Extend the resulting BNF to support Carrot definitions and expressions

2. Use yapp-xslt to generate the Carrot parser from the Carrot BNF http://www.o-xml.org/yapp/

Already running into trouble with this approach…

Write a compiler (in XSLT, naturally)

Project goals At this point, just explore the raw ideas (and share

them)

Solicit feedback

Placeholder for now: http://github.com/evanlenz/carrot

Help me cook this carrot You write the parser, I’ll write the compiler?

Future possibilities

XML-oriented browser scripting

XQIB

Saxon-CE

Carrot, or something like it, could make XSLT more palatable to Web developers

W3C activity Provide seeds for ideas in the XSL/XQuery WGs?

Carrot will grow with XPath/XQuery/XSLT

Mode merging Random XSLT feature idea: invoke more than one

mode<xsl:apply-templates mode="foo bar"/>

In Carrot:^foo|bar()

A static mode extension mechanism Handy for multi-stage transformations where each

stage is similar to but not the same as the next

Could be added to Carrot even if not supported directly in XSLT Carrot as a research playground for new language

ideas

Questions?

top related