inside an xslt processor michael kay, icl 19 may 2000

15
Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Upload: melissa-cahill

Post on 27-Mar-2015

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Inside an XSLT Processor

Michael Kay, ICL19 May 2000

Page 2: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

About me:

¶ ICL Fellow, systems

architect

¶ Database background

¶ Developer of SAXON ¶ Author of

XSLT Programmer’s Reference

published by Wrox Press¶ Recently joined XSL WG as

invited expert

Page 3: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

About this talk:

¶ The XSLT Processing Model¶ Structure of an XSLT Processor¶ Performance

» current limitations» possible ways forward

¶ Ideas on future development of the language

Page 4: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

The XSLT Processing Modelfirst approximation

SourceDocument

ResultDocument

Stylesheet

TransformationProcess

Page 5: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

The XSLT Processing Modelin more detail

SourceDocument

ResultDocument

TransformationProcess

SourceTree

ResultTree

StylesheetTree

Stylesheet

ParsingSerialization

Page 6: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

An XSLT Template Rule

<xsl:template match="appendix/para[1]"> <h4> <xsl:number level="single"/> <xsl:value-of select="@title"/> </h4> <p> <xsl:apply-templates/> </p></xsl:template>

Pattern

XPathExpression

Instruction

ResultElement

Page 7: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Architecture of an XSLT processor

XMLParser

TreeBuilder

XML Parser

Tree Builder

XPathcompiler

XSLTcompiler

XPath interpreter

XSLT interpreter

OutputManager

XML serializer

HTML serializer

Text serializerSourceTree

Source

Stylesheet

Result

Compiled Stylesheet

Page 8: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

At compile time:

¶ Parse and validate the stylesheet¶ Parse and validate all XPath expressions

» and attribute value templates

¶ Build rule base for matching patterns¶ Resolve references to named variables,

functions, and templates¶ Flatten the import tree¶ Optimize XPath expressions

Page 9: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Where does the time go?

Build Source Tree

Compile Stylesheet

Process Templates

Serialize Output

Page 10: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Is Performance a Problem?

¶ Client side: usually not» XSLT processing is generally faster

than download speed

¶ Server side: sometimes» CPU usage when handling very high

throughput» Memory problems when handling very

large documents

Page 11: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Some performance tips

¶ Keep documents small: split them first¶ Process once, at publishing time

» or use caching

¶ Do several simple transforms in series¶ Avoid complex patterns in template rules¶ Use keys¶ Use external functions¶ Avoid "//item"

Page 12: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Performance progress

Today

20 sec/Mb

5 sec/Mb

1 sec/Mb

Simpleoptimization

Advancedoptimization

Stylesheet compilationJava code optimizationLazy evaluationSimple XPath optimizationTail recursion

Incremental parsingPipeliningUse of schemaPattern matchingFull XPath optimizationCompile to bytecodes

Page 13: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Interesting research areas

¶ Database integration: transforming a document without loading into memory

¶ Applying regular expression theory¶ Execution as a sequence of serial

passes¶ Using schema knowledge at compile

time¶ Eager node numbering

Page 14: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Potential language features

¶ Serial transformation language?¶ Multi-pass stylesheets¶ Higher-level "relational" constructs:

grouping, joins, logical quantifiers¶ Richer data types¶ Assignment statement ????

Page 15: Inside an XSLT Processor Michael Kay, ICL 19 May 2000

Summary

¶ XSLT language is now stable¶ XSLT processor technology is

starting to be well understood¶ First crop of products are capable of

significant performance¶ Now the research needs to start on

the next phase of optimization techniques