università degli studi di pisa speaker: giovanni conforti joint work with: orlando ferrara and...
Post on 18-Dec-2015
220 views
TRANSCRIPT
![Page 1: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/1.jpg)
Università degli Studi di Pisa
Speaker: Giovanni Conforti
Joint work with: Orlando Ferrara and Giorgio Ghelli
TQL Algebra and its Implementation
IFIP TCS @ 2002 Montreal, 28th August
![Page 2: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/2.jpg)
2
• Short introduction to SSD and SSD query languages.
• Tree logic and TQL overview.
• TQL Algebra motivations.
What I’m going to talk about…
• TQL Algebra presentation.
• Translation algorithm.
• Translation correctness.
• Our implementation model.
• Conclusions and future works.
IFIP TCS @ 2002 Montreal, 28th August
![Page 3: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/3.jpg)
3
• Semi-Structured Data (SSD) are used to:
• model and query web (HTML, XML, …);• store sperimental data;• integrate eterogeneous databases;• …
Semi-structured Data
• Semi-Structured Data (SSD) structure is:
• irregular;• implicit;• always in evolution;• .........
IFIP TCS @ 2002 Montreal, 28th August
![Page 4: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/4.jpg)
4
Data model: SSD as labelled trees (Example)
articles
articlearticle
authordate
title
monthyear
GordonApr, 2000
Feb
TQL
… …
author
Cardelli
date
2001
author
Ghelli
… …
IFIP TCS @ 2002 Montreal, 28th August
articles[article[
author[Cardelli] |author[Gordon] |title [Anywhere] |date[Apr, 2000] ]
article[author[Ghelli] |
title[TQL] |conf[ETAPS] |date[
month[Feb] | year[2001] ] ]
]
![Page 5: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/5.jpg)
5
• As for tabular data we have SQL and relational algebra, we’d want to define query language and algebra for SSD
• Specify and develop a good query language for SSD (in paricular for XML) is one of the main current challenges of database and web research communities.
SSD query languages
• After several proposals (Lorel, YATL, XMLQL, XDuce, etc.) the W3C has introduced the standard XQuery whose implementation and specification are work in progress.
IFIP TCS @ 2002 Montreal, 28th August
![Page 6: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/6.jpg)
6
• Extend the ambient logic to describe properties of SSD, obtaining a tree logic
• The Tree logic is a modal logic good to express:
• properties that regard horizontal and vertical structure of SSD
• properties whose specification requires negation, recursion or universal quantification
• constraint and types of SSD
TQL – the idea
• Introduce free variables inside tree logic formulas; use a pattern-matching approach to bind these variables to values inside a given data source new SSD query strategy: TQL
IFIP TCS @ 2002 Montreal, 28th August
![Page 7: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/7.jpg)
7
• Based on three clauses:• matching;• filtering;• reconstruction.
• The possibility of integrating logic expression and queries inside the same language gives several advantages in terms of expressivity and optimization (i.e. rewriting based on types)
TQL – the language
Fused in the binding operator:
• But this talk is not about TQL language, but about TQL Algebra… so i will introduce TQL aspects only needed to understand our work about the algebra.
IFIP TCS @ 2002 Montreal, 28th August
• If you want to learn more about TQL see these two articles [WebDB2002] and [ETAPS2000]
![Page 8: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/8.jpg)
8
![Page 9: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/9.jpg)
9
![Page 10: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/10.jpg)
10
![Page 11: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/11.jpg)
11
A, B ::= T A A B x. A X. A
0 L[A] A | B L ~ L’ X
A
Tree Logics - syntax
Negation allows the definition of derived operators:
F A B x. AX. A L[A] A || B
Path Expressions:
• regular expressions;
• compact way to express constraints on paths over trees;
• can be defined using Tree Logics formulas.
Es. .m.n[A] as m[ n[A] | T ] | T
IFIP TCS @ 2002 Montreal, 28th August
![Page 12: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/12.jpg)
12
F 0 iff F = 0
F A B iff F A e F B
F m[A] iff F = m[F’] e F’ A
F A | B iff F’, F’’. F = F’ | F’’ e F’ A e F’’
B
F m[A] iff F’. F = m[F’] F’ A
F A || B iff F’, F’’. F = F’ | F’’ F’ A o F’’
B
F T always
F X iff F = (X)
F A iff ( F A )
… … …
Tree Logics – describing set of trees (forests)
IFIP TCS @ 2002 Montreal, 28th August
![Page 13: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/13.jpg)
13
TQL Queries
Syntax:Q, Q’ ::= 0 | X | L[Q] | f(Q) | Q | Q’ | from Q A select Q’
Example: result[ from $articles articles[
article[title[$T] | date[$D] | T ] | T] select article[title[$T] | date[$D]]]
{ month[Feb] | year[2001] }
{TQL}
{Apr, 2000}{Anywhere}
$D$T
result[ article[ title [Anywhere] | date[Apr, 2000] ] | article[ title[TQL] | date[ month[Feb] | year[2001] ] ] ]
IFIP TCS @ 2002 Montreal, 28th August
![Page 14: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/14.jpg)
14
In general an intermediate algebra assures:
• transformability
• executability
TQL Algebra motivations – in general
Parser
Transation
Execution
TQL query
Algebric expression
TQL Rewriting
TQL Algebra Rewriting
Physical optimization
IFIP TCS @ 2002 Montreal, 28th August
![Page 15: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/15.jpg)
15
• No current algebra for XML supports TQL operators (negation,
quantification, horizontal navigation, etc.) => we write a new one.
TQL Algebra motivations – TQL case
IFIP TCS @ 2002 Montreal, 28th August
• Due to negation and derived operators, this algebra must support
infinite bindings (variable bound to an infinite number of values).
• We want an algebra whose semantics is formally specified in
order to prove its correctness w.r.t. TQL semantics.
• We want a running prototype, so we have to implement data
structures and translation, evaluation algorithms for TQL Algebra
![Page 16: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/16.jpg)
16
• It is an algebra of tables and trees, defined on four sorts.• label expressions L : denoting labels;
• tree expressions Q : denoting forests (set of trees);• row expressions RV: denoting rows over V (tuples with type V);
• table expressions TV: denoting finite or infinite tables (set of rows) with schema V.
TQL Algebra – sorts and their semantics
IFIP TCS @ 2002 Montreal, 28th August
• The basic sort is the table one, that is used to represent the evaluation of a Q A TQL binding operation.
• SSD and TQL query results are naturally represent by tree expressions.
![Page 17: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/17.jpg)
17
TQL Algebra – table expressions
• One-row tables
{RV} | {(x L )} | {(x Q )}
• Relational operators (union, cartesian product, projection and restriction)
T UV, V’
T | T V ,V’ T | V T L ~ L’ T
• Universe and Complement
1V | CoV (T )
• Vertical test and horizontal iterator of trees
if Q = y[Y] then T Y,y else T | U{Q=Y|Y’}
Y|Y’
• Recursion
letrec M = Y. T M,Y in T M | M( Q )IFIP TCS @ 2002 Montreal, 28th August
![Page 18: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/18.jpg)
18
TQL Algebra – tree expressions
Tree algebra reflects the TQL operators used to build trees (queries). The differences are• X does not denote a variable, but a name of a row;• we have a new metavariable Y ranging over tree variables;• the from-select clause is substituted by the tree construction (multiset union) Parr T Qr whose informal semantic is:
“Compute the union of all Qr where r is a row belonging to T”.
IFIP TCS @ 2002 Montreal, 28th August
Q ::= R(X) | Y | 0 | Q | Q’ | L[Q] | f(Q) | Parr T Qr
![Page 19: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/19.jpg)
19
TQL Algebra – derived table expressions
• We can define by translation several useful table expressions:
• intersection, junction, extension
•co-projection (dual of projection)
•other structural test on the tree
• These operators are very useful for translate derived operators of the tree logic!
•All of them are implemented in the current system.
IFIP TCS @ 2002 Montreal, 28th August
![Page 20: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/20.jpg)
20
Translation from TQL to TQL Algebra
IFIP TCS @ 2002 Montreal, 28th August
• The core of translation is the binder translation. We perform a semantic inversion transforming a formula (function from substitutions to set of trees) to a function that, given a tree returns a set of substitutions (table expression).
A Q, RV,
• Translation is defined by structural recursion on A
• It actually depends from the current schema V,
• Q and R are only plugged somewhere inside the expression.
• is an environment mapping logical recursive variables to algebric ones.
╓ ╖
![Page 21: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/21.jpg)
21
Translation from TQL to TQL Algebra - example
• Example:
from Q A x. x[$Z] select Q’RV = Par
r TQ’ RV ; r
T A x. x[$Z] Q ,RV,
A
{$Z}
x[$Z]
……
IFIP TCS @ 2002 Montreal, 28th August
╓ ╖ ╓ ╖
╓ ╖
╓ ╖╓ ╖
![Page 22: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/22.jpg)
22
Translation – operators
IFIP TCS @ 2002 Montreal, 28th August
Formula Algebric Operator Dual Formula Dual Algebric Op.
T Universe F Empty
A Complement
A B Junction A B Ext. Union
x. A, X. A Projection x. A, X. A
Co-Projection
0, L[A] Test L[A] Test (inv)
L ~ L’ Restriction
A | B Union Iterator A || B Join Iterator
X Singleton
A Recursion (minfix) A maxfix
![Page 23: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/23.jpg)
23
Translation correctness
IFIP TCS @ 2002 Montreal, 28th August
• The formal approach we have taken allows us to prove the correctness of the translation. That is :
Theorem
FV(RV) dom(e) , FV(Q) V
[[ Q ]] e(RV ) = Q RV e
Semantics of the query Q in e(RV ) is equivalent to the semantics of the translation of Q in RV
╓ ╖╙ ╜
• The core of the proof is the from-select case in which we prove the correctness of binder translation
![Page 24: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/24.jpg)
24
• Representing in a finite space possibly infinite tables.
Implementing the algebra – model description
• We use disjunctive constraints (closely related to proposals in constraint databases).
• For each algebric operator we define and implement the corresponding one that works on disjunctive constraints.
• New algorithms for complex operators (complement, co-projection, tree navigation)
{ a }{ b }
NotIn { a, b }{ a }
$Y$X
IFIP TCS @ 2002 Montreal, 28th August
![Page 25: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/25.jpg)
25
Implementing the algebra – The TQL System
Tql Engine
Sys Interface
DB
World Wide Web
…...
World Wide Web
Tql Applet
Tql ServletTql GUI
XML
Tql Applet
File system
• Implemented in Java and ported to C#.
• Some stats:
• ~20.000 LoC;
• 182 classes.
• Download at:
http://tql.di.unipi.it/tql
IFIP TCS @ 2002 Montreal, 28th August
![Page 26: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/26.jpg)
26
• TQL Algebra:
• realized as a tool for execute TQL;• seems to be quite general;• it is implemented (with some restictions);• deals with infinite tables.
Conclusions
• Future works:
• rewritings (with types and constraints);• static safety analysis;• cost model and physical optimizations;• extension to the graph model (graph logic).
IFIP TCS @ 2002 Montreal, 28th August
![Page 27: Università degli Studi di Pisa Speaker: Giovanni Conforti Joint work with: Orlando Ferrara and Giorgio Ghelli TQL Algebra and its Implementation IFIP](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d245503460f949fae44/html5/thumbnails/27.jpg)
27
The End
IFIP TCS @ 2002 Montreal, 28th August
The End.