xml-to-sql query mapping in the presence of multi-valued schema mappings and recursive xml schemas
DESCRIPTION
XML-to-SQL Query Mapping in the Presence of Multi-valued Schema Mappings and Recursive XML Schemas. Speaker: Artem Chebotko Mustafa Atay, Artem Chebotko, Shiyong Lu, and Farshad Fotouhi Department of Computer Science Wayne State University Detroit, MI 48202 USA - PowerPoint PPT PresentationTRANSCRIPT
XML-to-SQL Query Mapping in the Presence of Multi-valued Schema
Mappings and Recursive XML Schemas
Speaker: Artem Chebotko
Mustafa Atay, Artem Chebotko, Shiyong Lu, and Farshad Fotouhi
Department of Computer ScienceWayne State UniversityDetroit, MI 48202 USA
{matay, artem, shiyong, fotouhi}@wayne.edu
September 5, 2007 DEXA'07, Regensburg, Germany
2
Outline of Talk Motivation
XML-to-Relational Mappings Recursion in Query Mapping
Problem Statement and Contributions Our Proposed Solution
Path-based XML-to-Relational Mapping Unfolded XML Schema Graph Proposed generic query mapping algorithm
Conclusions and Future Work
September 5, 2007 DEXA'07, Regensburg, Germany
3
XML-to-Relational Mappings
Single-valued Each XML element type is mapped to exactly one
relation e.g., Shared, Shanmugasundaram et al., VLDB,
1999 e.g., ODTDMap, Atay et al., Information Systems,
2007 Multi-valued
Each XML element type can be mapped to multiple relations
e.g., Basic and Hybrid, Shanmugasundaram et al., VLDB, 1999
September 5, 2007 DEXA'07, Regensburg, Germany
4
Single-valued vs. Multi-valued
September 5, 2007 DEXA'07, Regensburg, Germany
5
Single-valued vs. Multi-valued Motivating Example XPath expression
/A/B1/C/D3 SQL query
Select T4.IDFrom (A) T1, (B1) T2, (C) T3, (D3) T4Where T1.ID=T2.parentID And T2.ID=T3.parentID And T3.ID=T4.parentID
September 5, 2007 DEXA'07, Regensburg, Germany
6
Recursion in Query Mapping Challenge: When there is recursion
both in an XML query and in its underlying schema, there might be infinitely many matching paths
Solutions Krishnamurthy et al., ICDE, 2004
requires with construct of SQL’99 Fan et al., VLDB, 2005
requires LFP (least fixpoint operator)
September 5, 2007 DEXA'07, Regensburg, Germany
7
Problem Statement
i. Existing query mapping algorithms only support single-valued mappings
A query mapping algorithm supporting multi-valued mappings is missing
ii. Existing query mapping algorithms need special operators to deal with recursion
There is a need for a query mapping algorithm which can be implemented for any RDBMS
September 5, 2007 DEXA'07, Regensburg, Germany
8
Contributions We propose a generic query mapping
algorithm which supports both single-valued and multi-valued mappings
Our proposed algorithm only requires the traditional relational operators to handle the recursion It can be implemented in any RDBMS
September 5, 2007 DEXA'07, Regensburg, Germany
9
Our Proposed Solution Path-based XML-to-Relational Mapping
p-Mapping supports multi-valued mappings (i)
Unfolded XML Schema Graph UXG helps identifying finite number of paths for a
given recursive query (ii) Proposed generic query mapping algorithm
ID-XMLtoSQL
September 5, 2007 DEXA'07, Regensburg, Germany
10
p-Mapping
Provides solution to the problem of supporting multi-valued mappings
Combines the followings to find a mapping XML-to-Relational Mapping (-Mapping) Path structure of input XML query p=e1/e2/…/en
September 5, 2007 DEXA'07, Regensburg, Germany
11
p-Mapping Example/A/B1/C/D3
SQL query Select T4.ID
From (A) T1, (B1) T2, (C) T3, (D3) T4Where T1.ID=T2.parentID And T2.ID=T3.parentID And T3.ID=T4.parentID
September 5, 2007 DEXA'07, Regensburg, Germany
12
UXG (Unfolded XML Schema Graph)
UXG provides solution to the problem of finding a finite number of matching paths for a recursive XML query
We convert a cyclic XML schema graph to a directed acyclic graph by unfolding the cycles in the original graph (UXG)
Static and dynamic approaches UXG always guarantees a finite number
of matching paths for an arbitrary XML query
September 5, 2007 DEXA'07, Regensburg, Germany
13
UXG Example
September 5, 2007 DEXA'07, Regensburg, Germany
14
Algorithm ID-XMLtoSQL Given a path expression P and UXG Gu
1. Extract all matching paths2. Identify the p-Mappings for each pi
3. Call SPathToSQL() to generate an SQL query for each pi
4. Get the union of output SQL queries
September 5, 2007 DEXA'07, Regensburg, Germany
15
Clustering Cluster is the set of consecutive elements in
path expression which are mapped to the same relation
We use the notion of a cluster for optimizing the output SQL query in SPathToSQL()
e.g., /A/B1/C/D3/E
A B1 B1 B1 E c1 c2 c3
September 5, 2007 DEXA'07, Regensburg, Germany
16
Algorithm SPathToSQL
September 5, 2007 DEXA'07, Regensburg, Germany
17
Example /A/D3//E is given
extracted paths /A/D3/E /A/D3/E/D1/E
identified p-Mappings
{(A,A), (D3,A), (E,E)}
{(A,A), (D3,A), (E,E), (D1,D1),(E,E)}
Output SQL query
Select E.IDFrom A, EWhere
A.D3.ID=E.parentIDUNION ALLSelect E.IDFrom A, E T1, D1, E T2Where
A.D3.ID=T1.parentID And T1.ID=D1.parentID And D1.ID=T2.parentID
September 5, 2007 DEXA'07, Regensburg, Germany
18
Performance Study
We compared our ID-XMLtoSQL to SQLGen of Krishnamurthy et al., ICDE, 2004
We selected 9 queries from the XMark benchmark ID-XMLtoSQL outperformed SQLGen in all the test
queries
September 5, 2007 DEXA'07, Regensburg, Germany
19
Conclusions and Future Work We proposed a generic query mapping
algorithm for a schema-based relational XML storage
We proposed an efficient way of handling recursion in query mapping which can be applicable to all RDBMSs
We consider augmenting our proposed ID-based algorithm with interval-based and path-based mapping schemes as a potential future work
Thank you!
Questions?