april 4, 2002 updating xml views of relational data 1 master’s thesis update talk for mukesh...

39
April 4, 2002 Updating XML Views of Relational Data 1 Updating XML Views of Relational Data Master’s Thesis Update Talk For Mukesh Mulchandani Advisor : Prof. Elke Rundensteiner Reader : Prof. George Heineman

Post on 21-Dec-2015

215 views

Category:

Documents


3 download

TRANSCRIPT

April 4, 2002 Updating XML Views of Relational Data

1

Updating XML Views of Relational Data

Master’s Thesis Update TalkFor

Mukesh MulchandaniAdvisor : Prof. Elke RundensteinerReader : Prof. George Heineman

April 4, 2002 Updating XML Views of Relational Data

2

Outline • Motivation • Background Material• Rainbow System• My Extension to Rainbow System• Work in Literature• Assumptions and Restrictions• Approach• Status• Evaluation• Related Work

April 4, 2002 Updating XML Views of Relational Data

3

Motivation• XML is here to stay …

– Universal representation of data– De facto standard for information exchange

• But RDBMS is mature– Mature query optimization techniques– High query performance

• Hence Extensive research on storing XML into RDBs and publishing XML– Florescu and Kossman, Storing and Querying XML Data Using

RDBMS, In Bulletin of Technical Committee on Data Engineering, 1999– Shanmugasundaram, He, Tufte, Zhang, DeWitt and Naughton,

Relational Databases for Querying XML Documents : Limitations and Opportunities, VLDB 1999

– Zhang, Lee, Mitchell and Rundensteiner, Clock: Synchronizing Internal Relational Storage with External XML Documents, RIDE-DM 2000

April 4, 2002 Updating XML Views of Relational Data

4

Bridging XML & RDBs

Relations

RDBMS

Default XMLView

XML Virtual ViewXQueryXQuery

XPERANTO & Rainbow

• XQuery to Query• Translation

• Query Composition• Query Rewriting• Computation Pushdown• SQL Generation

• Updates ? – Not yet supported

XMLDocuments

SQL

Query Executor

Tuples

April 4, 2002 Updating XML Views of Relational Data

5

My Thesis Goals• Specify updates on XML views• Push updates correctly into underlying relations

– Irrespective of mapping chosen between XML to RDB

Issues to be addressed:– How to specify updates ?– What kind of updates?– Meaning of updates at RDB end (Schema or Data Updates)– How to propagate ?– How to know that propagation is correct ?– How is the performance ?

April 4, 2002 Updating XML Views of Relational Data

6

Outline Motivation • Background Material• Rainbow System• My Extension to Rainbow System• Work in Literature• Assumptions and Restrictions• Approach• Status• Evaluation• Related Work

April 4, 2002 Updating XML Views of Relational Data

7

Example of XML document & XQuery

<authors><author> <first>Michael</first> <last>Savitch</last> <title>Data Structures in C++</title> <year>2000</year></author><author> <first>Peter</first> <last>Naughton</last> <title>JAVA 2 Reference</title> <year>1998</year></author>

</authors>

User XQuery: <SavitchBooks>

FOR $author IN document(“authorview”)/author

WHERE $author/last = “Savitch”

RETURN$author/title

</SavitchBooks>

Result : <SavitchBooks>

<title>Data Structures in C++</title></SavitchBooks>

April 4, 2002 Updating XML Views of Relational Data

8

XML views of Relational Data<DB>

<Author> <row> <first>Michael</first> <last>Savitch</last> <bookid>1</bookid> </row> <row> <first>Peter</first> <last>Naughton</last> <bookid>2</bookid> </row></Author><Book> <row> <bookid>1</bookid> <title>Data Structures in C++</title> <year>2000</year> </row> <row> <bookid>2</bookid> <title>JAVA 2 Reference</title> <year>1998</year> </row></Book>

</DB>

First Last BookId

Michael Savitch 1

Peter Naughton 2

Author

BookId Title Year

1 Data Structures in C++ 2000

2 JAVA 2 Reference 1998

Book

Default View

April 4, 2002 Updating XML Views of Relational Data

9

XML views of Relational Data (Contd)

<authorview><author> <first>Michael</first> <last>Savitch</last> <title>Data Structures in C++</title> <year>2000</year></author><author> <first>Peter</first> <last>Naughton</last> <title>JAVA 2 Reference</title> <year>1998</year></author>

</authorview>

Mapping View

<authorview> FOR $authortuple IN document(“Default.xml”)/author/row $booktuple IN document(“Default.xml”)/book/row WHERE $authortuple/bookid = $booktuple/bookid RETURN

<author> $authortuple/first $authortuple/last $booktuple/title $booktuple/year</author>

</authorview>

April 4, 2002 Updating XML Views of Relational Data

10

Outline Motivation Background Material• Rainbow System• My Extension to Rainbow System• Work in Literature• Assumptions and Restrictions• Approach• Status• Evaluation• Related Work

April 4, 2002 Updating XML Views of Relational Data

11

Rainbow Query Engine

View Composition

SQL Generator

RDBMS

XAT

XAT

User Query

XQuery

SQL

KweeltXQuery Parser

XAT Executor

User Query Results in XML

Tuples

XAT Rewrite

XAT

XAT

Mapping Query

XATGenerator

XML ViewManager

Parsed Tree

Mapping Query

Default Schema

April 4, 2002 Updating XML Views of Relational Data

12

What Changes Do I Need To Make ?• Extend :

– XQuery language to support XML Updates

– XQuery Parser (Kweelt) to parse updates

– “XAT Generator” to generate update nodes in XAT

• Add Functionality:– “View Analyzer” to extract knowledge of underlying tables

and relationships among them.

– “Update Decomposer” to decompose given XML update into relational updates

– “Update Translator" to make correct SQL translation using semantic information about relational tables.

April 4, 2002 Updating XML Views of Relational Data

13

XAT Rewrite

View Composer

Update Manager within Rainbow

RDBMS

User Update

Tuples

View Analyzer

Update Decomposer XAT

Update Translator

Single TableUpdates

Table Relationships

Multiple SQL updates

SQL NodeGeneration

User Query

SQL

XAT Executor

User Query Results in XML

XAT

XAT

XAT

XAT

XAT Generator

SI

KweeltXQuery ParserMapping

Query

Default Schema

Mapping Query

XML ViewManager

Update Parser

XAT

Parsed Tree

April 4, 2002 Updating XML Views of Relational Data

14

Outline Motivation Background MaterialRainbow SystemMy Extension to Rainbow System• Work in Literature• Assumptions and Restrictions• Approach• Status• Evaluation• Related Work

April 4, 2002 Updating XML Views of Relational Data

15

Relational View Update Scenario

First Last BookId

Michael Savitch 1

Peter Naughton 2

AuthorBookId Title Year

1 Data Structures in C++ 2000

2 JAVA 2 Reference 1998

Book

First Last Title Year

Michael Savitch Data Structures in C++ 2000

Peter Naughton JAVA 2 Reference 1998

AuthorView

•Root Table•Ownership or Subset Relationships•Reference Relationships

April 4, 2002 Updating XML Views of Relational Data

16

Work appeared in Literature• On the updatability of relational views

– Dayal and Bernstein – IEEE transaction 1978

• A Relational Database View Update Translation Mechanism– Yoshifumi Masunaga – VLDB 1984

• Algorithms for translating view updates to database updates for view involving selections, projections, and joins.– A. M. Keller ACM SIGACT-SIGMOD 1985

• Updating Relational Databases through Object-Based Views– A. M. Keller, Barsalou, Siambela and Wiederhold – SIGMOD 1991

April 4, 2002 Updating XML Views of Relational Data

17

• Root Relation– A relational table under the view

– Key of the view elements is same as the key of this relation

Term Definitions (Keller – SIGMOD 1991)

<authorview><author> <first>Michael</first> <last>Savitch</last> <title>Data Structures in C++</title> <year>2000</year></author><author> <first>Peter</first> <last>Naughton</last> <title>JAVA 2 Reference</title> <year>1998</year></author>

</authorview>

First Last BookId

Michael Savitch 1

Peter Naughton 2

Author (Key : (First,Last))

BookId Title Year

1 Data Structures in C++ 2000

2 JAVA 2 Reference 1998

Book (Key : Bookid)

Root

April 4, 2002 Updating XML Views of Relational Data

18

1SavitchMichael

2NaughtonPeter

BookIdLastFirst

Author (Key : (First,Last))

2000Data Structures in C++1

JAVA 2 Reference

Title

19982

YearBookId

Book (Key : Bookid)

Term Definitions (Contd.)• Ownership Connection (R2 is owned by R1)

– Foreign key (also part of key) of R2, not unique, referring to key of R1 (1 : n)

• Subset Connection (R2 is subset of R1)– Foreign key (also key) of R2, unique, referring to key of R1 (1:1)

• Reference Connection (R2 References R1)– Foreign key of R2, null allowed, not unique, referring to key of R1.

April 4, 2002 Updating XML Views of Relational Data

19

Term Definitions (Contd.)• Dependency island of view

– Maximal sub-tree of the tree of projections such that• Root of sub-tree is root relation

• All directed paths beginning at root relation contain exclusively ownership and subset connections

• Referencing Peninsula– Is a relation

• directly connected to any relation of dependency island

• Via reference connection

April 4, 2002 Updating XML Views of Relational Data

20

Outline Motivation Background MaterialRainbow SystemMy Extension to Rainbow SystemWork in Literature• Assumptions and Restrictions• Approach• Status• Evaluation• Related Work

April 4, 2002 Updating XML Views of Relational Data

21

Restrictions on views• View should be in SPJ form

– Only select, project and joins are allowed in view definition

• View has at least one root relation– Key of table is also key of XML elements in view

• No Aggregation in view definition– No aggregate columns such as max( ), count( ) etc

• No explicit external dependencies– No element value computed from values of other elements

April 4, 2002 Updating XML Views of Relational Data

22

Assumptions for type of updates• Only data updates are performed• Only complete valid updates are issued

<authorview><author> <first>Michael</first> <last>Savitch</last> <title>Data Structures in C++</title> <year>2000</year></author><author> <first>Peter</first> <last>Naughton</last> <title>JAVA 2 Reference</title> <year>1998</year></author>

</authorview>

FOR $root IN document(“authorview.xml”) $author IN $root/author[first = “Michael”][last = “Savitch”] UPDATE $root{

DELETE $author

}

FOR $root IN document(“authorview.xml”) UPDATE $root{

INSERT <author><first>Joe</first><last>Smith</last><title>First Book</title><year>2002</year>

</author>

}

April 4, 2002 Updating XML Views of Relational Data

23

Outline Motivation Background MaterialRainbow SystemMy Extension to Rainbow SystemWork in LiteratureAssumptions and Restrictions• Approach• Status• Evaluation• Related Work

April 4, 2002 Updating XML Views of Relational Data

24

XQuery Update Grammar

  FOR $binding1 IN Xpath-expr,…..LET $binding := Xpath-expr,…WHERE predicate1,…..updateOp,……

  Where updateOp is defined as :   UPDATE $binding {subOp {, subOp}* } and subOp is : 

DELETE $child |RENAME $child To new_name |INSERT ( $bind [BEFORE | AFTER $child]

| new_attribute(name, value) | new_ref(name, value) | content [BEFORE | AFTER $child] ) |

REPLACE $child WITH ( new_attribute(name, value)| new_ref(name, value)| content ) |

FOR $sub_binding IN Xpath-subexpr,…..WHERE predicate1,……….updateOp.

• A FLWU Expression

April 4, 2002 Updating XML Views of Relational Data

25

<authorview> FOR $authortuple IN document(“Default.xml”)/author/row $booktuple IN document(“Default.xml”)/book/row WHERE $authortuple/bookid = $booktuple/bookid RETURN

<author> $authortuple/first $authortuple/last $booktuple/title $booktuple/year</author>

</authorview>

Running Example

<authorview><author> <first>Michael</first> <last>Savitch</last> <title>Data Structures in C++</title> <year>2000</year></author><author> <first>Peter</first> <last>Naughton</last> <title>JAVA 2 Reference</title> <year>1998</year></author>

</authorview>

FOR $root IN document(“authorview.xml”) $author IN $root/author[first = “Michael”][last = “Savitch”] UPDATE $root{

DELETE $author

}

Mapping View

User Update

Expected Effect<authorview> <author>

<first>Peter</first> <last>Naughton</last> <title>JAVA 2 Reference</title> <year>1998</year></author>

</authorview>

Mapping Query

April 4, 2002 Updating XML Views of Relational Data

26

My Approach• Analyze mapping query XAT to find

– underlying relations – root relation for the view– relationship of other relations with root relation

• Generate XAT for user update on mapping view• Massage composed tree to decompose update into

updates against individual relations• Use semantic information for correct translation of

updates• Generate SQL updates executed in the relational

engine.

April 4, 2002 Updating XML Views of Relational Data

27

Step 1 : Analysis of mapping query XAT

S(“Default.xml”):S1

(S1, /author/row):$authortuple

($authortuple, bookid):$col1

S(“Default.xml”):S1

(S2, /book/row):$booktuple

($bookuple, bookid):$col2

($col1=$col2)

($authortuple, first):$col3

($authortuple, last):$col4

($booktuple, title):$col5

($booktuple, year):$col6

T (<author> [$col3], [$col4], [$col5], [$col6]</author>):col7

T(<authorview>[col7]</authorview>):col8

Agg()

Root Relation : Author

Dependency Island : Author

Referencing Peninsulas : None

<DB><Author> <row> ……… </row> …….</Author><Book> <row> …… </row> ……..</Book>

</DB>

Relations : Author, Book

Database Catalog

Author : First, LastBook : Bookid

<authorview> FOR $authortuple IN document(“Default.xml”)/author/row $booktuple IN document(“Default.xml”)/book/row WHERE $authortuple/bookid = $booktuple/bookid RETURN

<author> $authortuple/first $authortuple/last $booktuple/title $booktuple/year</author>

</authorview>

April 4, 2002 Updating XML Views of Relational Data

28

Step 2 : Generation of XAT for user update

FOR $root IN document(“authorview.xml”) $author IN $root/author[first = “Michael”][last = “Savitch”] UPDATE $root{

DELETE $author

}

S(“authorview.xml”):S1

(S1, ):$root

($col1=“ Michael”)

($root, author):$author

($author, first):$col1

($author, last):$col2

($col2=“ Savitch”):$author

Delete($root, $author)

April 4, 2002 Updating XML Views of Relational Data

29

Composing two XATs

S(“Default.xml”):S1

(S1, /author/row):$authortuple

($authortuple, bookid):$col1

S(“Default.xml”):S2

(S2, /book/row):$booktuple

($bookuple, bookid):$col2

($col1=$col2)

($authortuple, first):$col3

($authortuple, last):$col4

($booktuple, title):$col5

($booktuple, year):$col6

T (<author> [$col3], [$col4], [$col5], [$col6]</author>):col7

T(<authorview>[col7]</authorview>):col8

Agg()

S(“authorview.xml”):S1

(S1, ):$root

($col1=“ Michael”)

($root, author):$author

($author, first):$col1

($author, last):$col2

($col2=“ Savitch”):$author

Delete($root, $author)

April 4, 2002 Updating XML Views of Relational Data

30

Composed Algebra Tree

($col3=“ Michael”)

($author, first):$col3

($author, last):$col4

($col4=“ Savitch”):$author

Delete($root, $author)

S(“Default.xml”):S1

(S1, /author/row):$authortuple

($authortuple, bookid):$col1

S(“Default.xml”):S2

(S2, /book/row):$booktuple

($bookuple, bookid):$col2

($col1=$col2):$root

April 4, 2002 Updating XML Views of Relational Data

31

Step 3 : Decomposing the Update

($col3=“ Michael”)

($author, first):$col3

($author, last):$col4

($col4=“ Savitch”):$author

Delete($root, $author)

S(“Default.xml”):S1

(S1, /author/row):$authortuple

($authortuple, bookid):$col1

S(“Default.xml”):S2

(S2, /book/row):$booktuple

($bookuple, bookid):$col2

($col1=$col2):$root

April 4, 2002 Updating XML Views of Relational Data

32

Decomposing the Update (Contd.)

Delete(author, first = “Michael”, last = “Savitch”, Bookid = book.Bookid)

Delete(book, Bookid = author.Bookidauthor.first = “Michael”author.last = “Savitch”)

Delete($root, $author)

S(“Default.xml”):S1

(S1, /author/row):$authortuple

($authortuple, bookid):$col1

S(“Default.xml”):S2

(S2, /book/row):$booktuple

($bookuple, bookid):$col2

Delete($root, $author)

S(“Default.xml”):S1

(S1, /author/row):$authortuple

($authortuple, bookid):$col1

S(“Default.xml”):S2

(S2, /book/row):$booktuple

($bookuple, bookid):$col2

April 4, 2002 Updating XML Views of Relational Data

33

Step 4 :Correct Translation of Update• Delete Algorithm (A. M. Keller, SIGMOD 1991)

– Isolate Dependency island

– For each projection in dependency island, delete all matching tuples from underlying relation

– Identify referencing peninsulas

– For each peninsula, perform a replacement on the foreign key of each matching tuple

Delete(author, first = “Michael”, last = “Savitch”, Bookid = book.Bookid)

Delete(book, Bookid = author.Bookidauthor.first = “Michael”author.last = “Savitch”)

Root Relation : Author

Dependency Island : Author

Referencing Peninsulas : None

April 4, 2002 Updating XML Views of Relational Data

34

Step 5 : Generation of SQL Update

DELETE FROM author

WHERE first = ‘Michael’ AND last = ‘Savitch’

AND bookid IN (SELECT bookid FROM book)

Delete(author, first = “Michael”, last = “Savitch”, Bookid = book.Bookid)

April 4, 2002 Updating XML Views of Relational Data

35

Outline Motivation Background MaterialRainbow SystemMy Extension to Rainbow SystemWork in LiteratureAssumptions and RestrictionsApproach• Status• Evaluation• Related Work

April 4, 2002 Updating XML Views of Relational Data

36

StatusXQuery to support updates on XMLXQuery Parser (Kweelt) to parser updates “XAT Generator” to generate update nodes in XAT View Analyzer Update Decomposer Update Translator• Implementation : Ongoing • Evaluation : not done yet

April 4, 2002 Updating XML Views of Relational Data

37

Evaluation• Correct propagation of the updates.

– Manual check on view reconstructed from tables after update.

• Works irrespective of the mapping chosen to map XML documents into RDB.– Experiment with different mapping approaches.

• Time taken to propagate the update– using different views having different no. of underlying

tables.

April 4, 2002 Updating XML Views of Relational Data

38

Expected Contributions• XQuery support for XML updates

– Work done with MQP group.

• Framework for correct propagation of XML updates to underlying relations.– Application of object based view updating strategy to XML

views

– First solution for updating XML views

• Implementation of the system as a proof of concept & incorporate into full working system Rainbow

• Experimental evaluation.

April 4, 2002 Updating XML Views of Relational Data

39

Related Work• Querying XML views of Relational Data

– J. Shanmugasundaram, Jerry Kierman, Eugene Shekita, Catalina Fan and John Funderburk – VLDB 2001

• Updating XML– Tatarinov, Ives, Alon Halevy and Daniel Weld SIGMOD 2001

• Updating Relational Databases through Object-Based Views– A. M. Keller, Barsalou, Siambela and Wiederhold – SIGMOD 1991

• Algorithms for translating view updates to database updates for view involving selections, projections, and joins.– A. M. Keller ACM SIGACT-SIGMOD 1985

• A Relational Database View Update Translation Mechanism– Yoshifumi Masunaga – VLDB 1984

• On the updatability of relational views– Dayal and Bernstein – IEEE transaction 1978