xml interoperability manjusha ravindranath

39
1 XML INTEROPERABILITY Manjusha Ravindranath

Upload: tia

Post on 18-Jan-2016

47 views

Category:

Documents


0 download

DESCRIPTION

XML INTEROPERABILITY Manjusha Ravindranath. CONTENTS. Introduction Interoperability XSSQL syntax Usecases document Group By -Without aggregation -With aggregation -Multiple XML Databases Restructuring Queries Implementation Conclusion. INTRODUCTION. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: XML INTEROPERABILITY Manjusha Ravindranath

1

XML INTEROPERABILITY

Manjusha Ravindranath

Page 2: XML INTEROPERABILITY Manjusha Ravindranath

2

CONTENTS

Introduction InteroperabilityXSSQL syntaxUsecases documentGroup By

-Without aggregation

-With aggregation

-Multiple XML DatabasesRestructuring Queries ImplementationConclusion

Page 3: XML INTEROPERABILITY Manjusha Ravindranath

3

INTRODUCTION

The goal of this research is

- to study XML interoperability.

- to develop a SQL oriented query language

XSSQL for querying XML documents in

comparison to procedure oriented languages like

XQuery.

- to study mapping between the flat

representation of relational data and the

hierarchical representation of XML data.

Page 4: XML INTEROPERABILITY Manjusha Ravindranath

4

INTEROPERABILITY

Interoperability is the ability to uniformly share, interpret, query and manipulate data across component databases.

XSSQL supports main key features of an interoperable language by

-being independent of the XML schemas and

Document Type Descriptors.

-permitting restructuring of one XML document to

another through view definition capabilities.

Page 5: XML INTEROPERABILITY Manjusha Ravindranath

5

XSSQL SYNTAX

select <tag_Name attrib_Name > {$var/Qname } </tag_Name>

from document (“doc_name.xml”)//Qname $var where whereConditions

Variables are declared in the from clause as - document (“doc_name.xml”)/Qname $var

for an element at the top level of the document. - document (“doc_name.xml”)//Qname $var for an element which is at intermediate levels of the document.

Group by is given inside the <tag_name> in the select clause as <tag_name group by $var>

Page 6: XML INTEROPERABILITY Manjusha Ravindranath

6

QUERIES FROM USECASES DOCUMENT

1. “XMP” Queries

2. Tree Queries

3. “SEQ” Queries

4. “R” Queries

5. “SGML” Queries

6. ”STRING” Queries

7. “NS” Queries

8. “PARTS”

9. “STRONG”

Page 7: XML INTEROPERABILITY Manjusha Ravindranath

7

“XMP” QUERY

Sample Data-”bib.xml”

<bib> <book year=“1994”> <title> TCP/IP Illustrated </title>

<author>

<last>Stevens</last><first>W.</first>

</author>

<publisher>Addison-Wesley</publisher>

</book>

<book year=“2000”> <title> Data on the web </title>

<author>

<last>Suciu</last><first>Dan</first></author>

<publisher>Morgan Kaufmann</publisher>

</book></bib>

Page 8: XML INTEROPERABILITY Manjusha Ravindranath

8

QUERY (XSSQL)Solution in XSSQL

Q1. List books published by Addison-Wesley after 1991 including their year and title.

select<book year =“{$b/@year}”>{$b/title}</book>from document(“bib.xml”)//book $bwhere $b/publisher =“Addison-Wesley” and $b/@year >1991

Expected Result <book year =“1994”> <title>TCP/IP Illustrated</title></book>

Page 9: XML INTEROPERABILITY Manjusha Ravindranath

9

QUERY (XQUERY)

Solution in XQuery

XQuery uses the “FLWR” expression which consists of

FOR, LET,WHERE and RETURN

for $b in document(“bib.xml”)//book

where $b/publisher=“Addison-Wesley” and $b/@year >1991

return

<book year =“{$b/@year}”>

{$b/title}

</book>

Solution in XQuery has the same above expected result.

Page 10: XML INTEROPERABILITY Manjusha Ravindranath

10

TREE QUERY -FUNCTIONS IN XSSQLSample Data -”book.xml”

<book><title>Data on the web</title>

<author>Dan Suciu </author>

<section id =“intro” difficulty = “easy”>

<title> Introduction</title> <p>Text….</p>

<section>

<title>Audience</title> <p>Text….</p>

</section>

<section>

<title>Web Data and the Two Cultures </title>

<figure height=“400” width=“400”>

<image source=“pic.gif”/> </figure>

</section> </section></book>

Page 11: XML INTEROPERABILITY Manjusha Ravindranath

11

QUERY (XSSQL)Solution in XSSQL

Q2. Prepare a nested table of contents for Book1 listing all the sections and their titles preserving the attributes of each <section> element if any.

create function toc ($e as element)return element * as begin{ declare $n = local-name($e) if ($n =“section”) select <section> {$e/@*} {toc($e/*)} </section> if ($n =“title”) select <title> {$e/text ()} </title> } end <toc>{ toc(document(“book.xml”)/book) } </toc>

Page 12: XML INTEROPERABILITY Manjusha Ravindranath

12

EXPECTED RESULT

<toc> <section id = “intro” difficulty = “easy”> <title>Introduction</title> <section> <title>Audience</title> </section> <section> <title>Web Data and the Two Cultures </title> </section> </section></toc>

Page 13: XML INTEROPERABILITY Manjusha Ravindranath

13

QUERY (XQUERY)

Solution in XQuery

define function toc ($e as element)as element * { let $n: = local-name($e) return if ($n =“section”) then <section> {$e/@*} {toc($e/*)} </section> else if ($n =“title”) then <title> {$e/text ()} </title> else {}} <toc>{ toc(document(“book.xml”)/book) } </toc>

Page 14: XML INTEROPERABILITY Manjusha Ravindranath

14

GROUP BY In XSSQL the concept is that each node will have its own

grouping. Each child will inherit grouping of its parent The cases studied under group by are

- Group by without aggregation - Group by with aggregation - Multiple XML Databases

Following queries are based on the document “sales.xml”. This document gives the daily sales of the stores in two cities

in each month starting from January of the current year. For the sake of simplicity two stores in two cities of NC are

taken and sales of couple of days in the months of January and February are discussed.

Page 15: XML INTEROPERABILITY Manjusha Ravindranath

15

WITHOUT AGGREGATION

Sample Data -“sales.xml”<entries><entry>

<state>NC</state>

<city>Greensboro</city><store>Harris Teeter</store>

<month>January</month>

<day>1</day><sales>100.00</sales>

<day>2</day><sales>110.00</sales>

</entry><entry>

<state>NC</state>

<city>Greensboro</city><store>Food Lion</store>

<month>January</month>

<day>1</day><sales>100.00</sales>

<day>2</day><sales>200.00</sales>

</entry></entries>

Page 16: XML INTEROPERABILITY Manjusha Ravindranath

16

QUERY (XSSQL)

Q3. List all stores in each city.

<root>select

<city group by $c>distinct($c/text())

<store group by $s>distinct($s/text())</store>

</city></root>

from document (“sales.xml”)/entries/entry $e,

$e/city $c,$e/store $s

Page 17: XML INTEROPERABILITY Manjusha Ravindranath

17

SEMANTICS OF GROUP BY IN XSSQLThe instantiations after the group by would be like the following

$c $s

Greensboro Harris Teeter

Greensboro Harris Teeter

Greensboro Food Lion

Greensboro Food Lion

Raleigh Harris Teeter

Raleigh Harris Teeter

Raleigh Lowes

Raleigh Lowes

Page 18: XML INTEROPERABILITY Manjusha Ravindranath

18

SEMANTICS OF GROUP BY IN XSSQL The output instance is graphically shown below. By <city group

By $c, $c binds to every <city>…</city> in the document. Duplicate city names are eliminated by distinct ($c/text()).

root

Gso Raleigh

HT FL HT Lowes

Page 19: XML INTEROPERABILITY Manjusha Ravindranath

19

EXPECTED RESULT

<root>

<city>Greensboro

<store>Harris Teeter </store>

<store>Food Lion </store>

</city>

<city> Raleigh

<store> Harris Teeter</store>

<store> Lowes</store>

</city>

</root>

Page 20: XML INTEROPERABILITY Manjusha Ravindranath

20

QUERY(XQUERY)

<root>

for $c in distinct-values(document(“sales.xml”)//city)

return

<city>$c/text() {

for $e in document(“sales.xml”)/entries/entry

where some $ca in $e/city satisfies

deep-equal ($ca,$c)

for $s in distinct-values ($e/store)

return

<store> $s/text()</store> }

</city>

</root>

Page 21: XML INTEROPERABILITY Manjusha Ravindranath

21

NEW GROUP BY PROPOSAL

The above example can be written in XQuery using a new GROUP BY proposal provided by Prof. Dan Suciu.

<root>

for $e in document(“sales.xml”)/entries/entry,

$c in $e/city, $s in $e/store

return GROUPBY $c IN

<city>$c/text()

GROUPBY $s IN $s

</city>

</root>

Page 22: XML INTEROPERABILITY Manjusha Ravindranath

22

QUERY (XSSQL)

Q4. Give the monthly sales in all stores in each city

<root>select

<city group by $c>distinct($c/text())

<store group by $s>distinct($s/text())

<month group by $m>distinct($m/text())

<total_sales>SUM($i) </total_sales>

</month>

</store>

</city></root>

from document (“sales.xml”)/entries/entry $e,

$e/city $c,$e/store $s, $e/month $m, $e/sales $i

Page 23: XML INTEROPERABILITY Manjusha Ravindranath

23

EXPECTED RESULT<root>

<city>Greensboro

<store> Harris Teeter

<month> January

<totalsales>210</totalsales>

</month><month>February

<totalsales>730</totalsales>

</month></store>

<store>Food Lion

<month> January

<totalsales>300</totalsales>

</month><month>February

<totalsales>830</totalsales>

</month></store></city></root>

Page 24: XML INTEROPERABILITY Manjusha Ravindranath

24

QUERY (XQUERY)

<root>

for $c in distinct-values(document(“sales.xml”)//city),

$e in document(“sales.xml”)/entries/entry

where some $ca in $e/city satisfies deep-equal($ca,$c)

return

<city> distinct($c/text()) {

for $s in distinct-values(document(“sales.xml”)//store),

where some $sa in $e/store satisfies deep-equal($sa,$s)

return

<store> distinct($s/text()) {

for $m in distinct-values(document(“sales.xml”)//month)

let $i=$e/sales

Page 25: XML INTEROPERABILITY Manjusha Ravindranath

25

QUERY (XQUERY) contd...

where some $ma in $e/month

satisfies deep-equal($ma,$m)

return

<month> distinct($m/text()) {

<total_sales>SUM($i) </total_sales> }

</month>}

</store>}

</city>

</root>

Page 26: XML INTEROPERABILITY Manjusha Ravindranath

26

MULTIPLE XML DATABASES Suppose we have multiple XML databases having similar and

possibly overlapping data.

Sample Data“Univ1.xml” and “Univ2.xml” deals with student information in different

majors.

<entries>

<entry>

<major> Mathematics </major>

<student> Stephen Providence </student>

<student> Dale Borget </student>

</entry><entry>

<major>Computer Science </major>

<student> Barbara McMasters </student></entry></entries>

Page 27: XML INTEROPERABILITY Manjusha Ravindranath

27

MULTIPLE XML DATABASESSample Data“Univ2.xml”

<entries>

<entry>

<major> Mathematics </major>

<student> Dale Borget </student>

<student> Mary Rierson </student>

</entry><entry>

<major>English </major>

<student> Robin Mooney </student>

</entry>

</entries>

Page 28: XML INTEROPERABILITY Manjusha Ravindranath

28

QUERY (XSSQL)define function students ($a as element entry)

as xs:string {

declare $b =$a/student

return $b }

<merge>

select

<entry>

{ $e1 / major}

{ students ($e1)} {

$e2[student NOT IN (select $sa from

document(“Univ1.xml”)//entry $ea, $ea/major$ma

$ea/student $sa

where $ma/text()=$m2/text() ) ]/student } </entry>

Page 29: XML INTEROPERABILITY Manjusha Ravindranath

29

QUERY (XSSQL) contd...

from document(“Univ1.xml”)//entry $e1,

document(“Univ2.xml”)//entry $e2, $e2/major $m2

UNION

select

{$a}

</merge>

from document(“Univ2.xml”)//entry $a,

$n in $a/major

where $n not in document(“Univ1.xml”)//entry/major

Page 30: XML INTEROPERABILITY Manjusha Ravindranath

30

EXPECTED RESULT<merge>

<entry>

<major> Mathematics </major>

<student> Stephen Providence </student>

<student> Dale Borget </student>

<student> Mary Rierson </student>

</entry> <entry>

<major>Computer Science </major>

<student> Barbara McMasters </student>

</entry><entry>

<major>English </major>

<student> Robin Mooney </student>

</entry></merge>

Page 31: XML INTEROPERABILITY Manjusha Ravindranath

31

RESTRUCTURING QUERIES The following two documents “doc1.xml” and “doc2.xml”

contain the same information about company stocks but have a different hierarchical structure.

Views have been created to demonstrate the restructuring capabilities of XSSQL.

Page 32: XML INTEROPERABILITY Manjusha Ravindranath

32

RESTRUCTURING QUERIES

Sample Data“doc1.xml”

<entries>

<stock>

<date>8/8/03</date>

<ticker>IBM</ticker>

<value>5881</value>

</stock>

<stock>

<date>8/8/03</date>

<ticker>MSFT</ticker>

<value>6681</value>

</stock> <stock>

Page 33: XML INTEROPERABILITY Manjusha Ravindranath

33

RESTRUCTURING QUERIES

Sample Data contd..

<date>8/9/03</date>

<ticker>IBM</ticker>

<value>5981</value>

</stock>

<stock>

<date>8/9/03</date>

<ticker>MSFT</ticker>

<value>6981</value>

</stock>

</entries>

Page 34: XML INTEROPERABILITY Manjusha Ravindranath

34

RESTRUCTURING QUERIES

Sample Data“doc2.xml”

<entries>

<stock>

<date>8/8/02</date>

<IBM>5681</IBM>

<MSFT>6681</value>

</stock>

<stock>

<date>8/9/02</date>

<IBM>5981</IBM>

<MSFT>6981</MSFT>

</stock> </entries>

Page 35: XML INTEROPERABILITY Manjusha Ravindranath

35

RULES OF RESTRUCTURING

a. /doc2/entries/stock/IBM is a /doc1/entries/stock/ticker

b. If x is a ticker then /doc2/entries/stock/x/text() corresponds to /doc1/entries/stock/value.

Page 36: XML INTEROPERABILITY Manjusha Ravindranath

36

QUERY (XSSQL)

create view doc1_to_doc2 as

<entries>

select

<stock group by $d>

<date> distinct($d/text() ) </date>

<t/text()>$v/text() </t/text()>

</stock>

</entries>

from document (“doc1.xml”)//stock $s,

$s/date $d, $s/ticker $t, $s/value $v

Expected Result

“doc2.xml”

Page 37: XML INTEROPERABILITY Manjusha Ravindranath

37

IMPLEMENTATION XSSQL queries are translated into XQuery using naive

algorithms.

General Algorithm used to translate XSSQL into XQuery: - Read and tokenize input XSSQL string using white spaces

(Can use JAVA stringTokenizer classes).

- Translate XSSQL tokens to tokens in XQuery using

functions .

- Finally concatenate XQuery tokens to produce the output

string.

Page 38: XML INTEROPERABILITY Manjusha Ravindranath

38

CONCLUSION

In conclusion We introduced XSSQL as a SQL oriented query language for

querying XML documents. We developed a formal syntax of XSSQL akin to SQL and

provided novel algorithms for translating XSSQL to XQUERY. We have shown that XSSQL extensively deals with group by

with and without aggregation in single and multiple XML documents using several levels of nesting.

This work leads to many important directions of future work like

- optimization of views in XML documents. - merging of multiple (more than two) XML documents. - developing standalone engine for XSSQL.

Page 39: XML INTEROPERABILITY Manjusha Ravindranath

39

RFERENCES Lakshmanan, L.V.S. , Sadri, F. , and Subramanian, S.N - 2001

SchemaSQL- An Extension to SQL for Multi-database interoperability.

W3C Working Draft

XML Query Use Cases-

http://www.w3.org/TR/xmlquery-use-cases/ Cotton, P. , Robie, J. , - Jan 30, 2002

Querying XML Documents. Unicode Conference Berners-Lee, T. , Hendler, J. , Lassila, O. , - May 17, 2001

The Semantic Web.Scientific American W3C Recommendation, - May 2, 2001

XML Schema Part 0: Primer

http://www.w3.org/TR/2001/REC-xmlschema-0-20010502/