1 designing valid xml views ya bing chen, tok wang ling, mong li lee department of computer science...

35
1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

Upload: griselda-warner

Post on 26-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

1

Designing Valid XML Views

Ya Bing Chen, Tok Wang Ling, Mong Li Lee

Department of Computer ScienceNational University of Singapore

Page 2: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

2

Outline

I. Introduction.II. ORA-SS data model.III. Designing Valid XML Views.IV. Comparison with Related Work.V. Conclusion & Future Work.

Page 3: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

3

I. Introduction

Page 4: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

4

Background

XML views are views in XML form on top of the underlying data.

XML views enable presentation and exchange of data in underlying databases in XML form on the Internet.

XML views are analogous to Relational views. Logical data independence. Data protection. Flexibility of data presentation.

I. Introduction

Page 5: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

5

Related Works

ActiveViews System [2]

Based on XML documents, the system offers a novel declarative view specification language to describe views that include the relevant data and activities of each different actor participating in electronic commerce activities.

the novelty is in the combination with active features. However, the views considered here are much simpler

than views in databases. Generally, the views support simple query operators such as selection operator.

I. Introduction

[2] S. Abiteboul, B. Amann, S. Cluet, A. Eyal, L. Mignet, and T. Milo. Active views for electronic commerce. In Int. Conf. on Very Large DataBases (VLDB), Edinburgh, Scotland, pages 138-149,1999.

Page 6: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

6

Related Works (cont.)

Mediation of Information using XML (MIX) [4]

MIX provides users with an integrated XML view of the underlying heterogeneous sources. The sources may be relational databases, OO databases or HTML files.

MIX uses XML DTD as the data model of XML views and a declarative query language called XMAS to define views.

The novelty of MIX is a graphical user interface that integrates browsing and querying XML views. It can support selection operator.

However, DTD is still not enough to to express semantics hold in XML data.

For example, DTD cannot distinguish whether an attribute belongs to an object class or a relationship type.

I. Introduction

[4]. C. Baru, A. Gupta, B. Ludaescher, R. Marciano, Y. Papakonstantinou, and P. Velikhov. XML-Based Information Mediation with MIX. ACM-SIGMOD, Philadelphia, PA, pages 597-599, 1999.

Page 7: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

7

Our contribution

Problems in the related works. Did not support validation of XML views.

That is, designed views may violate implicit semantics.

Did not support more complex operators, such as join, projection and swap.

We use a systematic approach to solve the two problems above. Transform XML documents into ORA-SS schema

diagram. Enrich the schema diagram with semantics. Propose a set of rules to guide the design of

valid XML views.

I. Introduction

Page 8: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

8

II. ORA-SS data model

Page 9: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

9

Main Concepts

ORA-SS: Object-Relationship-Attribute Semi Structured data model.

Three main parts: object class, relationship type and attribute. An object class is similar to an element in

XML documents. A relationship type describes a relationship

among object classes. An attribute is a property of an object class

or a relationship type.

II. ORA-SS data model

Page 10: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

10

An example of ORA-SS schema diagram

An object class is represented as a labeled rectangle.

An attribute is represented as a labeled circle.

A key attribute is represented as a filled circle.

Attributes of relationship type have labels on their incoming edges, while attributes of object class do not have.

s u p p lie r

s n o p ar t

p n op r ic e

p r o jec t

jn o

js ,2 ,1 :n ,1 :n

s p ,2 ,1 :n ,1 :ns p j, 3 , 1 :n , 1 :n

s p

F i g ur e 1 : An O R A-SS Sc he m aD i ag r am

s p j

q ty

II. ORA-SS data model

A relationship type is described by name, n, p, c.

Name denotes the name of the relationship type.n is the degree of the relationship type.p is the participation constraint of the parent object class in the relationship type.c is the participation constraint of the child object class in the relationship type.

Page 11: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

11

III. Designing Valid XML Views

Page 12: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

12

An introduction

Before designing valid XML views, we have two pre-process steps:

Transforming XML into ORA-SS Semantic enriching ORA-SS

Based on the enriched ORA-SS schema diagram, we begin to design XML views.

Four operators can be applied to the XML views: selection, projection, join and swap.

The first three operators are similar to selection, projection and join in relational databases.

The fourth operator exchanges the positions of parent and child object classes.

III. Designing valid XML views

Page 13: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

13

Selection operator A selection operator filters data by using predicates. For example, we design a view that depicts projects for which there exist suppliers for which there exist parts with a price > 80.

Selection operators u p p lie r

s n o p ar t

p n op r ic e

p r o jec t

jn o

js ,2 ,1 :n ,1 :n

s p ,2 ,1 :n ,1 :ns p j, 3 , 1 :n , 1 :n

s p

So ur c e s c he m a

q ty

s p j

s u p p lie r

s n o p ar t

p n op r ic e > 8 0

p r o jec t

jn o

js ,2 ,1 :n ,1 :n

s p ,2 ,1 :n ,1 :ns p j, 3 , 1 :n , 1 :n

s p

So ur c e s c he m a

q ty

s p j

III. Designing valid XML views

Page 14: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

14

Selection operator (cont.)

Features of selection operator Selection operator put predicates on the

source schema to filter data. They do not restructure the source schema. The resulting view schema contains the

conditions specified in the selection operator.

III. Designing valid XML views

Page 15: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

15

Projection operator projection operator selects or drops object classes or attributes in the source schema. the source semantics may be affected.

For example, the following view drops the object class supplier and its attributes.

Projection operator

p ar t

p n oav g_ p r ic e

p r o jec t

jn o

jp ,2 ,1 :n ,1 :n

jp

Vi e w s c he m a

s u p p lie r

s n o p ar t

p n op r ic e

p r o jec t

jn o

js ,2 ,1 :n ,1 :n

s p ,2 ,1 :n ,1 :ns p j, 3 , 1 :n , 1 :n

s p

So ur c e s c he m a

q ty

s p j

III. Designing valid XML views

Page 16: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

16

Projection operator (cont.)

Several changes in the view schema. The attribute sno has been dropped with supplier. The relationship types js, spj and sp have been dropped. The attribute price has been mapped into an aggregate

attribute, e. g, avg_price, which represents the average price of each part in a given project.

The attribute qty has been dropped. The example shows flexible views can be designed

based on ORA-SS with its additional semantics. However, we need to handle the views properly so

that semantics will not be violated.

III. Designing valid XML views

Page 17: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

17

Projection operator (cont.)

The rules for applying projection operators. Rule Proj1.

If an object class has been dropped, its attributes must be dropped too.

Rule Proj2. If an object class has been dropped, all relationship

types containing the object class must be dropped too. The attributes of these relationship types must be dropped, or mapped into attributes with some aggregate function, such as avg, max/min or sum, or mapped into attributes typed in bag of values if they cannot be aggregated.

Based on the rules, the views designed are guaranteed to be valid when projection operators are applied.

III. Designing valid XML views

Page 18: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

18

Join operator

Join operator joins two object classes and their attributes together by key-foreign key reference.

For example, the following view joins project and project’ together.

Join operator

p ar t

p r o jec t

q ty

s u p p lie r

jn oem p lo y ee

en o en am e

s p ,2 ,1 :n ,1 :n

s p j,2 ,1 :n ,1 :ns p

jn o

So ur c e s c he m a

p r o jec t '

p n o p r ic e

jn o jn am e

m j, 2 , 1 :n , 1 :n

m js p j

p r o g r es s

III. Designing valid XML views

p ar t

p r o jec t

q ty

s u p p lie r

jn o

s p ,2 ,1 :n ,1 :n

s p j,2 ,1 :n ,1 :ns p

Vi e w s c he m a

p n o p r ic e

jn o jn am e

s p j

Page 19: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

19

Join operator (cont.)

In the view, the attributes jno and jname of project are selected and placed below the object class project.

However, the attribute progress is dropped because it belongs to the relationship type mj, which does not exist in the view.

Actually, the attribute progress can also be mapped into an attribute typed in bag of values if users want.

III. Designing valid XML views

Page 20: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

20

Join operator (cont.)

The rules for applying join operators Rule Join1

when a join operator is applied to two object classes, if there are relationship types below the referenced object class that contain object classes above the referenced object class, then these relationship types must be dropped.

The attributes of these relationship types must be dropped too, mapped into attributes with some aggregate function or mapped into attributes typed in bag of values if they cannot be aggregated.

III. Designing valid XML views

Page 21: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

21

Join operator (cont.)

The rules for applying join operators Rule Join2

When a join operator is applied to two object classes, if there are relationship types below the referenced object class that do not contain any object classes above the referenced object class, then these relationship types can be selected or dropped in the view according to the users’ requirement.

The attributes of these relationship types can be selected or dropped too according to the users’ requirement.

III. Designing valid XML views

Page 22: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

22

Swap operator Swap operator exchanges the positions of a parent object class and one of its child object class.

For example, the following view swaps supplier and part.

Swap operator

III. Designing valid XML views

p ar t

p n o s u p p lie r

s n op r ic e

p r o jec t

jn o

jp ,2 ,1 :n ,1 :n

s p ,2 ,1 :n ,1 :ns p j, 3 , 1 :n , 1 :n

s p

So ur c e s c he m a

s p j

q ty

s u p p lie r

s n o p ar t

p n op r ic e

p r o jec t

jn o

p s ,2 ,1 :n ,1 :n

s p ,2 ,1 :n ,1 :ns p j, 3 , 1 :n , 1 :n

s p

Vi e w s c he m a

s p j

q ty

Page 23: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

23

Swap operator (cont.)

In the view, the parent object class part and its child object class supplier are swapped, and the attribute sno moves with its object class supplier.

However, the attribute price does not move with supplier. Because price is an attribute of the relationship type sp, it stays below the new lowest object class (part) of sp in the view.

Similarly, since the attribute qty belongs to the relationship type spj, it also stays below the lowest object class (part) of spj.

III. Designing valid XML views

Page 24: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

24

Swap operator (cont.)

The rules for applying swap operators.

Rule Swap1 If two object classes are swapped in the view, then the attributes of each of the object classes must stay with the object class.

Rule Swap2 If two object classes are swapped in the view, then the attributes of relationship types involving the two object classes must stay below the lowest participating object class in the relationship types.

III. Designing valid XML views

Page 25: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

25

Views on schema with IDD relationship

IDentifier Dependency Relationship (IDD)

em p lo y ee

c h ilden o

c n am e

O R A -S S s o u rc e s c h e m a d iag ramo f an I D D re latio n s h ip typ e

I D D ,2 ,0 :n ,1 :1

s ex D o B

Definition1. An object class A is said to be ID Dependent (IDD) on its parent object class B if A does not have its own identifier attributes, and an A object can only be identified by its parent’s key value (say k1) together with some of its own attributes (say k2). That is, the key of A is {k1, k2}. The relationship type between A and B is then called IDD relationship type.

III. Designing valid XML views

Page 26: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

26

Views on schema with IDD relationship (cont.)

If projection, join and swap operators are applied on IDD relationship, rules need to be modified.

For example, we design a view that swaps employee and child. In the view, the key attribute of employee – eno is added under the object class

child so that {eno, cname} becomes a composite key for child.

em p lo y ee

c h ilden o

c n am e

S o u rc e s c h e m a o f an I D Dre latio n s h ip typ e

I D D ,2 ,1 :n ,1 :1

c h ild

en o c n am eem p lo y ee

en o ( d er iv ed )

v ie w s c h e m a s w ap p in g em p loy e ean d c h ild

C o n s train t: th e eno ofem p lo y e e m u s t be th e

s am e as th e eno of ch i ld

swap operator

III. Designing valid XML views

Page 27: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

27

Views on schema with IDD relationship (cont.)

The rules for IDD relationship. Rule Proj_IDD. If a parent object class of an IDD relationship is

dropped in the view, then its key attribute must be added to the child object class to construct a key for the child.

Rule Join_IDD. If an child object class an IDD relationship type is referenced by another object class in the source schema in the view, then the key attribute of the parent object class must be added to the child to construct a key for the child.

Rule Swap_IDD. If two object classes of an IDD relationship type are swapped in the view, then the key attribute of the parent object class must be added to the child object class to construct a key for the child.

III. Designing valid XML views

Page 28: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

28

View validation algorithm

All given rules are integrated into an algorithm to validate XML views.

The algorithm monitors the process of designing view until the view is completely designed.

According to different operators, the algorithm uses corresponding rules to modify view schema to keep it valid.

Once an operator is applied to the view, the algorithm first checks whether IDD relationship type is involved and applies rules for it.

Then the algorithm applies the normal rules for the operator.

III. Designing valid XML views

Page 29: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

29

IV. Comparison with Related Work

Page 30: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

30

Comparison with related work

Active Views system

MIX system Our approach

Data model XML XML DTD ORA-SS

Projection operator

No No Yes

Join operator No No Yes

Swap operator

No No Yes

Validate views

No No Yes

Design views graphically

No No Yes

Page 31: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

31

V. Conclusion & Future Work

Page 32: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

32

Conclusion

We proposed a systematic approach for valid XML views design.

1. Transform an XML document into an ORA-SS schema diagram.

2. Enrich the ORA-SS schema diagram with additional semantics.

3. Develop a set of rules to guide the design of valid XML views.

The approach guarantees validity of XML views

and it supports four operators, i.e., selection, projection, join and swap operator.

The approach also handles IDD relationships.

Page 33: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

33

Future work

View definition generation. Generate the view definition in XQuery from

the graphical view schema that has been designed.

Query rewriting. Rewrite queries on views into queries on

source data. View update.

Which views are updateable and which are not.

How to update those updateable views.

Page 34: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

34

Q&A

Page 35: 1 Designing Valid XML Views Ya Bing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science National University of Singapore

35

References

1. S. Abiteboul. On views and XML. In Proceedings of the Eighteenth ACM Symposium on Principles of Database Systems, ACM Press, pages 1-9, 1999.

2. S. Abiteboul, B. Amann, S. Cluet, A. Eyal, L. Mignet, and T. Milo. Active views for electronic commerce. In Int. Conf. on Very Large DataBases (VLDB), Edinburgh, Scotland, pages 138-149,1999.

3. S. Abiteboul, D. Quass, J. McHugh, J.Widom, and J. L. Wiener. The lorel query language for semistructured data. International Journal of Digital Libraries, Volume 1, No. 1, pages 68-88, 1997.

4. C. Baru, A. Gupta, B. Ludaescher, R. Marciano, Y. Papakonstantinou, and P. Velikhov. XML-Based Information Mediation with MIX. ACM-SIGMOD, Philadelphia, PA, pages 597-599, 1999.

5. Gillian Dobbie, Xiaoying Wu, Tok Wang Ling, Mong Li Lee. ORA-SS: An Object-Relationship-Attribute Model for Semi-Structured Data. Technical Report TR21/00, School of Computing, National University of Singapore, 2000.

6. Tok Wang Ling, Mong Li Lee, Gillian Dobbie. Application of ORA-SS: An Object-Relationship-Attribute Model for Semi-Structured Data. In Proceedings of the Third Interna-tional Conference on Information Integration and Web-based Applications & Services (IIWAS), Linz, Austria, 2001.

7. http://www.w3.org/TR/xquery. 8. http://www.w3.org/XML/Schema.