april 30 th , 2004

62
1 mar Benjelloun – Active XML Active XML: A data-centric perspective on Web services Omar Benjelloun INRIA Futurs With: Serge Abiteboul, Tova Milo, and many others. April 30 th , 2004

Upload: mahsa

Post on 13-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

Active XML: A data-centric perspective on Web services Omar Benjelloun INRIA Futurs With: Serge Abiteboul, Tova Milo, and many others. April 30 th , 2004. Active XML - Outline. Introduction Active XML Active XML documents Active XML services Novel issues Exchanging Active XML data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: April 30 th , 2004

1

Omar Benjelloun – Active XML

Active XML: A data-centric perspective on Web services

Omar BenjellounINRIA Futurs

With: Serge Abiteboul, Tova Milo, and many others.April 30th, 2004

Page 2: April 30 th , 2004

2

Omar Benjelloun – Active XML

Active XML - Outline

Introduction

Active XML• Active XML documents• Active XML services

Novel issues• Exchanging Active XML data• Querying Active XML data

Active XML Peers• The peer as a client• The peer as a server• Theoretical foundations

Applications

Conclusion

Page 3: April 30 th , 2004

3

Omar Benjelloun – Active XML

Introduction

Page 4: April 30 th , 2004

4

Omar Benjelloun – Active XML

Distributed data management in P2P Information is everywhere

services

XML XML

services

XML XMLXML

XML

services

XML

services

XMLInternet

Webservice

Webservice

Data warehousesDatabasesWeb sitesPC, PDA, cell phones, home appliances, cars…

Page 5: April 30 th , 2004

5

Omar Benjelloun – Active XML

The golden triangle of distributed data management

XML

a standard for data representation & exchange• Extensible Markup Language• Labeled ordered trees• Types: XML Schema / tree automata

Query languages• XPath, XQuery

Web services

standards for distributed computing • SOAP, WSDL, UDDI• Activation of methods on remote servers• Many burgeoning standard proposals

(Choreography, QoS, user interface, etc.)

XQuery XPath

XML

SOAPWSDL

Page 6: April 30 th , 2004

6

Omar Benjelloun – Active XML

What is Active XML (AXML)?

AXML is a declarative language

for distributed information management

and

an infrastructure to support this language,

in a peer-to-peer framework.

Page 7: April 30 th , 2004

7

Omar Benjelloun – Active XML

Active XML

Page 8: April 30 th , 2004

8

Omar Benjelloun – Active XML

Active XML documents

XML documents with embedded calls to Web services

Intensional • Some of the data is given explicitly • Some is given intensionally

(i.e. the means to acquire data when needed are given)

Dynamic• If the external sources change, the same document will provide

different information• Reaction to world changes

Page 9: April 30 th , 2004

9

Omar Benjelloun – Active XML

Not a new idea in databases, nor on the Web

Mixing calls to data is an old idea• Procedural attributes in relational systems• Basis of Object-oriented Databases

In Web programming• Sun’s JSP, PHP+MySQL

Calls to Web services inside documents• Macromedia FLEX, Apache Jelly, Microsoft XAML

What is new is the exploitation of the idea…

Page 10: April 30 th , 2004

10

Omar Benjelloun – Active XML

Web services in brief

A number of standards• XML• SOAP: Exchange of messages between applications• WSDL: Description of service interfaces (e.g. input/output types)• UDDI: Advertisement and discovery of services• … other proposed standards (choreography, security, etc.)

For us: means to provide, invoke and describe remote functions with XML input/output.

They make AXML documents universally understandable.

Page 11: April 30 th , 2004

11

Omar Benjelloun – Active XML

A sample AXML document

<?xml version=“1.0” ?><newspaper> <title>Le Monde</title> <date>06/10/2003</date> <call svc=“Yahoo.GetTemp”> <city>Paris</city> </call> <call svc=“TimeOut.GetEvents”> exhibits </call></newspaper>

GetTemp

city

“Paris”

newspaper

titledate

“06/10/2003”

“Le Monde”

GetEvents

“Exhibits”

AXML documents may contain calls:• to any existing Web services

(e-bay.net, google.com…)

• to any AXML Web services (to be defined)

Page 12: April 30 th , 2004

12

Omar Benjelloun – Active XML

Materialization

We will see later that:• Replacing the call by its result is not the only option

• Calls are not necessarily RPC-style synchronous invocations

<?xml version=“1.0” ?><newspaper> <title>Le Monde</title> <date>06/10/2003</date> <call svc=“Yahoo.GetTemp”> <city>Paris</city> </call> <call svc=“TimeOut.GetEvents”> exhibits </call></newspaper>

GetTemp

city

“Paris”

newspaper

titledate

“06/10/2003”

“Le Monde”

GetEvents

“Exhibits”

Y!Y!

temp

“16°C”

SOAP call

<temp>16°C</temp>

Page 13: April 30 th , 2004

13

Omar Benjelloun – Active XML

AXML Web services

Parameters: AXML data

Result: AXML data

Distribute computations: by sending as parameters data containing service calls, one can delegate some work to other peers.

Partial computations: by returning data containing service calls, one can give to the receiver the control of these calls.

Great flexibility

Page 14: April 30 th , 2004

14

Omar Benjelloun – Active XML

Calling an AXML service

<?xml version=“1.0” ?><newspaper> <title>Le Monde</title> <date>06/10/2003</date>

<call svc=“TimeOut.GetEvents”> exhibits </call></newspaper>

newspaper

titledate

“06/10/2003”

“Le Monde”

GetEvents

“Exhibits”

<temp>16°C</temp>

exhibits

GetExhibits

“Paris”

City

T!T!

temp

“16°C”

SOAP call (still…)

Materialization is a recursive process

Termination is an issue

<exhibits> <call svc=“Yahoo.GetExhibits”> <city>Paris</city> </call></exhibits>

Page 15: April 30 th , 2004

15

Omar Benjelloun – Active XML

Organization

Novel issues raised by the AXML language• Exchange of AXML data

• Querying AXML data

Supporting infrastructure• AXML peers:

– Management of persistent AXML data– Declarative AXML services

Applications

Page 16: April 30 th , 2004

16

Omar Benjelloun – Active XML

Novel issues

Page 17: April 30 th , 2004

17

Omar Benjelloun – Active XML

Active XML - Outline

Introduction

Active XML• Active XML documents• Active XML services

Novel issues• Exchanging Active XML data (SIGMOD 2003)• Querying Active XML data

Active XML Peers• The peer as a client• The peer as a server• Theoretical foundations

Applications

Conclusion

Page 18: April 30 th , 2004

18

Omar Benjelloun – Active XML

To call or not to call ?

GetEvents

“Exhibits”

newspaper

title date

“Le Monde”“06/10/2003”

GetTemp

city

“Paris”

temp

“16°C”

Y!Y!

Materialization can be performed by the sender, before sending a document… or by the receiver, after receiving it.

GetEvents

“Exhibits”

newspaper

title date

“Le Monde”“06/10/2003”

GetTemp

city

“Paris”

temp

“16°C”

Page 19: April 30 th , 2004

19

Omar Benjelloun – Active XML

Why control the materialization of calls?

For added functionality, e.g. • Intensional data allows to get up-to-date information.

For security reasons or capabilities, e.g.• I don’t trust this Web service/domain,

• I don’t have the right credentials to invoke it,

• It costs money,

• Maybe the receiver doesn’t know Active XML!

For performance reasons, e.g.• A proxy can invoke all the services on behalf of a PDA.

… and many more reasons you can think of!

Page 20: April 30 th , 2004

20

Omar Benjelloun – Active XML

We extend XML Schema, with intensional types: XMLSchemaint

How to control it? Using types

Static analysis algorithms use signatures of services: WSDLint

... ...

r

......

...

... ...

g

fq

...

CapabilitiesACLCost...

Sender

dataexchangeSchemaf q

g

CapabilitiesACLCost...

Receiver

gg

g

g

gg

q

q

q

f

fr

r

Page 21: April 30 th , 2004

21

Omar Benjelloun – Active XML

Data:newspaper = title.date.(GetTemp|temp).(GetEvents|exhibit*)

title = data

date = data

temp = data

city = data

exhibit = title.(GetDate|date)

Functions:GetTemp(city) -> temp

GetEvents(data) -> (exhibit|performance)*

GetDate(title) -> date

The extended schema language

Rewriting: replace call(s) by an arbitrary output of the service.

To simplify, we use here a DTD-like syntax

GetTemp

city

“Paris”

newspaper

titledate

“06/10/2003”

“Le Monde”

GetEvents

“Exhibits”

Page 22: April 30 th , 2004

22

Omar Benjelloun – Active XML

Rewritings

The Goal:Given • an intensional document d • a schema s, Can we rewrite d so that it matches s?

Safe rewriting: one that for sure leads to s

(we know without making any call).

Possible rewriting: one that may lead to s (depending on the answers of services).

Page 23: April 30 th , 2004

23

Omar Benjelloun – Active XML

Difficulties

Infinite search space• Vertical

• Horizontal

Main problem • The result of a Web service call is unknown,

• We just know a signature (input/output types)

We want a very efficient solution.

Foundations of the problem • String & tree automata,

• with existential and universal transitions.

Page 24: April 30 th , 2004

24

Omar Benjelloun – Active XML

Results

The general problem is undecidable [MSS03]

Restrictions on the considered rewritings• Left-to-right: No “going back and forth”• K-depth: bound on the nesting of function calls (Search space still infinite but finitely representable)

Under these restrictions• We have algorithms to find safe/possible rewritings.• They are PTIME (for deterministic schemas).• We can also do it between schemas.

Implementation• demo at VLDB 2003 (customizable news syndication)

Page 25: April 30 th , 2004

25

Omar Benjelloun – Active XML

Safe rewriting algorithm

Sketch• Deal with function parameters first,

• Top-down traversal of the tree,

• For each data node:– rewrite its children (viewed as a word), – to match the target type (a regular expression)– using regular automata techniques, and smart marking.

Page 26: April 30 th , 2004

26

Omar Benjelloun – Active XML

Safe rewriting algorithm (2)

Build an FSA that accepts all k-depth rewritings of the initial word.

Build an FSA that recognizes the complement of the target type.

GetEvents

1wA

q1title

q6

dateq2 q3

GetTempq0 q4

q5

q7

exhibit

performance

temp

p0 p1title

p2date

p3temp p4GetEvents

p6*

p5

exhibit

exhibit

*

* * * *

*

A

Page 27: April 30 th , 2004

27

Omar Benjelloun – Active XML

Safe rewriting algorithm (3)

Compute the intersection of these languages:

A smart marking determines whether a safe rewriting exists.

Then run the word on the marked automaton to find an actual rewriting.

Optimization: lazy construction of the automata

q0,p0 q1,p1 q2,p2 q3,p3 q4,p4

q6,p3q5,p2

q3,p6q7,p6

q4,p6

q7,p6 q7,p3 q4,p3

q7,p5 q4,p5

title date

temp

GetEvents

GetEventsperformance

performance

GetTemp

performanceexhibit

exhibit

exhibit

exhibit

AAA kw

Page 28: April 30 th , 2004

28

Omar Benjelloun – Active XML

Active XML - Outline

Introduction

Active XML• Active XML documents• Active XML services

Novel issues• Exchanging Active XML data• Querying Active XML data (SIGMOD 2004)

Active XML Peers• The peer as a client• The peer as a server• Theoretical foundations

Applications

Conclusion

Page 29: April 30 th , 2004

29

Omar Benjelloun – Active XML

Querying AXML Data

Given a (tree pattern) query:/newspaper[temp > 18°C]/exhibits//exhibit[location=“Le Louvre”]

Materialize the document?

Call only the services that may contribute

data to the query answer.

The problem: Lazy evaluation of service calls

To call or not to call, this time when evaluating a query

GetTemp

city

“Paris”

newspaper

titlegetDate

“Le Monde”

GetEvents

“Exhibits”

exhibits

GetExhibits

“Paris”

City

temp

“19°C”

Page 30: April 30 th , 2004

30

Omar Benjelloun – Active XML

Lazy evaluation

Difficulties:• Calls can be found everywhere in the document

• May appear dynamically (as a result of previous calls)

• May become (ir)relevant due to previous invocations

• Need to take signatures of calls into consideration

A possible approach: modify the query processor• Top-down evaluation

• Trigger the calls found on the way

• Not so great:

– Computation is blocked

– Optimization opportunities are lost

Page 31: April 30 th , 2004

31

Omar Benjelloun – Active XML

Our solution

Given a query to evaluate:

Derive a set of

“node-focused” queries (NFQ),

that find the relevant calls

when evaluated on the document.

Need to be reevaluated, as the document evolves!

newspaper

temp

> 18°C

exhibits

exhibit

location

“Le Louvre”

newspaper

temp

> 18°C

exhibits

*

*

*

Etc.

Page 32: April 30 th , 2004

32

Omar Benjelloun – Active XML

Optimizations

Service calls sequencing• Analysis of the relationship between calls (through the NFQ’s)• Layering, and parallelization inside each layer.

Refinement via type analysis• Matching output types of services with data expected of queries

“Pushing” queries to capable services

Acceleration:• Via relaxation:

– NFQ approximation– Superset of the relevant calls

• Via a special access structure, similar to a DataGuide:– Restricted to paths that lead to service calls– Indexes the calls

Experimental assessment• 10x speed-up when combining optimizations

Page 33: April 30 th , 2004

33

Omar Benjelloun – Active XML

Active XML peers

Page 34: April 30 th , 2004

34

Omar Benjelloun – Active XML

Distributed data management in P2P

services

XML XML

services

XML XMLXML

XML

services

XML

services

XMLWeb

Webservice

Webservice

AXML

AXML

AXML

AXML

AXML

AXML

AXML

Page 35: April 30 th , 2004

35

Omar Benjelloun – Active XML

What do we need from an AXML system ?

Persistent, manageable, dynamic AXML data.

Easy ways to define services

Control of the exchanged data (parameters & results of service calls)

Peer-to-peer architecture, where each AXML peer:• Repository: manages persistent AXML data

• Client: uses (AXML) Web services

• Server: provides AXML services

AXMLpeerso

ap

Page 36: April 30 th , 2004

36

Omar Benjelloun – Active XML

Global architecture

query

readupdate

SOAPwrapper

SOAP

SOAP

AXML peer S3

SOAPservice

SOAP client

AXML peer S1AXML peer S2

AXML

XML

XML

AXML

AXML store

servicedescriptions

AXMLengine

Query engine

Page 37: April 30 th , 2004

37

Omar Benjelloun – Active XML

Implementation

SUN’s Java SDK 1.4 (includes XML parser, XPath processor, XSLT engine)

Apache Tomcat 4.1 servlet engine

Apache Axis SOAP toolkit 1.1

X-OQL query processor, persistent DOM repository

JSP-based Web user interface, using JSTL 1.0 standard tag library

Also, a lightweight implementation for PDA/phone (J2ME, CLDC profile), used for [ABB03demo].

Page 38: April 30 th , 2004

38

Omar Benjelloun – Active XML

Active XML - Outline

Introduction

Active XML• Active XML documents• Active XML services

New issues• Exchanging Active XML data• Querying Active XML data

Active XML Peers• The peer as a client• The peer as a server• Theoretical foundations

Applications• P2P auctions• News syndication• Other applications

Conclusion

Page 39: April 30 th , 2004

39

Omar Benjelloun – Active XML

Managing persistent AXML data

“Our newspaper should have its temperature information refreshed daily. New exhibits should be fetched every week and archived for 6 months”

Service call results enrich the document (calls can be kept for possible future reuse)

Main issues:• When to activate a service call?

• What to do with its result?

Page 40: April 30 th , 2004

40

Omar Benjelloun – Active XML

When to activate a service call?

Explicit pull mode • Daily, weekly, or after some event: e.g., when another call occurs

• This aspect of the problem is related to active databases

Implicit pull mode• Detect which intensional information (the service calls) may

contribute to the answer of a query (lazy evaluation)

• This aspect of the problem is related to deductive databases

Push mode• Based on a query subscription; the service provider pushes

information to the client (E.g., for synchronization purposes)

• This is related to stream and subscription queries

Page 41: April 30 th , 2004

41

Omar Benjelloun – Active XML

Managing service call results

How long does the returned data remain valid?

• Just long enough to answer a query: Mediation

• 1 day, 1 week, … or unbounded: Caching / Warehousing

• Various portions of the document may follow different policies: Hybrid

For repeated service call invocations: merge policy

• append,

• replace,

• Fusion (using XML Schema-like keys),

• Specific merge policies can be provided as Web services

Page 42: April 30 th , 2004

42

Omar Benjelloun – Active XML

Example: AXML document with control attributes

<?xml version=“1.0” ?><newspaper> <title>Le Monde</title> <date>06/10/2003</date> <call svc=“Yahoo.GetTemp” mode=“lazy”

valid=“1 day” merge=“replace” > <city>Paris</city> </call> <call svc=“TimeOut.GetEvents” mode=“every Monday morning” valid=“6 months”

merge=“append”> exhibits </call></newspaper>

Page 43: April 30 th , 2004

43

Omar Benjelloun – Active XML

Active XML - Outline

Introduction

Active XML• Active XML documents• Active XML services

Novel issues• Exchanging Active XML data• Querying Active XML data

Active XML Peers• The peer as a client• The peer as a server• Theoretical foundations

Applications

Conclusion

Page 44: April 30 th , 2004

44

Omar Benjelloun – Active XML

Declarative AXML services

Services can be defined by queries or updates over the AXML documents of the repository (XQuery, XPath, Xupdate)

Which (lazy) service calls may contribute to the answer?

let service GetExhibitsByLocation($loc) be

for $a in document(“newspaper.xml")/newspaper/exhibits,

$b in $a//exhibit

where $b@name=$loc

return <exhibits> {$b} </exhibits>

Page 45: April 30 th , 2004

45

Omar Benjelloun – Active XML

Other means to define services

Other programming languages:• XSLT transformations (through Apache Xalan)

• Java classes (through Axis)

Composition of existing services:• BPEL4WS (through IBM’s BPEL4J implementation)

Page 46: April 30 th , 2004

46

Omar Benjelloun – Active XML

Active XML - Outline

Introduction

Active XML• Active XML documents• Active XML services

New issues• Exchanging Active XML data• Querying Active XML data

Active XML Peers• The peer as a client• The peer as a server• Theoretical foundations (PODS 2004)

Applications

Conclusion

Page 47: April 30 th , 2004

47

Omar Benjelloun – Active XML

Theoretical foundations: Positive AXML

Restricted framework• Data model

– set-based (unordered) AXML trees– Call results are accumulated in documents

• Services – Monotone– Positive: defined by conjunctive fragment of XQuery

Results• Well-defined (possibly infinite) fix-point semantics• Termination, lazy evaluation…

Connections to: • Regular (infinite) trees, Query-Sub-Query [AM04],…

Page 48: April 30 th , 2004

48

Omar Benjelloun – Active XML

Applications

Page 49: April 30 th , 2004

49

Omar Benjelloun – Active XML

Demos

Peer-to-peer auctions (VLDB 2002 demo)• Discovery of new peers/auctions through intensional answers

RSS News syndication (VLDB 2003 demo 1)• Customization of services through schemas + news subscriptions

Distributed workspaces (VLDB 2003 demo 2)

Web warehousing (ECDL 2003 demo)

A powerful framework for the fast development of distributed, data-centric applications.

Page 50: April 30 th , 2004

50

Omar Benjelloun – Active XML

Other applications

E.dot, a dynamic warehouse on food risk management• Use AXML as the platform for the warehouse definition,

construction and maintenance

Network configuration • Use AXML exchange of information to configure

hardware/software components

Software distribution• Use AXML to customize distributions and keep your view of

the software fresh

Decentralized user profile/patient data management• Use AXML to coordinate the integration of data, and privacy

enforcement services in a uniform way

Page 51: April 30 th , 2004

51

Omar Benjelloun – Active XML

Conclusion

Page 52: April 30 th , 2004

52

Omar Benjelloun – Active XML

AXML documents and services

A simple paradigm…

…that allows for new, powerful features.

• Intensional parameters and results:

AXML documents can be exchanged

• Support for continuous services (streams of answers)

• Control over the exchange of AXML data

IssuesControl of call activation via typing, Lazy evaluation, Replication and distribution, Security, Mobility, Termination, Implementation, Foundations, …

Page 53: April 30 th , 2004

53

Omar Benjelloun – Active XML

Current/Future work

Security and privacy (with Bell Labs)

Editor/browser plug-in for AXML

Mass storage XML DB (with Xyleme Corp.)

P2P infrastructure

Page 54: April 30 th , 2004

54

Omar Benjelloun – Active XML

To know more…

http://purl.org/net/axml• Implementation becomes open-source• Already available for research• Will be released publicly very soon.

Selected publications• S.Abiteboul, O. Benjelloun, T. Milo:

Positive Active XML, PODS, 2004.• S.Abiteboul, O. Benjelloun, B. Cautis, I. Manolescu, T. Milo, N. Preda:

Lazy Query Evaluation for Active XML, SIGMOD, 2004.• T. Milo, S. Abiteboul, B. Amann, O. Benjelloun, F. Dang Ngoc:

Exchanging Intensional XML Data, SIGMOD, 2003 (full version to appear in TODS).

• S. Abiteboul, O. Benjelloun, I. Manolescu, T. Milo, R. Weber: Active XML: A Data-Centric Perspective on Web Services (book chapter), In Web Dynamics, Springer, 2004.

• S. Abiteboul, A. Bonifati, G. Cobena, I. Manolescu, T.Milo:Dynamic XML Documents with Distribution and Replication, SIGMOD, 2003

Page 55: April 30 th , 2004

55

Omar Benjelloun – Active XML

Merci

Page 56: April 30 th , 2004

56

Omar Benjelloun – Active XML

Page 57: April 30 th , 2004

57

Omar Benjelloun – Active XML

Extra slides

Page 58: April 30 th , 2004

58

Omar Benjelloun – Active XML

Asynchronous/Continuous services

The client subscribes and then is notified

The server decides when to send data• E.g., promotional offers

Change control:• Management of replication [ABCMM03]

• What to do when a change is detected– Send the new state of data

– Send the delta between old and new state

– Dual of merge policies

Page 59: April 30 th , 2004

59

Omar Benjelloun – Active XML

Peer-to-peer auctions (VLDB 2002 demo)

Each peer proposes auctions:• Document myauctions.xml with the

peer’s items and their current bids• Services offered:

– getLocalAuctions(),– status(auctionId)

Each peer bids on auctions:• Document mybids.xml with the

peer’s bids• Services offered:

– bid(peer,auctionId, amount)– bidUpTo(peer, auctionId,

increment, limit)

Each peer knows about other peers’ auctions:• Document

allauctions.xml contains calls to other peers that transitively retrieve their known auctions.

• Service offered : getAllAuctions()

When an auction closes, the winner is notified.

Page 60: April 30 th , 2004

60

Omar Benjelloun – Active XML

News syndication (VLDB 2003 demo)News sources:•GetStory(id)•GetNewsAbout(kwd)

Aggregators:•GetNewsAbout(kwd)•…but several versions, more or less intensional

Clients:•PC, laptops, PDAs

Page 61: April 30 th , 2004

61

Omar Benjelloun – Active XML

Customizing the output of services• News sources/aggregators provide different versions of

GetNewsAbout with different output schemas• The output is automatically transformed into the desired schema• Clients can also specify a desired output schema as a parameter

Customizing the input of services• Location-aware continuous services for mobile users• The context of the user is given by intensional parameters

Distributed logging mechanism• Also customizable through the use of schemas

Service customization using schemas

Page 62: April 30 th , 2004

62

Omar Benjelloun – Active XML

Call parameters<temp> <call svc=“[email protected]”><city>“Denver”</city></call></temp>

XPath

AXML

<temp> <call svc=“[email protected]”> <city> <call svc=“[email protected]”>“colorado”</call> </city> </call></temp>

<temp> <call svc=“[email protected]”>../../city</call></temp>

To call or not to call (before invoking) ?

XML