ac#ve&documents&and&ac#ve&xml& - serge abiteboul · 2012-04-04 · 2 ∪&...
TRANSCRIPT
![Page 1: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/1.jpg)
Ac#ve documents and Ac#ve XML
Serge Abiteboul INRIA Saclay, Collège de France, ENS Cachan
4/4/12 1 4/4/12 1
![Page 2: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/2.jpg)
Organiza#on
Introduc#on Modeling data intensive distributed systems Query op#miza#on in distributed systems Monitoring in distributed systems Task sequencing in distributed systems Conclusion
2
![Page 3: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/3.jpg)
Introduc#on
![Page 4: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/4.jpg)
Context: Web data management
Scale: lots of servers, large volume of data
Servers are autonomous (heterogeneous also)
Data may be very dynamic, heavy update rates
Peers are possibly moving
4
Rela#on → Tree
Centralized → Distributed
Precise data → Incomplete, probabilis#c
Precise schemas → Ontologies
The focus in this class
![Page 5: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/5.jpg)
The lesson from the past
The success of the rela#onal model with 2D-‐tables on local servers
– A logic for defining tables – An algebra for describing query plans over tables
We should do similarly for trees in a distributed environment – A logic for defining distributed trees and data services – An algebra for op#mizing queries over trees/services
5
![Page 6: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/6.jpg)
Roadmap
1. Modeling: the AXML model of ac#ve documents
– Views: to capture inten#onal data – Streams: to capture exchanges of data and evolu#on
2. Op#miza#on: an algebra for AXML 3. Monitoring: based on AXML documents 4. Task sequencing: A workflow based on AXML documents
– In the spirit of business ar#facts
6
Key concept for Data management
Key concept for distribu#on and
evolu#on
![Page 7: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/7.jpg)
Modeling data intensive distributed systems
Ac#ve XML
![Page 8: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/8.jpg)
Ac#ve XML
Based on Web standards: XML + Web services + Xpath/Xquery
Idea: Exchange XML documents with embedded func#on calls XML: Unordered, unranked, labeled trees
– Internal nodes are labeled by tags – Leaves are labeled by tags, data – Set seman#cs: No isomorphic sibling sub-‐trees
The func#ons are interpreted as calls to external services
– Embedding calls in data is an old idea in databases
8
a
b
c d
b
d
Ac@ve , evolving
, or func@on symbols
![Page 9: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/9.jpg)
Example
9
t
t2 m2
root@p1
!songs@p2 !songs@p3 t
t1 m1
t
t3 m3 !f3
songs
Leads to evolving trees – Inten#onal data: get the data only when desired – Dynamic data: If data sources change, the document changes – Flexible data: adapt to the needs – Func#on in push & pull mode
!songs@p1 t
t4 m4
t
t5 m5 !f5
![Page 10: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/10.jpg)
Query root/songs/t
10
t
t2 m2
root@p1
!songs@p2 !songs@p3 t
t1 m1
t
t3 m3 !f3
songs
!songs@p1 t
t4 m4
t
t5 m5 !f5
Recursive calls
t tt
![Page 11: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/11.jpg)
root//t[//singer/“Brel”]
11
t
t2 m2
root@p1
!songs@p2 !songs@p3 t
t1 m1
t
t3 m3 !f3
songs
!songs@p1 t
t4 m4
Push queries to data sources – !songs@p3: root//t[//singer/“Brel”] – !songs@p2 root//t[//singer/“Brel”] – !songs@p1: root//t[//singer/“Brel”] – Distributed query/subquery (or Magic Set)
t
![Page 12: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/12.jpg)
This is distributed datalog over trees
songs@p1(x,y) :-‐ t@p1(x,y) songs@p1(x,y) :-‐ songs@p2(x,y) songs@p1(x,y) :-‐ songs@p3(x,y) songs@p2x,y) :-‐ t@p1(x,y) songs@p2(x,y) :-‐ songs@p1(x,y) songs@p2(x,y) :-‐ songs@p3(x,y) songs@p3(x,y) :-‐ t@p1(x,y) songs@p3(x,y) :-‐ songs@p1(x,y) songs@p3(x,y) :-‐ songs@p2(x,y)
12
:-‐ songs@p1(x, y), P(x)
:-‐ songs@p2(x, y), P(x)
:-‐ songs@p3(x, y), P(x)
:-‐ songs@p1(x, y), P(x)
:-‐ songs@p2(x, y), P(x)
:-‐ songs@p1(x, y), P(x)
![Page 13: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/13.jpg)
Fun issues: The seman#cs of calls
When to ac#vate the call? – Explicit pull mode: ac#ve databases – Implicit pull mode: deduc#ve databases – Push mode: query subscrip#on
What to do with its result? How long is the returned data valid? Sending an AXML documents: evaluate the service calls before sending or not?
13
![Page 14: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/14.jpg)
Exchanging AXML data
Web services exchange inten#onal documents Materializa#on can be performed
– by the sender, before sending a document or – by the receiver, amer receiving it.
14
GetEvents
“Exhibits”
newspaper
@tle date
“Le Monde” “06/10/2003”
GetTemp
city
“Paris”
Tran
sfer
Matisse...
Matisse...
Matisse...
![Page 15: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/15.jpg)
Tran
sfer
Exchanging AXML data
Web services exchange inten#onal documents Materializa#on can be performed
– by the sender, before sending a document or – by the receiver, amer receiving it.
GetEvents
“Exhibits”
newspaper
@tle date
“Le Monde” “06/10/2003”
GetTemp
city
“Paris”
Matisse...
Matisse...
15
![Page 16: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/16.jpg)
Some reasons for not materializing data before sending the document
Freshness – The receiver will get up-‐to-‐date informa#on when needed
Security – Only the receiver has the creden#al to call the service – One needs to record who is actually using the data
Performance – To save on the bandwidth of the sender
To delegate work to someone else How to specify it: cas#ng based on types ☞ jewel sec#on
16
![Page 17: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/17.jpg)
Complex issues
Brings to a unique seong
distributed db deduc#ve db ac#ve db
stream data warehousing & media#on
This seems to us necessary for capturing all the facets of data management in distributed systems
This is unreasonable? Yes!
17
![Page 18: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/18.jpg)
Query op#miza#on in distributed systems
Ac#ve XML Algebra op#miza#on
![Page 19: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/19.jpg)
AXML system
A system = a set of peers – Each peer provides storage and query
processing – Each peer hosts ac#ve documents Extensional data Inten#onal data (query calls in the document)
Problem: Given a query q at some peer evaluate the answer to q with op#mal response #me
Query processor
Optimizer
Peer
Com
mun
icat
ion
AXML docs
Stats Workspace
19
![Page 20: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/20.jpg)
Local and global query processing
Local processing ☛ Input/output streams Local query op#miza#on
Global processing ☛ Streams for communica#ons
Global query op#miza#on ☛ Delegate work to other peers
input stream
π
⨝
π
input stream
σ
output stream
p1 p2
p3 p4
20
![Page 21: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/21.jpg)
Example 1: Local and global op#miza#on
Peer 1
rcv1 rcv2
∪
σ
snd1
R
Peer 3
Peer 2
snd2
S
Peer 1
rcv1 rcv2
R
∪
snd1
σ
Peer 3
Peer 2
snd2
σ
S
Global Rewriting:
Push selections to sources
p3 asks for σ ( R@p1 ∪ S@p2 )
Peer 1
rcv1 rcv2
σ
∪
snd1
R
Peer 3
Peer 2
snd2
S
σ
Local Rewriting: Selection &
Union commute
21
![Page 22: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/22.jpg)
Example 2: MapReduce
Peer 1
rcv1 rcv2
snd1
R
Peer 3
Peer 2
snd2
S
Peer 1
rcv1 rcv3
Sn2
Middle- ware 1
Peer 2
snd1
R
map
snd4 snd3
snd5
rcv2 rcv4 Middle- ware 2
snd5
rcv5 Peer 3
R
map
22
![Page 23: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/23.jpg)
The Ac#ve XML algebra
b
rcv2 q
root
rcv1
snd2
a
a
rcv2 b
Passive nodes Annotated with labels
q
root a b
Query nodes
Annotated with queries
For instance Tree-‐Patern-‐Queries
Send/Receive nodes
Annotated with channel ids
snd2 rcv2 rcv1
channel snd2
snd2
rcv2
rcv2 channel
rcv1
rcv1 Input
Internal channel Input channel (no snd)
23
![Page 24: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/24.jpg)
Evolu#on of a system
A system evolves by ac#va#ng: – a query node – a send/receive node on an internal channel – a receive node on an input channel
24
![Page 25: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/25.jpg)
Equivalence problem for AXML systems
No query TPQ TPQ with XPath joins
TPQ with joins
TPQ with constructor
No input PTIME PTIME PTIME Hard Undecidable
Input PTIME Hard Hard ? Undecidable
Complexity increases with: – richer query language – the presence of input
Axiomatization of equivalence in absence of queries
25
![Page 26: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/26.jpg)
Op#miza#on
As usual Use algebraic rewri#ng rules Use simplis#c es#mators for query plans Use heuris#cs to prune the search space
26
![Page 27: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/27.jpg)
Examples of performance op#miza#on techniques
Externalize data in devices with limited capabili#es – Cell phone, tablets, home appliances… – Limited storage space, computa#onal power, network bandwidth
Replicate documents and services – To allow for “local” computa#on – To increase parallelism
27 27
![Page 28: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/28.jpg)
Externalize and replica#on
28
![Page 29: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/29.jpg)
Monitoring in distributed systems
The Axlog system
![Page 30: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/30.jpg)
Monitoring distributed systems
Distributed applica#ons are omen very dynamic – Content change rapidly – Intense communica#ons – Peers some#mes come and leave
Complex and hard to control such systems – Many peers – Peers are distributed & autonomous – Peers are some#mes unreliable and selfish
Goal: monitor such systems
30
![Page 31: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/31.jpg)
Architecture
31
publishers
Alerters
Streams
Stream processors
ac#ons
RSS
Axlog processor
![Page 32: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/32.jpg)
Axlog principle = ac#ve document & query
Incoming streams of updates The outgoing stream is defined
by a query Q (e.g. TPQ) Each #me an incoming
message arrives, it modifies the document so possibly the query result
The output stream specifies how the view is modified
Incremental view maintenance
Query
AXML document
Updates
32
![Page 33: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/33.jpg)
Axlog engine
Datalog is used to evaluate queries with benefit from – Incremental view maintenance in datalog Δ technique – Query op#miza#on in datalog MagicSet – Constraint query languages CQL
Specific techniques – Push queries to the sources to avoid loading irrelevant data – Use of FSA on XML inputs: YFilter
33
![Page 34: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/34.jpg)
Task sequencing in distributed systems
![Page 35: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/35.jpg)
Task sequencing and verifica#on
• Task sequencing is a major difficulty for distributed systems – Difficulty to integrate workflow and database
systems
• Verifica#on of temporal proper#es is hard – Typically verifica#on is harder than evalua#on
• Evalua#ng an FO query is p#me data complexity • Verifying that Q ⊆ Q’ is undecidable
– Verifica#on will be the topic of the seminar by Victor Vianu
35
DBMSs exchanging data
Workflow systems sequencing tasks
![Page 36: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/36.jpg)
Example: Dell Supply Chain
Customer Web Store Bank
Plant
Warehouse
Shipping
Supplier
36
![Page 37: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/37.jpg)
AXML as business ar2facts
Concept introduced by IBM [Nigam & Caswell 03, Hull & Su 07]
Data-‐centric workflows − A process is described by a document
(possibly moving in the enterprise) − The behavior of an ar#fact is specified
by some constraints on its evolu#on
Vs. state-‐transi#on-‐based workflows • Based on some form of state transi#on
diagrams (BPEL, Petri,…)
• Mostly ignore data
webOrder id=7787780 Customer
Name: John Doe Address: Sèvres
Product: committed Ref: PC 456
Factory: Milano Parts: waiting orderDate: 2009/07/24 Site: http:// d555.com Payment: done
Bank-account … Delivery: not-active
37
![Page 38: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/38.jpg)
Axml Ar#facts move between peers
webOrder id=7787780 Customer
Name: John Doe Address: Sèvres
Order selection: on-going Ref: PC 456
Factory: undecided Parts: not-active orderDate: 2009/07/24 Site: http://d555.com Payment: pending Delivery: not-active
webOrder id=7787780 Customer
Name: John Doe Address: Sèvres
Order selection : committed Ref: PC 456
Factory: Milano Parts: on-going orderDate: 2009/07/24 Site: http:// d555.com Payment: done
Bank-account … Delivery: not-active
webOrder id=7787780 Customer
Name: John Doe Address: Sèvres
Order selection : committed Ref: PC 456
Factory: Milano Parts: done orderDate: 2009/07/24 Site: http:// d555.com Payment: done Bank-account: CEIF-4457889 Delivery: on-going Address: Orsay
In webStore In plant In delivery
38
![Page 39: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/39.jpg)
catalogue
WEBSTORE PLANT DELIVERY
CREDIT APPROVAL WAREHOUSE ARCHIVE
39
![Page 40: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/40.jpg)
Sequencing of opera#ons
Different ways of expressing sequencing of tasks – Guards: precondi#ons for func#on calls – Transi#on-‐based diagrams – Formulas in temporal logic
Study how they can simulate each other using some “scratch paper”
40
![Page 41: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/41.jpg)
A jewel of ac#ve documents
Cas#ng document to a target type
![Page 42: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/42.jpg)
The cas#ng problem
Given – An ac#ve document I – The signature of the func#ons – And a target type T
Which func#ons to call to be sure to reach T? 2-‐player game
– Juliet chooses which func#on to call – Romeo chooses a value within the domain of the
func#on Juliet wins if she can reach a document in T 42
![Page 43: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/43.jpg)
An abstrac#on: ac#ve context-‐free games
On words instead of trees – Game (𝚺,R,T)
• 𝚺 is a finite alphabet • R set of CF rules • T is a regular target language
– w is the start word Output: true if Juliet has a winning strategy Alterna#on of
∃ states (Juliet pick next func#on to call) and ∀ states (the adversary Romeo picks the answer)
43
![Page 44: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/44.jpg)
Examples
• Winning
• Start word aba • Strategy
– Call the second a – Call all the c’s – Obtain a word in Target
• Losing
• Start word ab • No strategy
– Ini#ally #(a) – #(b) = 0 – If I call a or b, #(a) – #(b) < 0
44
a→abc*; b→(ba)*b; c→ab Target abab(ab)*
![Page 45: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/45.jpg)
Fun rewri#ng game
The problem is undecidable in general Interes#ng decidable subcases
– MuschollSchwen#ckSegoufin – Juliet has to traverse the string from lem to right – No recursion among func#on calls – Func#on call are “linear”
Also in prac#ce, very efficient cas#ng based on unambiguous grammars
45
![Page 46: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/46.jpg)
Conclusion
![Page 47: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/47.jpg)
Some works around Axml
The Axml system – open-‐source (on server, on smartphone) The useful: Replica#on and query op#miza#on
How to evaluate a query efficiently by taking advantage of replica#on The useful: Lazy query evalua#on
How to evaluate a query without calling all embedded services The fun: Cas#ng problem
Which func#ons to call to “match” a target type Ac#ve context-‐free games
The exo#c – Diagnosis of communica#on systems based on datalog op#miza#on – Access control – Distributed design – Probabilis#c genera#on of documents
47
![Page 48: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/48.jpg)
We will come back to distribu#on
Lesson 6: datalog -‐ recursion is essen#al Lesson 7: distributed data management in general Lesson 8: distributed knowledge bases
48
![Page 49: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/49.jpg)
Acknowledgements
With many colleagues, in par#cular: – Tova Milo (Tel Aviv) Victor Vianu (UCSD) – Luc Segoufin (INRIA) Ioana Manolescu (INRIA) – Georg Gotlob (Oxford) Alkis Polyzo#s (UCSC) – Angela Bonifa# (Lille) Marie-‐Chris#ne Rousset (Grenoble) – Balder ten Cate (UCSC) Yannis Katsis (UCSD)
And PhD students – Omar Benjelloun (Google) Bogdan Marinoiu (SAP) – Pierre Bourhis (INRIA) Alban Galland (INRIA) – Marco Manna (Calabria) Nicoleta Preda (Versailles) – Zoe Abrams (Google) Emmanuel Taropa (Google) – Bogdan Cau#s (Telecom) Spyros Zoupanos (Max-‐Planck-‐Ins#tut)
And others
49
![Page 50: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/50.jpg)
4/4/12 50
Merci !
![Page 51: Ac#ve&documents&and&Ac#ve&XML& - Serge Abiteboul · 2012-04-04 · 2 ∪& σ snd 1 R Peer 3 Peer 2 snd 2 S Peer 1 rcv 1 rcv 2 R ∪& snd 1 σ Peer 3 Peer 2 snd 2 σ S Global Rewriting:](https://reader033.vdocuments.site/reader033/viewer/2022042312/5edbac2dad6a402d66660258/html5/thumbnails/51.jpg)
Sta@c Analysis and Verifica@on Victor Vianu, U.C. San Diego
PhD from USC 1983 Sabba#cals INRIA, ENS Cachan, Ulm,
Telecom Interests: database theory,
computa#onal logic, Web data Co-‐author of Founda2ons of databases
– Aka the Alice book Vianu has served as
– General Chair of SIGMOD, PODS, – Program Chair of PODS, ICDT
Editor-‐in-‐Chief of the J. ACM ACM Fellow
51