![Page 1: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/1.jpg)
Self Maintenance of materialized XML views with non-cooperative data
sources
DBDBD – 2006
Virginie Sans –ETIS/CNRS Laboratory– MIDI Team
![Page 2: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/2.jpg)
2
SummarySummary
1) Issue and context
1) Pre-requisite2) The issue3) Context4) State of the art
2) Contributions
1) View computation with the XAlgebra2) Detection and Identification of source updates3) View maintenance4) Applications and performances
Conclusion
![Page 3: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/3.jpg)
3
Mediation architectureMediation architecture
Introduced by WiederHold
The architecture mediator wrappers sources Query langague
1.1 Pre-requisite
![Page 4: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/4.jpg)
4
Mediation architectureMediation architecture
Mediator Handle the user request: canonization, atomization Send atomic request to a source via its wrapper
wrappers Translate query coming from the mediator into a
query in the native langague of the web source Give the mediator an answer in XML
Data sources heterogeneous distributed In a web context : Partially unavailable
Source SQL
WrapperWrapper
Meditor
XMLAtomic request
SQL Tuples
1.1 Pre-requisite
![Page 5: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/5.jpg)
5
ViewsViews
What about views ? Data integration Access control, security Data-warehouses
Why ? Interoperability Heterogeneous data
Materializing views Fast access to complex query Better Availability Request optimization
RDB SQL HTML
Materializedviews
WrapperWrapper
Mediator
WrapperWrapper WrapperWrapper
1.1 Pre-requisite
![Page 6: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/6.jpg)
6
Issue : View maintenance Issue : View maintenance
Maintenance process
Recomputation Recompute the whole view from scratch
When data sources are updated, the view consistency should be kept
Incremental maintenance compute changes to view in response
to changes to base sourcesSource t
Viewt
View computation
Source t+1
Viewt+1
Recomputation
Update
incr
emen
tal
Mai
nten
ance
Maintenance
1.2 Issue
![Page 7: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/7.jpg)
7
Context : semi-structured XML dataContext : semi-structured XML data
XML views are materialized at the mediator level
Hierarchical data
No scheme, except the query scheme
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book>
<title> TCP/IP Illustrated </title></book><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book> <price>39.95</price>
<title> Data on the Web </title><title> Données sur le Web </title>
</book></bib>
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book>
<title> TCP/IP Illustrated </title></book><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book> <price>39.95</price>
<title> Data on the Web </title><title> Données sur le Web </title>
</book></bib>
1.3 Context
![Page 8: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/8.jpg)
8
Context : XQUERY Context : XQUERY
XQuery
Dedicated to XML data
Relational operator (projection, select, join, union, …)
XML operator (tagging, unnesting, aggregation, ..)
FLWOR syntax
…………(pronounced Flower !)
<result> for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return <cheap_book>
$b/title </cheap_book>
</result>
<result> for $b in document("bib.xml")/bib/book let $a=$b/author where $b/price/text() < 60 Order by $b/year return <cheap_book>
$b/title </cheap_book>
</result>
Syntaxe FLWOR
for $var in foret [$var in foret]*let $var:= sous-arbreWhere conditionReturn result
Syntaxe FLWOR
for $var in foret [$var in foret]*let $var:= sous-arbreWhere conditionReturn result
1.3 Context
![Page 9: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/9.jpg)
9
Context : Other specificities Context : Other specificities
Views are computed using XAlgebra Cf.View computation
Wrappers have limited resources Few computation possibilities A component named logger stores the last modification date and a checksum of sources
Non cooperative web sources No information about their updates Not always available Not enough granularity
1.3 Context
![Page 10: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/10.jpg)
10
State of the art (1/2)State of the art (1/2)
Relational views Not fit for semi-structured data
Abiteboul and Al. OEM (Object Embedded Model) LOREL language Some Operators are missing
VOX – Rainbow Team Need to know the exact position in the XML Tree where the update has been done
1.4 State of the art
![Page 11: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/11.jpg)
11
State of the art (2/2) State of the art (2/2)
Cobena and Al. XDiff – an algorithm for XML files comparison Need a copy of the source at the wrapper level
Bonnet and Al. /Papadimos and Al. Parachute queries A mutant query plan
What about when sources are really unavailable ?
Our goal :
Reduce to the minimum sources accessUse information that are stored in the view
1.4 State of the art
![Page 12: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/12.jpg)
12
View maintenance : The process View maintenance : The process
View computation An algebraic approach using XAlgebra – Extension of the XAlgebra (identifiers)
Update detection Comparison of the information of the source and those stored in the logger
Update identification Recovering process Diff Algorithm
View maintenance Propagation rules for each operator
2.1 View computation
![Page 13: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/13.jpg)
13
View computationView computation
Steps :
2.1 View computation
![Page 14: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/14.jpg)
14
The XAlgebra data modelThe XAlgebra data model
Data structures : XRelation, XTuple, XAttributes
Operators : XSource, XConstruct, XUnion, ….
2.1 View computation
![Page 15: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/15.jpg)
15
XSource Operator– Step 1XSource Operator– Step 1
XQuery analysis
We obtain : A contextA set of patterns
For $f in doc("informations.xml")/personnes/personneLet $a:=$f/nomWhere $f/age<27 and $a="Durand"Return<nom>{$a}</nom><prenom>{$f/prenom}</prenom>
Path extraction :OptionalMandatoryHidden
2.1 View computation
![Page 16: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/16.jpg)
16
XSource Operator– Step 2 and 3XSource Operator– Step 2 and 3
From XML Sub-Trees to the tabular structure
1 Sub Tree => 1 Xtuple XRelation = set of XTuples
2.1 View computation
![Page 17: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/17.jpg)
17
XSource Operator– Extending the Algebra XSource Operator– Extending the Algebra
adding identifiers : XTids
An XTID is a set of pair :
{(idsource, idfragment), …..}
2.1 View computation
![Page 18: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/18.jpg)
18
View computation - XOperatorView computation - XOperator
XProject
2.1 View computation
![Page 19: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/19.jpg)
19
View computation - XOperatorView computation - XOperator
XJoin
XTids propagation : card (XTID)1for some nodes
2.1 View computation
![Page 20: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/20.jpg)
20
Update detection and IdentificationUpdate detection and Identification
Detection
Comparison of the information of the source and those stored in the logger• The last modification date• The checksum of the source
Identification
Partial recovery of the source information based on Xtids Comparison of the recovered XRelation with the updated source Δ computation
2.2 Update detection and identification
![Page 21: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/21.jpg)
21
XRecoverXRecover
Step 1 : Project XRv on XR1 patterns
2.2 Update detection and identification
![Page 22: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/22.jpg)
22
XRecoverXRecover
Step 2 : filtering XTuples values
2.2 Update detection and identification
![Page 23: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/23.jpg)
23
XRecoverXRecover
Step 3 : re-ordering XTuples
XTidUnnest
2.2 Update detection and identification
Xtuples are unnested depending on their XTids
![Page 24: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/24.jpg)
24
XRecoverXRecover
Step 3 : re-ordering Xtuples
XTidnest
2.2 Update detection and identification
Xtuples are nested by their Xtids
Xtuples are re-ordered
![Page 25: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/25.jpg)
25
Update Identification – Comparison AlgorithmUpdate Identification – Comparison Algorithm
Comparison of XR1t+1 avec XRt’
XR1t+1 is the XRelation obtained by applying Xsource to source 1 at t+1
XRt’ is the partial recovery of Xrelation of source 1 at t
Remark : XR1t+1 can also be filtered using predicates before comparison
The Diff algorithm is based on Unix Diff (Hunt & McIllroy).The symbol is the Xtuple instead of being the line
2.2 Update detection and identification
![Page 26: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/26.jpg)
26
Update identification – Diff algorithmUpdate identification – Diff algorithm
Delta with hunks : Insert(pos; Xtuple) delete(pos;Xtuple) Replace(pos; Xtupleold, Xtuplenew)
2.2 Update detection and identification
Insert(2,{Leclerc,Avide,{(1,3)}} {John,Avide,{(1,3)}} }
Delete(4,{Durand,Avide,{(1,11)}}, {Marcel,Avide,{(1,11)}} {Eric,Avide,{(1,11)}}}
Etc…
![Page 27: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/27.jpg)
27
Maintenance RulesMaintenance RulesFrom Delta to view maintenanceFrom Delta to view maintenance
Case of a deletion - delete(pos, xtuple)
An Xtuple is associated to an Xtid {(x)} such that card=1, Each Xvalue of the view have xtids noted XTID
1) We delete from Xvalues each pair of the Xtid such that x XTID
Example : The XTuple where xtid is x=1,3 has been deletedThe Xvalue {Alain}1,3;1,4 becomes XValeur {Alain}1,4
2) We delete each Xvalues such that card(XTID)=0
If XValue {Alain}1,3 become XValeur {Alain} We delete entirely the XValue
3) If the Xvalue was concenned by the predicate, we delete the XTuple
Join and restriction case
2.3 View maintenance
![Page 28: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/28.jpg)
28
Maintenance RulesMaintenance RulesFrom Delta to view maintenanceFrom Delta to view maintenance
Case of an insertion - insert(pos; xtuple)
1) A new Xtid is created Goal : preserved Xtuples order for a later recovery
2) Depending on the operator; we obtain various maintenance instructions
Projection: insert of the projection of the xtupleSelect : xtuple satisfies the predicat insertion
Join XR1 * XR2, computation of XT= xtuple * XR2. If XT insertion of XT
Union and Intersect: we keep the conservation des doublons Union Select where the predicate is always true Intersect join
Depending on the predicate, we can request either XR2 or its recovery
2.3 View maintenance
![Page 29: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/29.jpg)
29
Maintenance RulesMaintenance RulesFrom Delta to view maintenanceFrom Delta to view maintenance
Case of a modification- Replace(pos; Xtupleold, Xtuplenew)
Xtuple modification=
Xvalue modification OR
Xvalues deletion followed by insertion
Project and Union: modification of the concerned XValuesSelect and Intersect: If modification is applied an Xvalue that must verify the condition,
deletion of the Xtuple Else modification of the XValuesIntersect select.Join deletion followed by insertion.
2.3 View maintenance
![Page 30: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/30.jpg)
30
Maintenance RulesMaintenance RulesFrom Delta to view maintenanceFrom Delta to view maintenance
2.3 View maintenance
![Page 31: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/31.jpg)
31
Maintenance rulesMaintenance rulesMissing InformationMissing Information
Missing Information (join ?)
Source Recovery Multi-view strategy Source request
Goal : limited acces to the sources !!!!
Example :View= S1*S2
SQLHTML
Materialized viewsMediator
WrapperWrapperWrapperWrapper
xtuple x is inserted in S1
Computation of S2’
Insertio : x * S2’
2.3 View maintenance
![Page 32: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/32.jpg)
32
ApplicationsApplications
•On the web
• With sensors (ANR Project )
When necessary sources are unavailable
Goal : Limited access to them
With sensors that have no wire
Goal: Preserve power ressources
2.4 Applications and performances
![Page 33: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/33.jpg)
33
PerformancesPerformances
• Comparison between XRecover and Recomputation
2.4 Applications and performances
![Page 34: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/34.jpg)
34
PerformancesPerformances
• Comparison between XRecover and Recomputation
2.4 Applications and performances
![Page 35: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/35.jpg)
35
ContributionsContributions
Maintenance process in the context of non-cooperative web sources
Contribution to the XAlgebra New operators : XRecover, XTidUnnest, XTidNest
New data structure : XTids
Futur work Order sensitive view maintenance
A better Diff algorithm
Conclusion
![Page 36: Self Maintenance of materialized XML views with non-cooperative data sources](https://reader035.vdocuments.site/reader035/viewer/2022062407/56812b37550346895d8f465a/html5/thumbnails/36.jpg)
36
Thanks for you Thanks for you attention !attention !
Any questions ?Any questions ?