putting the spirit of the web back into semantic web … · 2010-11-22 · motivation vision:...
TRANSCRIPT
![Page 1: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/1.jpg)
PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB QUERYING
Cosmin Basca, Abraham Bernstein
![Page 2: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/2.jpg)
Motivation
Vision: towards a globally query-able and truly open Semantic Web
We want to: Query the Web of Data (WoD) on-demand Provide up-to date results (within the query execution
interval, typically seconds) Impose no or limited restrictions on data publishers Be flexible regarding participating triple stores Preserve the “openness” of WoD
![Page 3: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/3.jpg)
Openness
By “openness” we mean: Assume that servers are:
Independent (unaware of other servers) Heterogeneous
Assume no control and limited knowledge over their distribution & availability
Data publishing: Not having to adhere to fixed guidelines
![Page 4: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/4.jpg)
Motivating example
Consider: Sites holding LOD Linked Movie and DBPEDIA data Find out which movies and related information, were
produced by “Producers Circle” studios SELECT ?title ?photoCollection ?name WHERE {
?film dc:title ?title; movie:actor ?actor; owl:sameAs ?sameFilm.
# link to other datasets
?actor a foaf:Person; movie:actor_name ?name .
?sameFilm dbpedia:hasPhotoCollection ?photoCollection. ?sameFilm dbpedia:studio ‘‘Producers Circle’’.
}
![Page 5: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/5.jpg)
Problem
Key space A given in SW via URIs Tradeoff between globalism and performance (address
space vs. size in bytes)
Joining datasets
Currently no system / algorithm to achieve goal entirely
![Page 6: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/6.jpg)
Problem
High
Local
Restrictiveness
Goal
Cloud
Global
Clustered
Fixed id partitioning
Triple levelFederation
Low
Sesame
URIInstance levelFederation
Intended Addressing Space
![Page 7: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/7.jpg)
Problem
High
Local
Restrictiveness
Goal
YARS2
Cloud
Global
Clustered
Fixed id partitioning
Triple levelFederation
Low
AllegroGraph
4Store
Sesame
URIInstance levelFederation
Intended Addressing Space
![Page 8: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/8.jpg)
Problem
High
Local
Restrictiveness
Goal
SemWiq
YARS2
Cloud
Global
Clustered
Fixed id partitioning
Triple levelFederation
Low
AllegroGraph
4Store
Sesame
DARQ
URIInstance levelFederation
Intended Addressing Space
![Page 9: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/9.jpg)
Problem
RDF Peers
Hartig et. al.
High
Local
Restrictiveness
Goal
SemWiq
YARS2
Cloud
Global
Clustered
Fixed id partitioning
Triple levelFederation
Low
AllegroGraph
4Store
Sesame
DARQ
URIInstance levelFederation
Intended Addressing Space
![Page 10: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/10.jpg)
Problem
RDF Peers
Hartig et. al.
High
Local
Restrictiveness
?
Goal
SemWiq
YARS2
Cloud
Global
Clustered
Fixed id partitioning
Triple levelFederation
Low
AllegroGraph
4Store
Sesame
DARQ
URIInstance levelFederation
Intended Addressing Space
Closer to Goal
![Page 11: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/11.jpg)
Avalanche
!"#"$%&!"#"$%&!'()")*)$$%+"#),&!,-.%&'(")"&*&&&&!/#$.&+,-./.01&!"#"$%0&111&&&&222&&&!2-.%3#$.&+341+/5-6.7+/8&99:;8+7,1;6&$/;,01<<1=
Avalanche SPARQL endpoint
![Page 12: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/12.jpg)
Avalanche
!"#"$%&!"#"$%&!'()")*)$$%+"#),&!,-.%&'(")"&*&&&&!/#$.&+,-./.01&!"#"$%0&111&&&&222&&&!2-.%3#$.&+341+/5-6.7+/8&99:;8+7,1;6&$/;,01<<1=
Endpoints Directory or Search Engine
Avalanche SPARQL endpoint
1
![Page 13: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/13.jpg)
Avalanche
!"#"$%&!"#"$%&!'()")*)$$%+"#),&!,-.%&'(")"&*&&&&!/#$.&+,-./.01&!"#"$%0&111&&&&222&&&!2-.%3#$.&+341+/5-6.7+/8&99:;8+7,1;6&$/;,01<<1=
Endpoints Directory or Search Engine
Avalanche SPARQL endpoint
1 2
![Page 14: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/14.jpg)
Avalanche
!"#"$%&!"#"$%&!'()")*)$$%+"#),&!,-.%&'(")"&*&&&&!/#$.&+,-./.01&!"#"$%0&111&&&&222&&&!2-.%3#$.&+341+/5-6.7+/8&99:;8+7,1;6&$/;,01<<1=
Endpoints Directory or Search Engine
Avalanche SPARQL endpoint
1 23
![Page 15: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/15.jpg)
Challenges and Implications
Web of Data is growing: LoD ~25B triples (Sept 2010) Lack of (high) quality statistics (join estimations) Physical constraints
Bandwidth, latency, unavailability, many sites
Completeness not considered First K results
Exponential search space due to flexibility Efficient heuristics to search
![Page 16: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/16.jpg)
Architecture
AVALANCHE Mediator Execution Pipeline
AVALANCHE endpoints Web Directory or Search Engine
query preprocessing phase
query execution phase
PlansQueue
Plan Generator
FinishedPlansQueue
ResultsQueue
Query Stopper
Executor
MaterializerExecutor
Executor
Executor
Materializer
Materializer
Materializer
Res
ults
Statistics Requester QueryQuery
Parser
![Page 17: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/17.jpg)
Planning
AVALANCHE Mediator Execution Pipeline
AVALANCHE endpoints Web Directory or Search Engine
query preprocessing phase
query execution phase
PlansQueue
Plan Generator
FinishedPlansQueue
ResultsQueue
Query Stopper
Executor
MaterializerExecutor
Executor
Executor
Materializer
Materializer
Materializer
Res
ults
Statistics Requester QueryQuery
Parser
![Page 18: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/18.jpg)
Planning
Greedy multipath search inspired by Best First Search
Total space is O(n3)!, but size increases by M * H with each exploratory step (H=number of sites, M=number of paths)
In practice the space is tractable: most queries are not fully connected graphs!
Can be further reduced Windowed approach
![Page 19: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/19.jpg)
Planning
7 triple patterns and 6 unbounded variables if graph is undirected and fully connected : 240 possible paths In practice we have a sparse directed graph 11 paths
Search step: each path assigned to all servers involved i.e. for 100 hosts: 1100 states
Join (average) paths to form full query graph 4 average joins to full graph: 4400 plans (ordered)
SELECT ?title ?photoCollection ?name WHERE {
?film dc:title ?title; movie:actor ?actor; owl:sameAs ?sameFilm. # link to other datasets
?actor a foaf:Person; movie:actor_name ?name .
?sameFilm dbpedia:hasPhotoCollection ?photoCollection. ?sameFilm dbpedia:studio ‘‘Producers Circle’’.
}
€
n(n −1)2n−3
![Page 20: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/20.jpg)
Planning Heuristics
Default Extended
€
Edges(N1)CNTN1
€
min(CNTN1,CNTN 2)
,first node
,otherwise U=
€
1
€
(L +CNTN 2B
+CNTN1 +CNTN 2CNTN1
) Edges(Query)Edges(N2)
,first node C=
EU=
€
w1⋅ JOINN1,N 2 + w2⋅UN1,N 2
€
w2⋅UN1,N 2
,N1 N2 selective
,otherwise
€
JOINN1,N 2 ≈ −1k⋅ln(m⋅ Z1 + Z2 − Z12
Z1⋅ Z2)
ln(1− 1m)
L=latency B=bandwidth Cost to execute remote subquery Cost to execute local subquery Scaling factor (aid convergence)
Bloom filters (expensive only selective queries)
Zi=number of 0 bits in bloom filter i K=number of bloom hash functions M=size in bits of the bloom filter
![Page 21: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/21.jpg)
Execution
AVALANCHE Mediator Execution Pipeline
AVALANCHE endpoints Web Directory or Search Engine
query preprocessing phase
query execution phase
PlansQueue
Plan Generator
FinishedPlansQueue
ResultsQueue
Query Stopper
Executor
MaterializerExecutor
Executor
Executor
Materializer
Materializer
Materializer
Res
ults
Statistics Requester QueryQuery
Parser
![Page 22: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/22.jpg)
Execution
?sameFilm dbpedia:hasPhotoCollection ?photoCollection. ?sameFilm dbpedia:studio ‘‘Producers Circle’’.
?actor a foaf:Person. ?actor movie:actor_name ?name.
?film dc:title ?title. ?film movie:actor ?actor. ?film owl:sameAs ?sameFilm.
?sameFilm ?actor
q1 q2
q3
1) Join(q1,q2)
2) R1=Execute(q1)
3) Send(R1)
4) FR2=ExecuteFilter(R1)
5) Join(q2,q3)
6) Send(FR2)
7) FR3=ExecuteFilter(FR2)
8) Update(q3,q2)
10) Send(R3)
12) R2=Filter(FR2, FR3)
13) Send(R2)
14) R1=Filter(R1, R2) 9) R3=FR3
11) Update(q2,q1)
![Page 23: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/23.jpg)
Materializing and Stopping
materialization: same as execution, but request string representation
from endpoints that completed the plan
stopping: timeout relative saturation
New results received over a sliding window
first K results
![Page 24: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/24.jpg)
Preliminary Results 5 sites, 35 million triples
0 1.5
3 4.5
6 7.5
9 10.5
12 13.5
Q1 Q1 Q2 Q2 Q3 Q3
Tim
e (s
econ
ds)
Queries
execution timeFirst Results (default)
Total Results (default)
First Results (extended)Total Results (extended)
![Page 25: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/25.jpg)
Preliminary Results 5 sites, 35 million triples
0
40
80
120
160
200
240
280
Q1 Q1 Q2 Q2 Q3 Q3
#Res
ults
(uni
que)
Queries
# resultsFirst Results (default)
Total Results (default)
First Results (extended)Total Results (extended)
![Page 26: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/26.jpg)
Preliminary Results 5 sites, 35 million triples
0 15 30 45 60 75 90
105 120 135 150 165 180
1 10 100 1000 10000
# N
ew R
esul
ts
# Total Results
Planner Convergence
Q1Q2Q3
Saturation Q1Saturation Q2Saturation Q3
![Page 27: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/27.jpg)
Conclusions
Avalanche: Makes no or limited assumptions about data distribution
partitioning and availability Provides up-to date results as exposed by the
endpoints Flexible since it does not have knowledge about triple
store structure
![Page 28: PUTTING THE SPIRIT OF THE WEB BACK INTO SEMANTIC WEB … · 2010-11-22 · Motivation Vision: towards a globally query-able and truly open Semantic Web We want to: Query the Web of](https://reader034.vdocuments.site/reader034/viewer/2022042419/5f35f8cb805b50110f2cb1ae/html5/thumbnails/28.jpg)
Demo
See Avalanche live visit us @ISWC demo and poster session
thank you