how caching improves efficiency and result completeness for querying linked data
DESCRIPTION
That's the slides with which I presented my paper at the Linked Data on the Web workshop (LDOW) at WWW 2011, Hyderabad, India.TRANSCRIPT
How Caching ImprovesEfficiency and Result Completeness
for Querying Linked Data
Olaf Hartighttp://olafhartig.de/foaf.rdf#olaf
Database and Information Systems Research GroupHumboldt-Universität zu Berlin
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 2
SELECT DISTINCT ?i ?labelWHERE {
?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ; foaf:topic_interest ?i .
OPTIONAL { ?i rdfs:label ?label FILTER( LANG(?label)="en" || LANG(?label)="") }}ORDER BY ?label
?
Can we query the Web of Dataas of it were a single,giant database?
Our approach: Link Traversal Based Query Execution [ISWC'09]
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 3
Main Idea
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 4
Main Idea
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 5
query-localdataset
Main Idea
http://bob.name
?
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 6
query-localdataset
Main Idea
http://bob.name
?
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 7
query-localdataset
Main Idea
http://bob.name
?
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 8
query-localdataset
Main Idea
http://bob.name
?
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
“Descriptor object”
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 9
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
n
ame
project ?prj
Query
?prjName
know
s
http://bob.name
?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 10
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
knows
http://bob.name
http://alice.name
n
ame
project ?prj
Query
?prjName
know
s
http://bob.name
?acq n
ame
project ?prj
Query
?prjName
know
s
http://bob.name
?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 11
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
http://alice.name
?acq
n
ame
project ?prj
Query
?prjName
know
s
http://bob.name
?acq
knows
http://bob.name
http://alice.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 12
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
http
://al
ice.
nam
e
?
http://alice.name
?acq
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 13
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
http
://al
ice.
nam
e
?
http://alice.name
?acq
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 14
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
http
://al
ice.
nam
e
?
http://alice.name
?acq
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 15
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
n
ame
project ?prj
Query
know
s
?acq
?prjName
http://bob.name
http://alice.name
?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 16
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
http://alice.name
?acq
n
ame
?prjName
know
s
http://bob.nameQuery
project ?prj?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 17
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
project http://.../AlicesPrj
http://alice.name
http://alice.name
?acq
n
ame
?prjName
know
s
http://bob.nameQuery
project ?prj?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 18
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
http://alice.name http://.../AlicesPrj
?prj?acq
http://alice.name
?acq
n
ame
?prjName
know
s
http://bob.nameQuery
project ?prj?acq
project http://.../AlicesPrj
http://alice.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 19
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
http://alice.name
?acq
n
ame
?prjName
know
s
http://bob.nameQuery
project ?prj?acq
http://alice.name http://.../AlicesPrj
?prj?acq
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 20
query-localdataset
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Main Idea
http://alice.name
?acq
n
ame
know
s
http://bob.nameQuery
project?acq
?prjName
?prj
http://alice.name http://.../AlicesPrj
?prj?acq
http://.../AlicesPrj “ … “
?prjName?prj
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 21
query-localdataset
Main Idea
http://alice.name
?acq
n
ame
know
s
http://bob.nameQuery
project?acq
?prjName
?prj
http://alice.name http://.../AlicesPrj
?prj?acq
http://.../AlicesPrj “ … “
?prjName?prj
http://alice.name
?acqhttp://.../AlicesPrj “ … “
?prjName?prj
● Intertwine query evaluation with traversal of data links
● We alternate between:● Evaluate parts of the query (triple patterns)
on a continuously augmented set of data● Look up URIs in intermediate
solutions and add retrieved datato the query-local dataset
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 22
Characteristics
● Link traversal based query execution:● Evaluation on a continuously augmented dataset● Discovery of potentially relevant data during execution● Discovery driven by intermediate solutions
● Main advantage:● No need to know all data sources in advance
● Limitations:● Query has to contain a URI as a starting point● Ignores data that is not reachable* by the query execution
* formal definition in the paper
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 23
query-localdataset
The Issue
lab
el
interest?i
Querykn
ows
?acq
?iLabel
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 24
query-localdataset
The Issue
lab
el
interest?i
Querykn
ows
?acq
?iLabel
http://bob.name
http://bob.name?
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 25
query-localdataset
The Issue
lab
el
interest?i
Querykn
ows
?acq
?iLabel
http://bob.name
knows
http://bob.name
http://alice.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 26
query-localdataset
The Issue
lab
el
interest?i
Querykn
ows
?acq
?iLabel
http://bob.name
knows
http://bob.name
http://alice.name
?acq ?iLabel?i
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 27
query-localdataset
query-localdataset
The Issue
lab
el
interest?i
Querykn
ows
?acq
?iLabel
http://bob.name
nam
e
know
s
http://bob.nameQuery
project?acq
?prjName
?prj
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 28
query-localdataset
query-localdataset
Reusing the Query-Local Dataset
lab
el
interest?i
Querykn
ows
?acq
?iLabel
http://bob.name
nam
e
know
s
http://bob.nameQuery
project?acq
?prjName
?prj
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 29
query-localdataset
Reusing the Query-Local Dataset
lab
el
interest?i
Querykn
ows
?acq
?iLabel
http://bob.name
nam
e
know
s
http://bob.nameQuery
project?acq
?prjName
?prj
knows
http://alice.name
http://bob.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 30
query-localdataset
Reusing the Query-Local Dataset
lab
el
interest?i
Querykn
ows
?iLabel
?acq
http://bob.name
nam
e
know
s
http://bob.nameQuery
project?acq
?prjName
?prj
http://alice.name
?acq
knows
http://bob.name
http://alice.name
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 31
Re-using the query-local dataset (a.k.a. data caching)may benefit
query performance + result completeness
Hypothesis
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 32
Contributions
● Systematic analysis of the impact of data caching● Theoretical foundation* ● Conceptual analysis* ● Empirical evaluation of the potential impact
● Out of scope: Caching strategies (replacement, invalidation)
*see paper
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 33
Experiment – Scenario
● Information about thedistributed social network of FOAF profiles● 5 types of queries
● Experiment Setup:● 23 persons● Sequential use➔ 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 34
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse given order
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
hit rate
● no reuse experiment:● No data caching
● given order experiment● Reuse of the query-local
dataset for the complete sequence of all 115 queries
● Hit rate:
look-ups answered from cacheall look-up requests
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 35
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
hit rate
● no reuse experiment:● No data caching
● given order experiment● Reuse of the query-local
dataset for the complete sequence of all 115 queries
● Hit rate:
look-ups answered from cacheall look-up requests
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse given order
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 36
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
hit rate query execution time(in seconds)
number of query results
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse given order
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 37
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
hit rate query execution time(in seconds)
number of query results
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse given order
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 38
Summary
● Contributions:● Theoretical foundation● Conceptual analysis● Empirical evaluation
● Main findings:● Additional results possible (for semantically similar queries)● Impact on performance may be positive but also negative
● Future work:● Analysis of caching strategies in our context● Main issue: invalidation
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 39
Backup Slides
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 40
Contributions
● Theoretical foundation (extension of the original definition)
● Reachability by a Dseed
-initialized execution of a BGP query b
● Dseed
-dependent solution for a BGP query b
● Reachability R(B) for a serial execution of B = b1 , … , b
n
➔ Each solution for bcur
is also R(B)-dependent solution for bcur
● Conceptual analysis of the impact of data caching
● Performance factor: p( bcur
, B ) = c( bcur
, [ ] ) – c( bcur
, B )
● Serendipity factor: s( bcur
, B ) = b( bcur
, B ) – b( bcur
, [ ] )
● Empirical verification of the potential impact
● Out of scope: Caching strategies (replacement, invalidation)
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 41
Query Template Contact
SELECT * WHERE { <PERSON> foaf:knows ?p .
OPTIONAL { ?p foaf:name ?name } OPTIONAL { ?p foaf:firstName ?firstName } OPTIONAL { ?p foaf:givenName ?givenName } OPTIONAL { ?p foaf:givenname ?givenname } OPTIONAL { ?p foaf:familyName ?familyName } OPTIONAL { ?p foaf:family_name ?family_name } OPTIONAL { ?p foaf:lastName ?lastName } OPTIONAL { ?p foaf:surname ?surname }
OPTIONAL { ?p foaf:birthday ?birthday }
OPTIONAL { ?p foaf:img ?img }
OPTIONAL { ?p foaf:phone ?phone } OPTIONAL { ?p foaf:aimChatID ?aimChatID } OPTIONAL { ?p foaf:icqChatID ?icqChatID } OPTIONAL { ?p foaf:jabberID ?jabberID } OPTIONAL { ?p foaf:msnChatID ?msnChatID } OPTIONAL { ?p foaf:skypeID ?skypeID } OPTIONAL { ?p foaf:yahooChatID ?yahooChatID }
}
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 42
Query Template UnsetProps
SELECT DISTINCT ?result ?resultLabel WHERE{
?result rdfs:isDefinedBy <http://xmlns.com/foaf/0.1/> .?result rdfs:domain foaf:Person .
OPTIONAL { <PERSON> ?result ?var0 } FILTER ( !bound(?var0) )
<PERSON> foaf:knows ?var2 .?var2 ?result ?var3 .?result rdfs:label ?resultLabel .?result vs:term_status ?var1 .
}ORDER BY ?var1
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 43
Query Template Incoming
SELECT DISTINCT ?result WHERE{
?result foaf:knows <PERSON> .
OPTIONAL{
?result foaf:knows ?var1 .FILTER ( <PERSON> = ?var1 )<PERSON> foaf:knows ?result .
}FILTER ( !bound(?var1) )
}
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 44
Query Template 2ndDegree1
SELECT DISTINCT ?result WHERE{
<PERSON> foaf:knows ?p1 .<PERSON> foaf:knows ?p2 .FILTER ( ?p1 != ?p2 )
?p1 foaf:knows ?result .FILTER ( <PERSON> != ?result )?p2 foaf:knows ?result .
OPTIONAL {<PERSON> ?knows ?result .FILTER ( ?knows = foaf:knows )
}FILTER ( !bound(?knows) )
}
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 45
Query Template 2ndDegree2
SELECT DISTINCT ?result WHERE{
<PERSON> foaf:knows ?p1 .<PERSON> foaf:knows ?p2 .FILTER ( ?p1 != ?p2 )
?result foaf:knows ?p1 .FILTER ( <PERSON> != ?result )?result foaf:knows ?p2 .
OPTIONAL {<PERSON> ?knows ?result .FILTER ( ?knows = foaf:knows )
}FILTER ( !bound(?knows) )
}
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 46
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
Experiment – Single Query
hit rate
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
● no reuse experiment:● No data caching
● upper bound experiment● Reuse of query-local dataset
for 3 executions of each query● Third execution measured
● Hit rate:
look-ups answered from cacheall look-up requests
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 47
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
Experiment – Single Query
hit rate
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
● no reuse experiment:● No data caching
● upper bound experiment● Reuse of query-local dataset
for 3 executions of each query● Third execution measured
● Hit rate:
look-ups answered from cacheall look-up requests
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 48
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
Experiment – Single Query
hit rate query execution time(in seconds)
number of query results
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 49
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
Experiment – Single Query
hit rate query execution time(in seconds)
number of query results
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 50
Experiment – Single Query
● In the ideal case for Bupper
= [ bcur
, bcur
] :
● pupper
( bcur
, Bupper
) = c( bcur
, [ ] ) – c( bcur
, Bupper
) = c( bcur
, [ ] )
● supper
( bcur
, Bupper
) = b( bcur
, Bupper
) – b( bcur
, [ ] ) = 0
Experiment Avg.1 number of Query Results
(std.dev.)
Average1
Hit Rate
(std.dev.)
Avg.1 query Execution Time
(std.dev.)
no reuse4.983
(11.658)
0.576(0.182)
30.036 s(46.708)
upper bound5.070
(11.813)
0.996(0.017)
1.943 s(11.375)
1 Averaged over all 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 51
Experiment – Single Query
● Summary (measurement errors aside):● Same number of query results● Significant improvements in query performance
Experiment Avg.1 number of Query Results
(std.dev.)
Average1
Hit Rate
(std.dev.)
Avg.1 query Execution Time
(std.dev.)
no reuse4.983
(11.658)
0.576(0.182)
30.036 s(46.708)
upper bound5.070
(11.813)
0.996(0.017)
1.943 s(11.375)
1 Averaged over all 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 52
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
given order
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
hit rate
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
query execution time(in seconds)
number of query results
● given order experiment:● Reuse of the query-local
dataset for the complete sequence of all 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 53
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
query execution time(in seconds)
number of query results
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
hit rate
● given order experiment:● Reuse of the query-local
dataset for the complete sequence of all 115 queries
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
given order
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 54
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
hit rate query execution time(in seconds)
number of query results
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
given order
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 55
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
hit rate query execution time(in seconds)
number of query results
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
given order
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 56
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
given order
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
hit rate query execution time(in seconds)
number of query results
Bgiven order
= [ q1 , … , q
38 ]
s( q39
, Bgiven order
) = b( q39
, Bgiven order
) – b( q39
, [ ] )
= 9 – 1
= 8
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 57
ContactInfoPhillipe
UnsetPropsPhillipe
2ndDegree1Phillipe
2ndDegree2Phillipe
IncomingPhillipe
0 0,2 0,4 0,6 0,8 1
0 0,2 0,4 0,6 0,8 1
no reuse upper bound
given order
Experiment – Complete Sequence
(Query No. 36)
(Query No. 37)
(Query No. 38)
(Query No. 39)
(Query No. 40)
0 5 10 15 20 25 30
0 5 10 15 20 25 30
0 20 40 60 80
0 20 40 60 80
hit rate query execution time(in seconds)
number of query results
Bgiven order
= [ q1 , … , q
38 ]
p'( q39
, Bgiven order
) = c'( q39
, [ ] ) – c'( q39
, Bgiven order
)
= 31.48 s – 68.64 s
= – 37.16 s
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 58
Experiment – Complete Sequence
● Summary:● Data cache may provide for additional query results● Impact on performance may be positive but also negative
Experiment Avg.1 number of Query Results
(std.dev.)
Average1
Hit Rate
(std.dev.)
Avg.1 query Execution Time
(std.dev.)
no reuse4.983
(11.658)
0.576(0.182)
30.036 s(46.708)
upper bound5.070
(11.813)
0.996(0.017)
1.943 s(11.375)
given order6.878
(12.158)
0.932(0.139)
39.845 s(145.898)
1 Averaged over all 115 queries
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 59
Experiment – Complete Sequence
● Executing the query sequence in a random order results in measurements similar to the given order.
Experiment Avg.1 number of Query Results
(std.dev.)
Average1
Hit Rate
(std.dev.)
Avg.1 query Execution Time
(std.dev.)
no reuse4.983
(11.658)
0.576(0.182)
30.036 s(46.708)
upper bound5.070
(11.813)
0.996(0.017)
1.943 s(11.375)
given order6.878
(12.158)
0.932(0.139)
39.845 s(145.898)
random orders6.652
(11.966)
0.954(0.036)
36.994 s(118.700)
Olaf Hartig - How Caching Improves Efficiency and Result Completeness for Querying Linked Data 60
These slides have been created byOlaf Hartig
http://olafhartig.de
This work is licensed under aCreative Commons Attribution-Share Alike 3.0 License
(http://creativecommons.org/licenses/by-sa/3.0/)