pagoda poster

1
Pay-as-you-go OWL Query Answering Using a Triple Store Yujiao Zhou, Yavor Nenov, Bernardo Cuenca Grau and Ian Horrocks Pay-as-you-go Approach Intuition to delegate the bulk of the computational workload to a highly scalable datalog reasoner to minimise the use of a fully- fledged reasoner Evaluation Evaluated on LUBM(100,1000), UOBM(1, 60, 500), FLY, DBPedia +travel and NPD FactPages. Average time without OWL 2 reasoning Average time Acknowledgements This work was supported by the Royal Society, the EPSRC projects Score!, ExODA, and MaSI 3 , and the FP7 project OPTIQUE. Data Lower ELHO Lower Data Upper Ontology D U Query Summary Datalog Engine Datalog Engine Datalog Engine Summarisation Full Reasoner Q Dependency Analysis Fragment F Full Reasoner Q F Output Tracking by datalog encoding triple store OWL 2 reasoner L=LRL LEL U L = U σ(cert(q, F)) cert(q, σ(F)) Incomplete endomorphisms Arrange calls to the reasoner according to the dependencies heuristically Rule out non-answers Done Diagram Over-approx to datalog upper bound U answer of q w.r.t the resulting set of rules U(Σ) and D. Lower bounds basic lower bound LRL answer of q w.r.t. the datalog fragment of Σ and D; EL lower bound LEL answer of q w.r.t. the ELHO fragment of Σ and D. Tracking encoding in datalog Intuition: to compute all the rules and facts that participate in a proof of q(a) in ΣD. This goal can be archived using datalog encoding. Example: If B 1 (x 1 ),,B m (x m ) H(x) is a rule in U(Σ), H t ( x), B1 (x1), . . . , Bm (xm) S(c r )B 1 t (x1 ). . . B m t (xm ) is added to the tracking rule. Involved rules: {r | S(c r ) is derived} Involved facts: {P(a) D | P t (a) is derived} Summarisation & dependency between answers Let σ be the summary function, σ(cert(q, F)) cert(q, σ(F)) If there is an endomorphism from a to b in F, then a cert(q, F) implies b cert(q, F) Existential knowledge {...,A u B,...} {C } {C } {A,...} x 1 {A,...} x 2 R R {A,...} {C } c x 1 {A,...} x 2 R R {...,A t B,...} Disjunctive knowledge DL Ontology Dataset Queries LUBM(n) SHI 93 ~100,000n 14 (std)+10 UOBM(n) SHIN 314 ~200,000n 15 FLY SRI 144,407 6,308 5 DBPedia SHOIN 1,757 12,119,662 441 (atomic) NPD SHIF 819 3,817,079 329 (atomic) LUBM(1000) UOBM(100) FLY DBPedia NPD Queries 22/24 12/15 5/5 439/441 294/329 Time(s) 18.4 0.7 0.2 0.3 0.1 LUBM(100) UOBM(1) FLY DBPedia NPD Time(s) 29.6 1.8 0.2 3 3 Problem Setting Ontology Σ — a set of rules of the form φ(x) V i y i ψ(x, y i ) Data D — a set of ground atoms of the form P(a) Conjunctive queries — FO formula of the form q(x) y ψ(x, y) where ψ and φ are conjunctions of atoms.

Upload: dbonto

Post on 09-Jul-2015

74 views

Category:

Technology


1 download

DESCRIPTION

Abstract: An enhanced hybrid approach to OWL query answering that combines an RDF triple-store with an OWL reasoner in order to provide scaleable pay-as-you-go performance. The enhancements presented here include an extension to deal with arbitary OWL ontologies and optimisations that significantly improve scalability. We have implemented these techniques in a prototype system, a preliminary evaluation of which has produced very encouraging results.

TRANSCRIPT

Page 1: PAGOdA poster

Pay-as-you-go OWL Query Answering Using a Triple Store

Yujiao Zhou, Yavor Nenov, Bernardo Cuenca Grau and Ian Horrocks

Pay-as-you-go Approach

Intuition‣ to delegate the bulk of the

computational workload to a highly scalable datalog reasoner

!‣ to minimise the use of a fully-

fledged reasoner

Evaluation‣ Evaluated on LUBM(100,1000), UOBM(1, 60, 500), FLY, DBPedia+travel

and NPD FactPages.

Average time without OWL 2 reasoning

Average timeAcknowledgements This work was supported by the Royal Society, the EPSRC projects Score!, ExODA, and MaSI3, and the FP7 project OPTIQUE.

Data

Lower

ELHO Lower

Data

Upper

Ontology

DU

Query

Summary

Datalog

Eng

ine

Datalog Engine

Datalog Engine

Summarisation

Full Reasoner Q

Dependency Analysis

Fragment

F

Full Reasoner QF

Output

Tracking by datalog encoding

triple store OWL 2 reasoner

L=LRL ∪ LEL ∪ … U

L = U

σ(cert(q, F)) ⊆ cert(q, σ(F))

Incomplete endomorphisms

Arrange calls to the reasoner according to the dependencies heuristically

Rule out non-answers

Done

Diagram Over-approx to datalog

!!!!

‣ upper bound U answer of q w.r.t the resulting set of rules U(Σ) and D.

Lower bounds ‣ basic lower bound LRL

answer of q w.r.t. the datalog fragment of Σ and D; ‣ EL lower bound LEL

answer of q w.r.t. the ELHO fragment of Σ and D.

Tracking encoding in datalog Intuition: to compute all the rules and facts that participate in a proof of q(a) in Σ∪D. This goal can be archived using datalog encoding. ‣ Example:

‣ If B1(x1),…,Bm(xm) → H(x) is a rule in U(Σ), Ht(x), B1 (x1), . . . , Bm (xm) → S(cr)∧B1

t (x1 )∧ . . . ∧Bm

t(xm ) is added to the tracking rule.

‣ Involved rules: {r | S(cr) is derived} Involved facts: {P(a) ∈ D | Pt(a) is derived}

Summarisation & dependency between answers ‣ Let σ be the summary function, σ(cert(q, F)) ⊆ cert(q, σ(F)) ‣ If there is an endomorphism from a to b in F, then

a ∈ cert(q, F) implies b ∈ cert(q, F)

‣ Existential knowledge

{. . . , A uB, . . .}

{C} {C}

{A, . . .}x1

{A, . . .}x2

R

R

{A, . . .}

{C}c

x1{A, . . .}

x2

R

R

{. . . , A tB, . . .}

‣ Disjunctive knowledge

DL Ontology Dataset QueriesLUBM(n) SHI 93 ~100,000n 14 (std)+10

!UOBM(n) SHIN 314 ~200,000n 15FLY SRI 144,407 6,308

88 5

DBPedia SHOIN 1,757 12,119,662 441 (atomic)NPD SHIF 819 3,817,079 329 (atomic)

LUBM(1000) UOBM(100) FLY DBPedia NPDQueries 22/24 12/15 5/5 439/441 294/329 Time(s) 18.4 0.7 0.2 0.3 0.1

LUBM(100) UOBM(1) FLY DBPedia NPD

Time(s) 29.6 1.8 0.2 3 3

Problem Setting‣Ontology Σ — a set of rules of the form φ(x) → Vi ∃yi ψ(x, yi) ‣ Data D — a set of ground atoms of the form P(a) ‣ Conjunctive queries — FO formula of the form q(x) ← ∃y ψ(x, y)

where ψ and φ are conjunctions of atoms.