Download - Universität Innsbruck Leopold Franzens Copyright 2007 DERI Innsbruck Technical Fair 11th December…
Universität InnsbruckLeopold Franzens
Technical Fair Copyright 2007 DERI Innsbruck www.deri.at 11th December 2007
Technical FairReasoning
Barry BishopDERI Innsbruck – University of Innsbruck
2 Technical Fair11th December 2007
Introduction
• The reasoning group• Achievements over the last year (or so)• Recent progress• Future plans
3 Technical Fair11th December 2007
The Reasoning Group
Barry Bishop
Chair/Software Engineer
Stijn Heymans
Post-doc Researcher
Florian Fischer
Undergraduate
Uwe Keller
PhD Researcher
Graham Hench
Software Engineer
Richard Pöttler
Software Engineer
Nathalie Steinmetz
Graduate Researcher
Holger Lausen
PhD Student
The Reasoning Group meet every 2 weeks to co-ordinate project tasks and push forward the development of DERI software components
4 Technical Fair11th December 2007
Established components
• Integrated Rule Inference System (IRIS)– safe Datalog– default negation– built-in predicates– WSML (XML-Schema) data types
p(X,Z) :- q(X,Y), r(X), s(Z), X != Ypath(X,Y) :- path(X,Z), path(Z,Y)diff(X) :- q(X), not r(X)
Join
Set difference of q and r
Transitive closure
Built-in predicate
5 Technical Fair11th December 2007
Established components
• WSML2Reasoner– a framework for all WSML language variants
IRIS MINS KAON2
Pellet
WSML2Reasoner
WSML-Flight WSML-Rule WSML-DL
TPTP
WSML-Full
(any)
WSML-Core
6 Technical Fair11th December 2007
Established components
• RDFSReasoner– framework for reasoning over RDFS graphs– executes WSML conjunctive queries– RDF, RDFS and eRDFS entailment regimes– uses IRIS as the underlying reasoning engine
7 Technical Fair11th December 2007
Evaluating Datalog
• Evaulating pure Datalog is straightforward, because
– there is no negation, function symbols or built-in predicates
– when starting with a finite set of facts, a finite minimal model is guaranteed
– all programs can be evaluated
• However, IRIS is more sophisticated!
8 Technical Fair11th December 2007
Evaluating Datalog
• IRIS supports more data types– Datalog has only integer and string– WSML has all the XML schema types
• Now we have some ambiguity:p(X,Y) :- q(X,Y), X = 2p(2,Y) :- q(2,Y)
(what‘s the difference?)
9 Technical Fair11th December 2007
Evaluating Datalog
• Built-in predicates (e.g. EQUALS, LESS, ADD, etc)
• Test for safe rules requires that every variable that occurs in a built-in also occurs in a non-negated ordinary body predicate (or is equated with one)
e.g.p(X,Y) :- q(X), X < Y
is not safe.
10 Technical Fair11th December 2007
Evaluating Datalog
• However, it can be useful to relax this rule, e.g.p(Z) :- q(X,Y), X+Y=Z
• But beware:p(0)p(Y) :- p(X), X+1=Y
does not converge! (More about this later)
11 Technical Fair11th December 2007
Evaluating Datalog
• NEGATION– Monotonicity – essential to know at each evaluation
step, that any new fact is always true.p(X) :- r(X), not s(X)s(X) :- t(X)
– Requires that the second rule is evaluated before the first rule.
– This layering of rules is called STRATIFICATION (More about this later)
12 Technical Fair11th December 2007
Evaluating Datalog
• Computing the fixed point is expensive!• Although, when answering a query such as
?- p(?X,?Y)the fixed point computation will have to be done anyway.
• However, when the query contains constants, e.g.?- p('a', ?Y)
then an optimisation is possible: Magic Sets
13 Technical Fair11th December 2007
Evaluating Datalog
• Magic sets– This technique generates a modified program in
order to take advantage of constants in the query– A bottom-up evaluation procedure is used to
evaluate the new program
14 Technical Fair11th December 2007
Evaluating Datalog
Magic sets exampleBefore:
sg(?X, ?X) :- person(?X).sg(?X, ?Y) :- parent(?X, ?Xp), sg(?Xp, ?Yp), parent(?Y, ?Yp).?- sg('a', ?W).
After:m_sg('a').m_sg(?Xp) :- sup2_1(?X, ?Xp).sup2_1(?X,?Xp) :- m_sg(?X), parent(?X, ?Xp).sg(?X, ?X) :- m_sg(?X), person(?X).sg(?X, ?Y) :- sup2_1(?X, ?Xp), sg(?Xp, ?Yp), parent(?Y, ?Yp).?- sg('a', ?W).
15 Technical Fair11th December 2007
Recent Work
• Stabilisation of IRIS– more unit/functional tests
• fixing bugs found– identifying undefined behaviour
• built-ins with incompatible data types• negated built-ins
– documentation• design (UML)• user guide
16 Technical Fair11th December 2007
Recent Work
• Locally Stratified Negation– a rule has a negative dependency on itself, BUT– the input and output of the rule are partitioned due to
the presence of constants, e.g.
p('a',X) :- r(X), ¬ p('b',X)
Add a tuple (a,X) in to p if X is in r, but not if (b,X) is in p
17 Technical Fair11th December 2007
Recent Work
• Why is local stratification important?– because of how WSML is evaluated, e.g.
– is translated in to datalog as:
axiom isSingle definedBy ?x[family_status hasValue single] :- ?x memberOf Human and naf ( ?x[married_to hasValue ?y] )
_has_values(?x, 'family_status', 'single') :- _member_of(?x, 'Human'), ¬ _has_values(?x, 'married_to', ?y)
18 Technical Fair11th December 2007
Recent Work
• Locally stratified negation – trickier example
p('a',X) :- r(X), not q('b',X)q(X,Y) :- p(X,Y)
– Which rule should be evaluated first?– What are the strata?
• IRIS re-writes these rules to:
Stratum 0: q('b',Y) :- p('b',Y)Stratum 1: p('a',X) :- r(X), not q('b',X)
q(X,Y) :- p(X,Y), X != 'b'
19 Technical Fair11th December 2007
Recent Work
• Query Containment– is a reasoning related task
• Will the results of query q1 contain the results of query q2? Always?
• Query containment could be used in web service discovery:– Plug-in (goal is completely achieved)– Subsumption (goal subsumes web-service)– Exact match
20 Technical Fair11th December 2007
Recent Work
• Query Containment– The current algorithm (frozen fact*) is limited to
positive datalog only
– Further work will produce an algorithm that also works in the presence of negation
*This algorithm is presented in Ramakrishnan, R., Y. Sagiv, J. D. Ullman and M. Y. Vardi (1989). Proof-Tree Transformation Theorems and their Applications. 8th ACM Symposium on Principles of Database Systems, pp. 172 - 181, Philadelphia
21 Technical Fair11th December 2007
Recent Work
• Query extensions– Formal languages group have specified a query
language called WSML-Flight-A– Implemented using post-processing on
WSML2Reasoner
SELECT ?place, COUNT(*)FROM _"file:/C:/deri/workspace/WSMLQuery/test/files/the-simpsons-ontology.wsml"WHERE ?employee[hasWorkingPlace hasValue ?place]ORDER BY COUNT(*) DESCGROUP BY ?placeHAVING COUNT(*) > 1
PLACE COUNT(*)springfield_elementary 6channel_6 5nuclear_plant 4
22 Technical Fair11th December 2007
Recent Work
• Reasoning over external data sources
• Motivation:– Not convenient to all include instance data in a
WSML document– To answer a query, the reasoner may not need all
instance data– Impossible to provide all instance data when the
data set is very large– Allows reasoner to use data in any format
23 Technical Fair11th December 2007
Recent Work
Overview at the WSML layer
WSMLReasoner
WSML Document
ontologies, web services, goals, mediators...
Queries
Gets instance data from
UserDataSourceClass
ExternalDataSource
HasValue[] hasValue(IRI id, IRI name, Term value)
MemberOf[] memberOf(IRI id, IRI concept)
24 Technical Fair11th December 2007
Future Work
• Reminder– WSML language variants:
• different levels of logical expressiveness and• different languages paradigms
– WSML-Core, -Flight and -Rule can be evaluated using Datalog
• Core Function free and negation free• Flight With inequality and locally stratified
negation • Rule Unsafe/unstratified rules and function
symbols
25 Technical Fair11th December 2007
Future Work
• Concentrate on WSML-Rule– Function symbols– Unsafe rules– Unstratified logic programs
• Continue evolution of framework– Plug-ins
• Program optimisers• Storage algorithms• Evaluation strategies
– Description Logics– Rule Interchange Format (RIF)
26 Technical Fair11th December 2007
Future Work
• Function Symbols– A generalisation of Datalog– Extra work for storage and selection of facts– Required for WSML-Rule
axiom aaMapingRule14 definedBy mediated1(?X11,Name)[lName hasValue ?Y12] memberOf Name:-?X11[o1#hasLastName hasValue ?Y12] memberOf o1#person and ?X11 memberOf ?SC13 and mappedConcepts(?SC13,Name).
27 Technical Fair11th December 2007
Future Work
• Without function symbols– straightforward to match tuples in a relation, e.g.
p(X,Y) :- q(X,Y), r(X,2), s(Y,Y)
• With function symbols– term matching is harder, e.g.
p(f(X),Y) :- q(g(X), h(X,Y))
EverythingTuples whose last term is 2
Tuples whose two terms are equal
Relation Q match
q(g(1), h(1, 2)) X=1, Y=2q(g(1), h(2, 2)) no match
q(g(k(3)), h(k(3), m(2,3))) X=k(3), Y=m(2,3)
28 Technical Fair11th December 2007
Future Work
• With function symbols, safe rules can have an infinite minimal model:
p(f(X)) :- p(X)
• Which is the same convergence problem as before!• Possible solutions
– time constraint– space constraint– complexity constraint
29 Technical Fair11th December 2007
Future Work
• Stratified Logic Programs– Have acyclic negation– Have a unique minimal model– Have an intuitive meaning
• Unstratified Logic Programs– Often do not have a clear meaning– Often have several minimal models or a minimal
model that is the empty set
• WSML-Rule allows unsafe and unstratified rules
30 Technical Fair11th December 2007
Future Work
• Stable-model semantics– Gives meaningful results to some programs– However, some programs have no stable model, e.g.
p(X) :- not p(X) (odd loop)
– And some programs have two stable models, e.g.p(X) :- not q(X) (even loop)q(X) :- not p(X)
(hinting at disjunctive information)– Worst of all, computationally hard!
31 Technical Fair11th December 2007
Future Work
• Well-founded semantics– Every rule set has a unique well-founded model– Several published evaluation techniques– However, even loops give an empty set (no
disjunction)
• Interesting papers:– Kemp, Stuckey and Srivastava on well-founded
semantics and magic sets– Zukowski, alternative computation of well-founded
semantics
32 Technical Fair11th December 2007
Summary
• Over the last 18 months much has been achieved– Datalog reasoner created– Framework for reasoning with RDFS and WSML
• However, there is much more to do– Function symbols – end of February 2008– Unsafe/unstratified programs – end of March 2008
• Target– a complete in house WSML-Rule reasoner by start
of April at the latest (bye-bye MINS)