constraint generation and reasoning in owl
DESCRIPTION
Constraint Generation and Reasoning in OWL. Dissertation Defense Thomas H. Briggs, VI Advisor: Dr. Yun Peng University of Maryland, Baltimore County. Introduction. Property Constraints Important to defining the semantics of an ontology Properties may have domain / range constraints - PowerPoint PPT PresentationTRANSCRIPT
Constraint Generationand Reasoning in OWL
Dissertation Defense
Thomas H. Briggs, VI
Advisor: Dr. Yun Peng
University of Maryland, Baltimore County
Introduction
• Property Constraints• Important to defining the semantics of an ontology• Properties may have domain / range constraints• Global consequences from local assertions• 75% of properties are unconstrained
• Property Constraint Generation• Uses information in the ontology to generate
constraints• Can be used to determine missing, suggest new, or
analyze existing constraints• Creates default knowledge that must be treated
differently than other asserted or inferred knowledge.
Thesis
The purpose of this research is to investigate methods for generating domain andrange constraints from its defining ontology and to evaluate the quality of this generation. This work will also investigate the default reasoning necessary to support generated constraints. A specific focus will be on management of the default facts in the knowledge base including tracking default facts and efficient retraction operations to restore consistency.
Research Outcomes
• Outcomes of this work are:• Algorithmic framework to generate and
evaluate domain and range constraints, and• Quantitative comparison of the relationship
between generated and specified constraints, and
• An inference procedure that will enable a limited form of default reasoning that maintains the completeness, and correctness of OWL reasoners.
Description Logics
Description Logics
• Description Logics:• are a branch of crisp logics• include well-researched languages
• AL, CLASSIC, RACER
• have a long history • are the basis of the Semantic Web• have fast and efficient reasoners (for some) DL
• FACT, Pellet
Description Logics
• Describe some world by• Defining classes, properties, and individuals• Classes define types of individuals• Properties define relationships between
individuals• Individuals are things that are instances of
classes, and are related to other individuals through properties.
• Similar to first order logic
Constraints
• An assertion about the types of fillers of a property• Subject is type of domain of property• Object is type of range of property• Unconstrained defaults to Thing/Top
• Different interpretation than traditional languages• Define valid types of individuals• May force a type cast, but error otherwise
void foo(double z){ printf(“%f\n”, z);}
char x[] = “33.0”;foo(x);
teaches: domain(Teacher), range(Student)
teaches(Adam, Bill)
Constraints
• An assertion about the types of fillers of a property• Subject is type of domain of property• Object is type of range of property• Unconstrained defaults to Thing/Top
• Different interpretation than traditional languages• Define valid types of individuals• May force a type cast, but error otherwise
void foo(double z){ printf(“%f\n”, z);}
char x[] = “33.0”;foo(x);
Error: stringscannot bedoubles!
teaches: domain(Teacher), range(Student)
teaches(Adam, Bill)
Adam is a teacher, Bill a student
Open World Assumption• Open World Assumption (OWA)
• Anything that isn’t asserted is considered as unknown.
• Leads to monotonicity in reasoner.
• Closed World Assumption (CWA)• Assume all facts are known• Default knowledgehasChild(ALICE, BOB)
Does Alice have exactly one child?
Yes! No!?
Closed World Open World
Unique Name Assumption
• Assumption that the name of an item is sufficient to make it unique (UNA).• We make this for classes and properties• Do not make this for individuals
True only whensame individual
Open World Assumption –Because we didn’t say they were different, then the reasoner canconclude that they are to makethe model true
Constraint Generation
Unconstrained Properties
• Domain and range assert types to fillers of property
• Unconstrained properties lack these type assertions
• Reasons• Information is unknown• Artifact of ontology generator• Avoid conflicts with reuse• Faulty semantics
Constraint Generation
• Unconstrained properties are a problem
• Constraint generation is a non-trivial process:• Omitted constraints may be intentional or may
not• Open World Assumption – information may not
be there
• Two sources of information on constraints:• ABox• TBox
ABox Generation
• ABox generation problematic• Depends on individuals’ class membership
• Individuals may not be defined / UNA• Frequently do not have a complete set of class
assertions• Class assertions overlap
What should the domain andrange of drives be?
TBox Generation
• Terminology provides definition of the relationship between classes.
Generation Lemma:
Class Vehicle: subClassOf: Thing and (drivenBy some Person)Class Civic: subClassOf: Thing
and (madeBy only HONDA) and (drivenBy some Person)
Domain must subsume:Vehicle union Civic
Vehicle
Vehicle or Civic
Vehicle or X
X
?
Finding “Best”
• Using terminology to find “best” • Intractable – exponential growth• Requires utility function to measure goodness• Requires future knowledge or omniscience
Generation Methods
• Generation Methods • Construct a constraint that satisfies generation
lemma
• Three Generation Methods• Disjunction Method• Least-Common Named Subsumer• Vivification
Disjunction
• Based on Generation Lemma
• Computes the Least Common Subsumer (LCS)• In languages with disjunction, the LCS is simply
the disjunction of the concepts• Generation time linear w.r.t. number classes and
properties• Reasoning time is exponential.
Disjunction Algorithm
Disjunction Example
Domain for P:
Range for P: C
Disjunction Discussion
• Disjunction is good because• It is simple to compute• Most specific / accurate statement of constraint
• Disjunction is bad because• Does not add useful information• Disjunction adds non-determinism to reasoner
Least Common Named Subsumer
• Select a named concept that subsumes concepts• Trade-off in specificity for concept description• Quality depends on existence of named
concepts• May be expensive to compute
• Runtime is
LCNS Algorithm
Subsumption checkingis the dominate cost
LCNS Example
LCNS Domain of P: A
LCNS Range of P: C
Disjunction Domain of P:
LCNS Discussion
• LCNS is good because• It selects a named class in the ontology• Runtime bound to cost for subsumption
checking• Generalizes concepts from disjunction
• LCNS is bad because• Requires existence of a named class or LCNS is
Thing• Tends to over-generalize in other case as well• Over-generalization discards too much
information
Vivification
• Balance specificity and over-generalization• First proposed by Cohen & Hirsh 1992• Difference here is partial absorption• Starts with disjunction, using inheritance
relationship summarizes terms with common direct super-classes.
• Only terms that do not share a common super-class remain in the disjunction
Absorption
• Moderates the generalization process• Uses the class inheritance structure for
operation
Vivification Algorithm
Vivify a concept list in L for a given absorption criteria Beta.
Perform Absorption
Vivification Example
Property P is usedin the definitionof the threeyellow classes.
Disjunction Domain:
LCNS Domain:
Vivification Domain:
Vivification Discussion
• Vivification is good because• It creates general concepts that summarize over
common super-classes, selecting named subsumers
• It preserves outliers• It is fast
• Vivification is bad because• Disjunctions may remain after summarization• Depends on the completeness of the terminology• Ignores individual assertions
Results
Results - Domain
Generated constraint was equal to originally specified one.Positive outcome. Correctly generated constraint with equal specificity.
Domain ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%
10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902
Results - Domain
Original more specific than generated. In all cases, the original constraint subsumed itself.Making it more specific than the generated one.
Domain ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%
10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902
Results - Domain
Original more general then generated.A negative to neutral outcome. The original constraintwas more general than its present usage.
Domain ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%
10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902
Original Top, Generated Top.Both the original and generated concepts where top.It is a subclass of the case of row 1 where concepts are equal.
Results - Domain
Domain ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%
10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902
Original Top, Generated More SpecificStrongly positive results. A constraint was generatedfor a concept that previously lacked one.
Results - Domain
Domain ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%
10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902
Generated Top, Original More Specific.A neutral to negative result. A constraint was generated as Top whenthe original was not Top. An example was an ontology that defined hasAuntas the union of Niece and Nephew, which was equivalent to Person, andPerson was equivalent to everything – hence the generated created Top.
Results - Domain
Domain ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%
10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902
Property Unused.Neutral results. A constraint could not be generated becausethere were no role restrictions to define the constraints.
Results - Domain
Domain ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%
10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902
Processor or Reasoner Failed.There was a runtime failure of the processor or reasoners.
Results - Domain
Domain ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%
10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902
Results – Range
Range results were similar to domain.
Range ResultsDisjunction LCNS Vivification
Relationship # props % # props % # props %1 Original Equals Generated 231 0.8% 248 0.9% 255 0.9%
2 Original More Specific Than Generated 6 0.0% 6 0.0% 17 0.1%3 Original More General Than Generated 172 0.6% 147 0.5% 138 0.5%4 Original and Generated Top 647 2.2% 930 3.2% 657 2.3%5 Original Top, Generated More Specific 2113 7.3% 1839 6.4% 2097 7.3%6 Generated Top, Original More Specific 361 1.2% 392 1.4% 365 1.3%7 Property Unused, Original Specified 3403 11.8% 3428 11.9% 3416 11.8%8 Property Unused, Original Unspecified 21824 75.5% 21834 75.5% 20959 72.5%9 Processor Failed 102 0.4% 63 0.2% 955 3.3%
10 Reasoner Failed 43 0.1% 15 0.1% 43 0.1%Total 28902 28902 28902
Results - Normalized
Generation strategies created improved constraints almost 80% of time.
Vivification created constraints nearly as specific as Disjunction.
Domain Disjunction LCNS VivifiedRelationship # props % # props % # props %Original Equals Generated 801 19.1% 833 19.7% 808 19.6%Original More General Than Generated 141 3.4% 103 2.4% 74 1.8%Original Top, Generated Top 800 19.1% 1111 26.3% 807 19.5%Original Top, Generated More Specific 2427 57.8% 2112 49.9% 2414 58.5%Generated Top, Original More Specific 27 0.6% 71 1.7% 25 0.6%Total 4196 4230 4128
Range Disjunction LCNS VivifiedRelationship # props % # props % # props %Original Equals Generated 231 6.6% 248 7.0% 255 7.3%Original More General Than Generated 172 4.9% 147 4.1% 138 3.9%Original Top, Generated Top 647 18.4% 930 26.2% 657 18.7%Original Top, Generated More Specific 2113 60.0% 1839 51.7% 2097 59.7%Generated Top, Original More Specific 361 10.2% 392 11.0% 365 10.4%Total 3524 3556 3512
Results - Runtime
Algorithm min (s)
max (s) mean (s) std. dev (s)
No generation 3.80 14.62 5.29 1.71
LCS 3.85 23.37 5.43 2.29
LCNS 3.79 23.92 5.34 2.20
Vivification 3.79 22.56 5.29 2.08
Algorithm mean (s)
std. dev (s)
LCS 0.22 1.16
LCNS 0.14 0.14
Vivification 0.08 0.82
Time – Load / Reasoning Time
Time – Load, Reason, Generate, Build, Reason – 1000 Ontologies
Vivification faster than disjunctionat 92.6% degree of confidence.
Vivification faster than LCNSat 76.4% degree of confidence.
Hypothesis Testing
Results – Discussion
• Generation• Remove unused properties gives better picture of
future as technologies mature.• Generation a viable method
• Vivification was dominate method• Generated constraints with near equal specificity to
LCS• Able to generalize at appropriate times • Avoided the over-generalization of LCNS• All around best performance for generation and
reasoning
Default Reasoning
Default Reasoning
• Monotonicity• One goal of OWL is to maintain monotonicity –
the property of a reasoner that adding new facts to the knowledge base does not cause existing facts to be retracted.
• Default Knowledge / Rules• Default knowledge and rules about the
terminology make use of Closed World Semantics, give up monotonicity.
• A default rule may conflict with future statements
• Statements must be retracted.
Contraction
• When a clash occurs in a knowledge base with default statements, those default facts must be removed to restore consistency. This is called a contraction.
• How to tell default from non-default?• Inference leads to multi-path problem• Default and non-default facts can be used to
infer new facts• Default facts may block non-default facts from
being generated
Default Example
Class: A SubClassOf: Thing, P some BClass: B SubClassOf: ThingClass: C SubClassOf: Thing
ObjectProperty: PDomain: Thing Range: Thing
Individual: JIndividual: I
Facts: P(I,J)
Class: A SubClassOf: Thing, P some BClass: B SubClassOf: ThingClass: C SubClassOf: Thing
ObjectProperty: PDomain: A Range: B
Individual: JTypes: B
Individual: I Types: A
Facts: P(I,J)
After property generation,domain and range on P weregenerated / default.
Before generation
What if the domain expert adds C SubClassOf: P some B? Now, the domain of P is generated
as A union C. I no longer in A!
Modifications
• Default Descriptor• Indicates the defaultness of a statement or
assertion.• Does not change the meaning of the term
• Inference• Inference rules modified to propagate descriptor• Non-default statement must replace default
statement
Concept Strength
• Concept Strength between concepts C and D• Strength Relationship:
• If C is default and D is not, then C weaker than D• If C and D have same defaultness, then equal• If C is not default and D, then C stronger than D
Reasoner
• Reasoner was implemented from transformation rules such as:
• Depends on contains and union operation.
Modified Reasoner
• The reasoner’s rules are modified
• Contains Rule• The contains(x) predicate will be modified – • Return true if A contains some y, such that y=x,
and x is not stronger than y.
• Union Update Procedure• The union operator to update the KB will be
modified to replace any equivalent weaker term with a stronger term
Contraction Triggering
• An inconsistent knowledge base contains either a true clash or a default clash.• True Clash – clash occurs between two non-
default statements• Default Clash – clash involves at least one
default statement
• A knowledge base that contains only default clashes can be contracted by removing all default facts.• Default facts can be rebuilt using new state of
the KB
Reasoner Completeness
Extends Baader and Nutt’s Completeness of Tableau Reasoner
Reasoner Soundness
Extends Baader’s Soundness Theorem
Assume the transformation rules defined for a non-default Description Logic are truth-preserving. Assume the ABox S’ is obtained from a finite set of Aboxes S by application of a transformation rule including the modified contains and union operations. Then S is consistent if and only if S’ is.
Reasoner Example
Reasoner
* indicates defaultness
Reasoner Conclusion
• Default rules can create clashes• True clashes different than default clashes• Default clashes can be contracted and resolved
• Defaultness can be propagated through inference• Modify inference rules, contains, and union• Sound and Complete
Conclusion
Conclusion
• Constraints can be generated• Disjunction – Most specific, but slow• LCNS – Tends to over generalize, slowest• Vivification - Balanced generalization, fast
• Default Reasoning• Track defaultness• Retract default statements• Balanced by efficient generation and reasoning
Future Work
• Future Work• Investigate individual assertions• Extend to support OWL 1.1 domain/range
pairings• Use of external data sources (e.g. Cyc, WordNet)
to improve constraint generation• Investigate application to improve search
performance and results• Extend default reasoning to support SWRL
terminology rules.
Final Words
• Thesis statement was supported• An algorithm for constraint generation was described• Its impact on reasoner performance was assessed• Default reasoning, sufficient for this work, was
described
• Expected outcomes were met• A set of tools to generate property constraints was
created• A qualitative assessment of generation was applied to
all available ontologies• A default reasoner using described rules was
implemented