constraint generation and reasoning in owl

Constraint Generationand Reasoning in OWL

Dissertation Defense

Thomas H. Briggs, VI

Advisor: Dr. Yun Peng

University of Maryland, Baltimore County

Introduction

• Property Constraints• Important to defining the semantics of an ontology• Properties may have domain / range constraints• Global consequences from local assertions• 75% of properties are unconstrained

• Property Constraint Generation• Uses information in the ontology to generate

constraints• Can be used to determine missing, suggest new, or

analyze existing constraints• Creates default knowledge that must be treated

differently than other asserted or inferred knowledge.

Thesis

The purpose of this research is to investigate methods for generating domain andrange constraints from its defining ontology and to evaluate the quality of this generation. This work will also investigate the default reasoning necessary to support generated constraints. A specific focus will be on management of the default facts in the knowledge base including tracking default facts and efficient retraction operations to restore consistency.

Research Outcomes

• Outcomes of this work are:• Algorithmic framework to generate and

evaluate domain and range constraints, and• Quantitative comparison of the relationship

between generated and specified constraints, and

• An inference procedure that will enable a limited form of default reasoning that maintains the completeness, and correctness of OWL reasoners.

Description Logics

Description Logics

• Description Logics:• are a branch of crisp logics• include well-researched languages

• AL, CLASSIC, RACER

• have a long history • are the basis of the Semantic Web• have fast and efficient reasoners (for some) DL

• FACT, Pellet

Description Logics

• Describe some world by• Defining classes, properties, and individuals• Classes define types of individuals• Properties define relationships between

individuals• Individuals are things that are instances of

classes, and are related to other individuals through properties.

• Similar to first order logic

Constraints

• An assertion about the types of fillers of a property• Subject is type of domain of property• Object is type of range of property• Unconstrained defaults to Thing/Top

• Different interpretation than traditional languages• Define valid types of individuals• May force a type cast, but error otherwise

void foo(double z){ printf(“%f\n”, z);}

char x[] = “33.0”;foo(x);

teaches: domain(Teacher), range(Student)

teaches(Adam, Bill)

Constraints

• An assertion about the types of fillers of a property• Subject is type of domain of property• Object is type of range of property• Unconstrained defaults to Thing/Top

• Different interpretation than traditional languages• Define valid types of individuals• May force a type cast, but error otherwise

void foo(double z){ printf(“%f\n”, z);}

char x[] = “33.0”;foo(x);

Error: stringscannot bedoubles!

teaches: domain(Teacher), range(Student)

teaches(Adam, Bill)

Adam is a teacher, Bill a student

Open World Assumption• Open World Assumption (OWA)

• Anything that isn’t asserted is considered as unknown.

• Leads to monotonicity in reasoner.

• Closed World Assumption (CWA)• Assume all facts are known• Default knowledgehasChild(ALICE, BOB)

Does Alice have exactly one child?

Yes! No!?

Closed World Open World

Unique Name Assumption

• Assumption that the name of an item is sufficient to make it unique (UNA).• We make this for classes and properties• Do not make this for individuals

True only whensame individual

Open World Assumption –Because we didn’t say they were different, then the reasoner canconclude that they are to makethe model true

Constraint Generation

Unconstrained Properties

• Domain and range assert types to fillers of property

• Unconstrained properties lack these type assertions

• Reasons• Information is unknown• Artifact of ontology generator• Avoid conflicts with reuse• Faulty semantics

Constraint Generation

• Unconstrained properties are a problem

• Constraint generation is a non-trivial process:• Omitted constraints may be intentional or may

not• Open World Assumption – information may not

be there

• Two sources of information on constraints:• ABox• TBox

ABox Generation

• ABox generation problematic• Depends on individuals’ class membership

• Individuals may not be defined / UNA• Frequently do not have a complete set of class

assertions• Class assertions overlap

What should the domain andrange of drives be?

TBox Generation

• Terminology provides definition of the relationship between classes.

Generation Lemma:

Class Vehicle: subClassOf: Thing and (drivenBy some Person)Class Civic: subClassOf: Thing

and (madeBy only HONDA) and (drivenBy some Person)

Domain must subsume:Vehicle union Civic

Vehicle

Vehicle or Civic

Vehicle or X

X

?

Finding “Best”

• Using terminology to find “best” • Intractable – exponential growth• Requires utility function to measure goodness• Requires future knowledge or omniscience

Generation Methods

• Generation Methods • Construct a constraint that satisfies generation

lemma

• Three Generation Methods• Disjunction Method• Least-Common Named Subsumer• Vivification

Disjunction

• Based on Generation Lemma

• Computes the Least Common Subsumer (LCS)• In languages with disjunction, the LCS is simply

the disjunction of the concepts• Generation time linear w.r.t. number classes and

properties• Reasoning time is exponential.

Disjunction Algorithm

Disjunction Example

Domain for P:

Range for P: C

Disjunction Discussion

• Disjunction is good because• It is simple to compute• Most specific / accurate statement of constraint

• Disjunction is bad because• Does not add useful information• Disjunction adds non-determinism to reasoner

Least Common Named Subsumer

• Select a named concept that subsumes concepts• Trade-off in specificity for concept description• Quality depends on existence of named

concepts• May be expensive to compute

• Runtime is

LCNS Algorithm

Subsumption checkingis the dominate cost

LCNS Example

LCNS Domain of P: A

LCNS Range of P: C

Disjunction Domain of P:

LCNS Discussion

• LCNS is good because• It selects a named class in the ontology• Runtime bound to cost for subsumption

checking• Generalizes concepts from disjunction

• LCNS is bad because• Requires existence of a named class or LCNS is

Thing• Tends to over-generalize in other case as well• Over-generalization discards too much

information

Vivification

• Balance specificity and over-generalization• First proposed by Cohen & Hirsh 1992• Difference here is partial absorption• Starts with disjunction, using inheritance

relationship summarizes terms with common direct super-classes.

• Only terms that do not share a common super-class remain in the disjunction

Absorption

• Moderates the generalization process• Uses the class inheritance structure for

operation

Vivification Algorithm

Vivify a concept list in L for a given absorption criteria Beta.

Perform Absorption

Vivification Example

Property P is usedin the definitionof the threeyellow classes.

Disjunction Domain:

LCNS Domain:

Vivification Domain:

Vivification Discussion

• Vivification is good because• It creates general concepts that summarize over

common super-classes, selecting named subsumers

• It preserves outliers• It is fast

• Vivification is bad because• Disjunctions may remain after summarization• Depends on the completeness of the terminology• Ignores individual assertions

Results

Results - Domain

Generated constraint was equal to originally specified one.Positive outcome. Correctly generated constraint with equal specificity.

Domain ResultsDisjunction LCNS Vivification

Relationship # props % # props % # props %1 Original Equals Generated 801 2.8% 833 2.9% 808 2.8%2 Original More Specific Than Generated 7 0.0% 7 0.0% 63 0.2%3 Original More General Than Generated 141 0.5% 103 0.4% 74 0.3%4 Original and Generated Top 800 2.8% 1111 3.8% 807 2.8%5 Original Top, Generated More Specific 2427 8.4% 2112 7.3% 2414 8.4%6 Generated Top, Original More Specific 27 0.1% 71 0.2% 25 0.1%7 Property Unused, Original Specified 3201 11.1% 3204 11.1% 3190 11.0%8 Property Unused, Original Unspecified 21385 74.0% 21406 74.1% 21267 73.6%9 Processor Failed 64 0.2% 46 0.2% 201 0.7%

10 Reasoner Failed 49 0.2% 9 0.0% 53 0.2%Total 28902 28902 28902

Results - Domain

Original more specific than generated. In all cases, the original constraint subsumed itself.Making it more specific than the generated one.




Results - Domain

Original more general then generated.A negative to neutral outcome. The original constraintwas more general than its present usage.




Original Top, Generated Top.Both the original and generated concepts where top.It is a subclass of the case of row 1 where concepts are equal.

Results - Domain




Original Top, Generated More SpecificStrongly positive results. A constraint was generatedfor a concept that previously lacked one.

Results - Domain




Generated Top, Original More Specific.A neutral to negative result. A constraint was generated as Top whenthe original was not Top. An example was an ontology that defined hasAuntas the union of Niece and Nephew, which was equivalent to Person, andPerson was equivalent to everything – hence the generated created Top.

Results - Domain




Property Unused.Neutral results. A constraint could not be generated becausethere were no role restrictions to define the constraints.

Results - Domain




Processor or Reasoner Failed.There was a runtime failure of the processor or reasoners.

Results - Domain




Results – Range

Range results were similar to domain.

Range ResultsDisjunction LCNS Vivification

Relationship # props % # props % # props %1 Original Equals Generated 231 0.8% 248 0.9% 255 0.9%

2 Original More Specific Than Generated 6 0.0% 6 0.0% 17 0.1%3 Original More General Than Generated 172 0.6% 147 0.5% 138 0.5%4 Original and Generated Top 647 2.2% 930 3.2% 657 2.3%5 Original Top, Generated More Specific 2113 7.3% 1839 6.4% 2097 7.3%6 Generated Top, Original More Specific 361 1.2% 392 1.4% 365 1.3%7 Property Unused, Original Specified 3403 11.8% 3428 11.9% 3416 11.8%8 Property Unused, Original Unspecified 21824 75.5% 21834 75.5% 20959 72.5%9 Processor Failed 102 0.4% 63 0.2% 955 3.3%


Results - Normalized

Generation strategies created improved constraints almost 80% of time.

Vivification created constraints nearly as specific as Disjunction.

Domain Disjunction LCNS VivifiedRelationship # props % # props % # props %Original Equals Generated 801 19.1% 833 19.7% 808 19.6%Original More General Than Generated 141 3.4% 103 2.4% 74 1.8%Original Top, Generated Top 800 19.1% 1111 26.3% 807 19.5%Original Top, Generated More Specific 2427 57.8% 2112 49.9% 2414 58.5%Generated Top, Original More Specific 27 0.6% 71 1.7% 25 0.6%Total 4196 4230 4128

Range Disjunction LCNS VivifiedRelationship # props % # props % # props %Original Equals Generated 231 6.6% 248 7.0% 255 7.3%Original More General Than Generated 172 4.9% 147 4.1% 138 3.9%Original Top, Generated Top 647 18.4% 930 26.2% 657 18.7%Original Top, Generated More Specific 2113 60.0% 1839 51.7% 2097 59.7%Generated Top, Original More Specific 361 10.2% 392 11.0% 365 10.4%Total 3524 3556 3512

Results - Runtime

Algorithm min (s)

max (s) mean (s) std. dev (s)

No generation 3.80 14.62 5.29 1.71

LCS 3.85 23.37 5.43 2.29

LCNS 3.79 23.92 5.34 2.20

Vivification 3.79 22.56 5.29 2.08

Algorithm mean (s)

std. dev (s)

LCS 0.22 1.16

LCNS 0.14 0.14

Vivification 0.08 0.82

Time – Load / Reasoning Time

Time – Load, Reason, Generate, Build, Reason – 1000 Ontologies

Vivification faster than disjunctionat 92.6% degree of confidence.

Vivification faster than LCNSat 76.4% degree of confidence.

Hypothesis Testing

Results – Discussion

• Generation• Remove unused properties gives better picture of

future as technologies mature.• Generation a viable method

• Vivification was dominate method• Generated constraints with near equal specificity to

LCS• Able to generalize at appropriate times • Avoided the over-generalization of LCNS• All around best performance for generation and

reasoning

Default Reasoning

Default Reasoning

• Monotonicity• One goal of OWL is to maintain monotonicity –

the property of a reasoner that adding new facts to the knowledge base does not cause existing facts to be retracted.

• Default Knowledge / Rules• Default knowledge and rules about the

terminology make use of Closed World Semantics, give up monotonicity.

• A default rule may conflict with future statements

• Statements must be retracted.

Contraction

• When a clash occurs in a knowledge base with default statements, those default facts must be removed to restore consistency. This is called a contraction.

• How to tell default from non-default?• Inference leads to multi-path problem• Default and non-default facts can be used to

infer new facts• Default facts may block non-default facts from

being generated

Default Example

Class: A SubClassOf: Thing, P some BClass: B SubClassOf: ThingClass: C SubClassOf: Thing

ObjectProperty: PDomain: Thing Range: Thing

Individual: JIndividual: I

Facts: P(I,J)

Class: A SubClassOf: Thing, P some BClass: B SubClassOf: ThingClass: C SubClassOf: Thing

ObjectProperty: PDomain: A Range: B

Individual: JTypes: B

Individual: I Types: A

Facts: P(I,J)

After property generation,domain and range on P weregenerated / default.

Before generation

What if the domain expert adds C SubClassOf: P some B? Now, the domain of P is generated

as A union C. I no longer in A!

Modifications

• Default Descriptor• Indicates the defaultness of a statement or

assertion.• Does not change the meaning of the term

• Inference• Inference rules modified to propagate descriptor• Non-default statement must replace default

statement

Concept Strength

• Concept Strength between concepts C and D• Strength Relationship:

• If C is default and D is not, then C weaker than D• If C and D have same defaultness, then equal• If C is not default and D, then C stronger than D

Reasoner

• Reasoner was implemented from transformation rules such as:

• Depends on contains and union operation.

Modified Reasoner

• The reasoner’s rules are modified

• Contains Rule• The contains(x) predicate will be modified – • Return true if A contains some y, such that y=x,

and x is not stronger than y.

• Union Update Procedure• The union operator to update the KB will be

modified to replace any equivalent weaker term with a stronger term

Contraction Triggering

• An inconsistent knowledge base contains either a true clash or a default clash.• True Clash – clash occurs between two non-

default statements• Default Clash – clash involves at least one

default statement

• A knowledge base that contains only default clashes can be contracted by removing all default facts.• Default facts can be rebuilt using new state of

the KB

Reasoner Completeness

Extends Baader and Nutt’s Completeness of Tableau Reasoner

Reasoner Soundness

Extends Baader’s Soundness Theorem

Assume the transformation rules defined for a non-default Description Logic are truth-preserving. Assume the ABox S’ is obtained from a finite set of Aboxes S by application of a transformation rule including the modified contains and union operations. Then S is consistent if and only if S’ is.

Reasoner Example

Reasoner

* indicates defaultness

Reasoner Conclusion

• Default rules can create clashes• True clashes different than default clashes• Default clashes can be contracted and resolved

• Defaultness can be propagated through inference• Modify inference rules, contains, and union• Sound and Complete

Conclusion

Conclusion

• Constraints can be generated• Disjunction – Most specific, but slow• LCNS – Tends to over generalize, slowest• Vivification - Balanced generalization, fast

• Default Reasoning• Track defaultness• Retract default statements• Balanced by efficient generation and reasoning

Future Work

• Future Work• Investigate individual assertions• Extend to support OWL 1.1 domain/range

pairings• Use of external data sources (e.g. Cyc, WordNet)

to improve constraint generation• Investigate application to improve search

performance and results• Extend default reasoning to support SWRL

terminology rules.

Final Words

• Thesis statement was supported• An algorithm for constraint generation was described• Its impact on reasoner performance was assessed• Default reasoning, sufficient for this work, was

described

• Expected outcomes were met• A set of tools to generate property constraints was

created• A qualitative assessment of generation was applied to

all available ontologies• A default reasoner using described rules was

implemented

constraint generation and reasoning in owl

Documents