introduction to protégé for absolute beginners
Post on 25-Feb-2016
103 Views
Preview:
DESCRIPTION
TRANSCRIPT
Introduction to Protégé for Absolute Beginners
University at BuffaloAugust 11-12, 2012
2
Goal and Content of Tutorial
• The goal of the tutorial is to explain how to translate ontologies into a language that can be processed by computers
• Three main sections by content:– Overview of the Web Ontology Language (OWL)– Hands-on training in Protégé, an OWL editor– Overview of SPARQL Protocol and RDF Query
Language (SPARQL), a query language for retrieving and modifying ontologically grounded information
3
IS THE GOAL WORTHWHILE?
4
The Current State of Data Integration on the Web
• Search engines return some remarkably precise results but the precision degrades as the topics become less standardized
5
A Query Containing Standardized Terms…
6
…Yields Very Good Results
7
But as the Terms Become Less Standardized…
8
…the Results Become Less Precise
9
The Current State of Data Integration in the Enterprise
• Using more than a single software application carries a risk of added cost to combine the information they create. – Databases carry very little meta-data about the
content of information they contain– Spreadsheets most often carry less
10
In the Social Network, Hashtags Cluster Information Into Categories
• But the ambiguities of language reappear in the categories
• and the lack of rigor in relating one category to another is an obstacle to machine based validation of usage.
11
The Value Added by OWL Ontologies to Data Integration
• Ontologies endow terms with machine processable definitions and disambiguate different senses of the same expression
• Ontologies place restrictions on how terms can be related to other terms so that misuse and inconsistencies can be detected.
12
The Ontologized Web, Enterprise and Social Network
• What if creators of web pages, databases, and blogs used terminology from curated ontologies to annotate their content?
• Standardized ways of describing the structures to represent data is accepted, why not extend that acceptance to annotation of content?
• Expected Benefits:– The precision of search increase dramatically– Data from different sources can be merged– Gaps in information can be identified– Falsehoods and incoherent expressions can be detected
13
OVERVIEW OF RESOURCE DESCRIPTION FRAMEWORK (RDF)
14
Resource Description Framework (RDF)• Designed to be a language for making
assertions about resources • A Resource* is – an electronic document, an image, a source of
information with a consistent purpose – not necessarily accessible via the Internet; e.g.,
human beings, corporations, and books in a library can also be resources.
– an abstract concept such as the operators and operands of a mathematical equation or types of a relationship (e.g., "parent" or "employee“)
*derived from RFC 3986-Uniform Resource Identifier (URI): Generic Syntax from http://tools.ietf.org/html/rfc3986
15
Expressing Information in RDF
• Statements are always expressed in the form of a triple: – Subject – Predicate – Object (a.k.a. RDF Triple)
• Translating the statement “Austria’s GDP per capita is 30,500 Euros” into RDF requires breaking it into triples
Subject Predicate Object
Austria has economic indicator Austria’s GDP per capita
Austria’s GDP per capita has value 30,500 Euros
16
Universal Resource Identifiers (URIs) and Literals
• URIs are unique names of resources– http://dbpedia.org/page/Austria– http://en.wikipedia.org/wiki/Austria
• Literals– Can be a simple raw text value– can be annotated with a language tag as in
“Austria”@en– can be typed with a datatype as in
“30,500Euros”^^string
17
Rules for RDF Statements
• Subject and Predicate have to be URI named resources
• Object – can be either a URI named resource or a literal
18
Applying the RulesUsing “dbpedia:”, “ro”, and “example:” as prefixes for:
http://dbpedia.org/page, http://www.obofoundry.org/ro, andhttp://www.myexample.com/resource respectively,
Which of the following are well-formed RDF statements?
Subject Predicate Object
dbpedia:Austria ro:part_of dbpedia:Europe
dbpedia:Austria ro:part_of “Europe”@en
“Europe” ro:has_part dbpedia:Austria
dbpedia:Austria “is trading partner with” dbpedia:Germany
dbpedia:Europe ro:part_of dbpedia:Austria
example:30500Euro example:is_value_of example:AustrianGDPperCapita
19
RDF Graphs
dbpedia:Austria example:has_economic_indicator
example:Austrian_GDPper
Capita
example:has_value
30,500Euros^^string>
Nodes
Edges
The direction of the edges is always away from the subject and towards the object of the statement
20
Graphing RDFHow would the following be represented in a RDF Graph?
Subject Predicate Object
game1:MonopolyPlayer_1 rdf:Type mnply:MonopolyPlayer
game1:MonopolyPlayer_1 mnply:has_role game1:MonopolyBanker_Game1
game1:MonopolyPlayer_1 mnply:represented_by game1:MonopolyTokenBoot_Game1
game1:MonopolyPlayer_1 mnply:competes_in game1:MonopolyGame_Game1
21
Graphing RDF
game1:Monopoly
Game_Game1
game1:MonopolyPlayer_1
mnply:Monopoly
Player
game1:MonopolyBanker_Game1
game1:MonopolyToken
Boot_Game1
mnply:represented_by
mnply:has_role
mnply:competes_in
rdf:type
22
How far does RDF take us toward our goal?
• The value of RDF lies in the use of URIs, as it allows distinct information sources to share a common meaning for terms – Every occurrence of the same URI is a reference to
the same resource
• There is no inference with RDF, no way to validate use of URIs.
23
OVERVIEW OF RDF SCHEMA (RDFS)
24
RDF Schema (RDFS)
“RDF Schema defines classes and properties that may be used to describe classes, properties and other resources”*
RDFS defines terms that can describe classes of things and the relationships that hold between these classes
*RDF Vocabulary Description Language 1.0: RDF Schema from http://www.w3.org/TR/rdf-schema/
25
The Need for RDFS
• RDF can name, but not define, resources or the relationships that hold between them
• But what about…
Apples are a kind of fruit
Subject Predicate Object
dbpedia:Apple ex:is_kind_of dbpedia:Fruit
26
The Need for RDFS
• Machines cannot process elements of an expression that lie outside of RDF. To a machine our example looks like:
• We need language elements that enable a machine to process relationships between entities
Apples are a kind of fruit
Subject Predicate Object
tuvwxyz:Abcde ef:ij_klmn_op tuvwxyz:Fghij
27
RDFS Types
• Allows a resource to be typed as a class (i.e. a collection of individuals)
• Allows a class to be defined as a subclass of another class (i.e. all individuals that it contains are contained in the other)
• Allows a property to be defined as a subproperty of another property
28
RDFS Taxonomies• Enables the creation of taxonomies of both
classes and properties
Fruit
AppleCortland
Apple
Gala Apple
is related to
is sibling of
is brother of
is sister of
Class Taxonomy Property Taxonomy
29
rdfs:Resourcerdfs:Classrdfs:Literalrdfs:Datatyperdfs:rangerdfs:domainrdfs:subClassOfrdfs:subPropertyOfrdfs:labelrdfs:commentrdfs:ContainerMembershipPropertyrdfs:Memberrdfs:seeAlsordfs:isDefinedBy
RDFS Vocabulary
30
RDFS Vocabulary in Action• rdfs:subClassOf is used to assert that every instance of a class is
an instance of another.
• If a resource is rdf:type dbpedia:Apple, a reasoner will assert that the resource is also rdf:type dbpedia:Fruit
example:Newtons
Apple
dbpedia:Apple
dbpedia:Fruit
rdf:type rdfs:subClassOf
rdf:type
Apples are a kind of fruit
Subject Predicate Object
dbpedia:Apple rdfs:subClassOf dbpedia:Fruit
31
RDFS Vocabulary in Action• rdfs:subPropertyOf is used to assert that every
pair of resources that are related by a property are also related by another.
• If Ann is the sister of Ben and is sister of is a subproperty of is sibling of, then a reasoner will assert that Ann is a sibling of Ben
Every sister of a person is a sibling of that person
Subject Predicate Object
ex:is_sister_of rdfs:subPropertyOf ex:is_sibling_of
32
RDFS Vocabulary in Action• rdfs:domain is used to assert that a property is always
applied to instances of one or more classes.
• If Ann is related to Ben via the ex:is_sister_of property, a reasoner will assert that Ann is rdf:type ex:Female
example:Ann
Example:Ben
example:Female
example:is_Sister_of
rdf:type
Only females can be sisters of others
Subject Predicate Object
ex:is_sister_of rdfs:domain ex:Female
33
RDFS Vocabulary in Action• rdfs:range is used to assert that the instances of the
object of a property are always of one or more classes or datatypes
• If Newton’s apple is related to Newton’s apple tree via the ex:is_borne_by property, a reasoner will assert that Newton’s apple tree is rdf:type dbpedia:Plant
example:Newton’s
Apple
Example:Newton’s
Apple Tree
dbpedia:Plant
example:is_borne_by
rdf:type
Only plants can bear fruit
Subject Predicate Object
ex:is_borne_by rdfs:range dbpedia:Plant
34
RDFS Vocabulary in Action
• rdfs:label is used to provide a human readable version of a resource’s name.
• If a GUID is used as the identifier for the class of Apple, then use rdfs:label to assign as many human readable versions as desired.
Subject Predicate Object
ex:EXO_0002032 rdfs:label “Apple”@en
ex:EXO_0002032 rdfs:label “Manzana”@sp
ex:EXO_0002032 rdfs:label “Mela”@it
35
RDFS Vocabulary in Action• rdfs:comment is used to provide a human-
readable description of a resource
Both comments are reused from http://dbpedia.org/page/Apple
Subject Predicate Object
dbpedia:Apple rdfs:comment “The apple is the pomaceous fruit of the apple tree, species Malus domestica in the rose family”@en
dbpedia:Apple rdfs:comment “La mela è il frutto (più precisamente si tratta di un falso frutto a pomo) del melo.” @it
36
RDFS Vocabulary in Action
• rdfs:seeAlso is used to assert that a resource provides additional information about the subject resource.
Subject Predicate Object
dbpedia:Apple rdfs:seeAlso wiki:Apple
dbpedia:Apple rdfs:seeAlso ex:Apple
37
RDFS Vocabulary in Action
• rdfs:isDefinedBy is used to assert that a resource defines the subject resource.
Subject Predicate Object
dbpedia:Apple rdfs:isDefinedBy wiktionary:apple
dbpedia:Apple rdfs:isDefinedBy wordnet:apple
38
How far does RDFS take us toward our goal?
• Contains elements that enable machine inferencing on necessary conditions (e.g. Apples are the fruit of the apple tree)
• Doesn’t allow restrictions on classes that would enable inferencing on sufficient conditions (e.g. Apples are the fruit of the apple tree)
• Doesn’t provide a way to exclude resources from class membership, can’t validate assertions.
39
OVERVIEW OF THE WEB ONTOLOGY LANGUAGE (OWL)
40
Web Ontology Language (OWL*)
• OWL is the descendant of Knowledge Representation Languages of the 1990’s such as Simple HTML Ontology Extensions (SHOE) and Ontology Inference Layer (OIL) and from the DARPA Agent Markup Language (DAML)
• The initial version of OWL became a formal W3C Recommendation on February 10, 2004
• OWL 2 became a W3C Standard on October 27, 2009
* why “OWL” instead of “WOL” http://lists.w3.org/Archives/Public/www-webont-wg/2001Dec/0169.html
41
The Need for OWL• RDFS lacks the expressive power allow inferences about individuals
beyond their class membership.
• Based on this equivalence a machine can infer only that the two classes have the same instances.
• We want to enable a machine to infer the attributes of an individual based upon the definition of the class of which they are members
Subject Predicate Object
dbpedia:Apple rdf:type rdfs:Class
dbpedia:Apple rdfs:subClassOf ex:FruitOfAppleTree
ex:FruitOfAppleTree rdf:type rdfs:Class
ex:FruitOfAppleTree rdfs:subClassOf dbpedia:Apple
42
OWL Usage
“The W3C OWL 2 Web Ontology Language (OWL) is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things. OWL is a computational logic-based language such that knowledge expressed in OWL can be reasoned with by computer programs either to verify the consistency of that knowledge or to make implicit knowledge explicit.”*
* http://www.w3.org/TR/owl2-primer/
43
Defining Classes -EnumerationUse owl:oneOf to enumerate the members of a
classIn Manchester Syntax
Class: MonopolyToken
EquivalentTo: {Battleship , Boot , Car , Dog , Thimble , Top_Hat , Wheelbarrow, Iron} SubClassOf: Thing
44
Defining Classes - Restrictions
• owl:Restriction creates a class defined using an object property and either:– a value constraint which places a constraint on the range
of the property when applied to this particular class• e.g. the rdfs:range of the is_borne_by property might be plant,
but when defining apple we would constrain the range to the class of apple trees
– a cardinality constraint which places a constraint on the number of values a property can take in the context of a particular class• e.g. there can be no more than 8 players in a game of Monopoly
45
Additional Inferences Gained Through Restrictions
Without a restriction all that can be inferred about an improved property is that it must also be a property
Class: MonopolyImprovedProperty SubClassOf: MonopolyProperty
Adding a restriction adds the information that an improved property must be a property and that it must be the location of some building
Class: MonopolyImprovedProperty
EquivalentTo: location_of some MonopolyBuilding SubClassOf: MonopolyProperty
46
rdfs:subClassOf vs. owl:equivalentClassproperty that is the location of a building
improved property
Virginia Place is thelocation of House 1
is a subclass of
improved property
Virginia Place is thelocation of House 1
is an equivalentclass of
property that is the location of a building
?
?
47
owl:allValuesFrom vs. owl:someValuesFrom
• owl:allValuesFrom constrains the object property so that its value must come from the specified class or data range– Example: A mortgaged property is one such that it is
owned only by the bank• owl:someValuesFrom constrains the object
property so that at least one of its values must come from the specified class or data range– Example: An improved property is the location of some
building
48
owl:hasValue• The owl:hasValue constraint limits an object property to a
given value, which can be either an individual or a data value. For example we could use this constraint to assert that all monopoly railroads have a price of 200.
Class: MonopolyRailroad
SubClassOf: has_price value 200, MonopolyProperty
• Given an resource that is a Monopoly Railroad a reasoner will infer that its price is 200.
game1:ReadingRailroad
mnply:MonopolyRailroad
mnply:has_price =
200
rdf:type rdfs:subClassOf
mnply:has_price 200
49
owl:hasValue• To define the class of New York City building we can use
owl:hasValue on the property of located_in and the individual NewYorkCity
Class: NewYorkCityBuilding
SubClassOf: located_in value NewYorkCity, Building
• Given an resource that is a New York City building a reasoner will infer that its location is New York City.
example:EmpireState
Building
example:NewYorkCity
Building
example:located_in
NYC
rdf:type rdfs:subClassOf
example:located_in example:NewYorkCity
50
Cardinality Constraints• Useful in expressing that a class has an exact number of
relationships to another class or data range.
Example: A turn has exactly one player as a participant and exactly one integer as its ordinal value
Class: MonopolyTurn
Annotations: rdfs:label "Monopoly turn"^^xsd:string SubClassOf: has_ordinal_value exactly 1 xsd:integer, has_participant exactly 1 MonopolyPlayer, occurs_containing some MonopolyRollOfDice, occurs_during some MonopolyRound, MonopolyEvent
51
Cardinality Constraints• Can also express that the number of instances of a given
relationship between a class and another class or data range can span a range of values
Example: A color group can have between 2 and 3 properties as members.
Class: MonopolyColorGroup
SubClassOf: owl:Thing, (has_member min 2 MonopolyProperty) and (has_member max 3 MonopolyProperty)
52
Set Operators
• owl:intersectionOf - a class is formed from the individuals that are common to two or more classes
• owl:unionOf – a class is formed from the individuals that are in any of two or more classes
• owl:complementOf – a class is formed from the individuals that are not members of a class
53
owl:equivalentClass and owl:disjointWith
• owl:equivalentClass establishes that two classes have the same instances– this is similar to the owl:sameAs that establishes
that two classes have the same intention• owl:disjointWith establishes that two classes
have no members in common
54
Defining Properties - Subtypes
• Object Property – used to link individuals to individuals
• Datatype Property – used to link individuals to data values
• Annotation Property – used to link ontology elements to metadata
55
Defining Properties – Relations to Other Properties
• owl:equivalentProperty – behaves similarly to owl:equivalentClass, two properties are equivalent if and only if they have the same members (i.e. they have the same extension)
• owl:inverseOf – if x is related to y with by property A and property A is the inverse of property B, then y is related to x with property B
56
Defining Properties – Cardinality Constraints
• owl:FunctionalProperty is used to place a uniqueness constraint on the value of the range of a property for each value in the domain of that property.
game1:MonopolyPlayer_1
game1:Monopoly
TokenRailroad_1
mnply:is_represented_by
game1:Monopoly
TokenRailroad_2
mnply:is_represented_by
owl:sameAs
57
Defining Properties – Cardinality Constraints
• owl:InverseFunctionalProperty is used to place a uniqueness constraint on the value of the domain of a property for each value in the range of that property
game1:MonopolyPlayer_1
game1:Monopoly
TokenRailroad_1
mnply:is_represented_by
mnply:is_represented_by
owl:sameAs
game1:MonopolyPlayer_2
58
Defining Properties – Logical Characteristics
• Symmetric Property – P is a symmetric property if aPb then bPa
• Asymmetric Property – P is an asymmetric property if aPb then not bPa
• Reflexive Property – P is a reflexive property every aPa• Irreflexive Property – P is an irreflexive property no
aPa• Transitive Property – P is a transitive property if aPb
and bPc, then aPc
59
A Few Examples• The relationship of being adjacent to is symmetric
– If Mediterranean Avenue is adjacent to Go, then Go is adjacent to Mediterranean Avenue
• The relationship of having a role is asymmetric– If Player_1 has the role of banker, then the role of banker does
not have the role of Player_1
• The relationship of occurring prior to is transitive– If Round_1 occurs prior to Round_2 and Round_2 occurs prior to
Round_3, then Round_1 occurs prior to Round_3
60
Multi-typed PropertiesProperties can be typed by more than one of the logical
characteristics
ObjectProperty: adjacent_to
Annotations: rdfs:label "adjacent to"^^xsd:string Characteristics: Irreflexive, Symmetric Domain: MonopolyBoardSpace Range: MonopolyBoardSpace
61
A COUPLE OF IMPORTANT ASSUMPTIONS
62
No Assumption of Unique Names
• There is no assumption that if “two” resources have unique names that they represent distinct entities
• This holds for any type of resource: class, property, datatype or instance
63
Open World Assumption
• Some data management systems use the Closed World assumption, meaning that if a fact is not found among the data in the system, it is assumed to be false.– In a sales database, if the name “Steve Wozniak” does not
appear in the customer table, then Mr. Wozniak is not a customer of that company
• In Semantic Web applications, the Open World assumption is used, meaning that if a fact is not found among a set of data it is not assumed to be false.
top related