modeling with rdf rdf and tabular data consider the product database relation idmodel...
TRANSCRIPT
Modeling with RDF RDF and Tabular Data Consider the Product database relation
ID Model No. Division Product Line SKU
1 ZX-3 Manufacturing support Paper machine FB3524
2 ZX-3P Manufacturing support Paper machine KD5243
3 B-1430 Control Engineering Feedback line KS4520
4 B-1430X Control Engineering Feedback line CL5934
Each row describes a single entity
The type of each entity is the table name (Product)
Give each row a distinct URIref
The table gives us a locally unique identifier: the primary key (ID)
To get a globally unique identifier, use the URI for the database itself
Say, http://www.acme.mfg.com
Use the prefix mfg:
For an identifier for a row, concatenate the table name (“Product”) with the unique key
Express it in the mfg: namespace
mfg:Product1, mfg:Product2, …
The column labels correspond to properties
For globally unique identifiers, use same namespace as for individuals
For local names, concatenate table name and column label
mfg:Product_ModelNo, mfg:Product_Division, …
There’s 1 triple per table cell
Plus 1 triple per row for type info
In N3 (1st and 3rd rows only)
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix mfg: <http://www.acme.mfg.com/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
mfg:Product1 rdf:type mfg:Product .
mfg:Product1 mfg:Product_ID "1"^^xsd:integer .
mfg:Product1 mfg:Product_ModelNo "ZX-3" .
mfg:Product1 mfg:Product_Division "Manufacturing support" .
mfg:Product1 mfg:ProductLine "Paper machine" .
mfg:Product1 mfg:Product_SKU "FB3524" .
mfg:Product3 rdf:type mfg:Product .
mfg:Product3 mfg:Product_ID "3"^^xsd:integer .
mfg:Product3 mfg:Product_ModelNo "B-1430" .
mfg:Product3 mfg:Product_Division "Control engineering" .
mfg:Product3 mfg:ProductLine "Feedback line" .
mfg:Product3 mfg:Product_SKU "KS4520" .
In RDF/XML (1st and 3rd rows)<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:mfg="http://www.acme.mfg.com/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
<mfg:Product rdf:about="http://www.acme.mfg.com/Product1">
<mfg:Product_ID rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">1</mfg:Product_ID>
<mfg:Product_ModelNo>ZX-3</mfg:Product_ModelNo>
<mfg:Product_Division>Manufacturing support</mfg:Product_Division>
<mfg:Product_ProductLine>Paper machine</mfg:ProductLine>
<mfg:Product_SKU>FB3524</mfg:Product_SKU>
</mfg:Product>
<mfg:Product rdf:about="http://www.acme.mfg.com/Product3">
<mfg:Product_ID rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">3</mfg:Product_ID>
<mfg:Product_ModelNo>B-1430</mfg:Product_ModelNo>
<mfg:Product_Division>Control engineering</mfg:Product_Division>
<mfg:Product_ProductLine>Feedback line</mfg:Product_ProductLine>
<mfg:Product_SKU>KS4520</mfg:Product_SKU>
</mfg:Product>
</rdf:RDF>
The RDF Graph (1st and 3rd rows)
The triples could be spread across the Web
Use a common base
Some of the property values (objects) could be resources, identified by URIrefs, instead of literals
Especially Product_Division and Product_ProductLine
Foreign keys (unique identifiers in other relations unique URIrefs)
RDF and Inferencing In the Semantic Web, consistency constraints on the data can be
expressed in the data itself
Lets us model smart data: can describe how they should be used
The Semantic Web stack includes a series of layers on top the RDF layer to describe consistency constraints on the data
The key is the notion of inferencing
Given some stated info, we can determine related info that can be considered as if stated
E.g., given X is married to Y, conclude that Y is married to X
Or conclude that the following is an inconsistent set of statements:
X is disjoint from Y, z X, z Y
This may seem trivial, but it’s exactly this that’s missing in dumb data
Type Propagation Rule (an Inference Pattern)
If A rdfs:subClassOf B and x rdf:type A then x rdf:type B
This statement is the entire definition of subClassOf in RDFS
Consistent with informal notion of broader term in a thesaurus or taxonomy
Our data lives in the Web
To make them more useful, they should behave in a consistent way when combined with data from multiple sources
Basing the meaning of constraint terms on inferencing provides a robust solution to understanding the meaning of novel combinations of terms
E.g., to determine how multiple inheritance works in RDFS, apply the Type Propagation Rule twice:
If C is a subclass of A and a subclass of B, then,
if x is of type C, then x is of type A and x is of type B
Asserted triples are those asserted in the original RDF store (possibly merged from multiple sources)
Inferred triples are the additional triples inferred by one of the inference rules governing a given inference engine
An inference engine could infer a triple already asserted
We consider it asserted
No mater how a triple is available, an inference engine treats it in the same way
Example Given the asserted triples
ex:PassengerVehicle rdfs:subClassOf ex:MotorVehicle .
ex:Van rdfs:subClassOf ex:MotorVehicle .
ex:MiniVan rdfs:subClassOf ex:Van .
ex:MiniVan rdfs:subClassOf ex:PassengerVehicle .
ex:myVehicle rdf:type exMiniVan
ex:yourVehicle rdf:type ex:Van
The following are some of the inferred triples
ex:myVehicle rdf:type ex:Van
ex:myVehicle rdf:type ex:PassengerVehicle
ex:yourVehicle rdf:type ex:MotorVehicle
But, e.g., the following triple can’t be inferred
ex:yourVehicle rdf:type ex:PassengerVehicle
The situation is more subtle when info is merged from multiple sources each with its own inference engine
Most RDF implementations allow new triples to be asserted into the triple store
So an application can assert inferred triples
If those triples are serialized and shared on the Web,
another application could merge them with other sources and draw further inferences
Here the distinction between asserted and inferred might be too course to be useful
Suppose 1 info source provides a list of members of the class PassengerVehicle and another provides a list of members of class Van To find a list of all MotorVehicles , we use the inference rule for
subClassOf
We have a merged graph
The inferencing determines that both myVehicle and yourVehicle are members of the class MotorVehicle
Two fundamental components in this data integration example
1. A model that expresses the relationship between the 2 data sources Here the model consists of a single class having both the
classes to be integrated as subclasses This is represented by the single concept of subClassOf
2. The notion of inference Inferencing applies the model to the 2 data sources to
produce a single, integrated answer
When data seem disconnected, it’s often because some apparently simple consistency is conspicuously absent
RDFS and Meaning Modeling in RDF is about graphs—it creates a graph structure to
represent data
Modeling in RDFS is about sets
RDFS provides a way to talk about the vocabulary used in an RDF graph
It lets an info modeler express, relative to particular data modeling and integration needs, how individuals are related how properties used to define individuals are related to sets of
individuals etc.
RDFS is like other schema languages in providing info about how we describe data
But it differs from other schema languages in important ways
Consider document modeling systems
A schema language here (XML Schema) lets us express the set of allowed formats for a document
Can then determine (automatically) whether a given document confirms to the schema
A database schema provides type and key info for relations
A relation itself has nothing indicating the meaning of info in a given column or which column is used to index the relation
For OO programming systems, the class structure again helps organize the data
It not only describes the data but also determines, according to the inheritance policy of the language, what methods are available for a given instance and how they are implemented
This contrasts with databases and XML:
It doesn’t interpret the data but instead gives a systematic way to describe info and transformations for that info
In all cases, the schema is info about the data
The schema in RDF should help provide meaning to the data
The mechanism by which it does this is inference
Inference lets us know more about a set of data than what’s explicitly expressed in the data
It thus explicates the meaning of the original data
The additional info is based in a systematic way on patterns in the original data
RDFS provides detailed axioms that express exactly what inferences can be drawn from given data patterns
Most modeling systems have a clear distinction between the data and its schema
But many modern systems model the schema in the same forma as the data—e.g.,
The introspection API of Java represents the object models as objects
An XML Schema document is a well-formed XML document
For RDF, the schema language was defined in RDF from the start
All schema info in RDFS is defined with RDF triples
It’s thus easy to give a formal description of the semantics of RDFS:
Give inferences rules that work over patterns of triples
In RDF, everything is expressed in triples
The meaning of asserted triples is expressed in inferred triples
The structures that drive these inferences are also triples
So this process can continue as far as it needs to:
The schema info that provides context for info on the Semantic Web can itself be distributed on the Semantic Web
Can explicate the meaning of each RDFS resource by indicating what triples can be inferred in certain circumstances (defined by some pattern of triples)
Illustrate this with one of the most fundamental RDFS terms: rdfs:subClassOf
In RDFS, class membership is given meaning only when there are several classes and a known relation between them
That something’s a member of a class means it’s a member of any superclass Cf. the Type Propagation Rule
The various rules taken together give a way to express a variety of relations between classes and properties Such a collection of classes and properties gives a rudimentary
definition of a vocabulary for RDF data I.e., it defines a set of elements and the relationships of those sets to
the properties describing the elements
Note that we distinguish two senses of “is” with
:Kaneda rdf:type :AllStarPlayer .
:AllStarPlayer rdfs:subClassOf :MajorLeaguePlayer .
RDFS provides a meaning for rdfs:subClassOf here:
The type info for Kaneda propagates in the expected way, giving
:Kaneda rdf:type :MajorLeaguePlayer .
This very simple interpretation of the subclass relationship makes it a workhorse of RDFS modeling
It’s like IF/THEN of programming languages:
IF something is a member of the subclass THEN it’s a member of the superclass
But nothing in RDFS corresponds to the ELSE clause
You can’t infer things from the lack of asserted membership
The RDFS Inference Patterns RDFS consists of a small number of inference patterns
Each provides type info on individuals in a variety of circumstances
Relationship Propagation through rdfs:subPropertyOf
The mechanism RDFS provides for talking about the properties that link individuals is based on an inference pattern defined using resource (keyword) rdfs:subPropertyOf
Terminology includes verbs as well as nouns
Many of the same requirements for mapping nouns from one source to another apply to relationships
Relationship brother is more specific than sibling
If someone is my brother, then he is also my sibling
The rule: For properties P and R, we have
If P rdfs:subPropertyOf R, then, if A P B, then infer A R B
Example A large firm engages a number of people in various capacities
and has a variety of ways to administer these relationships
Some are directly employed by the firm while others are contractors
Among the contractors, some are directly contracted to the company on a freelance basis, others on a long-term retainer, and still others contract through an intermediate firm
All these people can be said to work for the firm
To see how to model this in RDFS, first consider the inferences we should be able to draw and under what circumstances
Name the relationships between a person and the firm :contractsTo, :freeLancesTo, :indirectlyContractsTo, :isEmployedBy, :WorksFor
If we assert any of these properties about a person, we’d like to infer that he :worksFor the firm
And there are intermediate conclusions we can draw—e.g., Both a freelancer and an indirect contractor contract to the firm and
indeed work for it
All these relationships are expressed using refs:subPropertyOf
First introduce the properties:
:freeLancesTo a rdf:Property .
:contractsTo a rdf:Property .
:indirectlyContractsTo a rdf:Property .
:isEmployedBy a rdf:Property .
:worksFor a rdf:Property .
Now the subproperty relationships:
:freeLancesTo rdfs:subPropertyOf :contractsTo . (1)
:indirectlyContractsTo rdfs:subPropertyOf :contractsTo .(2)
:isEmployedBy rdfs:subPropertyOf :worksFor . (3)
:contractsTo rdfs:subPropertyOf :worksFor . (4)
To see what inferences can be drawn, we need some instance data
First, some classes:
:Firm a rdfs:Class .
:Employee a rdfs:Class .
Then some individuals:
:TheFirm a :Firm .
:Goldman a :Employee .
:Spence a :Employee .
:Long a :Employee .
Suppose we have the following triples:
:Goldman :isEmployedBy :TheFirm . (a)
:Spence :freeLancesTo :TheFirm . (b)
:Long :indirectlyContractsTo :TheFirm . (c)
(1)
:worksFor
:contractsTo :isEmployedBy
:freeLancesTo :indirectlyContractsTo
(2)
(4) (3)
We can draw the following inferences:
:Goldman :worksFor :TheFirm . From (a) and (3)
:Spence :contractsTo :TheFirm . From (b) and (1)
:Long :contractsTo :TheFirm . From (c) and (2)
:Spence :worksFor :TheFirm . From (b) and (1) & (4)
:Long :worksFor :TheFirm . From (c) and (2) & (4)
rdfs:subPropertyOf lets us describe a hierarchy of properties
Whenever a property in the tree holds between 2 entities, so does every property above it
Typing Data by Usage—rdfs:domain and rdfs:range Besides saying how 2 properties relate to each other, we’d also like
to represent how a property is used relative to the defined classes
Recall the forms (where P is a property and D and R classes)
P rdfs:domain D .
P rdfs:range R .
Informally: Relation P relates values from class D to values from class R
D is the class of the subject of a triple using P, R is the class of the object
The rules: For property P and classes D and R, we have
If P rdfs:domain D and x P y, then x rdf:type D .
If P rdfs:range R and x P y, then y rdf:type R .
Using P in a way apparently inconsistent with this declaration isn’t an error
Rather, RDFS infers the type info needed to bring P into compliance with its domain and range declarations
In RDFS, can’t assert that a given individual isn’t a member of a given class
(Contrast with OWL)
In fact, no notion of an incorrect or inconsistent inference
So, unlike with XML Schema, an RDF Schema never proclaims an input invalid
It just infers appropriate type info
In this way, RDFS behaves more like a database schema:
Declares what joins are possible but makes no statement about the validity of the joined data
Combination of Domain and Range with
rdfs:subClassOf The inference patterns can interact in interesting ways
Can see this with the 3 patterns seen so far
Show the interaction between rdfs:subClassOf and rdfs:domain, starting with an example
Given the following class info
:Woman a rdfs:Class .
:MarriedWoman a rdfs:Class .
:MarriedWoman rdfs:subClassOf :Woman . (1)
And the property info
:maidenName a rdf:Property .
:maidenName rdfs:domain :MarriedWoman . (2)
And the triple
:Karen :maidenName "Stephens" .
We infer by (2)
:Karen a :MarriedWoman .
But also, by (1) and (2),
:Karen a :Woman .
In general, for any resource X,
Given X :maidenName Y, infer X a :Woman
But this is exactly the definition of domain, i.e., we just saw that
:maidenName rdfs:domain :Woman .
Until now, we’ve applied the inference pattern when a triple using rdfs:domain was asserted or inferred
Now we’re inferring an rdfs:domain triple when we prove that the inference pattern holds
I.e., we view the inference patter as the definition of what it means for rdfs:domain to hold
Generalizing, we have the new inference pattern: Where P is a property and D and C classes,
If P rdfs:domain D and D rdfs:subClassOf C, then P rdfs:domain C
The same conclusion holds for rdfs:range, by the same argument
So, whenever we have domain or range info about a property and a
triple involving that property,
we can draw conclusions about the type of any element based just on that triple
The definitions of rdfs:domain and rdfs:range are the most common problem areas in RDFS for modelers experienced in another data modeling paradigm
Relation to OOP
There’s an “obvious” correspondence
between an rdfs:Class and a class in OOP and
between an RDFS property and an OOP variable,
whence the assertion
P rdfs:domain C .
corresponds to the definition of the variable corresponding to P being defined at class C
From this “obvious” mapping comes an expectation that these definitions inherit in the same way as variable definitions do in OOP
But in RDFS there’s no notion of inheritance per se
The only mechanism is inference
The RDFS inference rule closest to the OO notion of inheritance is the Subclass Propagation Rule
But, using the “obvious” interpretation, we assert that the variable maidenName is defined in MariedWoman
Then infer that it is in the class higher in the tree, Woman
From the OOP point of view, this is like inheritance up the tree
The fallacy in the conclusion comes from the “obvious” mapping of rdfs:domain as defining a variable relative to a class
It’s never accurate in the Semantic Web to say that a property is “defined for a class”
A property is defined independently of any class
The RDFS relations specify which inferences can be correctly made about it in particular contexts
RDFS Modeling Combinations and Patterns
The effect of the few, simple RDFS inference rules can be quite subtle in the context of shared info in the Semantic Web
Set Intersection No explicit RDFS modeling construct for set intersection (or union)
But, when we want to model intersection (or union), we needn’t always have to model it explicitly
Often need only certain inferences supported by these notions
Sometimes these inferences are available through design patterns that combine the RDFS primitives in specific ways
Regarding intersection, an inference we might want to draw is that
If x is in C, then it is in both A and B
The relationship being expressed is C A B
This inference is supported by making C a common subclass of A & B:
C rdfs:subClassOf A .
C rdfs:subClassOf B .
Given the triple
x a C .
by the Type Propagation Rule, we infer (as desired) the triples
x a A .
x a B .
Can’t draw the inference in the other direction
From membership in A and B, we can’t infer membership in C
We can’t express A B C
Example: Hospital Skills Suppose we’re describing hospital staff
A surgeon is a member of the hospital staff and a qualified physician Surgeon Staff Physician
But not every staff physician is a surgeon—the inclusion goes one way
From our assertions, we want to be able to infer, given Kildare is a surgeon, he’s also a staff member and a physician
Assert
:Surgeon rdfs:subClassOf :Staff .
:Surgeon rdfs:subClassOf :Physician .
:Kildare a :Surgeon .
Infer by Type Propagation Rule
:Kildare a :Staff .
:Kildare a :Physician .
Can’t make the inference the other way
A psychiatrist is also both a staff member and a physician
Property Intersection Even though it’s unfamiliar to think of a property as a set, we can
still use intersection and union to describe the functionality supported for properties
RDFS again can’t support these exactly
But we can approximate these notions with proper use of refs:subPropertyOf
One inference is based on one property being an intersection of 2 others
P R S
Given x P y, we want to infer both x R y and x S y
Example: Patients in Hospital Rooms When a patient is logged into a room, we know that
He is on the duty roster for that room
His insurance is billed for that room
Assert
:loggedIn rdfs:subPropertyOf :billedFor .
:loggedIn rdfs:subPropertyOf :assignedTo .
:Marcus :loggedIn :Room101 .
Infer
:Marcus :billedFor :Room101 .
:Marcus :assignedTo :Room101 .
Can’t make the inference in the other direction
If Marcus is billed for Room 101 and assigned to Room 101, no RDFS rules apply
Set Union Express A B C by making C a common superclass of A and
B:A rdfs:subClassOf C .
B rdf:subClassOf C .
Then x a A or x a B implies x a C
Summary C A B by making C a subclass of A and of B
C A B by making both A and B subclasses of C
Example: All-Stars In determining candidates for the All Stars, a league’s rules could state that
MVPs are candidates
Top scorers are candidates
Assert
:MVP rdfs:subClassOf :AllStarCandidate .
:TopScorer rdfs:subClassOf :AllStarCandidate .
Given
:Reilly a :MVP .
:Kaneda a :TopScorer .
we may infer
:Reilly a :AllStarCandidate .
:Kaneda a :AllStarCandidate .
We can’t draw the inference the other way
Given someone is a candidate, can’t infer he’s an MVP or top scorer
Property Union Use rdfs:subPropertyOf to combine properties from diverse
sources as we use rdfs:subClassOf to combine classes as a union
If 2 sources use properties P and Q similarly, a single amalgamated property R can be defined as
P rdfs:subPropertyOf R .
Q rdfs:subPropertyOf R .
Then (for resources x and y), from x P y or x Q y infer x R y
Example: Merging Library Records One library uses property borrows to indicate that a patron has
borrowed a book
Another library uses checkedOut for the same relationship
If we’re sure the 2 mean exactly the same, we make them equivalent:
Library1:borrows rdfs:subPropertyOf Library2:checkedOut .
Library2:checkedOut rdfs:subPropertyOf Library1:borrows .
Any relationship expressed by one library is inferred to hold for the other
If we are not sure they’re exactly the same but have an application that wants to treat them the same,
use the Union pattern to create a common super-property:
Library1:borrows rdfs:subPropertyOf :hasPossession .
Library2:checkedOut rdfs:subPropertyOf :hasPossession .
All patrons & books from both libraries are related by :hasProperty Info from both sources is merged
Property Transfer In modeling info from multiple sources, a common requirement is:
If 2 entities are related by some relationship in one source,
the same entities should be related by a corresponding relationship in the other source
Given property P in one source and property Q in another, say that all uses of P are to be considered uses of Q with
P rdfs:subPropertyOf Q .
Now, from x P y infer x Q y
Perhaps strange to have a design pattern consisting of a single triple
But this use of rdfs:subPropertyOf is so pervasive it merits being distinguished
Example: Terminology Reconciliation A growing number of standard info representation schemes are being
published in RDFS
Info developed in advance of a standard must be retargeted to be compliant
E.g., suppose a given legacy bibliography system uses the term author for the creator of a book
This is dc:creator
Make the system’s data conform to Dublin Core with a single triple
:author rdfs:subPropertyOf dc:creator .
Now anyone for whom the author property has been defined has the same value defined for dc:creator
The work is done by the RDFS inference engine
Not by an off-line editing process
So legacy applications using the author property can carry on as before
Newer, DC-compliant applications can use the inferred data in a standard way
Challenges Look at modeling scenarios that can be addressed with the patterns
we’ve seen
Term Reconciliation Resolve terms used by different agents who want to use their
descriptions together in a federated application
Challenge 2 Enforce the assertion that any member of one class is automatically
treated as a member of another
Solution Take the case where a given term in one vocabulary is fully
subsumed by a term in another
E.g., a researcher is a special case of an analyst
We’d like RDFS to infer from the fact that x is a researcher to that x is an analyst
Researcher rdfs:subClassOf :Analyst .
Now any resource that’s a Researcher
:Wenger a :Researcher .
is inferred to be an Analyst as well
:Wenger a :Analyst
Challenge 3 Suppose there’s considerable semantic overlap between the
concepts analyst and researcher
But neither concept is defined sharply: some analysts might not be researchers and vice versa
Still, for the federated application we want to treat these concepts as the same
Solution Use the Union pattern
Define a new term, investigator, for the federated domain that’s not defined in either of the sources
Effectively define investigator as the union of researcher and analyst using the super-property idiom:
:Analyst rdfs:subClassOf :Investigator .
:Researcher rdfs:subClassOf :Investigator .
No commitment to a direct relationship between analyst and researcher
But have a federated handle for speaking of the general class
Challenge 4 We’ve determined that the 2 classes really are identical
In terms of inference, we want any member of the one class to be a member of the other and vice versa
Solution RDFS doesn’t provide a primitive statement of class equivalence
But achieve it with rdfs: subClassOf
:Analyst rdfs:subClassOf :Researcher .
:Researcher rdfs:subClassOf :Analyst .
If we have
:Reilly a :Researcher .
:Kaneda a :Analyst .
we can infer
:Reilly a :Analyst .
:Kaneda a :Researcher .
The 2 rdfs:subClassOf triples (or any cycle of rdfs:subClassOf triples) assert the equivalence of the 2 classes
Instance-Level Data Integration Suppose we have answers to a single question coming from
multiple sources
E.g., a command-and-control mission planner wants to know where ordinance can’t be targeted
Several sources of info contribute One source provides a list of facilities and their types, some of
which—civilian facilities—must never be targeted Another provides descriptions of airspaces, some of which are
off-limits
A target is off-limits if excluded by either of these sources
Challenge 5 Define a single class whose contents include all individuals from all
these sources (and any ones later added)
Solution Use the Union construction to join the 2 sources into a single,
federated class
fc:CivilianFacility rdfs:subClassOf cc:OffLimits .
space:NoFlyZone rdfs:subClassOf cc:OffLimits .
Any instance of the facility or airspace descriptions identified as restricted are inferred to be cc:OffLimits
Readable Labels with rdfs:label Resources on the Semantic Web are specified with URIs
Not particularly meaningful to people
rdfs:label gives a standard way for presentation engines (e.g., web browsers) to display a printable name of a resource
There might be another source for human-readable names for any resource
Use a combination of the property union and property transfer patterns
E.g., we’ve imported RDF info from an external form (e.g., database or spreadsheet) defining 2 classes of individuals: Person and Movie
Person has property personName
Movie has property movieTitle
E.g.,
:Person1 :personName “James Dean” .
:Movie1 :movieTitle “Rebel without a Cause” .
Challenge 6 Want to use a generic display mechanism, using rdfs:label, to
display info about these people and movies
Solution Define each property as a subproperty of rdfs:label
:personName rdfs:subPropertyOf rdfs:label .
:movieTitle rdfs:subPropertyOf rdfs:label .
When presentation engine queries for rdfs:label of any resource,
inference provides the value of personName or movieTitle as appropriate
No need for code that understands the (domain-specific) distinction between Person and Movie
Data Typing Based on Use A shipping company manages a fleet of vessels, including vessels
under construction being repaired currently in service retired from service
Table 1 shows some info the company might keep
In RDF, each row corresponds to a resource of type ship:Vessel
Some triples:
ship:Berengaria ship:maidenVoyage “Dec. 16, 1946” .
ship:QEII ship:nextDeparture “Mar. 4, 2010” .
Subclasses of Vessel
DeployedVessel—has been deployed sometime in its lifetime
InServiceVessel—currently in service
OutOfServiceVessel—currently out of service (for any reason, possibly retired or never deployed)
RDFS triples:
ship:DeployedVessel rdfs:subClassOf ship:Vessel .
ship:InServiceVessel rdfs:subClassOf ship:Vessel .
ship:OutOfServiceVessel rdfs:subClassOf ship:Vessel .
Challenge 7 Automatically classify a vessel into a specific subclass from info in
Table 1:
Has had a maiden voyage ship:DeployedVessel
Next departure set ship:InServiceVessel
Has a decommission or destruction date
ship:OutOfServiceVessel
Solution Enforce these inferences using rdfs:domain
ship:maidenVoyage rdfs:domain ship:DeployedVessel .
ship:nextDeparture rdfs:domain ship:InServiceVessel .
ship:decommissionedDate rdfs:domain ship:OutOfServiceVessel .
ship:destructionDate rdfs:domain ship:OutOfServiceVessel .
Each of the 3 subclasses is in the domain of 1 or more of the 4 properties
All 4 instances have ship:maidenVoyage ship:DeployedVessel ship:QEII & ship:Constitution have ship:nextDeparture ship:InServiceVessel
ship:Titanic has ship:destructionDate and ship:Berengia has ship:decommissionedDate ship:OutofServiceVessel
Challenge 8 All these inferences concern the subject of the rows—the vessels
Can also draw inferences about what’s in the other table cells
Express that the commander of a ship has the rank of captain
Solution Express ranks as classes:
ship:Captain rdfs:subClassOf ship:Officer .
ship:Commander rdfs:subClassOf ship:Officer .
ship:LieutenantCommander rdfs:subClassOf ship:Officer .
ship:Lieutenant rdfs:subClassOf ship:Officer .
ship:Ensign rdfs:subClassOf ship:Officer .
Express that a ship’s commander has rank captain:
ship:hasCommander rdfs:range ship:Captain .
From Table 1, can infer that all of Johnson, Warwick, Black, Montgomery are captains
Filtering Undefined Data Select out for further processing individuals based on info defined
for them
Challenge 9 Those vessels with nextDeparture dates are selected as input to,
e.g., a system scheduling group tours
Solution Define the class that’s the domain of nextDeparture:
ship:DepartingVessel a rdfs:Class .
ship:nextDeparture rdfs:domain ship:DepartingVessel .
Only ship:Constitution and ship:QEII are members of ship:DepartingVessel—see Table 1
RDFS and Knowledge Discovery Because of the inference-based semantics of RDFS, domains and
ranges aren’t used to validate info
They’re used to determine new info based on old
domain and range are best understood as tools for knowledge discovery rather than knowledge description
On the Semantic Web, we don’t know in advance how info from somewhere else should be interpreted in a new context
domain and range let us things discover about our data based on its use
Use domain and range in one of the above (or similar) patterns,
where a particular knowledge filtering or discovery pattern is intended
Modeling with Domains and Ranges In the shipping example, 2 definitions for the nextDeparture domain:
ship:nextDeparture rdfs:domain ship:InServiceVessel . (1)
ship:nextDeparture rdfs:domain ship:DepartingVessel . (2)
Consider the case of the QEII, for which we asserted
ship:QEII ship:maidenVoyage “May 2, 1969” .
ship:QEII ship:nextDeparture “Mar. 4, 2010” .
ship:QEII ship:hasCommander ship:Warwick .
The rules of inference for rdfs:domain let us infer
ship:QEII a ship:InServiceVessle . by (1)
ship:QEII a ship:DepartingVessel . by (2)
Each is from the definition of rdfs:domain as applied to the indicated domain declaration
Any vessel for which a nextDeparture is specified is inferred to be in the intersection of the 2 classes in the domain statements
This is counterintuitive from the point of view of OOP
If a “property” is associated with 2 classes, the intuition is that we can use it with members of either class
I.e., it may be used with their union
To see the ramifications of this intersection behavior,
suppose a company manages a team of traveling salespersons, each with schedule of trips
sales:SalesPerson rdfs:subClassOf foaf:Person .
sales:sells rdfs:domain sales:SalesPerson .
sales:sells rdfs:range sales:ProductLine .
sales:nextDeparture rdfs:domain sales:SalesPerson .
Now merge the info for the sales force management with the schedules of the ocean liners
A simple connection to make between the 2 models is to link their nextDeparture properties:
sales:nextDeparture rdfs:subPropertyOf ship:nextDeparture .
ship:nextDeparture rdfs:subPropertyOf sales:nextDeparture .
The intuition is that the 2 uses of nextDeparture are in fact the same (Both refer to dates)
To see what inferences are supported, suppose we assert
sales:Johannes sales:nextDeparture “May 31, 2008” .
We already have
ship:QEII ship:nextDeparture “Mar. 4, 2010” .
We can infer the following
sales:Johannes a ship:DepartingVessel .
by rdfs:subPropertyOf then rdfs:domain
ship:QEII a sales:SalesPerson .
by rdfs:subPropertyOf then rdfs:domain
ship:QEII a foaf:Person .
by rdfs:subClassOf
Might want to blame rdfs:domain, but the real issue is a modeling error
Jumped to the conclusion that 2 properties should be mapped so closely
The domain statements are quite strong statements
It would be surprising if 2 properties defined so specifically didn’t have extreme ramifications when merged
Lesson: Avoid merging things whose meanings have important differences
If we just want to merge the 2 notions of nextDeparture for a calendar application,
map both properties to a 3rd, domain-neutral property:
ship:nextDeparture rdfs:subPropertyOf cal:nextDeparture .
sales:nextDeparture rdfs:subPropertyOf cal:nextDeparture .
The amalgamating property, cal:nextDeparture, doesn’t need any domain info
We can (as desired) infer
sales:Johannes cal:nextDeparture “May 31, 2008” .
ship:QEII cal:nextDeparture Mar. 4, 2010” .
With just a few primitives, RDFS provides considerable subtlety for modeling relationships between data sources
But also the chance for subtle errors
Understand the modeling connections by tracing the inferences