don’t like rdf reification? making statements about statements using singleton property
DESCRIPTION
Statements about RDF statements, or meta triples, provide additional information about individual triples, such as the source, the occuring time or place, or the certainty. Integrating such meta triples into semantic knowledge bases would enable the querying and reasoning mechanisms to be aware of provenance, time, location, or certainty of triples. How- ever, an efficient RDF representation for such meta knowledge of triples remains challenging. The existing reification approach allows such meta knowledge of RDF triples to be expressed using RDF by two steps. The first step is representing the triple by a Statement instance which has subject, predicate, and object indicated separately in three different triples. The second step is creating assertions about that instance as if it is a statement. While reification is simple and intuitive, this approach does not have formal semantics and is not commonly used in practice as described in the RDF Primer. In this paper, we propose a novel approach called Singleton Property for representing meta triples and provide a for- mal semantics for it. We explain how this singleton property approach fits well with the existing syntax and formal semantics of RDF, and the syntax of SPARQL query lan- guage. We also demonstrate the use of singleton property in the representation and querying of meta knowledge in two examples of Semantic Web knowledge bases: YAGO2 and BKR. This approach, which is also simple and intuitive, can be easily adopted for representing and querying statements about statements in other knowledge bases.TRANSCRIPT
Don’t like RDF Reification?Making Statements about Statements
Using Singleton Property
Vinh NguyenKno.e.sis
Wright State University
Olivier BodenreiderNational Library of MedicineNational Institute of Health
Amit ShethKno.e.sis
Wright State University
WWW 2014, Seoul
2
Linked Open Data
• > 70% Metadata • Relation Extraction fromunstructured text (PubMed, Wiki)• Evidences• Judgement
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Bob Dylan marriedTo Carolyn Dennis 1986-06-## 1992-10-##
Motivation ScenarioFacts:
Meta Queries:
Query type Sample query
Provenance P1. Where is this fact from? P2. When was it created? P3. Who created this fact?
Time T1. When did this fact occur?T2. What is the time span of this fact?T3. Which events happened in the same year?
Location L1. What is the location associated with this fact? L2. Which events happened at the same place?
Certainty C1. What is the author confidence of this fact?
3
Subject Predicate Object
Bob Dylan marriedTo Sarah Lownds
Bob Dylan marriedTo Carolyn Dennis
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Standard RDF Reification
Form of Triples: Standard RDF Reification
Pros:
1. Intuitive, easy to understand
Cons:1. Takes 3N triples (4N if including
Statement typing) to represent a statement => Not scalable
2. No formal semantics defined => Semantics is unclear
3. Discouraged in LOD!
Time-aware Facts:
4
Subject Predicate Object
#stmt1 type Statement
#stmt1 hasSubject BobDylan
#stmt1 hasProperty marriedTo
#stmt1 hasObject Sara Lownds
Bob Dylan marriedTo Sarah Lownds
#stmt1 starts 1965-11-22
#stmt1 ends 1977-06-29
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Standard RDF Reification
RDF Reification vs. Singleton PropertyTime-aware Facts:
Subject Predicate Object
#stmt1 type Statement
#stmt1 hasSubject BobDylan
#stmt1 hasProperty marriedTo
#stmt1 hasObject Sara Lownds
Bob Dylan marriedTo Sarah Lownds
#stmt1 starts 1965-11-22
#stmt1 ends 1977-06-29
Subject Predicate Object
marriedTo#1 rdf:sp marriedTo
BobDylan marriedTo#1 Sarah Lownds
marriedTo#1 starts 1965-11-22
marriedTo#1 ends 1977-06-29
Singleton Property
5
6
Subject Predicate Object Source DateExtracted
Bob Dylan marriedTo Sarah Lownds wikipage:Bob_Dylan 2009-06-07
Form of Triples: PaCE
Pros:1. Save ~50% number of triples
compared to reification thanks to the repeated subject, predicate, and object.
Cons:1. Not intuitive, hard to
understand
2. Limited expressiveness
Provenance-aware Facts:
Provenance-aware Context Entity
Subject Predicate Object
BobDylan_wp rdf:type Bob Dylan
SaraLownds_wp rdf:type Sara Lownds
BobDylan_wp marriedTo SaraLownds_wp
BobDylan_wp hasSource wiki:Bob_Dylan
BobDylan_wp hasDateExt 2009-06-07
Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, Amit Sheth, and Krishnaprasad Thirunarayan. 2010. Provenance context entity (PaCE): scalable provenance tracking for scientific RDF data. In Proceedings of the 22nd international conference on Scientific and statistical database management (SSDBM'10),
7
Subject Predicate Object Source DateExtracted
Bob Dylan marriedTo Sarah Lownds wikipage:Bob_Dylan 2009-06-07
Provenance-aware Context Entity
Subject Predicate Object
BobDylan_wp rdf:type Bob Dylan
SaraLownds_wp rdf:type Sara Lownds
BobDylan_wp marriedTo SaraLownds_wp
BobDylan_wp hasSource wiki:Bob_Dylan
BobDylan_wp hasDateExt 2009-06-07
Facts and Provenance:PaCE vs. Singleton Property
Subject Predicate Object
marriedTo#1 rdf:sp marriedTo
BobDylan marriedTo#1 Sarah Lownds
marriedTo#1 hasSource wp:Bob_Dylan
marriedTo#1 hasDateExt 2009-06-07
Singleton Property
Form of Quadruples: Named Graph
Pros:
1. Intuitive --creating # named graphs for # sources
2. Attach metadata for a set of triples3. SPARQL supported
Cons:
1. Defined for provenance only
2. Ambiguous semantics while associating different types of metadata at triple level
Time-aware Facts:
* Carroll, Jeremy J., et al. "Named graphs, provenance and trust." Proceedings of the 14th international conference on World Wide Web. ACM, 2005.
8
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Named GraphSubject Predicate Object NG
Bob Dylan marriedTo Sarah Lownds ng_1
ng_1 starts 1965-11-22 Prov_graph
ng_2 ends 1977-06-29 Prov_graph
Named GraphSubject Predicate Object NG
Bob Dylan marriedTo Sarah Lownds ng_1
ng_1 starts 1965-11-22 Prov_graph
ng_2 ends 1977-06-29 Prov_graph
Time-aware Facts:
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Named Graph vs. Singleton Property
Subject Predicate Object
marriedTo#1 rdf:sp marriedTo
Bob Dylan marriedTo#1 Sarah Lownds
marriedTo#1 starts 1965-11-22
marriedTo#1 ends 1977-06-29 9
Singleton Property
10
RDF+:Subject Predicate Object Meta Property Meta value
Bob Dylan marriedTo Sarah Lownds starts 1965-11-22
Bob Dylan marriedTo Sarah Lownds ends 1977-06-29
Form of Quintuples: RDF+
Cons:1. The representation is not in the form of RDF. Statement identifiers are used
internally. Require the mappings from RDF to RDF+ and vice versa.
2. The SPARQL query syntax and semantics need to be extended to support RDF+
Facts and Temporal Information:
* Dividino, Renata, et al. "Querying for provenance, trust, uncertainty and other meta knowledge in RDF." Web Semantics: Science, Services and Agents on the World Wide Web 7.3 (2009): 204-219.
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
11
Overall Goal
3. Scalable, e.g., to LOD
A mechanism to make statements about statements should meet these requirements:
2. Formal semantics defined1. Intuitive, easy to understand
4. Compatible with existing standards
5. Multiple types of metadata
12
Generic Property vs. Singleton PropertySubject Predicate Object Source MarriageDate
Bob Dylan marriedTo Sarah Lownds wikipage:Bob_Dylan 1965-11-22
BarackObama marriedTo MichelleObama wikipage:Barack_Obama 1992-10-03
Facts and Provenance:
Generic Property:
1. marriedTo is an RDF property
2. marriedTo => {(Bob Dylan, Sarah Dylan), (Barack Obama, Michelle Obama),……}
3. Any assertion to marriedTo is applicable to all pairs of entities!
Singleton Property:
1. marriedTo#1, marriedTo#2 are RDF property2. Different property instances: marriedTo#1,marriedTo#2,…marriedTo#n
3. Any assertion to marriedTo#1/marriedTo#2/…/marriedTo#n is applicable to only ONE pair <= KEY
instanceOf
13
• Given a vocabulary V,
Model-Theoretic SemanticsOriginal* Simple Interpretation I :
satisfies additional criteria as follows:
• IPS: a subset of IR, called the set of singleton properties of I,
New simple Interpretation I :
satisfies additional criteria as follows:
• xs IP∈ s if
⟨xs, rdf:SingletonPropertyI I⟩ ∈ EXT (rdf:typeI)
New RDF Interpretation I :
• IR: a non-empty set of resources, alternatively called domain or universe of discourse of I.
• IP: the set of generic properties of I
• IEXT: a function assigning to each property a set of pairs from IR where IEXT (p) is called the extension of property p
• IS: a function, mapping URIs from V into the union set of IR and IP,
• IL: a function from the typed literals from V into the set of resources IR,
• LV: a subset of IR, called the set of literal values.
• IEXT : IP → 2IR X IR
IS_EXT : IPS→ IR X IR.
• IS_EXT (ps): is a function assigning to each singleton property a pair of entities from IR.
• xs IP∈ s if
⟨xs, xI I⟩ ∈ EXT (rdf:singletonPropertyOfI),
and x IP, I∈ S_EXT (xs) = <s1, s2>
14
IR = {α, β, γ, δ, θ, λ, σ, ϕ}
IP = {δ, θ, λ, σ, ϕ}
LV = {1965-11-22, 1977-06-29,
1986-06-##, 1992-10-##}IEXT = θ → { α, β } ⟨ ⟩
λ → { α, γ } ⟨ ⟩ σ → { θ, 1965-11-22 , ⟨ ⟩
⟨λ, 1986-06-## } ⟩φ → { θ, 1977-06-29 , ⟨ ⟩
⟨λ, 1992-10-## } ⟩rdf:sp → { θ, δ ,⟨ ⟩ λ, δ } ⟨ ⟩δ → { α, β ,⟨ ⟩ α, γ } ⟨ ⟩
IPS = {θ, λ}IS_EXT = θ→ α,β ⟨ ⟩
λ → α,γ⟨ ⟩
Model-Theoretic Semantics: ExampleExample of vocabulary VEX:
RDF Interpretation of VEX:
Subject Predicate Object
BobDylan isMarriedTo Sarah Lownds
BobDylan isMarriedTo#1 SaraLownds
isMarriedTo#1 rdf:sp isMarriedTo
isMarriedTo#1 hasStart 1965-11-22
isMarriedTo#1 hasEnd 1977-06-29
BobDylan isMarriedTo CarolynDennis
BobDylan isMarriedTo#2 CarolynDennis
isMarriedTo#2 rdf:sp isMarriedTo
isMarriedTo#2 hasStart 1986-06-##
isMarriedTo#2 hasEnd 1992-10-##
BobDylan
→ α
SaraLownds → β
CarolynDennis
→ γ
isMarriedTo → δ
isMarriedTo#1
→ θ
isMarriedTo#2
→ λ
hasStart
→ σ
hasEnd
→ φ
IS:
Querying Meta Triples Using SPARQLTriple Type Subject Predicate Object
Instantiating singleton property predicate_i rdf:sp predicate
Singleton triple subject predicate_i object
Meta triple predicate_i meta-predicate_j meta-value_j
Singleton Graph Pattern
Data Query:
1. Who married whom?
2. SPARQL query
SELECT ?person1 ?person2WHERE { ?person1 ?married_sp ?person2 .?married_sp rdf:sp:marriedTo .
}
Meta Query:
1. Who married whom and when?
2. SPARQL query
SELECT ?person1 ?person2 ?timeWHERE { ?person1 ?married_sp ?person2 .?married_sp rdf:sp:marriedTo . ?married_sp :happenedOn ?date .} 15
16
Use Case: Temporal and Spatial YAGO2S
FactID Subject Predicate Object
#1 GratefulDead performed TheClosingOfWinterLand
#2 #1 occursIn SanFrancisco
#3 #1 occursOn 1978-12-31
Subject Predicate Object
performed_12345 rdf:singletonPropertyOf performed
GratefulDead performed_12345 TheClosingOfWinterLand
performed_12345 occursIn SanFrancisco
performed_12345 occursOn 1978-12-31
FactID in Yago2s
Singleton Property
17
Experiment: BKR with Provenance
All datasets are available at http://wiki.knoesis.org/index.php/Singleton_Property
• Five data sets generated from the same seed BKR Singleton Property (SP) Reification (R) PaCE C1 (C1) PaCE C2 (C2) PaCE C3 (C3)
18
Experiment Results
(A) random-value queries vs. fixed-value queries in msec.
(B) query length and execution time in msec.
19
Conclusion
3. Scalable, e.g., to LOD
Does the singleton property approach meet these requirements?
2. Formal semantics defined1. Intuitive, easy to understand
4. Compatible with existing standards
5. Multiple types of metadata
20
Further information, please visit http://wiki.knoesis.org/index.php/Singleton_Property