data access over large semi-structured databasespaiva lima da silva bruno 12 / 48. rule-based data...
TRANSCRIPT
![Page 1: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/1.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Data Access over LargeSemi-Structured Databases
A Generic Approach Towards Rule-Based Systems
Bruno PAIVA LIMA DA SILVA
Soutenance de theseLundi 13 Janvier 2014
RapporteursOllivier HAEMMERLE ............................................ IRIT, Universite Toulouse III
Fabien GANDON ....................................................... INRIA, Sophia-Antipolis
ExaminateursOdile PAPINI .............................................. LSIS, Universite Aix-Marseille II
Directeur de theseMarie-Laure MUGNIER ........................................ LIRMM, Universite Montpellier II
Co-Encadrants de theseJean-Francois BAGET ............................................................ INRIA, LIRMM
Madalina CROITORU .......................................... LIRMM, Universite Montpellier II
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 1 / 48
![Page 2: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/2.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Ontology-Based Data Access
In this work, we address the:
Ontology-Based Data Access (OBDA) problem.[Also known as Ontological Conjunctive Query Answering (OCQA)]
This is a cross-domain problem between the domains of:
Knowledge Representation (KR)
Artificial Intelligence (AI)
Databases
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 2 / 48
![Page 3: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/3.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Ontology-Based Data Access
Given:
Knowledge base (KB)
Factual knowledge (F)
Ontology (Universal knowledge) (O)
(Boolean) Conjunctive Query (Q)
OBDA consists in verifying if the query Q can be deduced from theknowledge base (if there is an answer to the query in the KB).
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 3 / 48
![Page 4: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/4.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Example
Example of a knowledge base:
(a) Alice lives and works in Paris.
(b) Alice is a computer analyst.
(c) Bob lives in Montpellier, and works in Nımes.
(d) Bob is a school teacher.
Ontology:
1 “If someone is a school teacher, then he teaches Latin.”
2 “If someone is a computer analyst, then he knows programming.”
3 “If someone lives in Montpellier and works in Nımes, then he drives to work.”
Given the following queries:
Q1: Is there a school teacher who lives in Montpellier?Q2: Is there anyone who drives to work?
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 4 / 48
![Page 5: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/5.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Example
Example of a knowledge base:
(a) Alice lives and works in Paris.
(b) Alice is a computer analyst.
(c) Bob lives in Montpellier, and works in Nımes.
(d) Bob is a school teacher.
Ontology:
1 “If someone is a school teacher, then he teaches Latin.”
2 “If someone is a computer analyst, then he knows programming.”
3 “If someone lives in Montpellier and works in Nımes, then he drives to work.”
Given the following queries:
Q1: Is there a school teacher who lives in Montpellier?Q2: Is there anyone who drives to work?
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 4 / 48
![Page 6: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/6.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Example
Example of a knowledge base:
(a) Alice lives and works in Paris.
(b) Alice is a computer analyst.
(c) Bob lives in Montpellier, and works in Nımes.
(d) Bob is a school teacher.
Ontology:
1 “If someone is a school teacher, then he teaches Latin.”
2 “If someone is a computer analyst, then he knows programming.”
3 “If someone lives in Montpellier and works in Nımes, then he drives to work.”
Given the following queries:
Q1: Is there a school teacher who lives in Montpellier?Q2: Is there anyone who drives to work?
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 4 / 48
![Page 7: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/7.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Example
Example of a knowledge base:
(a) Alice lives and works in Paris.
(b) Alice is a computer analyst.
(c) Bob lives in Montpellier, and works in Nımes
(d) Bob is a school teacher.
Ontology:
1 “If someone is a school teacher, then he teaches Latin.”
2 “If someone is a computer analyst, then he knows programming.”
3 “If someone lives in Montpellier and works in Nımes, then he drives to work.”
Given the following queries:
Q1: Is there a school teacher who lives in Montpellier?Q2: Is there anyone who drives to work?
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 4 / 48
![Page 8: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/8.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Example
Example of a knowledge base:
(a) Alice lives and works in Paris.
(b) Alice is a computer analyst.
(c) Bob lives in Montpellier, and works in Nımes.
(d) Bob is a school teacher.
Ontology:
1 “If someone is a school teacher, then he teaches Latin.”
2 “If someone is a computer analyst, then he knows programming.”
3 “If someone lives in Montpellier and works in Nımes, then he drives to work.”
Given the following queries:
Q1: Is there a school teacher who lives in Montpellier?Q2: Is there anyone who drives to work?
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 4 / 48
![Page 9: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/9.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Example
Example of a knowledge base:
(a) Alice lives and works in Paris.
(b) Alice is a computer analyst.
(c) Bob lives in Montpellier, and works in Nımes.
(d) Bob is a school teacher.
Ontology:
1 “If someone is a school teacher, then he teaches Latin.”
2 “If someone is a computer analyst, then he knows programming.”
3 “If someone lives in Montpellier and works in Nımes, then he drives to work.”
Given the following queries:
Q1: Is there a school teacher who lives in Montpellier?Q2: Is there anyone who drives to work?
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 4 / 48
![Page 10: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/10.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Ontology
There are two common methods for using ontological content:
Forward chaining:Hypothesis of the rules are queried in the facts and theconclusions of those are added to the facts, until the base issaturated.
Backwards chaining:Initial query is decomposed/rewritten according to the rules ofthe ontology. Those new queries are then applied to theknowledge base.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 5 / 48
![Page 11: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/11.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Approaches for OBDA
There are currently two distinct manners to represent ontologicaldata:
Rules (RBDA)
OBDA
DescriptionLogics
Figure: Approaches for the OBDA problem.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 6 / 48
![Page 12: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/12.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Table of contents
1 Rule-Based Data Access
2 ALASKA
3 Experimental work
4 CSP
5 Conclusion
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 7 / 48
![Page 13: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/13.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
1 Rule-Based Data Access
2 ALASKA
3 Experimental work
4 CSP
5 Conclusion
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 8 / 48
![Page 14: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/14.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Formalism
We use a decidable subset of First-Order Logic (FOL) to represent the problem:
Terms: Constants or variablesAlice, Bob, Paris, Montpellier, Latin, Programming, x, y, z ...
Predicates: Relations between termsman/1, woman/1, drives-to-work/1, lives/2, teaches/2, knows/2 ...
Atoms: Single piece of informationman(Bob), computerAnalyst(Alice), lives(Bob,Nımes)
Facts: A fact is a set of atoms{city(Montpellier)}, {person(x),lives(x,Montpellier),works(x,Nımes)}, ...
Rules: Composed of two facts
1 [body] {schoolTeacher(x)} → [head] {teaches(x,Latin)}2 [body] {computerAnalyst(x)} → [head] {knows(x,Programming)}3 [body] {lives(x,Montpellier),works(x,Nımes)} → [head] {drives-to-work(x)}
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 9 / 48
![Page 15: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/15.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Problem definition
According to this formalism, we have:
Factual knowledge (F): A fact or the union of facts
Ontology (O): A set of rules
Conjunctive Query (Q): A query is a fact
F |= QProblem: Finding substitutions (Entailment)
{F ,O} |= QProblem: Using ontological information, Finding substitutions(Rule-Entailment)
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 10 / 48
![Page 16: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/16.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Elementary operations
The efficiency of RBDA depends on the efficiency of some elementaryoperations:
In forward chaining:
Calling the entailment algorithm for each rule hypothesis andinserting small new pieces of information to the KB.
Finding an answer to the query in a saturated knowledge base.
In backwards chaining:
[Rewriting queries according to the ontology]
Efficiency of calling the entailment algorithm for each rewriting.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 11 / 48
![Page 17: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/17.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Context
Semi-structured data [Abiteboul,1997]Knowledge bases with: “irregular, partial or implicit structure”, “schema
is ignored”, “schema evolving rapidly”, “difficult distinction between
schema and data”, etc.
Emergence of very large and semi-structured knowledge bases:i.e. the web.
Emergence of databases with different data models (seeNoSQL).
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 12 / 48
![Page 18: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/18.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Research question
Given the problem (RBDA) and the context, our research questionis the following:
“How to provide a platform providing under a unified logicalframework different implementation approaches foraddressing the Rule-Based Data Access problem?”
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 13 / 48
![Page 19: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/19.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
State of the art
We have studied different existing tools/approaches for adressingthe RBDA problem according to their ability to:
Represent a knowledge base under our formalism.
Perform conjunctive queries.
Represent rules and perform conjunctive queries with thepresence of rules.
Handle data stored in secondary memory (disk).
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 14 / 48
![Page 20: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/20.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Prolog
Representing a knowledgebase: OK
next-to(d1,d2).next-to(d2,t).reads(d1,n).reads(d2,n).reads(t,n).same-suit(d1,d2).dressed-in(d1,Black).dressed-in(t,Blue).
Conjunctive queries: Native
Conjunctive queries withrules: Native
Does not handle data insecondary memory
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 15 / 48
![Page 21: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/21.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
CoGITaNT
Representing a knowledge base:OK
> : black
dressed-in
same-suit
> : d1 next-to > : d2
reads reads
> : n
1
2
1 2
1 2
1
2
1
2
Conjunctive queries: Native
Conjunctive queries with rules:Native
Does not handle data insecondary memory
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 16 / 48
![Page 22: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/22.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Relational databases
Representing a knowledge base:OK
next-to
t1 t2
d1 d2d2 t
reads
t1 t2
d1 nd2 nt n
dressed-in
t1 t2
d1 Blackt Blue
same-suit
t1 t2
d1 d2
Conjunctive queries: Nativewith SQL
Conjunctive queries with rules:Not Native
Handles data in secondarymemory
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 17 / 48
![Page 23: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/23.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Triples stores
Representing a knowledge base:OK
d1 next-to d2.
d2 next-to t.
d1 reads n.
d2 reads n.
t reads n.
d1 same-suit d2.
d1 dressed-in Black.
t dressed-in Blue.
Conjunctive queries: Nativewith SPARQL
Conjunctive queries with rules:Not Native
Handles data in secondarymemory
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 18 / 48
![Page 24: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/24.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Graph databases
Representing a knowledge base:OK
label: d1dressed-in: black
label: d2
label: n
same-suit
next-to
readsreads
Conjunctive queries: Notnative in every GDBMS
Conjunctive queries with rules:Not Native
Handles data in secondarymemory
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 19 / 48
![Page 25: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/25.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Recap
The table below recaps the information from the previous slides:
- KB Rep. F |= Q F ,R |= Q Sec. memoryProlog Yes Native Native No
CoGITaNT Yes Native Native NoRelational Databases Yes Native (SQL) Not native Yes
Triple Stores Yes Native (SPARQL) Not native YesGraph Databases Yes Not native Not native Yes
Figure: Table comparing the features of the studied methods.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 20 / 48
![Page 26: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/26.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Research question
Back to the research question:
“How to provide a platform providing under a unified logicalframework different implementation approaches foraddressing the Rule-Based Data Access problem?”
FOL-based formalism
Relational databases Triples stores Graph databases
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 21 / 48
![Page 27: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/27.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
1 Rule-Based Data Access
2 ALASKA
3 Experimental work
4 CSP
5 Conclusion
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 22 / 48
![Page 28: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/28.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
ALASKA platform
ALASKA platform
Abstract Logic-based Architecture Storage systems & Knowledge base Analysis
Its goal is to enable to integrate, in a generic manner, heterogenous storage
systems.
Such integration is made using an intermediary language, which is our logical
formalism.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 23 / 48
![Page 29: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/29.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Features & details
Multi-layered architecture: Program goes from higher level
operations down to I/O disk functions.
Classes and interfaces in the abstract layer ensure all the connected
systems use a common datatype (based on our formalism).
Written in JAVA: loss in efficiency, but database connection libraries
are often directly available.
Systems connected so far: Jena TDB, Sqlite, DEX, Neo4j -
[Non-definitive list]
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 24 / 48
![Page 30: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/30.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Class diagram
KRRoperations
IFact
< interface >
IAtom
< interface >
ITerm
< interface >
GDBConnectors
RDBConnectors
TSConnectors
Predicate TermAtom
GDB RDB TS
Applicationlayer (1)
Abstractlayer (2)
Translationlayer (3)
Datalayer (4)
Figure: Class diagram of ALASKA architecture.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 25 / 48
![Page 31: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/31.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Elementary operations
The IFact interface defines methods that need to be implementedby every storage system connected to ALASKA.
The implementation of each of those functions differs according tothe data model of the store. The following cases are the principal:
Adding an atom (inserting new information)
Listing the neighbourhood of a term (enumerating)
Verifying the existence of an atom (checking)
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 26 / 48
![Page 32: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/32.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Adding an atom
F : {friend(Alice,Bob), friend(Alice,Carl), friend(Bob,Carl)}Adding the atom enemy(Bob,Diana) to F
At ALASKA level:F : {friend(Alice,Bob), friend(Alice,Carl), enemy(Bob,Carl), enemy(Bob,Diana)}
In a triples store:
Alice friend Bob.
Alice friend Carl.
Bob friend Carl.
Bob enemy Diana.
In a relational database:
friendt1 t2
Alice BobAlice CarlBob Carl
enemyt1 t2
Bob Diana
In a graph database:
Alice Bob
Carl
Diana
friend
friend friend
enemy
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 27 / 48
![Page 33: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/33.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Adding an atom
F : {friend(Alice,Bob), friend(Alice,Carl), friend(Bob,Carl)}Adding the atom enemy(Bob,Diana) to F
At ALASKA level:F : {friend(Alice,Bob), friend(Alice,Carl), enemy(Bob,Carl)}
In a triples store:
Alice friend Bob.
Alice friend Carl.
Bob friend Carl.
Bob enemy Diana.
In a relational database:
friendt1 t2
Alice BobAlice CarlBob Carl
enemyt1 t2
Bob Diana
In a graph database:
Alice Bob
Carl
Diana
friend
friend friend
enemy
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 27 / 48
![Page 34: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/34.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Adding an atom
F : {friend(Alice,Bob), friend(Alice,Carl), friend(Bob,Carl)}Adding the atom enemy(Bob,Diana) to F
At ALASKA level:F : {friend(Alice,Bob), friend(Alice,Carl), enemy(Bob,Carl), enemy(Bob,Diana)}
In a triples store:
Alice friend Bob.
Alice friend Carl.
Bob friend Carl.
Bob enemy Diana.
In a relational database:
friendt1 t2
Alice BobAlice CarlBob Carl
enemyt1 t2
Bob Diana
In a graph database:
Alice Bob
Carl
Diana
friend
friend friend
enemy
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 27 / 48
![Page 35: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/35.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Adding an atom
F : {friend(Alice,Bob), friend(Alice,Carl), friend(Bob,Carl)}Adding the atom enemy(Bob,Diana) to F
At ALASKA level:F : {friend(Alice,Bob), friend(Alice,Carl), enemy(Bob,Carl), enemy(Bob,Diana)}
In a triples store:
Alice friend Bob.
Alice friend Carl.
Bob friend Carl.
Bob enemy Diana.
In a relational database:
friendt1 t2
Alice BobAlice CarlBob Carl
enemyt1 t2
Bob Diana
In a graph database:
Alice Bob
Carl
Diana
friend
friend friend
enemy
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 27 / 48
![Page 36: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/36.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Adding an atom
F : {friend(Alice,Bob), friend(Alice,Carl), friend(Bob,Carl)}Adding the atom enemy(Bob,Diana) to F
At ALASKA level:F : {friend(Alice,Bob), friend(Alice,Carl), enemy(Bob,Carl), enemy(Bob,Diana)}
In a triples store:
Alice friend Bob.
Alice friend Carl.
Bob friend Carl.
Bob enemy Diana.
In a relational database:
friendt1 t2
Alice BobAlice CarlBob Carl
enemyt1 t2
Bob Diana
In a graph database:
Alice Bob
Carl Diana
friend
friend friend enemy
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 27 / 48
![Page 37: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/37.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Backtracking algorithm
Using the methods detailed above, we have written a genericbacktracking algorithm:
Very simple backtracking algorithm. No optimization.
Every time it needs to read an information from the knowledgebase, it uses functions defined in ALASKA abstract layer.
The use of those ensures that it works with every systemconnected to ALASKA, and its efficiency relies on theefficiency of the store for the reasoning elementary operations.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 28 / 48
![Page 38: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/38.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Use cases for ALASKA
ALASKA can also be/has also been used for:
Implementing rules algorithms
Storing/querying graphs in the Qualinca project.
A for Abstract: Any KR/AI application requiring transparentreading and writing access to disk-stored information may useALASKA.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 29 / 48
![Page 39: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/39.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
1 Rule-Based Data Access
2 ALASKA
3 Experimental work
4 CSP
5 Conclusion
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 30 / 48
![Page 40: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/40.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
ALASKA for RBDA
Our current goal is to use ALASKA to evaluate the efficiency ofthe connected systems on elementary operations:
Storage tests:
Measuring the time and size when storing increasingly large
knowledge bases on disk.
Querying tests:
Measuring the time that each system takes to answer a set of
queries using different algorithms/query engines.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 31 / 48
![Page 41: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/41.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Storage workflow
RDF File Input Manager RDF Parser
IFact Manager
IFact to GraphTranslation
IFact to RDBTranslation
IFact to TriplesTranslation
Graph DBRelational DBTriples Store
Layer (1)
Layer (2)
Layer (3)
Layer (4)
Figure: Workflow for storing a knowledge base in RDF using ALASKA.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 32 / 48
![Page 42: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/42.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Buffering algorithm
1 Storing all information at once: forbidden (context).
2 Storing piece by piece: highly unefficient in transactional systems.
We have used an algorithm with a read-and-write atom buffer:
Parsing level:
Using N-TRIPLES files instead of RDF/XML: reading and filling the buffer.
Transaction management:
Manual management of transactions. Limiting write operations on disk.
Garbage collection:
Recycling JAVA objects in order to reduce GC work, thus memory overload.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 33 / 48
![Page 43: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/43.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Storage results
Using our platform, we have evaluated the storage efficiency ofdifferent storage systems:
Knowledge Base
Transformation into IFact
RelationalDatabase
GraphDatabase
TriplesStore
0 10 20 30 40 50 60 70 80 90 1000
0.5
1
1.5
·104
Size (Millions of triples)
Tim
e(s
)
Stores
Jena TDB DEX Sqlite Neo4J
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 34 / 48
![Page 44: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/44.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Results
Size of the stored knowledge basesSystem 5M 20M 40M 75M 100M
DEX 55 Mb 214.2 Mb 421.7 Mb 785.1 Mb 1.0 GbNeo4J 157.4 Mb 629.5 Mb 1.2 Gb 2.3 Gb 3.1 GbSqlite 767.4 Mb 2.9 Gb 6.0 Gb 11.6 Gb 15.5 Gb
Jena TDB 1.1 Gb 3.9 Gb 7.5 Gb 13.9 Gb 18.1 GbRDF File 533.2 Mb 2.1 Gb 4.2 Gb 7.8 Gb 10.4 Gb
Figure: Storage time and KB sizes in different systems
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 35 / 48
![Page 45: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/45.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Querying workflow
Q
AbstractArchitecture
Q → SQLQ → SPARQLGeneric Backtrack
Algorithm
F stored inGraph DB
F stored inRelational DB
F stored inTriple Store
Figure: Workflow for querying with ALASKA.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 36 / 48
![Page 46: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/46.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Querying results
1 creator(X,PaulErdoes) . creator(X,Y)
Returns the persons and the papers that were written with Paul Erdoes.
2 type(X,Article) . journal(X,Journal1-1940) . creator(X,Y)
Returns the creators of all the elements that are articles and were published inJournal 1 (1940).
0 10 20 30 40 50 60 70 80 90 1000
20
40
60
Size (Thousand of triples)
Tim
e(m
s)
Q1
Jena(BT) Jena(SPARQL)
DEX(BT) Neo4J(BT)
Sqlite(SQL)
0 10 20 30 40 50 60 70 80 90 1000
100
200
300
400
Size (Thousand of triples)
Tim
e(m
s)
Q2
Jena(BT) Jena(SPARQL)
DEX(BT) Neo4J(BT)
Sqlite(SQL)
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 37 / 48
![Page 47: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/47.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Conclusions
The results of the querying tests have revealed two different open issues:
1 Difficulty of scaling when performing less complex queries
2 Generic backtrack algorithm unefficient on more complex queries,even for smaller KBs
An optimization of our generic algorithm is needed: they are known inthe domain of CSP.
Instead of implementing them one by one, we have chosen to integrate a
CSP solver within ALASKA, to benefit from such optimizations.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 38 / 48
![Page 48: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/48.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
1 Rule-Based Data Access
2 ALASKA
3 Experimental work
4 CSP
5 Conclusion
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 39 / 48
![Page 49: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/49.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
New querying workflow
Q
AbstractArchitecture
Q → SQLQ → SPARQLGeneric Backtrack
AlgorithmCSP Solver
F stored inGraph DB
F stored inRelational DB
F stored inTriple Store
Figure: Workflow for querying with ALASKA.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 40 / 48
![Page 50: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/50.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #1
Example:
F :{computerAnalyst(Alice),
schoolTeacher(Bob),fireman(Carl),
computerAnalyst(Diana)
lives(Alice,Paris),lives(Bob,Montpellier),
lives(Carl,Paris),lives(Diana,Paris)
likes(Bob,Alice),likes(Diana,Carl)}
1 2 3Alice Bob Carl
4 5 6Diana Paris Montpellier
Q = {likes(x,y),computerAnalyst(y), lives(y,Paris)}
x [1,2,3,4,5,6]
y [1,2,3,4,5,6] Paris [5]
likes
lives
cpuA
Solutions to the network:{{x = 2, y = 1, Paris = 5}}
Answers to the query:{{x = Bob, y = Alice, Paris = Paris}}
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 41 / 48
![Page 51: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/51.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #1
Example:
F :{computerAnalyst(Alice),
schoolTeacher(Bob),fireman(Carl),
computerAnalyst(Diana)
lives(Alice,Paris),lives(Bob,Montpellier),
lives(Carl,Paris),lives(Diana,Paris)
likes(Bob,Alice),likes(Diana,Carl)}
1 2 3Alice Bob Carl
4 5 6Diana Paris Montpellier
Q = {likes(x,y),computerAnalyst(y), lives(y,Paris)}
x [1,2,3,4,5,6]
y [1,2,3,4,5,6] Paris [5]
likes
lives
cpuA
Solutions to the network:{{x = 2, y = 1, Paris = 5}}
Answers to the query:{{x = Bob, y = Alice, Paris = Paris}}
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 41 / 48
![Page 52: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/52.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #1
Example:
F :{computerAnalyst(Alice),
schoolTeacher(Bob),fireman(Carl),
computerAnalyst(Diana)
lives(Alice,Paris),lives(Bob,Montpellier),
lives(Carl,Paris),lives(Diana,Paris)
likes(Bob,Alice),likes(Diana,Carl)}
1 2 3Alice Bob Carl
4 5 6Diana Paris Montpellier
Q = {likes(x,y),computerAnalyst(y), lives(y,Paris)}
x [1,2,3,4,5,6]
y [1,2,3,4,5,6] Paris [5]
likes
lives
cpuA
Solutions to the network:{{x = 2, y = 1, Paris = 5}}
Answers to the query:{{x = Bob, y = Alice, Paris = Paris}}
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 41 / 48
![Page 53: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/53.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #1
Example:
F :{computerAnalyst(Alice),
schoolTeacher(Bob),fireman(Carl),
computerAnalyst(Diana)
lives(Alice,Paris),lives(Bob,Montpellier),
lives(Carl,Paris),lives(Diana,Paris)
likes(Bob,Alice),likes(Diana,Carl)}
1 2 3Alice Bob Carl
4 5 6Diana Paris Montpellier
Q = {likes(x,y),computerAnalyst(y), lives(y,Paris)}
x [2,4]
y [1] Paris [5]
likes(2,1),(4,3)
lives(1,5),(3,5),(4,5)
cpuA(1,4)
Solutions to the network:{{x = 2, y = 1, Paris = 5}}
Answers to the query:{{x = Bob, y = Alice, Paris = Paris}}
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 41 / 48
![Page 54: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/54.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #1
Example:
F :{computerAnalyst(Alice),
schoolTeacher(Bob),fireman(Carl),
computerAnalyst(Diana)
lives(Alice,Paris),lives(Bob,Montpellier),
lives(Carl,Paris),lives(Diana,Paris)
likes(Bob,Alice),likes(Diana,Carl)}
1 2 3Alice Bob Carl
4 5 6Diana Paris Montpellier
Q = {likes(x,y),computerAnalyst(y), lives(y,Paris)}
x [2,4]
y [1] Paris [5]
likes(2,1),(4,3)
lives(1,5),(3,5),(4,5)
cpuA(1,4)
Solutions to the network:{{x = 2, y = 1, Paris = 5}}
Answers to the query:{{x = Bob, y = Alice, Paris = Paris}}
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 41 / 48
![Page 55: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/55.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #1
Example:
F :{computerAnalyst(Alice),
schoolTeacher(Bob),fireman(Carl),
computerAnalyst(Diana)
lives(Alice,Paris),lives(Bob,Montpellier),
lives(Carl,Paris),lives(Diana,Paris)
likes(Bob,Alice),likes(Diana,Carl)}
1 2 3Alice Bob Carl
4 5 6Diana Paris Montpellier
Q = {likes(x,y),computerAnalyst(y), lives(y,Paris)}
x [2,4]
y [1] Paris [5]
likes(2,1),(4,3)
lives(1,5),(3,5),(4,5)
cpuA(1,4)
Solutions to the network:{{x = 2, y = 1, Paris = 5}}
Answers to the query:{{x = Bob, y = Alice, Paris = Paris}}
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 41 / 48
![Page 56: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/56.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation
Once again, reading the whole KB prior to an operation: forbidden.
In order to adapt our problem into a CSP, a different transformation isneeded:
The network is built according to the query, but no informationfrom the KB is read at construction time.
Introduction of a new constraint type: once it is triggered, it adds anew constraint to the network and then reads the feasible tuplesfrom the KB.
Introduction of a threshold value: constraints are only triggeredwhen the domain of one of the variables goes under that value.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 42 / 48
![Page 57: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/57.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #2
Example: Q = {p(x , y), q(y ,A), r(x ,A)}
x [1:N]
y [1:N] A [1:N]
p
q
r
Figure: Example of a CSP network representing a query.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 43 / 48
![Page 58: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/58.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #2
Example: Q = {p(x , y), q(y ,A), r(x ,A)}
x [1:N]
y [1:N] A [1]
p
q
r
A=1
Figure: Example of a CSP network representing a query.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 43 / 48
![Page 59: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/59.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #2
Example: Q = {p(x , y), q(y ,A), r(x ,A)}
x [x1,x2,..., xn1]
y [y1,y2,..., yn2] A [1]
p
q
r
enumq(?,A)
enumr(?,A)
A=1
Figure: Example of a CSP network representing a query.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 43 / 48
![Page 60: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/60.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Transformation #2
Example: Q = {p(x , y), q(y ,A), r(x ,A)}
x [x ′k ,x ′k ,..., xk1]
y [y ′1,y ′2,..., yk2] A [1]
p
q
r
enumq(?,A)
enumr(?,A)
A=1
checkp(xi ,yj)
Figure: Example of a CSP network representing a query.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 43 / 48
![Page 61: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/61.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Current issues
So far, different issues have appeared for performing conjunctivequeries over large knowledge bases using a CSP solver:
Queries must contain a constant term in order to the solverstarts.
The max value of explored terms per variable (N) is very low(around 30K-35K).
The backtracking process of the solver needs to be manuallydefined.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 44 / 48
![Page 62: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/62.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
1 Rule-Based Data Access
2 ALASKA
3 Experimental work
4 CSP
5 Conclusion
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 45 / 48
![Page 63: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/63.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Contributions
The achievements of this work are:
A generic and logic-based software architecture allowing thecommunication to different storage systems through an unified language.
The ability of using such architecture to write higher-level reasoningprograms independently of how the data to be manipulated is stored.
An algorithm that enables the storage of a very large knowledge base ondisk in a single machine and avoids full memory consumption.
The use of such architecture to perform efficiency comparisons on theelementary operations of the RBDA problem.
The transformation of the problem of querying a knowledge base into aconstraint solving problem and its adaptation to large knowledge bases.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 46 / 48
![Page 64: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/64.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
Future work
Implementation and integration of rules algorithms intoALASKA.
The perspective of enabling ALASKA to integrate DescriptionLogics.
Using indexing techniques for the CSP version of our problemwhen only propagating is not enough for finding substitutions.
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 47 / 48
![Page 65: Data Access over Large Semi-Structured DatabasesPAIVA LIMA DA SILVA Bruno 12 / 48. Rule-Based Data AccessALASKAExperimental workCSPConclusion Research question Given the problem (RBDA)](https://reader034.vdocuments.site/reader034/viewer/2022050401/5f7f61e15db30b5aa4340837/html5/thumbnails/65.jpg)
Rule-Based Data Access ALASKA Experimental work CSP Conclusion
The end
Thank you for your attention!
Data Access over Large Semi-Structured Databases
PAIVA LIMA DA SILVA Bruno 48 / 48