www 2017 tutorial: semantic data management in practice ...€¦ · www 2017 tutorial: semantic...
TRANSCRIPT
![Page 1: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/1.jpg)
WWW 2017 Tutorial:Semantic Data Management in Practice
Part 2: Storage and Querying
Olaf HartigLinköping University
@olafhartig
Olivier CuréUniversity of Paris-Est Marne la Vallée
@oliviercure
![Page 2: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/2.jpg)
2WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Goals
● Achieve an initial understanding of the RDF database management ecosystem
● Understand differences between 7 identified production-ready stores
![Page 3: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/3.jpg)
3WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Overview
● RDF storage● Seven production-ready RDF stores● Ontology Based Data Access● Demo● APIs
![Page 4: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/4.jpg)
4WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
RDF Storage
● Although most production-ready RDF stores support ACID properties, they are best considered as
– OLAP (online analytical processing) – not OLTP (On line transaction processing)
● This implies that updates are performed in batch– Mainly due to reasoning (see Section 5)
![Page 5: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/5.jpg)
5WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
RDF Storage
● RDF is a logical data model and thus does not impose any physical storage solution
● Existing RDF stores are either– based on an existing DataBase Management
System, ● relational model, e.g., PostgreSQL● NoSQL, e.g., Cassandra
– Designed from scratch, e.g., as a Graph store
![Page 6: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/6.jpg)
6WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
RDF Stores Taxonomy
![Page 7: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/7.jpg)
7WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
RDF Store Ecosystem
![Page 8: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/8.jpg)
8WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
RDF Distributed data management
● RDF storage is part of Big data● Distribution of RDF triples over a cluster of machines
![Page 9: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/9.jpg)
9WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Overview
● RDF storage● Seven production-ready RDF stores● Ontology Based Data Access● Demo● APIs
![Page 10: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/10.jpg)
10WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
7 Production-Ready Systems
● They all guarantee– ACID transactions– Replication (mostly Master-Slave, some Master-
Master)– Partition (Range, Hashing)
![Page 11: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/11.jpg)
11WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Data Models and Querying
● Some of these systems support other data models– XML for MarkLogic and Virtuoso– Property graph for GraphDB, BlazeGraph and
Stardog
http://njh.me/ssn#QBE01 geolat 48.83
geolon 2.21comment “Transm..”
observes
SensingDevice
DébitMètre
type
since 2012
![Page 12: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/12.jpg)
12WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Data Models and Querying
● Some of these systems support other data models– XML for MarkLogic and Virtuoso– Property graph for GraphDB, BlazeGraph and
Stardog– Relational for Virtuoso and Oracle – Document for MarkLogic
● Hence other query languages than SPARQL (v1.1) can be supported– Gremlin for property graph, Xquery for XML, SQL
for relational, Prolog
![Page 13: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/13.jpg)
13WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
License
● Some of these systems have free editions but with some feature or use limitations:– MarkLogic’s dev license is free for up to 1TB and
10 months max– Stardog: community (10DB max with 25M
triples/DB, 4 users), dev (no limits but 30 day trial)– Allegrograph: free and dev have restrictions of 5M
and 50M respectively– Virtuoso and GraphDB: free but no clustering and
no replication– Blazegraph: free for a single machine
● All systems have commercial editions (Oracle is commercial only)
![Page 14: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/14.jpg)
14WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Summary of production-ready systemsTriple store Full-text
searchCloud-ready
Extra features
Allegrograph Integrated + solr
AMI
Blazegraph Integrated + solr
AMI Reification done right
GraphDB Integrated + solr +
elacticsearch (ent.)
AMI RDF ranking
MarkLogic Integrated AMI With Xquery, Javascript
Oracle IntegratedInline in SQL
Stardog Integrated + Lucene
AMI Integrity constraints, Explanations
Virtuoso Integrated AMI Inline in SQL
![Page 15: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/15.jpg)
15WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Overview
● RDF storage● Seven production-ready RDF stores● Ontology Based Data Access● Demo● APIs
![Page 16: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/16.jpg)
16WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
OBDA (Ontology Based Data Access) Alternative
● Relevant when you have an existing (relational) database and want to reason over it using an ontology
● The ontology models the domain, hides the structure of the data sources and enriches incomplete data
● The ontology is connected to the data sources via mappings that relate concepts and properties to SQL views over the sources
● Queries, expressed in SPARQL, are translated into the sources query language (usually SQL)
● State of the art is Ontop
![Page 17: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/17.jpg)
17WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Overview
● RDF storage● Seven production-ready RDF stores● Ontology Based Data Access● Demo● APIs
![Page 18: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/18.jpg)
18WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Demo
● With Blazegraph (v2.1.4)– Website: https://www.blazegraph.com/ – Download:
https://sourceforge.net/projects/bigdata/files/bigdata/2.1.4/blazegraph.jar/download
– Start: java -server -Xmx4g -jar blazegraph.jar– http://localhost:9999/blazegraph
● And an extract of our sensor database instantiating the Semantic Sensor Network ontology
![Page 19: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/19.jpg)
19WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Overview
● RDF storage● Seven production-ready RDF stores● Ontology Based Data Access● Demo● APIs
![Page 20: WWW 2017 Tutorial: Semantic Data Management in Practice ...€¦ · WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and Querying 12 Olaf Hartig and Olivier](https://reader034.vdocuments.site/reader034/viewer/2022042214/5eb8ea211191c74aeb7f0d72/html5/thumbnails/20.jpg)
20WWW 2017 Tutorial: Semantic Data Management in Practice Part 2 – Storage and QueryingOlaf Hartig and Olivier Curé
Available APIs
● Two popular Java APIs to process and handle RDF data and SPARQL queries are:– RDF4J (formerly Sesame)
– Apache Jena
● They both – provide a JDBC-like API and REST-like API– storing, querying and reasoning capabilities