graphdb connectors – powering complex sparql queries
TRANSCRIPT
About Ontotext
• Provides products & solutions for content enrichment and metadata management − 70 employees, headquarters in Sofia (Bulgaria)
− Sales presence in NYC and London
• Major clients and industries − Media & Publishing
− Health Care & Life Sciences
− Cultural Heritage & Digital Libraries
− Financial information providers
− Government
− Education
Aug 2015 2 Smart Data Week 2015
Ontotext’s Vision for Smart Data Management
3 Aug 2015
Graph Database
• Flexible RDF graph data model
• Ontology metadata layer
Semantic Search
• Semantic, exploratory search • Metadata driven content
Text Mining & Interlinking
• People, locations, organisations, topics
• Discover implicit relations • Reuse open knowledge
graphs
SPARQL – the Good & the Bad
• Very good for complex graph pattern matching
• Not so good for −Full-text search
−Snippet extraction
−Faceted search
−Complex aggregations
−Range queries
Aug 2015 4 Smart Data Week 2015
What If…
• We could get the full power of SPARQL
• … and extremely fast −Full-text search / snippet extraction
−Faceted search
−Complex aggregations
−Range queries
• … while using only SPARQL (query + update)
Aug 2015 5 Smart Data Week 2015
Ontotext GraphDB Connectors
• Provide extremely fast full-text search, range, faceted search, aggregations
• Utilise an external engine like Lucene, Solr or Elasticsearch
• Flexible schema mapping: index only what you need
• Real-time synchronization of data in GraphDB and the external engine
• Connector management via SPARQL
• Data querying & update via SPARQL
• Based on the GraphDB plug-in architecture
Aug 2015 6 Smart Data Week 2015
Workflow
Selective replication
Query Processor
Graph indexes Internal indexes
SPARQL SELECT with or without an embedded Lucene/Solr/Elasticsearch query
Solr/Elasticsearch direct queries
Lucene/Solr/Elasticsearch GraphDB engine
SPARQL INSERT/DELETE
Aug 2015 7 Smart Data Week 2015
Interface
• All interaction via SPARQL queries
− INSERT for creating connectors
−SELECT for getting connector configuration parameters
− INSERT/SELECT/DELETE for managing & querying RDF data
Aug 2015 8 Smart Data Week 2015
Sample Data
Aug 2015 9 Smart Data Week 2015
@prefix : <http://www.ontotext.com/example/wine#> .
:RedWine rdfs:subClassOf :Wine .
:WhiteWine rdfs:subClassOf :Wine .
:RoseWine rdfs:subClassOf :Wine .
:Merlo rdf:type :Grape ;
rdfs:label "Merlo" .
:CabernetSauvignon rdf:type :Grape ;
rdfs:label "Cabernet Sauvignon" .
:CabernetFranc rdf:type :Grape ;
rdfs:label "Cabernet Franc" . :
PinotNoir rdf:type :Grape ;
rdfs:label "Pinot Noir" .
:Chardonnay rdf:type :Grape ;
rdfs:label "Chardonnay" .
:Yoyowine rdf:type :RedWine ;
:madeFromGrape :CabernetSauvignon ;
:hasSugar "dry" ;
:hasYear "2013"^^xsd:integer .
:Franvino rdf:type :RedWine ;
:madeFromGrape :Merlo ;
:madeFromGrape :CabernetFranc ;
:hasSugar "dry" ;
:hasYear "2012"^^xsd:integer .
:Noirette rdf:type :RedWine ;
:madeFromGrape :PinotNoir ;
:hasSugar "medium" ;
:hasYear "2012"^^xsd:integer .
:Blanquito rdf:type :WhiteWine ;
:madeFromGrape :Chardonnay ;
:hasSugar "dry" ; :hasYear "2012"^^xsd:integer .
:Rozova rdf:type :RoseWine ;
:madeFromGrape :PinotNoir ;
:hasSugar "medium" ;
:hasYear "2013"^^xsd:integer .
Create a Connector
Aug 2015 10 Smart Data Week 2015
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA {
inst:my_index :createConnector '''
{
"elasticsearchNode": "localhost:9300",
"types": [ "http://www.ontotext.com/example/wine#Wine" ],
"fields": [
{
"fieldName": "grape",
"propertyChain": [
"http://www.ontotext.com/example/wine#madeFromGrape",
"http://www.w3.org/2000/01/rdf-schema#label"
]
}
]
} ''' .
}
• Connector name
• ES instance
• Entities to be synced
• Properties to be indexed
Similar for Solr
What’s Indexed in Elasticsearch
wine Grape
:Yoyowine Cabernet Sauvignon
:Noirette Pinot Noir
:Blanquito Chardonnay
:Franvino Merlo, Cabernet Franc
:Rozova Pinot Noir
Aug 2015 #11
:Yoyowine rdf:type :RedWine ;
:madeFromGrape :CabernetSauvignon ;
:hasSugar "dry" ;
:hasYear "2013"^^xsd:integer .
:Franvino rdf:type :RedWine ;
:madeFromGrape :Merlo ;
:madeFromGrape :CabernetFranc ;
:hasSugar "dry" ;
:hasYear "2012"^^xsd:integer .
:Noirette rdf:type :RedWine ;
:madeFromGrape :PinotNoir ;
:hasSugar "medium" ;
:hasYear "2012"^^xsd:integer .
:Blanquito rdf:type :WhiteWine ;
:madeFromGrape :Chardonnay ;
:hasSugar "dry" ; :hasYear "2012"^^xsd:integer .
:Rozova rdf:type :RoseWine ;
:madeFromGrape :PinotNoir ;
:hasSugar "medium" ;
:hasYear "2013"^^xsd:integer .
Smart Data Week 2015
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
PREFIX wine: <http://www.ontotext.com/example/wine#>
SELECT ?entity ?grape ?year {
?search a inst:my_index ;
:query "grape:cabernet" ;
:entities ?entity .
?entity wine:madeFromGrape ?grape .
?entity wine:hasYear ?year
}
Full-text Search with SPARQL
Aug 2015 13 Smart Data Week 2015
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
SELECT ?entity {
?search a inst:<connector-name> ;
:query "<elastic-search-query>" ;
:entities ?entity . }
Instances of
:Wine
Combining
Elasticsearch &
SPARQL results
?entity ?grape ?year
:Yoyowine :CabernetSauvignon 2013
:Franvino :Merlo 2012
:Franvino :CabernetFranc 2012
Faceted Search with SPARQL
Aug 2015 14 Smart Data Week 2015
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
INSERT DATA {
inst:my_index2 :createConnector '''
{ "elasticsearchNode": "localhost:9300",
"types": [ "http://www.ontotext.com/example/wine#Wine" ],
"fields": [
{
"fieldName": "sugar",
"propertyChain": [ "http://www.ontotext.com/example/wine#hasSugar" ],
},
{
"fieldName": "year",
"propertyChain": [ "http://www.ontotext.com/example/wine#hasYear" ]
}
]
} ''' .
}
• Connector name
• ES instance
• Entities to be synced
• Properties to be indexed
PREFIX : <http://www.ontotext.com/connectors/elasticsearch#>
PREFIX inst: <http://www.ontotext.com/connectors/elasticsearch/instance#>
SELECT ?facetName ?facetValue ?facetCount
WHERE {
?r a inst:my_index2 ;
:facetFields "year,sugar" ;
:facets _:f .
_:f :facetName ?facetName .
_:f :facetValue ?facetValue .
_:f :facetCount ?facetCount .
}
Faceted Search with SPARQL
Aug 2015 16 Smart Data Week 2015
?facetName ?facetValue ?facetCount
year 2012 3
year 2013 2
sugar Dry 3
sugar medium 2
Summary
• High-performance full-text search, faceted search & aggregations within SPARQL are important
• Ontotext GraphDB Connectors provide a solution for the problem, utilising external engines like Elasticsearch and Solr
• Data access only via SPARQL, external engine component transparent to applications and users
Aug 2015 17 Smart Data Week 2015