version 7svoboda/courses/171... · lectureoutline rdfstores • introducon • linkeddata...

59
NDBI040: Big Data Management and NoSQL Databases hƩp://www.ksi.mff.cuni.cz/~svoboda/courses/171-NDBI040/ Lecture 4 RDF Stores: SPARQL MarƟn Svoboda [email protected]ff.cuni.cz 24. 10. 2017 Charles University in Prague, Faculty of MathemaƟcs and Physics Czech Technical University in Prague, Faculty of Electrical Engineering

Upload: others

Post on 31-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

NDBI040: Big Data Management and NoSQL Databasesh p://www.ksi.mff.cuni.cz/~svoboda/courses/171-NDBI040/

Lecture 4

RDF Stores: SPARQLMar n [email protected]

24. 10. 2017

Charles University in Prague, Faculty of Mathema cs and PhysicsCzech Technical University in Prague, Faculty of Electrical Engineering

Page 2: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Lecture OutlineRDF stores

• Introduc on• Linked Data

SPARQL query language• Graph pa erns• Filter constraints• Solu on modifiers• Aggrega on• Query forms

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 2

Page 3: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

RDF StoresData model

• RDF triplesComponents: subject, predicate, and objectEach triple represents a statement about a real-world en ty

• Triples can be viewed as graphsVer ces for subjects and objectsEdges directly correspond to individual statements

Query language• SPARQL: SPARQL Protocol and RDF Query Language

Representa ves• Apache Jena, rdf4j (Sesame), Algebraix• Mul -model: MarkLogic, OpenLink Virtuoso

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 3

Page 4: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Linked DataLinked Data

• Method of publishing structured and interlinked data in away that allows for an automated processing by programsrather than browsing by human readers

Principles of Linked Open Data• Iden fy resources using URIs or even be er using URLs• Publish data about resources in standard formats via HTTP• Mutually interlink resources to form Web of Data• Release the data under an open licence

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 4

Page 5: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Linked Open Data CloudMay 2007

Source: h p://lod-cloud.net/

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 5

Page 6: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Linked Open Data CloudSeptember 2011

Source: h p://lod-cloud.net/

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 6

Page 7: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Linked Open Data CloudAugust 2017

Source: h p://lod-cloud.net/

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 7

Page 8: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Linked DataSta s cs

• October 200725 datasets2 billion triples, 2 million links

• September 2011295 datasets31 billion triples, 504 million links

• August 20171163 datasets

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 8

Page 9: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

SPARQL Query Language

Page 10: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

SPARQLSPARQL Query Language

• Query language for RDF dataGraph pa erns, op onal graph pa erns, subqueries, nega on,aggrega on, value constructors, …

• Versions: 1.0 (2008), 1.1 (2013)• W3C recommenda ons

h ps://www.w3.org/TR/sparql11-query/Altogether 11 recommenda ons: query language, updatefacility, federated queries, protocol, result formats, …

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 10

Page 11: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Sample DataGraph ofmovies <http://db.cz/movies>

@prefix i: <http://db.cz/terms#> .@prefix m: <http://db.cz/movies/> .@prefix a: <http://db.cz/actors/> .m:vratnelahve

rdf:type i:Movie ; i:title "Vratné lahve" ;i:year "2006" ;i:actor a:sverak , a:machacek .

m:samotarirdf:type i:Movie ; i:title "Samotáři" ;i:year "2000" ;i:actor a:schneiderova , a:trojan , a:machacek .

m:medvidekrdf:type i:Movie ; i:title "Medvídek" ;i:year "2007" ;i:actor a:machacek , a:trojan ;i:director "Jan Hřebejk" .

m:zelaryrdf:type i:Movie .

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 11

Page 12: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Sample DataGraph of actors <http://db.cz/actors>

@prefix i: <http://db.cz/terms#> .@prefix a: <http://db.cz/actors/> .a:trojan

rdf:type i:Actor ;i:firstname "Ivan" ; i:lastname "Trojan" ;i:year "1964" .

a:machacekrdf:type i:Actor ;i:firstname "Jiří" ; i:lastname "Macháček" ;i:year "1966" .

a:schneiderovardf:type i:Actor ;i:firstname "Jitka" ; i:lastname "Schneiderová" ;i:year "1973" .

a:sverakrdf:type i:Actor ;i:firstname "Zdeněk" ; i:lastname "Svěrák" ;i:year "1936" .

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 12

Page 13: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Sample QueryFind all movies, return their tles and years they were filmedPREFIX i: <http://db.cz/terms#>SELECT ?t ?yFROM <http://db.cz/movies>WHERE

{?m rdf:type i:Movie ;

i:title ?t ;i:year ?y .

}ORDER BY ?y

?t ?ySamotáři 2000

Vratné lahve 2006Medvídek 2007

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 13

Page 14: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Sample Query

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 14

Page 15: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ern MatchingGraph pa erns

• Basic graph pa ernBased on ordinary triples with variables

– ?variable or $variable• More complicated graph pa erns

E.g. group, op onal, minus, …

Graph pa ern matching• Our goal is to find all subgraphs of the data graphthat are matched by the query graph pa ern

I.e. subgraphs of the data graph that are iden cal to the querygraph pa ern with variables subs tuted by par cular terms

• One matching subgraph = one solu on = one row of a table

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 15

Page 16: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ern MatchingQuery result = solu on sequence = ordered mul set of solu ons

?t ?ySamotáři 2000

Vratné lahve 2006Medvídek 2007

{{ (?t, "Samotáři"), (?y, "2000") },{ (?t, "Vratné lahve"), (?y, "2006") },{ (?t, "Medvídek"), (?y, "2007") }

}

Solu on = set of variable bindings?t ?y

Samotáři 2000{ (?t, "Samotáři"), (?y, "2000") }

Variable binding = pair of a variable name and a value it is assigned?t

Samotáři(?t, "Samotáři")

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 16

Page 17: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ern MatchingCompa bility of solu ons

• Two solu ons are mutually compa ble if and only if all thevariables they share are pairwise bound to iden cal values

Examples• Compa ble solu ons

{ (?m, m:samotari), (?t, "Samotáři") }{ (?m, m:samotari), (?y, "2000") }

• Incompa ble solu ons{ (?m, m:samotari), (?t, "Samotáři") }{ (?m, m:medvidek), (?y, "2007") }

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 17

Page 18: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Select QueriesSELECT queries

prologue declarationsprologue declarations

SELECT clauseSELECT clause

FROM clauseFROM clause

WHERE clauseWHERE clause

solution modifierssolution modifiers

• Prologue declara ons – PREFIX, BASE• Main clauses

SELECT – variables to be projectedFROM – data graphs to be queriedWHERE – graph pa erns to be matched

• Solu on modifiers – ORDER BY, …

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 18

Page 19: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Prologue Declara onsPrologue declara ons

• Allow to simplify IRI references by declaring base IRIs

BASEBASE IRI referenceIRI reference

PREFIXPREFIX prefix nameprefix name :: IRI referenceIRI reference

BASE clause• One single base IRI is defined

all rela ve IRI references are then related to this base IRIPREFIX clause

• Several base IRIs are defined, each is associated with a nameall prefixed names are then related to the respec ve base IRI

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 19

Page 20: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Prologue Declara onsExamples

• When BASE <http://db.cz/> is defined,then a rela ve IRI reference terms#Movieis interpreted as http://db.cz/terms#Movie

• When PREFIX i: <http://db.cz/> is defined,then a prefixed name i:terms#Movieis interpreted as http://db.cz/terms#Movie

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 20

Page 21: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Where ClauseWHERE clause

• Prescribes one group graph pa ern

WHEREWHERE group graph patterngroup graph pattern

Types of graph pa erns• Basic – triple pa erns to be matched• Group – set of graph pa erns to be matched• Op onal – graph pa ern to be matched only if possible• Alterna ve – two or more alterna ve graph pa erns• …

Graph pa erns can be induc vely combined into complex ones

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 21

Page 22: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsBasic Graph Pa ern

Basic graph pa ern (triple block)

One or more triple pa erns to be all matched

• Ordinary triples separated by .• … or their abbreviated forms inspired by Turtle nota on

Object lists using ,Predicate-object lists using ;Blank nodes using []

Examples• s p1 o1 . s p1 o2 . s p2 o3 .• s p1 o1 , o2 ; p2 o3 .

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 22

Page 23: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsBasic Graph Pa ern

Interpreta on• All the involved triple pa erns must be matched

I.e. we combine them as if they were in conjunc onMore precisely…

– Each triple pa ern is evaluated to its solu on sequence– All combina ons of compa ble solu ons are then found

• Note that all the variables need to be boundI.e. if any of the involved variables cannot be bound at all,then the en re basic graph pa ern cannot be matched!

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 23

Page 24: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa erns: ExampleBasic Graph Pa ern

Titles and years of all moviesPREFIX i: <http://db.cz/terms#>SELECT ?t ?yFROM <http://db.cz/movies>WHERE

{?m rdf:type i:Movie . # triple 1?m i:title ?t . # triple 2?m i:year ?y . # triple 3

}

?t ?yVratné lahve 2006Samotáři 2000Medvídek 2007

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 24

Page 25: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa erns: ExampleBasic Graph Pa ern

Jt1K =?m

m:vratnelahvem:samotarim:medvidekm:zelary

Jt2K = ?m ?tm:vratnelahve Vratné lahvem:samotari Samotářim:medvidek Medvídek

Jt3K = ?m ?ym:vratnelahve 2006m:samotari 2000m:medvidek 2007

?t ?yVratné lahve 2006Samotáři 2000Medvídek 2007

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 25

Page 26: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsBasic Graph Pa ern

Equivalence of literals• Values must be iden cal• And when types / language tags are specified,

these must be iden cal as wellE.g.: "Medvídek"@cs != "Medvídek"

Equivalence of blank nodes• Blank nodes in query pa erns act as non-selectable variables• Labels of blank nodes in

data graphs / query graph pa erns / query resultsmay not refer to the same nodes despite being the same

I.e. the scope of validity is always local only

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 26

Page 27: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsGroup Graph Pa ern

Group graph pa ern

Set of graph pa erns to be all matched

{{ nested SELECT querynested SELECT query

triplestriples group graph patterngroup graph pattern

OPTIONAL graph patternOPTIONAL graph pattern

UNION graph patternUNION graph pattern

MINUS graph patternMINUS graph pattern

GRAPH graph patternGRAPH graph pattern

BINDBIND

FILTERFILTER

triplestriples

}}

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 27

Page 28: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsGroup Graph Pa ern

Two modes• Nested SELECT query

Only with SELECT and WHERE clauses and solu on modifiersi.e. without FROM clause

• Set of graph pa erns interleaved by triple blocksInterpreta on

• All the involved graph pa erns must be matchedI.e. we combine them as if they were in conjunc on

Notes• Empty group pa erns {} are also allowed

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 28

Page 29: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsOp onal Graph Pa ern

OPTIONAL graph pa ern

One group graph pa ern is tried to be matched

OPTIONALOPTIONAL group graph patterngroup graph pattern

Interpreta on• When the op onal part does not match,it creates no bindings but does not eliminate the solu on

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 29

Page 30: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa erns: ExampleOp onal Graph Pa ern

Movies together with their directors when possiblePREFIX i: <http://db.cz/terms#>SELECT ?t ?y ?dFROM <http://db.cz/movies>WHERE

{?m rdf:type i:Movie ;

i:title ?t ;i:year ?y .

OPTIONAL { ?m i:director ?d . }}

?t ?y ?dVratné lahve 2006Samotáři 2000Medvídek 2007 Jan Hřebejk

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 30

Page 31: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsAlterna ve Graph Pa ern

UNION graph pa ern

Two or more group graph pa erns are tried to be matched

group graph patterngroup graph pattern UNIONUNION group graph patterngroup graph pattern

Interpreta on• Standard set union of the involved query results

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 31

Page 32: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsMinus Graph Pa ern

MINUS graph pa ern

One group graph pa ern removing compa ble solu ons

MINUSMINUS group graph patterngroup graph pattern

Interpreta on• Solu ons of the first pa ern are preserved if and only if theyare not compa ble with any solu on of the second pa ern

I.e. minus graph pa ern does not correspond to the standardset minus opera on!

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 32

Page 33: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa erns: ExampleMinus Graph Pa ern

Titles of movies that have no directorPREFIX i: <http://db.cz/terms#>SELECT ?tFROM <http://db.cz/movies>WHERE

{?m rdf:type i:Movie ;

i:title ?t . # pattern 1MINUS { ?m rdf:type i:Movie ; i:director ?d . } # pattern 2

}

?tVratné lahveSamotáři

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 33

Page 34: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa erns: ExampleMinus Graph Pa ern

Jp1K = ?m ?tm:vratnelahve Vratné lahvem:samotari Samotářim:medvidek Medvídek

Jp2K = ?m ?dm:medvidek Jan Hřebejk

?tVratné lahveSamotáři

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 34

Page 35: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

From ClauseFROM clause

• Defines data graphs to be queried

FROMFROM

NAMEDNAMED

IRI referenceIRI reference

prefixed nameprefixed name

Dataset = collec on of graphs to be queried• One default graph

Merge of all the declared graphs from unnamed FROM clausesEmpty when no unnamed FROM clause is provided

• Zero or more named graphsAc ve graph = used for the evalua on of graph pa erns

• The default graph unless changed using GRAPH graph pa ern

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 35

Page 36: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

From Clause: ExampleNames of actors who played inMedvídekmoviePREFIX i: <http://db.cz/terms#>PREFIX m: <http://db.cz/movies/>SELECT ?f ?lFROM <http://db.cz/movies>FROM <http://db.cz/actors>WHERE

{m:medvidek i:actor ?a .?a i:firstname ?f ; i:lastname ?l .

}

?f ?lJiří MacháčekIvan Trojan

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 36

Page 37: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa ernsGraph Graph Pa ern

GRAPH graph pa ern

Pa ern evaluated with respect to a par cular named graph

GRAPHGRAPH variablevariable

IRI referenceIRI reference

prefixed nameprefixed name

group graph patterngroup graph pattern

• Changes the ac ve graph for a given group graph pa ernGRAPH <http://db.cz/actors> { … }

• We can also consider all the named graphsGRAPH ?g { … }

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 37

Page 38: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Graph Pa erns: ExampleGraph Graph Pa ern

Names of actors who played inMedvídekmoviePREFIX i: <http://db.cz/terms#>PREFIX m: <http://db.cz/movies/>SELECT ?f ?lFROM <http://db.cz/movies>FROM NAMED <http://db.cz/actors>WHERE

{m:medvidek i:actor ?a .GRAPH <http://db.cz/actors> {

?a i:firstname ?f ; i:lastname ?l .}

}

?f ?lJiří MacháčekIvan Trojan

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 38

Page 39: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Variable AssignmentsBIND graph pa ern

Explicitly assigns a value to a given variable

BINDBIND (( expressionexpression ASAS variablevariable ))

• This variable must not yet be bound!

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 39

Page 40: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Filter ConstraintsFILTER constraints

Impose constraints on variables and their values

FILTERFILTER built-in callbuilt-in call

(( expressionexpression ))

• Only solu ons sa sfying the given condi on are preserved• Does not create any new variable bindings!• Always applied on the en re group graph pattern

i.e. evaluated at the very end

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 40

Page 41: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Filter Constraints: ExampleMovies filmed in 2005 or later where Ivan Trojan playedPREFIX i: <http://db.cz/terms#>PREFIX a: <http://db.cz/actors/>SELECT ?t ?yFROM <http://db.cz/movies>WHERE

{?m rdf:type i:Movie ;

i:title ?t ;i:year ?y .

FILTER ((?y >= 2005) &&EXISTS { ?m i:actor a:trojan . }

)}

?t ?yMedvídek 2007

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 41

Page 42: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Filter ConstraintsRela onal expressions

• Comparisons=, !=, <, <=, =>, >Unbound variable < blank node < IRI < literal

• Set membership testsIN and NOT IN

NOTNOT

ININ ((

expressionexpression

,,

))

Numeric expressions• Unary / binary arithme c operators +, -, *, /

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 42

Page 43: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Filter ConstraintsPrimary expressions

• Literals – numeric, boolean, RDF triples• Variables• Built-in calls• Parentheses

Boolean expressions• Logical connec ves

Conjunc on &&, disjunc on ||, nega on !• 3-value logic because of unbound variables (NULL values)

true, false, error

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 43

Page 44: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Filter ConstraintsBuilt-in calls

• Term accessorsSTR – lexical form of an IRI or literalLANG – language tag of a literalDATATYPE – data type of a literal

• Variable testsBOUND – true when a variable is bound to a valueisIRI, isBLANK, isLITERAL

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 44

Page 45: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Filter ConstraintsBuilt-in calls

• Existence testsEXISTS

– True when a provided group graph pa ernis evaluated to at least one solu on

NOT EXISTS

NOTNOT

EXISTSEXISTS group graph patterngroup graph pattern

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 45

Page 46: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Select clauseSELECT clause

• Enumerates variables to be included in the query result

SELECTSELECT

DISTINCTDISTINCT

REDUCEDREDUCED

variablevariable

expressionexpression ASAS variablevariable

**

• Asterisk * selects all the variables

Solu on modifiers• DISTINCT – duplicate solu ons are removed• REDUCED – some duplicate solu ons may be removed

(implementa on-dependent behavior)

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 46

Page 47: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Solu on ModifiersSolu on modifiers – modify the en re solu on sequence

• Aggrega onGROUP BY and HAVING

• OrderingORDER BYLIMIT and OFFSET

GROUP BY clauseGROUP BY clause HAVING clauseHAVING clause

ORDER BY clauseORDER BY clause LIMIT clauseLIMIT clause

OFFSET clauseOFFSET clause

OFFSET clauseOFFSET clause

LIMIT clauseLIMIT clause

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 47

Page 48: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Solu on ModifiersORDER BY clause

• Defines the order of solu ons within the query result

ORDER BYORDER BY variablevariable

built-in callbuilt-in call

(( expressionexpression ))

ASCASC

DESCDESC

(( expressionexpression ))

• ASC(…) = ascending (default)• DESC(…) = descending

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 48

Page 49: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Solu on ModifiersLIMIT clause

• Limits the number of solu ons in the query result

LIMITLIMIT integerinteger

OFFSET clause• Skips a certain number of solu ons in the query result

OFFSETOFFSET integerinteger

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 49

Page 50: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Solu on Modifiers: ExamplePREFIX i: <http://db.cz/terms#>SELECT ?t ?yFROM <http://db.cz/movies>WHERE

{?m rdf:type i:Movie ;

i:title ?t ;i:year ?y .

}ORDER BY DESC(?y) ASC(?t)OFFSET 1LIMIT 5

?t ?yVratné lahve 2006Samotáři 2000

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 50

Page 51: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Aggrega onGROUP BY + HAVING clauses

• Standard aggrega on over a solu on sequence

GROUP BYGROUP BY variablevariable

built-in callbuilt-in call

(( expressionexpression

ASAS variablevariable

))

HAVINGHAVING built-in callbuilt-in call

(( expressionexpression ))

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 51

Page 52: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Aggrega on: ExampleNumbers of actors in movies with at most 2 actorsPREFIX i: <http://db.cz/terms#>SELECT ?t (COUNT(?a) AS ?c)FROM <http://db.cz/movies>WHERE

{?m rdf:type i:Movie ;

i:title ?t ;i:actor ?a .

}GROUP BY ?m ?tHAVING (?c <= 2)ORDER BY ?c ?t

?t ?cMedvídek 2

Vratné lahve 2

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 52

Page 53: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Aggrega onAggregate func ons

COUNTCOUNT ((

DISTINCTDISTINCT

expressionexpression

**

))

SUMSUM

MINMIN

MAXMAX

AVGAVG

((

DISTINCTDISTINCT

expressionexpression ))

GROUP_CONCATGROUP_CONCAT ((

DISTINCTDISTINCT

expressionexpression

;; SEPARATORSEPARATOR == stringstring

))

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 53

Page 54: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Query FormsQuery forms

• SELECTFinds solu ons matching a provided graph pa ern

• ASKChecks whether at least one solu on exists

• DESCRIBERetrieves a graph with data about selected resources

• CONSTRUCTCreates a new graph according to a provided pa ern

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 54

Page 55: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Query FormsSELECT

SELECT query form

Finds solu ons matching a provided graph pa ern

prologue declarationsprologue declarations

SELECT clauseSELECT clause

FROM clauseFROM clause

WHERE clauseWHERE clause

solution modifierssolution modifiers

Result• Solu on sequence = ordered mul set of solu ons

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 55

Page 56: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Query FormsCONSTRUCT

CONSTRUCT query form

Creates a new graph according to a provided pa ern

prologue declarationsprologue declarations

CONSTRUCTCONSTRUCT {{ triples constructiontriples construction }}

FROM clauseFROM clause

WHERE clauseWHERE clause

solution modifierssolution modifiers

Result• RDF graph constructed according to a group graph pa ern

Unbound or invalid triples are not involved

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 56

Page 57: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Query Forms: ExampleCONSTRUCT

PREFIX i: <http://db.cz/terms#>CONSTRUCT

{?a i:name concat(?f, " ", ?l) .

}FROM <http://db.cz/actors>WHERE

{?a rdf:type i:Actor ;

i:firstname ?f ;i:lastname ?l .

}

<http://db.cz/actors/trojan> i:name "Ivan Trojan" .<http://db.cz/actors/machacek> i:name "Jiří Macháček" .<http://db.cz/actors/schneiderova> i:name "Jitka Schneiderová" .<http://db.cz/actors/sverak> i:name "Zdeněk Svěrák" .

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 57

Page 58: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon
Page 59: Version 7svoboda/courses/171... · LectureOutline RDFstores • Introducon • LinkedData SPARQLquerylanguage • Graphpa ©erns • Filterconstraints • Soluonmodifiers • Aggregaon

Lecture ConclusionSPARQL

• Query formsSELECT, ASK, DESCRIBE, CONSTRUCT

• Graph pa ernsBasic, group, op onal, alterna ve, minusVariable assignmentsFilters

• Solu on modifiersDISTINCT, REDUCEDGROUP BY, HAVINGORDER BY, LIMIT, OFFSET

NDBI040: Big Data Management and NoSQL Databases | Lecture 4: RDF Stores: SPARQL | 24. 10. 2017 59