1 gstore: answering sparql queries via subgraph matching presented by guan wang kent state...
TRANSCRIPT
![Page 1: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/1.jpg)
1
gStore: Answering SPARQL Queries Via Subgraph Matching
Presented by Guan Wang
Kent State UniversityOctober 24, 2011
![Page 2: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/2.jpg)
2
Outline
RDF & SPARQL
Previous Solutions for SPARQL Queries
Overview of gStore
Encoding Technique
VS*-tree & Query Algorithm
Experiments
Conclusions
![Page 3: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/3.jpg)
3
RDF & SPARQL
Previous Solutions for SPARQL Queries
Overview of gStore
Encoding Technique
VS*-tree & Query Algorithm
Experiments
Conclusions
Outline
![Page 4: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/4.jpg)
4
What is RDF
A general-purpose framework provides structured, machine-understandable metadata for the Web
It is based upon the idea of making statements about resources in the form of subject-predicate-object expressions. These expressions are known as triples in RDF.
Subject Object
Predicate
Statement
![Page 5: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/5.jpg)
5
RDF Model Example
page.html
Guan
Guan’s Home Page
Creator
Title
Subject Predicate Objectpage.html Creator Guanpage.html Creator Guan's Home Page
![Page 6: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/6.jpg)
6
What is SPARQL
SPARQL is a query language for RDF. It provides a standard format for writing queries that target RDF data and a set of standard rules for processing those queries and returning the results.
The building blocks of a SPARQL queries are graph patterns that include variables. The result of the query will be the values that these variables must take to match the RDF graph.
![Page 7: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/7.jpg)
7
Example of SPARQL
Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
Names beginning with a ? or a $ are variables. Graph patterns are given as a list of triple patterns
enclosed within braces {} The variables named after the SELECT keyword are the
variables that will be returned as results. (~SQL) Here each of the conjunctions, denoted by a dot,
corresponds to a join.
![Page 8: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/8.jpg)
8
RDF Graph
![Page 9: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/9.jpg)
9
SPARQL Queries
Query Graph
SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
![Page 10: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/10.jpg)
10
Subgraph Match vs. SPARQL Queries
![Page 11: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/11.jpg)
11
RDF & SPARQL
Previous Solutions for SPARQL Queries
Overview of gStore
Encoding Technique
VS*-tree & Query Algorithm
Experiments
Conclusions
Outline
![Page 12: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/12.jpg)
12
Existing Solutions-Three Column Table
SPARQL Query:
Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }
Shortage:
Too Many Self-Joins
![Page 13: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/13.jpg)
13
Shortage:
A Big Waste of Space
Existing Solutions-Property Table
![Page 14: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/14.jpg)
14
Existing Solutions-Vertically Partitioned
Shortage:
Too Many Merge Joins
![Page 15: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/15.jpg)
15
Existing Solutions-RDF-3x
Shortage: Different to Handle Updates
Utilize the characteristic of RDF, that there are only three elements(subject, object and predicate) in RDF. Construct all six possible indexes and optimalize merge orders.
![Page 16: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/16.jpg)
16
RDF & SPARQL
Previous Solutions for SPARQL Queries
Overview of gStore
Encoding Technique
VS*-tree & Query Algorithm
Experiments
Conclusions
Outline
![Page 17: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/17.jpg)
17
Overview of gStore(Store)
Represent an RDF dataset by an RDF graph G and store it by its adjacency list table.
![Page 18: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/18.jpg)
18
Overview of gStore(Encoding)
Encode each entity and class vertex into a bitstring, called signature. Link these vertex signatures to form a data signature graph G according
to RDF graph’s structure
![Page 19: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/19.jpg)
19
Overview of gStore(VS*-tree)
![Page 20: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/20.jpg)
20
RDF & SPARQL
Previous Solutions for SPARQL Queries
Overview of gStore
Encoding Technique
VS*-tree & Query Algorithm
Experiments
Conclusions
Outline
![Page 21: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/21.jpg)
21
Encoding Technique
![Page 22: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/22.jpg)
22
Encoding Technique
![Page 23: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/23.jpg)
23
RDF & SPARQL
Previous Solutions for SPARQL Queries
Overview of gStore
Encoding Technique
VS*-tree & Query Algorithm
Experiments
Conclusions
Outline
![Page 24: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/24.jpg)
24
VS*-tree
Each leaf node of the tree corresponds to one vertex signature in G. Given two leaf nodes d1 and d2 in the tree, we introduce an edge between them, if and only if there is an edge between d1 and d2 in G Given nodes d1 and d2 in the tree, we introduce a super edge from d1 to d2 , if and only if there is at least one edge from d1’s children to
d2’s children. Assign an edge label for the edge d1→ d2 by performing bitwise “OR” over these n edge labels from d1’s children to d2’s children.
![Page 25: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/25.jpg)
25
VS*-tree
![Page 26: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/26.jpg)
26
Query Algorithm
![Page 27: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/27.jpg)
27
RDF & SPARQL
Previous Solutions for SPARQL Queries
Overview of gStore
Encoding Technique
VS*-tree & Query Algorithm
Experiments
Conclusions
Outline
![Page 28: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/28.jpg)
28
Experiments
Used datasets: Yago, DBLP which are popular semantic datasets with millions of triples.
Data size: approximately 4GB.
![Page 29: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/29.jpg)
29
Experiments(Exact Queries)
![Page 30: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/30.jpg)
30
Experiments(Wildcard Queries)
![Page 31: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/31.jpg)
31
RDF & SPARQL
Previous Solutions for SPARQL Queries
Overview of gStore
Encoding Technique
VS*-tree & Query Algorithm
Experiments
Conclusions
Outline
![Page 32: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/32.jpg)
32
Conclusions
Propose to store and query RDF data from graph database perspective. Using VS*-tree as indexing method for bitstring of vertices, which supports the SPARQL queries in
a scalable manner. False positive.
![Page 33: 1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011](https://reader036.vdocuments.site/reader036/viewer/2022062407/56649c775503460f9492c431/html5/thumbnails/33.jpg)
33
Reference
[ICDE09]Thanh Tran, Haofen Wang, Sebastian Rudolph, Philipp Cimiano, "Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data", DOI 10.1109/ICDE.2009.119.
[VLDB07]Daniel J. Abadi, Adam Marcus, Samuel R. Madden,Kate Hollenbach, "Scalable Semantic Web Data Management Using Vertical Partitioning", VLDB ‘07, September 2328, 2007, Vienna, Austria.
[PVLDB08]Cathrin Weiss, Panagiotis Karras, Abraham Bernstein, "Hexastore:Sextuple Indexing for Semantic Web Data Management",PVLDB '08, August 23-28, 2008, Auckland, New Zealand
[PVLDB08]Thomas Neumann, Gerhard Weikum, "RDF3X:a RISCstyle Engine for RDF",PVLDB '08, August 23-28, 2008, Auckland, New Zealand
[VLDB11]Lei Zou, Jinghui Mo, Lei Chen, M. Tamer O¨ zsu, Dongyan Zhao, "gStore: Answering SPARQL Queries via Subgraph Matching" VLDB‘11,August 29th - September 3rd 2011, Seattle, Washington.
Thank you!