review: scalable semantic web data management using vertical partitioning
DESCRIPTION
Part of the Semantic Web, Ontologies and the Cloud class at The University of Texas at Austin's Computer Science department during Spring 2010 termTRANSCRIPT
![Page 1: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/1.jpg)
Abadi, Marcus, Madden, HollenbachVLDB 2007
Presented by: {Gui}llermo CabreraThe University of Texas at Austin
![Page 2: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/2.jpg)
Problem
Storage Goal
RDBMS use
RDF Physical Organization
Column store vs. Row Store
Materialized Path Expressions
Experiment & Results
Discussion
![Page 3: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/3.jpg)
Performance: Self-joins
Many triples
![Page 4: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/4.jpg)
Achieve scalability & performance in triple storage
Survey approaches in RDBMS
Benefits of vertical partition and column store
![Page 5: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/5.jpg)
1 table with 3 indexed columns?
Multi layer architecture◦ Translate -> Optimize -> Execute
Mapping tables for long URI and literals
Jena, Oracle, Sesame, 3store (Hyunjun),
Hexastore (Donghyuk)
![Page 6: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/6.jpg)
Property tables◦ Clustered property table
Denormalize RDF (wider tables)
Clustering algorithm
NULL values
![Page 7: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/7.jpg)
![Page 8: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/8.jpg)
Property tables◦ Property-Class Tables
Exploit the type property
Properties may exist in multiple tables
![Page 9: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/9.jpg)
![Page 10: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/10.jpg)
Advantage:◦ Fewer joins
Disadvantage:◦ NULL values
◦ Multivalued attributes are complicated
![Page 11: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/11.jpg)
Vertical Partition◦ n two-column tables, n = # of unique properties
◦ Table sorted by subject
Merge join
![Page 12: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/12.jpg)
![Page 13: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/13.jpg)
• Advantage
Multi valued attributes supported
No clustering algorithm (Property tables)
Only accessed properties are read
• Disadvantage
Use of multiple properties (table joins)
Inserts expensive
![Page 14: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/14.jpg)
Triple Store
Property Table
Vertical Partition (Row Store)
Vertical Partition Store (Column Store)
![Page 15: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/15.jpg)
Why?
Projection is free
Tuple headers (metadata on row)◦ 35 bytes in Postgres vs. 8 bytes in C-Store
Column oriented compression◦ Run-length encoding (ex. 1,1,1,2,2 1x3, 2x2)
Optimized merge join◦ Prefetching
![Page 16: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/16.jpg)
<BookID1, Author, http://preamble/FoxJoe>
<http://preamble/FoxJoe,wasBorn, “1860”>
Find all books whose authors were born in 1860
![Page 17: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/17.jpg)
![Page 18: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/18.jpg)
Barton Libraries Dataset
Longwell Queries◦ Calculating counts
◦ Filtering
◦ Inference
![Page 19: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/19.jpg)
8.3 GB – Triple Store (Postgres)
14 GB – Property Table (Postgres)
5.2 GB – Vertically Partitioned (Postgres)
2.7 GB – Vertically Partitioned (C-store)
Including indices and mapping table
![Page 20: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/20.jpg)
![Page 21: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/21.jpg)
![Page 22: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/22.jpg)
![Page 23: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/23.jpg)
Replace ◦ subject-object joins subject-subject joins
![Page 24: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/24.jpg)
Add 60 integer valued columns
7 GB increase in size
![Page 25: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/25.jpg)
Great for reads, writes not considered
What about load times?
Using another benchmark (ex. LUBM)?
Native XML databases for RDF/XML?
Test triple store in Sesame
![Page 26: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/26.jpg)
![Page 27: Review: Scalable Semantic Web Data Management Using Vertical Partitioning](https://reader033.vdocuments.site/reader033/viewer/2022060203/559e1ac91a28abd75b8b4614/html5/thumbnails/27.jpg)