![Page 2: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/2.jpg)
What is Graph Database?
•Change in data representation• Gartner says “it represents a radical change in how data is organized and processed”
•Relationship is the first-class citizen in the graph database• In relational database, it is handled implicitly
• In graph database, you can make your data more connected
Relational Database Graph Database
Entity Row Node (Vertex)
Relationship Row Relationship (Edge)
![Page 3: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/3.jpg)
Benefits of Graph Database
• Intuitive data modeling• ER diagram-like data model
• Concise query• Doesn’t need to specify joins and its conditions• Ex) Cypher (by Neo Technology), SPARQL (by WWW)
• Performance for graph pattern matching• Optimized for processing graph traversals
• Graph analysis• Provide built-in graph analysis functions• Ex) PageRank, ShortestPath, graph clustering
![Page 4: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/4.jpg)
Intuitive Data Modeling
![Page 5: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/5.jpg)
Cypher Query Language
• Cypher is a SQL for graph databases• Declarative query language for the property graph model
• Developed by Neo technology Inc. since 2011
• Inspired by SQL and SPARQL (the standard query language for RDF)• Designed to be human-readable query language
•OpenCypher.org (http://opencypher.org) • Participate in developing the query language
![Page 6: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/6.jpg)
Cypher Example
• Using the graph pattern matching and ASCII code diagrams
with recursive as (
selectparent, child as descendant,1 as level from source union all
selectd.parent, s.child, d.level + 1
from descendants as d
join source s on d.descendant = s.parent
)select * from descendantsorder by parent, level, descendant ;
Query: Find all ancestor-descendant pairs in the graph.
MATCHp=(descendant)-[:Parent*]->(ancestor)
RETURN(ancestor), (descendant), length(p)
ORDER BY (ancestor), (descendant), length(p)
Cypher
descendant ancestor
SQL
![Page 7: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/7.jpg)
Graph Databases
• There exists many graph databases
Graph DB vendorsNeo4j – Single node, OLTP, CypherDatastax Enterprise Graph – Cassandra, Gremlin, OLTP & OLAPOrientDB – Cluster, SQL like language, document storage
Big vendorsOracle – 12c spatial and networkSAP – HANA graph, support Cypher, columnar storageIBM – provide cloud service based-on Titan, System GMicrosoft – Graph engineTeradata Aster Database – provides graph analytics
RDF DBVirtuosoo, AllegroGraph, GraphDB (ontotext)
Graph analysisGiraph – Apache projectGraphX – Spark moduleGraphLab – acquired by Apple and changed to turi.com
NoSQLMongoDB – provide simple graph lookup from 3.4 (2016 Dec) ElasticSearch – provide graph visualization and modeling
Etc.Objectivity’s ThinsSpanArangoDBJanusGraph – Forked from Titan, supported by Linux FoundataionGrakn.AI – Using Titan and Spark
![Page 8: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/8.jpg)
What We Want to Implement• Property graph model
• OpenCypher query language
• ACID transaction
• OLTP workload and graph analytics framework
• We chose to implement it based on PostgreSQL because it already has• Robust storage engine• Transaction layer using MVCC• Cost-based query optimizer
![Page 9: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/9.jpg)
AgensGraph
• Newest release: v1.1 (based on PostgreSQL v9.6.2)• Homepage: http://www.agensgraph.com • Download: http://bitnine.net/downloads/• Github: https://github.com/bitnine-oss/agensgraph
• A forked project of PostgreSQL (Apache license)
• Features• Multi-model: property graph data model, relational data model and JSON documents• Cypher query language support• Integrated querying using SQL and Cypher• Multiple graphs and Hierarchical graph label organization• Property indexes on both vertexes and edges• Constraints: unique, mandatory and check constraints
![Page 10: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/10.jpg)
AgensGraph Data Model
• Extended property graph model with JSON document
• Support multiple graphs in a database
• Label hierarchy• Vertexes and edges can be grouped into labels (e.g. person, student, teacher, …)• Labels are organized as a hierarchy
Property Indexes usingBtree, GIN, BRIN, …for both vertexes and edge
Vertex Vertex
Edge
![Page 11: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/11.jpg)
Cypher Clauses• For reading
• MATCH: find graph patterns• OPTIONAL MATCH: allows incomplete matchings
• For updating• CREATE: create a vertex or an edge• MERGE: like UPSERT• SET: modify property values
• For filtering• WHERE
• For handling results• WITH, RETURN
• And ORDER BY, LIMIT, SKIPhttps://s3.amazonaws.com/artifacts.opencypher.org/M05/railroad/Cypher.html
![Page 12: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/12.jpg)
Example• Create graph objects
• If you want label hierarchy CREATE VLABEL student INHERITS (person);
![Page 13: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/13.jpg)
Example• Create vertexes
![Page 14: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/14.jpg)
Example• Create property indexes
• Create relationships
![Page 15: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/15.jpg)
AgensGraph Architecture
• Developed in the core of PostgreSQL engine• Not a layered architecture (e.g. Titan)• Forked project of PostgreSQL• PostgreSQL is very reliable and robust
• Add graph objects
• Extend query engine for supporting Cypher query and fast graph traversal
• Maintain transaction and storage layer
JDBC/ODBC/Python/Node.js Driver
Integrated Query Processing EngineGraph query optimizerGraph query executor
Transaction LayerSupport MVCC and ACID TX
Cache LayerSupport caching graph data in memory
Graph StorageSupport label hierarchy
Optimized for fast traversal and updates
SQL & Cypher
![Page 16: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/16.jpg)
Graph Storage• Use PostgreSQL’s heap table and B-tree indexes
• Use composite indexes for edge tables to exploit index-only scans for traversals
• We found that heap table and B-tree fast enough to process graph workload
• But we plan to design a new storage for large-scale graph processing
![Page 17: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/17.jpg)
Cypher Query Processor• Cypher query is processed by the same process with SQL
• We integrate Cypher query processing with SQL query engine from the parser to the executor
• So you can use any PostgreSQL’s expressions and functions in Cypher
• Cypher query’s results is a relation• We treat Cypher query as a subquery• Existing query optimizations can be applied to Cypher query too
(e.g. rolling up subquery, predicate push-down, join ordering, …)
• Can make a query by combining SQL and Cypher as a subquery
Cypher Query
Parser
Analyze
Plan & Optimize
Execute
AST
Query Tree
Plan Tree
![Page 18: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/18.jpg)
Cypher Implementation Issues• Cypher query is a chain of Cypher clauses
• Each clause produces its results as a relation
• Chained execution• The results from the former clause are provided to the next clause
• Transform a Cypher query to a query tree• Each clause is transformed to a query structure• A MATCH clause is transformed to a query structure with joins• The chained clauses are combined as subqueries
![Page 19: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/19.jpg)
Cypher Query Processor
Actor table{name = ‘Tom Cruise’}
ACT_IN table Movie table
Query (the first MATCH)
Query (the second MATCH)
ACT_IN tableActor
{name: ‘Nicole Kidman’}
Actor table{name = ‘Tom Cruise’}
ACT_IN table Movie table
Query
ACT_IN tableActor
{name: ‘Nicole Kidman’}
Subquery rollup
![Page 20: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/20.jpg)
Variable-length Edge (VLE) Query
• Can be implemented using recursive common table expression in SQL
• But we found that CTE is inefficient for VLE query• Using CTE is BFS (Breadth First Search)-style processing• BFS processing needs to buffer intermediate results
• We implement a new execution node for VLE query• DFS-style processing
• It is a way faster than a recursive CTE query
MATCHp=(descendant)-[:Parent*]->(ancestor)
RETURN(ancestor), (descendant), length(p)
ORDER BY (ancestor), (descendant), length(p)
Cypher
descendant ancestor
![Page 21: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/21.jpg)
Example Cypher Plan• match (a)-[*1..5]->(b) return a, b;
![Page 22: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/22.jpg)
Considerations for Graph Query Performance
• Graph pattern matching is usually more efficient using random page reads• set random_page_cost = 0.005• It is more efficient to cache the data in memory or use SSD for fast graph traversal
• Index-only scan is important for graph traversals• It is possible when there are no accessing for edges’ properties
• Query optimization is crucial but it is harder than SQL queries• Graph queries involves many joins• Size estimations are getting inaccurate as increasing the number of joins• PostgreSQL’s optimizer works well usually but needs to improved and more research
![Page 23: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/23.jpg)
LDBC Benchmark
• Linked Data Benchmark (http://ldbcouncil.org)
• Participants (http://ldbcouncil.org/industry/members)• Oracle labs, IBM, Huawei, SAP, Sparsity, Openlink SW, Ontotext, Neo technology
• Benchmark tool for graph workloads• Social network benchmark (SNB)
• Simulating social network service workloads
• Graph analytics• Semantic publishing benchmark
• For RDF and SPARQL
• We conducted SNB interactive workloads
![Page 24: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/24.jpg)
Performance Comparisons
• Caveat• We had optimized two databases as much as we can• The benchmark results can be changed by configuration settings
• Comparisons• Neo4j 3.1 community edition• AgensGraph 1.0
![Page 25: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/25.jpg)
Future Roadmap
• Distributed and parallel processing• Extend AgensGraph using Postgres-XL
• Graph analysis framework like the vertex-centric programming model
• Support more graph analysis algorithms
• Integration with Big data systems for large-scale graph processing
![Page 26: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/26.jpg)
Thank You!
http://agensgraph.com
Github: https://github.com/bitnine-oss/agensgraph
![Page 27: AgensGraph: a Multi-model Graph Database · PDF fileBig vendors Oracle – 12c spatial and network SAP – HANA graph, support Cypher, columnar storage ... •Provide technical services](https://reader034.vdocuments.site/reader034/viewer/2022052213/5a751f3f7f8b9a4b538c3acd/html5/thumbnails/27.jpg)
Bitnine Global
• Headquartered at Seoul in Korea and founded in 2014
• R&D center at Santa Clara in USA
• Provide technical services for PostgreSQL and big data
• Partner with IBM and Cloudera