tinkerpop: a story of graphs, dbs, and graph dbs

47
TinkerPop: a story of graphs, DBs, and graph DBs Joshua Shinavier and James Thornton Texas Linux Festival June 13th, 2014

Upload: joshua-shinavier

Post on 10-May-2015

4.228 views

Category:

Technology


4 download

DESCRIPTION

intro to TinkerPop and the Aurelius Graph Cluster for the Graph DB Workshop, Texas Linux Festival 2014

TRANSCRIPT

Page 1: TinkerPop: a story of graphs, DBs, and graph DBs

TinkerPop: a story of graphs, DBs, and graph DBs

Joshua Shinavier and James Thornton

Texas Linux FestivalJune 13th, 2014

Page 2: TinkerPop: a story of graphs, DBs, and graph DBs

Once, there was a thing

Page 3: TinkerPop: a story of graphs, DBs, and graph DBs

v(1)

Let’s call it a vertex

Page 4: TinkerPop: a story of graphs, DBs, and graph DBs

The vertex had some metadata

v(1)

name: “Graph DB workshop”

Page 5: TinkerPop: a story of graphs, DBs, and graph DBs

We’ll call that a property

v(1)

name: “Graph DB workshop”

You are here.

Page 6: TinkerPop: a story of graphs, DBs, and graph DBs

In fact, the vertex had multiple properties

v(1)

name: “Graph DB workshop”type: “Event”

Page 7: TinkerPop: a story of graphs, DBs, and graph DBs

The properties were of various types

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

Page 8: TinkerPop: a story of graphs, DBs, and graph DBs

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

Our vertex was not alone

Page 9: TinkerPop: a story of graphs, DBs, and graph DBs

Thus, an edge

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

Page 10: TinkerPop: a story of graphs, DBs, and graph DBs

The edge was directed…

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

Page 11: TinkerPop: a story of graphs, DBs, and graph DBs

…and labeled

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOf

Page 12: TinkerPop: a story of graphs, DBs, and graph DBs

The label types the relationship

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOf

You are here, too.

Page 13: TinkerPop: a story of graphs, DBs, and graph DBs

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

More vertices joined the fun…

Page 14: TinkerPop: a story of graphs, DBs, and graph DBs

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

hasTopic

More labels, too

Page 15: TinkerPop: a story of graphs, DBs, and graph DBs

Now it was a labeled multigraph

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

hasTopic

name: “James Thornton”type: “Person”githubId: “espeed”

Page 16: TinkerPop: a story of graphs, DBs, and graph DBs

A few more edges

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

contributesTo

contributesTohasTopic

contributesTo

name: “James Thornton”type: “Person”githubId: “espeed”

Page 17: TinkerPop: a story of graphs, DBs, and graph DBs

Some edges also had properties

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

contributesTo

contributesTohasTopic

contributesTo

weight: 0.2

weight: 0.8

name: “James Thornton”type: “Person”githubId: “espeed”

weight: 1.0

Page 18: TinkerPop: a story of graphs, DBs, and graph DBs

We call this a Property Graph

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

contributesTo

contributesTohasTopic

contributesTo

weight: 0.2

weight: 0.8

name: “James Thornton”type: “Person”githubId: “espeed”

weight: 1.0

Page 19: TinkerPop: a story of graphs, DBs, and graph DBs

Many graph DB data models are variations on this theme

v(1)

name: “Graph DB workshop”type: “Event”starts: 1402682400000ends: 1402696800000

v(2)

name: “Texas Linux Fest”type: “Event”starts: 1402664400000ends: 1402808400000

partOfv(3)

name: “Chef Workshop”type: “Event”starts: 1402664400000ends: 1402696800000

v(4)

name: “Canonical Charm School”type: “Event”starts: 1402664400000ends: 1402696800000

partOf

partOf

v(6)

name: “Joshua Shinavier”type: “Person”githubId: “joshsh”

v(5)

presentedBy

presentedBy

v(7)

name: “TinkerPop suite”type: “Software”

hasTopic

v(8)

name: “Aurelius Graph Cluster”type: “Software”

contributesTo

contributesTohasTopic

contributesTo

weight: 0.2

weight: 0.8

name: “James Thornton”type: “Person”githubId: “espeed”

weight: 1.0

Page 20: TinkerPop: a story of graphs, DBs, and graph DBs

Neo4j

Page 21: TinkerPop: a story of graphs, DBs, and graph DBs

OrientDB

Page 22: TinkerPop: a story of graphs, DBs, and graph DBs

Sparksee*

* the graph database previously known as DEX

Page 23: TinkerPop: a story of graphs, DBs, and graph DBs

etc.

Page 24: TinkerPop: a story of graphs, DBs, and graph DBs

Enter

• single Property Graph API supported by diverse graph database backends

• choose your favorite, but avoid vendor lock-in

• Blueprints : graph DB :: JDBC : RDBMS

• implementations, “ouplementations”, test suites, and helper utilities are built on top

Page 25: TinkerPop: a story of graphs, DBs, and graph DBs

Blueprints implementations

Page 26: TinkerPop: a story of graphs, DBs, and graph DBs

Now we need a query language…

• build it on the Blueprints API

• query over any Blueprints-compatible DB

• make it path-like, with side-effects

• match abstract traversals through the graph, filtering, ranking, and mutating as you go

• make it interactive. How about a REPL?

Page 27: TinkerPop: a story of graphs, DBs, and graph DBs

• a domain-specific language for traversing graphs

• Turing-complete, permits access to the full JDK

• has been adapted to various JVM languages

• Gremlin : graph DB :: SQL : RDBMS… sort of

Enter

Page 28: TinkerPop: a story of graphs, DBs, and graph DBs

Think “pipes and filters”

Page 29: TinkerPop: a story of graphs, DBs, and graph DBs

• Pipes: dataflow framework. The basis of Gremlin

• Frames: Java bean framework for graphs

• Furnace: Property Graph algorithms

• Rexster: high-performance graph database server

The rest of the TinkerPop family

Page 30: TinkerPop: a story of graphs, DBs, and graph DBs

TinkerPop is…• a developer group creating an open-source graph DB

stack

• a community of users and third-party implementors

• a foundation for building high-performance graph applications of any size

• model some data on your laptop

• build massive clustered applications

• open source, BSD licensed

Page 31: TinkerPop: a story of graphs, DBs, and graph DBs

A detailed guide to the rest of this workshop

• intro to the Aurelius Graph Cluster

• demos of graph tools and concepts

• guided installation of tools

• preview of TinkerPop3

Page 32: TinkerPop: a story of graphs, DBs, and graph DBs

Thanks!

Page 33: TinkerPop: a story of graphs, DBs, and graph DBs

The Aurelius Graph Cluster

Page 34: TinkerPop: a story of graphs, DBs, and graph DBs

In TinkerPop…

• we adapt various graph DBs to a unified API

• they become Property Graph databases

Page 35: TinkerPop: a story of graphs, DBs, and graph DBs

With AGC…

• we adapt various high-performance databases to the Titan API

• they become graph databases

Page 36: TinkerPop: a story of graphs, DBs, and graph DBs

Take your pick of CAP

Page 37: TinkerPop: a story of graphs, DBs, and graph DBs

Titan highlights• graphs, transactions scale with the number of

machines in a cluster

• geo, numeric range, and full text search for vertices and edges

• support for either of two indexing backends

• ElasticSearch, Lucene

• native support for Blueprints, Rexster

Page 38: TinkerPop: a story of graphs, DBs, and graph DBs

Dealing with supernodes

• Titan’s vertex-centric indices permit ordered querying from a vertex

• e.g. retrieve “knows” edges… in order of “since” timestamp

• iterates efficiently, even if there are thousands of edges

Page 39: TinkerPop: a story of graphs, DBs, and graph DBs

What about Faunus

Page 40: TinkerPop: a story of graphs, DBs, and graph DBs

Faunus…

• is a Hadoop-based graph analytics engine

• in Titan 0.5 will simply be called Titan/Hadoop

• adds support for global distributed graph operations

• applies (a subset of) Gremlin in a breadth-first fashion

Page 41: TinkerPop: a story of graphs, DBs, and graph DBs

Faunus inputs and outputs

• Hadoop SequenceFile format (in/out)

• Titan graph DB (in/out)

• GraphSON format (in/out)

• Rexster (in)

• RDF (in)

• Gremlin scripts (in/out)

Page 42: TinkerPop: a story of graphs, DBs, and graph DBs

Demo time

Page 43: TinkerPop: a story of graphs, DBs, and graph DBs

TinkerPop3

Page 44: TinkerPop: a story of graphs, DBs, and graph DBs

What’s new in TP3• new Gremlin implementation which makes good use of

Java 8 closures, enables introspection and optimization of traversals

• new OLAP API with support for message passing systems like Giraph, Hama, Faunus, etc.

• revamped I/O utilities with support for GraphSON, GraphML, and GremlinKryo

• new server model, incl. remote execution of scripts via WebSocket API, server plugin support, customizable serialization formats

Page 45: TinkerPop: a story of graphs, DBs, and graph DBs

Gremlitron

• Blueprints, Pipes, and Gremlin are all integrated in TinkerPop3

• Frames obsoleted by Gremlin DSLs

• Furnace is Gremlin OLAP

• Rexster is Gremlin Server

Page 46: TinkerPop: a story of graphs, DBs, and graph DBs

Try it out

• at:

• https://github.com/tinkerpop/tinkerpop3

• mailing list:

• https://groups.google.com/forum/gremlin-users

• we welcome your feedback and/or PRs