a graph database on ramcloud - stanford university talks/2016...• datastax enterprise graph •...

41
A Graph Database on RAMCloud Jonathan Ellithorpe, Mendel Rosenblum git clone [email protected]:PlatformLab/TorcDB.git

Upload: others

Post on 20-May-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

A Graph Database on RAMCloudJonathan Ellithorpe, Mendel Rosenblum

git clone [email protected]:PlatformLab/TorcDB.git

TorcDB• Torc: TinkerPop on RAMCloud

• Implementation of the Apache TinkerPop API

• API gaining in popularity

• DataStax Enterprise Graph

• IBM Graph

Data Model

API Example// transaction opened on first read/write

Vertex tom = graph.vertices(42); (1) Vertex forum = graph.vertices(79);

Vertex post = graph.addVertex(“Post”, { (2) “content”: “Just bought an ACME mouse trap, super excited!”, “created”: “Jan 24, 2014”});

graph.addEdge(forum, post, “containerOf”); (3) graph.addEdge(post, tom, “hasCreator”);

graph.tx().commit(); // or abort() (4)

// new transaction opened here Vertex wylie_coyote = graph.vertices(43); graph.addEdge(wylie_coyote, post, “likes”); graph.tx().commit();

Query Language: Gremlintraversal t = graph.traversal()

// Fetch tom from the graph

Vertex tom = t.V(42).next();

// What do Tom’s friends, who are 28 years old, like?

t.V(tom).out(“knows”) .has(“age”, eq(28)) .out(“likes”)

// Who liked Tom’s posts since January 1st 2015

t.V(tom).in(“hasCreator”) .has(“created”, gt(20150101)) .in(“likes”)

TorcDB ClientApp

TorcDB Library

App

TorcDB Library

Evaluation

• Graph database benchmarks available:

• LinkBench (Facebook)

• LDBC (Linked Data Benchmark Council)

Linked Data Benchmark Council: Social Network Benchmark

Defines:

• Property graph of a social network

• Persons, Posts, Comments, Tags, etc…

Linked Data Benchmark Council: Social Network Benchmark

Defines: • Fixed set of queries on that graph:

• 8 Update Queries • Add person, add post, add post like, add forum…

• 7 Short Read Queries • Read person’s profile, read person’s friends, read

message content… • 14 Complex Read Queries

• Trend analysis, shortest paths, friends with common interests…

Linked Data Benchmark Council: Social Network Benchmark

Defines: • A range of graph sizes

• 1GG • 11K people • ~3.5M posts &

comments • 100GB

• 500K people • ~300M posts &

comments

• Most popular graph database in the world according to db-engines.com in May 2016

Traversal API

Transaction Log

File System Cache

Record Files (nodes, relationships,

properties)

Disk

Operating System

Transaction ManagementObject Cache

Core API Cypher

Neo4j

Neo4j Architecture

Neo4j Master

Neo4j Slave

Neo4j Slave

replication replication

Neo4j High Availability reads

reads reads

writes

MATCH (n:Person {id:42})-[r:KNOWS]-(friend)

WHERE r.creationDate < 2006-01-01

RETURN

friend.id AS personId,

friend.firstName AS firstName,

friend.lastName AS lastName,

r.creationDate AS friendshipCreationDate

ORDER BY friendshipCreationDate ASC

Neo4j ServerNeo4j

Benchmark Client

REST API

TorcDB Benchmark

Client

Infrc transportRAMCloud

Cluster

Setup

• All servers have 24GB memory

• RAMCloud cluster:

• 25 servers

• replicas=3

Results

TorcDB 1GB Dataset Update Query Latencies

0us

350us

700us

1,050us

1,400us

AddPerson

AddPostLi

ke

AddCommen

tLike

AddForum

AddForum

Members

hip

AddPost

AddCommen

t

AddFrien

dship

TorcDB 50th

TorcDB vs. Neo4j 1GB Dataset Update Query Latencies

0us

7,500us

15,000us

22,500us

30,000us

AddPerson

AddPostLi

ke

AddCommen

tLike

AddForum

AddForum

Members

hip

AddPost

AddCommen

t

AddFrien

dship

Neo4j 50th TorcDB 50th

TorcDB vs. Neo4j 1GB Dataset Update Query Latencies

0us

7,500us

15,000us

22,500us

30,000us

AddPerson

AddPostLi

ke

AddCommen

tLike

AddForum

AddForum

Members

hip

AddPost

AddCommen

t

AddFrien

dship

Neo4j 50th TorcDB 50th TorcDB 99th

TorcDB 1GB vs 100GB Dataset Update Query Latencies

0us

1,000us

2,000us

3,000us

4,000us

AddPerson

AddPostLi

ke

AddCommen

tLike

AddForum

AddForum

Members

hip

AddPost

AddCommen

t

AddFrien

dship

TorcDB 1GB TorcDB 100GB

TorcDB 1GB Dataset Short Read Query Latencies

0us

27.5us

55us

82.5us

110us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

TorcDB 50th

TorcDB 1GB Dataset Short Read Query Latencies

0us

175us

350us

525us

700us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

TorcDB 50th

TorcDB 1GB Dataset Short Read Query Latencies

0us

400us

800us

1,200us

1,600us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

TorcDB 50th

TorcDB 1GB Dataset Short Read Query Latencies

0us

4,000us

8,000us

12,000us

16,000us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

TorcDB 50th

TorcDB 1GB Dataset Short Read Query Latencies

0us

4,000us

8,000us

12,000us

16,000us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

TorcDB 50th Neo4j 50th

TorcDB 1GB vs 100GB Dataset Short Read Query Latencies

0us

10,000us

20,000us

30,000us

40,000us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

TorcDB 1GB TorcDB 100GB

1GB Dataset Complex Query1, Friends with Certain Name

0ms

175ms

350ms

525ms

700ms

Neo4j

TorcD

B

50th 90th 99th

1GB vs. 100GB Dataset Complex Query1, Friends with Certain Name

0ms

1,500ms

3,000ms

4,500ms

6,000ms

TorcD

B 1GB

TorcD

B 100G

B

50th 90th 99th

Live Demo!

Open Source

Extra Slides

1GB Dataset Complex Query1, Friends with Certain Name

0us

175us

350us

525us

700us

Neo4j

TorcD

B

50th 90th 99th

Neo4j 1GB Dataset Short Read Query Latencies

0us

35,000us

70,000us

105,000us

140,000us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

50th 90th 99th

Neo4j 1GB Dataset Update Query Latencies

0us

15,000us

30,000us

45,000us

60,000us

AddPerson

AddPostLi

ke

AddCommen

tLike

AddForum

AddForum

Members

hip

AddPost

AddCommen

t

AddFrien

dship

50th 90th 99th

TorcDB 1GB Dataset Short Read Query Latencies

0us

125,000us

250,000us

375,000us

500,000us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

50th 90th 99th

TorcDB 1GB Dataset Update Query Latencies

0us

1,750us

3,500us

5,250us

7,000us

AddPerson

AddPostLi

ke

AddCommen

tLike

AddForum

AddForum

Members

hip

AddPost

AddCommen

t

AddFrien

dship

50th 90th 99th

TorcDB 100GB Dataset Short Read Query Latencies

0us

250,000us

500,000us

750,000us

1,000,000us

Person

Profile

Person

Posts

Person

Frien

ds

Messa

geCon

tent

Messa

geCrea

tor

Messa

geForu

m

Messa

geRep

lies

50th 90th 99th

TorcDB 100GB Dataset Update Query Latencies

0us

10,000us

20,000us

30,000us

40,000us

AddPerson

AddPostLi

ke

AddCommen

tLike

AddForum

AddForum

Members

hip

AddPost

AddCommen

t

AddFrien

dship

50th 90th 99th

TinkerPop Database APIGraph: addVertex(label, keyValueMap) —> Vertex addEdge(outV, inV, label, keyValueMap) —> Edge vertices(vertexIds[]) —> Iterator<Vertex> edges(edgeIds[]) —> Iterator<Edge> tx() —> TransactionContext

Vertex: remove() addEdge(label, inV, keyValueMap) —> Edge edges(dir, edgeLabels[]) —> Iterator<Edge> vertices(dir, edgeLabels[]) —> Iterator<Vertex> properties(propKeys[]) —> Iterator<keyValueMap> property(key, value)

Edge: remove() vertices(direction) —> Iterator<Vertex> properties(propKeys[]) —> Iterator<keyValueMap> property(key, value) —> Property