a graph database on ramcloud - stanford university talks/2016...• datastax enterprise graph •...
TRANSCRIPT
A Graph Database on RAMCloudJonathan Ellithorpe, Mendel Rosenblum
git clone [email protected]:PlatformLab/TorcDB.git
TorcDB• Torc: TinkerPop on RAMCloud
• Implementation of the Apache TinkerPop API
• API gaining in popularity
• DataStax Enterprise Graph
• IBM Graph
API Example// transaction opened on first read/write
Vertex tom = graph.vertices(42); (1) Vertex forum = graph.vertices(79);
Vertex post = graph.addVertex(“Post”, { (2) “content”: “Just bought an ACME mouse trap, super excited!”, “created”: “Jan 24, 2014”});
graph.addEdge(forum, post, “containerOf”); (3) graph.addEdge(post, tom, “hasCreator”);
graph.tx().commit(); // or abort() (4)
// new transaction opened here Vertex wylie_coyote = graph.vertices(43); graph.addEdge(wylie_coyote, post, “likes”); graph.tx().commit();
Query Language: Gremlintraversal t = graph.traversal()
// Fetch tom from the graph
Vertex tom = t.V(42).next();
// What do Tom’s friends, who are 28 years old, like?
t.V(tom).out(“knows”) .has(“age”, eq(28)) .out(“likes”)
// Who liked Tom’s posts since January 1st 2015
t.V(tom).in(“hasCreator”) .has(“created”, gt(20150101)) .in(“likes”)
Evaluation
• Graph database benchmarks available:
• LinkBench (Facebook)
• LDBC (Linked Data Benchmark Council)
Linked Data Benchmark Council: Social Network Benchmark
Defines:
• Property graph of a social network
• Persons, Posts, Comments, Tags, etc…
Linked Data Benchmark Council: Social Network Benchmark
Defines: • Fixed set of queries on that graph:
• 8 Update Queries • Add person, add post, add post like, add forum…
• 7 Short Read Queries • Read person’s profile, read person’s friends, read
message content… • 14 Complex Read Queries
• Trend analysis, shortest paths, friends with common interests…
Linked Data Benchmark Council: Social Network Benchmark
Defines: • A range of graph sizes
• 1GG • 11K people • ~3.5M posts &
comments • 100GB
• 500K people • ~300M posts &
comments
• Most popular graph database in the world according to db-engines.com in May 2016
Traversal API
Transaction Log
File System Cache
Record Files (nodes, relationships,
properties)
Disk
Operating System
Transaction ManagementObject Cache
Core API Cypher
Neo4j
Neo4j Architecture
Neo4j Master
Neo4j Slave
Neo4j Slave
replication replication
Neo4j High Availability reads
reads reads
writes
MATCH (n:Person {id:42})-[r:KNOWS]-(friend)
WHERE r.creationDate < 2006-01-01
RETURN
friend.id AS personId,
friend.firstName AS firstName,
friend.lastName AS lastName,
r.creationDate AS friendshipCreationDate
ORDER BY friendshipCreationDate ASC
TorcDB 1GB Dataset Update Query Latencies
0us
350us
700us
1,050us
1,400us
AddPerson
AddPostLi
ke
AddCommen
tLike
AddForum
AddForum
Members
hip
AddPost
AddCommen
t
AddFrien
dship
TorcDB 50th
TorcDB vs. Neo4j 1GB Dataset Update Query Latencies
0us
7,500us
15,000us
22,500us
30,000us
AddPerson
AddPostLi
ke
AddCommen
tLike
AddForum
AddForum
Members
hip
AddPost
AddCommen
t
AddFrien
dship
Neo4j 50th TorcDB 50th
TorcDB vs. Neo4j 1GB Dataset Update Query Latencies
0us
7,500us
15,000us
22,500us
30,000us
AddPerson
AddPostLi
ke
AddCommen
tLike
AddForum
AddForum
Members
hip
AddPost
AddCommen
t
AddFrien
dship
Neo4j 50th TorcDB 50th TorcDB 99th
TorcDB 1GB vs 100GB Dataset Update Query Latencies
0us
1,000us
2,000us
3,000us
4,000us
AddPerson
AddPostLi
ke
AddCommen
tLike
AddForum
AddForum
Members
hip
AddPost
AddCommen
t
AddFrien
dship
TorcDB 1GB TorcDB 100GB
TorcDB 1GB Dataset Short Read Query Latencies
0us
27.5us
55us
82.5us
110us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
TorcDB 50th
TorcDB 1GB Dataset Short Read Query Latencies
0us
175us
350us
525us
700us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
TorcDB 50th
TorcDB 1GB Dataset Short Read Query Latencies
0us
400us
800us
1,200us
1,600us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
TorcDB 50th
TorcDB 1GB Dataset Short Read Query Latencies
0us
4,000us
8,000us
12,000us
16,000us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
TorcDB 50th
TorcDB 1GB Dataset Short Read Query Latencies
0us
4,000us
8,000us
12,000us
16,000us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
TorcDB 50th Neo4j 50th
TorcDB 1GB vs 100GB Dataset Short Read Query Latencies
0us
10,000us
20,000us
30,000us
40,000us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
TorcDB 1GB TorcDB 100GB
1GB Dataset Complex Query1, Friends with Certain Name
0ms
175ms
350ms
525ms
700ms
Neo4j
TorcD
B
50th 90th 99th
1GB vs. 100GB Dataset Complex Query1, Friends with Certain Name
0ms
1,500ms
3,000ms
4,500ms
6,000ms
TorcD
B 1GB
TorcD
B 100G
B
50th 90th 99th
1GB Dataset Complex Query1, Friends with Certain Name
0us
175us
350us
525us
700us
Neo4j
TorcD
B
50th 90th 99th
Neo4j 1GB Dataset Short Read Query Latencies
0us
35,000us
70,000us
105,000us
140,000us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
50th 90th 99th
Neo4j 1GB Dataset Update Query Latencies
0us
15,000us
30,000us
45,000us
60,000us
AddPerson
AddPostLi
ke
AddCommen
tLike
AddForum
AddForum
Members
hip
AddPost
AddCommen
t
AddFrien
dship
50th 90th 99th
TorcDB 1GB Dataset Short Read Query Latencies
0us
125,000us
250,000us
375,000us
500,000us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
50th 90th 99th
TorcDB 1GB Dataset Update Query Latencies
0us
1,750us
3,500us
5,250us
7,000us
AddPerson
AddPostLi
ke
AddCommen
tLike
AddForum
AddForum
Members
hip
AddPost
AddCommen
t
AddFrien
dship
50th 90th 99th
TorcDB 100GB Dataset Short Read Query Latencies
0us
250,000us
500,000us
750,000us
1,000,000us
Person
Profile
Person
Posts
Person
Frien
ds
Messa
geCon
tent
Messa
geCrea
tor
Messa
geForu
m
Messa
geRep
lies
50th 90th 99th
TorcDB 100GB Dataset Update Query Latencies
0us
10,000us
20,000us
30,000us
40,000us
AddPerson
AddPostLi
ke
AddCommen
tLike
AddForum
AddForum
Members
hip
AddPost
AddCommen
t
AddFrien
dship
50th 90th 99th
TinkerPop Database APIGraph: addVertex(label, keyValueMap) —> Vertex addEdge(outV, inV, label, keyValueMap) —> Edge vertices(vertexIds[]) —> Iterator<Vertex> edges(edgeIds[]) —> Iterator<Edge> tx() —> TransactionContext
Vertex: remove() addEdge(label, inV, keyValueMap) —> Edge edges(dir, edgeLabels[]) —> Iterator<Edge> vertices(dir, edgeLabels[]) —> Iterator<Vertex> properties(propKeys[]) —> Iterator<keyValueMap> property(key, value)
Edge: remove() vertices(direction) —> Iterator<Vertex> properties(propKeys[]) —> Iterator<keyValueMap> property(key, value) —> Property