allegograph

41
AllegroGraph as a Graph Database Jans Aasman, Ph.D. CEO - Franz Inc [email protected]

Upload: university-of-new-south-wales

Post on 11-May-2015

874 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Allegograph

AllegroGraph as a Graph Database

Jans Aasman, Ph.D.CEO - Franz [email protected]

Page 2: Allegograph
Page 3: Allegograph

Contents

• AllegroGraph as a – QuintupleStore (well OcttupleStore in 2011)– RDF store– Graph Database

• Agraph architecture• Extreme use cases

– AMDOCS … CRM on top a trillion triples– Pharmaceutic … explore connections in graph space– Demo

Page 4: Allegograph

Agraph as a quintuple store

• S, P, O, G + unique ID + transaction #• SPOG can be any data type

1 2.0 3 42001-12-12 after 010-12-12 +19258781444Jans loves pizza file1 12NoOne believes 12

• And include very efficient geospatial and temporal representations and indices

• 6 default indices, 24 user controlled indices• Range indexing, Freetext Indexing• Neighborhood matrixes & UPI maps (for 1 ms access)• 2011: time, security

Page 5: Allegograph

Agraph as an RDF store

• RDF store when you adhere to the RDF conventions.• Full Sparql 1.0, most of Sparql 1.1• RDFS++ reasoner• GeoSpatial and Temporal representations.• Prolog for Rules• Soon Common Logic (CLIF+)

– As a usability layer on top of Prolog– Easier to combine Rules and Queries

Page 6: Allegograph

Agraph as a Graph Database

• If you want a Property Graph: – use the graph argument

Jans loves pizza gr1gr1 weight 90gr1 author Sophia

Page 7: Allegograph
Page 8: Allegograph

Schema

• Node typing• Edge typing• Attributes (nodes)• Attributes (edges)• Directed edges• Undirected edges• Restricted edges• Loop edges• Attribute indexing• Starting node• Schema

Yes Yes Yes Yes: A trusts B gr1, gr1 certainty 80. Yes: A trusts B Yes: if using RDFS symmetric property or generators Yes, if it means there can be islands. Yes, A loves A Yes No, although, is that a DB property? Yes and No: On demand you can use Ontology and

validation is straight forward

Page 9: Allegograph

Querying

• Language• Traversals

Lisp, Prolog, JavaScript and toy version of Gremlin Yes, through adjacency lists and special indices.. This

seems to be an implementation point and not a fundamental property

Page 10: Allegograph

Database

• Transactional• ACID• Fully Indexed• Distributed• Cache• Embeddable• Store-engine• Migration

framework• Object mapping

Yes Yes Yes Federation (in-machine, between machines), AG5 Yes, adjacency vectors (neighbourhood matrics) Yes: 3.3, No: 4.2.x Custom From RDB to Graph DB? Various

Only in Lisp, not in clients.

Page 11: Allegograph

Utilities

• Shell• Algorithms• Benchmark• Protocols• RDF Store• OWL Store• IDE Integration• Admin tool• Importer• Exporter• Loader• Scripting Language

All from Lisp shell, some from cshell, wget/curl Yes, JavaScript, Prolog and Lisp Yes, but only for RDF stores and reasoning REST/JSON Yes Yes Yes Yes, AGWebview Yes, from various input formats Yes, clients lets you dump triples AGLoad, Gruff, AGWebview Lisp and Javascript.

Page 12: Allegograph

Languages

• Java • Python• Ruby• C#• Scala• Clojure• Perl• PHP

Page 13: Allegograph

Many graph algorithms using generator model

• Because of Social Network Analysis requirements we implement many graph algorithms.– Using generators– A first class function that takes

• One node as input• Returns all children

• And neighbourhood matrices(or adjacency hash-tables) forspeed.

Page 14: Allegograph

how far is Actor1 from Actor2?

• Degrees of separation– How far is P1 from P2

• Connection strength– How many shortest paths

from P1 to P2 through a series of predicates and rules

Page 15: Allegograph

In what groups is this actor?

• Find the ego-network around a person or thing– Friend, friends

of friends, etc.

• Find all the fully connect graphs around a personor thing

Page 16: Allegograph

Questions in SNA: How Important is an actor?

• In-degree, out-degree

• Actor degree centrality– I have the most connections

in a group so I am more important

• Actor closeness centrality– I have more shortest paths to

anyone else in the group so I am more important

• Actor betweenness centrality– I am more often on the shortest path between other people in the group so I am

more important. I can control flow of information better than other people

Page 17: Allegograph

Has the group a leader, is the group cohesive?

• Group centralization– How centralized is this group?– Does this group have a leader– Is there someone controlling

the information flow

• Group cohesiveness– How strong and well

connected is this group– Are most people connected– What is the density

Page 18: Allegograph

All search and SNA functions use Generators

• Generator– Input: one node– Output: list of nodes– Fully functional, can be complex sparql or prolog queries – Or just predicates and indication of direction

Page 19: Allegograph

How to get from A to E??

subj pred obj a dinner-with b a kissed-with c c movie-with e b kissed-with d d movie-with e e dinner-with a

(defgenerator knows (node) (objects-of :p dinner-with))

(defgenerator knows (node) (objects-of :p dinner-with) (subjects-of :p dinner-with))

Page 20: Allegograph

How to get from A to E??

(defgenerator knows () (object-of :p dinner-with) (subject-of :p dinner-with) (object-of :p movie-with) (subject-of :p movie-with) (object-of :p kissed-with) (subject-of :p kissed-with))

(defgenerator knows () (undirected (dinner-with movie-with kissed-with)))

Page 21: Allegograph

Declaratively specify

(generator knows (node) (select (?x) (q ??node movie-with ?x)

(q ??node dinner-with ?x) (not (q ??node kissed-with ?x))) (select (?x) (q ?x movie-with ??node) (q- ?x dinner-with ??node) (not (q- ?x kissed-with ??node)))

Page 22: Allegograph

Sample SNA functions

(Ego-group actor generator depth ?group)- binds ?group to group of nodes

(Ego-group-members actor generator depth ?a) - bind ?a to every member in the group

(Cliques actor generator min-depth ?cl)- binds ?cl to all cliques

(Clique-members actor generator min-depth ?cl ?a)- binds ?cl to cliques and then iterates of ever member ?a in ?cl

(Actor-centrality actor group generator ?num) - binds ?num to actorcentrality

(Actor-centrality-members group ?actor ?num) - binds ?actor to every actor in group, ?centrality is centrality of

that actor, we start with the actor with highest centrality.(Group-centrality group generator ?num)

Actor = single node

Group = list of nodes

Depth = number

Generator = generator

Page 23: Allegograph

Integrated in Prolog and Common Logic (CLIF)

(defgenerator knows (node) (undirected :p (!fr:dinner-with !fr:kissed-with))) (select (?x) (ego-group-members !person:jans knows ?x 2) (q ?x !geo:place ?y) (geo-box-around !geoname:Berkeley ?y 5 miles))

(select (?x) (ego-group !person:jans knows ?group 2) (actor-centrality-members ?group knows ?x ?num) (q ?x !geo:place ?y) (geo-box-around !geoname:Berkeley ?y 5 miles))

Page 24: Allegograph

Where we use this?

• Amdocs: Know everything about every customer– Partitioned on customer– Most graph search centered in client

• Pfizer: help me find connections between drugs, diseases, genes, side effects in a sea of clinical trials– Just a mess of data– All graph search in server

Page 25: Allegograph

Traditional Business Intelligence

Can tell you ALL about the average customer

but NOTHING about the individual.

Page 26: Allegograph

Can you in < 1 second with one push of a button

• Predict the three most likely reasons why Joe Smith from Kansas is calling the call center? Bill unexpectedly high, loosing connection too often, doesn’t know how to use new subscription service?

• The ten last events that happened for JS? Phone calls, sms, downloads of movie, device stopped working, payment of bill, looking at map, search for local store.

• What is the likelyhood that he will change from T-Mobile to Sprint or AT&T?

• What are his ten most important friends and what devices do they have. And who is the first to change and who follows?

Page 27: Allegograph

Can you in < 1 second with one push of a button

• What are the usual daily locations for this person? What kind of shops?

• What kind of services does he download, what kind of movies/music/games does he like, what products does he buy?

• Is his plan the right plan for him?• Is he in a good mood?• Is he a valuable customer, is he a good payer, what is your

margin on him, how many times per month does he call a call center, does he look up help for mail on the internet? Can you predict if he is going to pay the bill?

Page 28: Allegograph
Page 29: Allegograph

Events Decision Engine

ContainerContainer

Actions

SBA Application Server

“Sesame”

AllegroGraphTriple Store DB

EventIngestion

ScheduledEvents

Inference Engine(Business Rules)

BayesianBeliefNetwork

Events

Operational Systems

Event Data Sources

Amdocs Event Collector

CRMCRMRM

Amdocs Integration Framework

OMS

NW Web 2.0

Architecture

Page 30: Allegograph
Page 31: Allegograph

Work for Pharma

Page 32: Allegograph

sider

Page 33: Allegograph
Page 34: Allegograph
Page 35: Allegograph

Gruff Demo

Page 36: Allegograph

What about Scalability

Page 37: Allegograph
Page 38: Allegograph
Page 39: Allegograph
Page 40: Allegograph

Architecture overview

Storage layer ( compression, indexing, freetext, transactions )

Session Management, Query Engine, Federation

RESTBackup/Restore

Replication

Warm Failover

Security

Management

Sparql Prolog Rules Clif++ Geo SNA Time RDFS+ Java-

Script

Java:Sesame Jena Python Ruby C# Clojure

Scala Perl

Page 41: Allegograph

• Thanks…