kojaph: visual definition and exploration of patterns … visual definition and exploration of...

28
KOJAPH: Visual definition and exploration of patterns in graph databases Walter Didimo, Francesco Giacchè, Fabrizio Montecchiani University of Perugia, ITALY

Upload: nguyendien

Post on 07-Jul-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

KOJAPH: Visual definition and exploration of patterns in

graph databases Walter Didimo, Francesco Giacchè, Fabrizio Montecchiani

University of Perugia, ITALY

Objective

Providing software tools to easily define and explore patterns in networked data sets (networks)

Requirements:• simplicity defining desired patterns through a visual language

• scalability handling big (networked) data through graph databases

• flexibility relying on different graph database management systems(GDBMSes)

Motivation

• Graph databases are of growing interest in many application domains that handle big data (social sciences, homeland security, finance, biology, computer networks, ..)

• Many users need to analyze data without learning the native query language of a GDBMS

• Several efficient GDBMSes available

Relational and Graph Databases

• Relational databases (SQL): objects and their relations are represented by tables and referential integrity constraints among them

• Graph databases (NoSQL): objects and their relations are directly represented as a graph

movie_ID

Movie

title

year

Actor

actor_ID

name

gender

MovieActor

movie_ID

actor_ID

acts-in

1

1Movie

movie_IDtitleyear

Actor

actor_IDnamegender

acts-in

Contribution

Kojaph: prototype system • visual language integrated in a simple user interface to define patterns

• tools to explore the results (… and neighborhoods)

• flexible architecture to use the system on top of different GDBMSes

Contribution

Kojaph: prototype system • visual language integrated in a simple user interface to define patterns

• tools to explore the results (… and neighborhoods)

• flexible architecture to use the system on top of different GDBMSes

but .. why this name?

Contribution

Kojaph: prototype system • visual language integrated in a simple user interface to define patterns

• tools to explore the results (… and neighborhoods)

• flexible architecture to use the system on top of different GDBMSes

Kojak

but .. why this name?

Contribution

Kojaph: prototype system • visual language integrated in a simple user interface to define patterns

• tools to explore the results (… and neighborhoods)

• flexible architecture to use the system on top of different GDBMSes

Kojak Kojaph

but .. why this name?

Related Work

• Graphite [Chau et al., IEEE ICDM 2008]− It has its own pattern matching algorithms, and cannot be used with other

existing GDBMSes

− only simple patterns can be defined

− limited interaction to explore the results

• QGraph [Blau et al., Tech. Rep. Univ. of Massachussets, 2002]− powerful visual language

− partially implemented in a system (Proximity) – an "old" project, not conceived to rely on current popular GDBMSes (like, e.g., Neo4J)

Graph pattern matching in Kojaph

G

GPRP

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsjhgdjhgj

rules on nodes and edges of GP

graph topology

an edge of GP can also correspond to a path

Graph pattern matching in Kojaph

G

GPRP

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsjhgdjhgj

rules on nodes and edges of GP

graph topology

an edge of GP can also correspond to a path

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsj

hgdjhgj

query

Graph pattern matching in Kojaph

G

GPRP

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsjhgdjhgj

rules on nodes and edges of GP

graph topology

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsj

hgdjhgj

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsj

hgdjhgj

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsj

hgdjhgj

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsj

hgdjhgj

an edge of GP can also correspond to a path

asdasdasdadasdashsdajhsjgdkajhgsdkajhgsdkajhgsdkjagsdkjahgsdkjahgsdkjhagsdkjagssdhghghgjahsj

hgdjhgj

query

Rules in a pattern

• Properties on node/edge attributes− constant values (node/edge types), user input values and collections− comparison operators (=, >, <, …, ~=, in, collects)− math operators (+, -, *, /)− other symbols (parenthesis, ..)

• Path constraints (lower and upper bounds on path lengths)− properties can also be defined on the generic nodes and edges of a path

• Properties must be combined in a boolean formula, consisting of AND, OR, and NOT operators

• the formula is visually represented as a (binary) tree

Interface of Kojaph: Overview

graph editor and attribute selector

properties tree (rules/constr.)

property editor

Graph editor and attribute selector

Property editor

Tree of properties

Monica Bellucci

Tom Cruise

Comedy

Horror / Action

path of length 2

Presentation of the results

The layouts are computed with the force-directedalgorithm in D3.js

Local exploration

System architecture and implementation

We realized an implementation of the interface for the Neo4JGDBMS, using Cypher as query language

From the Kojaph visual language to Cypher

• A graph pattern expressed in Kojaph is translated into a corresponding Cypher query in O(n) time

− n = size of the pattern (size of the graph + size of the properties tree + number of path constraints)

Example of translation

MATCH (n2)-[e1*2..2]-(n4), (n4)-[e2]-(n3), (n4)-[e3] -(n1), (n3)-[e4]-(n5)

WHERE ( (n2.name="Monica Bellucci" AND labels(n2) = [Person, Actor]) AND (n5.name

= "Tom Cruise" AND labels(n5)= [Person, Actor]) AND labels(n4)=[Person,Actor] ) AND (

(n1.genre="Comedy" AND labels(n1)="Movie") AND ( labels(n3)="Movie" AND

(n3.genre="Action" OR n3.genre="Horror") )

RETURN n1,n2,n3,n4,n5,e1,e2,e3,e4, …

Short video (example 1)

Future work

Enhancing Kojaph

• Increasing the expressiveness of the visual language− e.g., adding numerical annotations to nodes

• Adding more layout functionalities to explore the results− different drawing conventions for different types of patterns

Future work

Enhancing Kojaph

• Increasing the expressiveness of the visual language− e.g., adding numerical annotations to nodes

• Adding more layout functionalities to explore the results− different drawing conventions for different types of patterns

Future work

Enhancing Kojaph

• Increasing the expressiveness of the visual language− e.g., adding numerical annotations to nodes

• Adding more layout functionalities to explore the results− different drawing conventions for different types of patterns

Future work

Enhancing Kojaph

• Increasing the expressiveness of the visual language− e.g., adding numerical annotations to nodes

• Adding more layout functionalities to explore the results− different drawing conventions for different types of patterns

Future work

Enhancing Kojaph

• Increasing the expressiveness of the visual language− e.g., adding numerical annotations to nodes

• Adding more layout functionalities to explore the results− different drawing conventions for different types of patterns

• Testing Kojaph on different GDBMSes other than Neo4J

• Evaluate the usability of Kojaph vs other similar systems (Graphite, Proximity, ..)

Thanks for your attention!

http://mozart.diei.unipg.it:8080/Kojaph/