dr. ying zhou school of information technologies comp5338 – advanced data models week 7: neo4j...

39
Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming API

Upload: ashley-hudson

Post on 02-Jan-2016

219 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

Dr Ying ZhouSchool of Information Technologies

COMP5338 ndash Advanced Data Models

Week 7 Neo4j Storage Query execution Data Modelling and Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-2

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

Materials adapted by permission from Graph Databases (2nd Edition) by Ian Robinson et al (OrsquoReilly Media Inc) Copyright 2015 Neo Technology Inc

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

A Review Question

07-3

httpsgooglRkoRu6

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Index-free Adjacency

Native storage of relationships between nodes Effectively a pre-computed bidirectional join

Traversal is like pointer dereferencing Almost as fast as well

Index-free Adjacency Each node maintains a direct link to its adjacent nodes Each node is effectively a micro-index to the adjacent nodes

Cheaper than global indexes Query are faster do not depends on the total size of the graph

07-4

Slides 3-10 are based on Graph Database chapter 61 and 62

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Architecture

07-5

Graph data is stored in store files on disk Nodes relationships properties and labels all have their own store

files Separating graph and property data promotes fast traversal

userrsquos view of their graph and the actual records on disk are structurally dissimilar

Page 163 of Graph Database

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node store file

07-6

All node data is stored in one node store file Physically stored in file named noestorenodestoredb Each record is of a fixed size ndash 15 bytes (was 9 bytes in earlier

version ) Offset of stored node = node id 15 (node id = 100 offset = 1500) Deleted IDs in id file and can be reused

4 bytes for the first relationship ID

4 bytes for the first property ID

5 bytes for labels

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship store file

07-7

All relationship data is stored in one relationship store file Physically stored in file named neostorerelationshipstoredb Each record is of a fixed size ndash 34 bytes Offset of stored relationship = relationship id 34

So relationship id = 10 offset = 340

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Other Files

07-8

Property store contains fixed size records to store properties for nodes and relationships Simple properties are stored inline Complex ones such as long string or array property are stored else

where

Node label in node records references data in label store Relationship type in relationship record references data in

relationship type store Both Node ID and Property ID are 4 bytes

There is a maximum ID value 2~32 -1 ID is assigned and managed by the system

The corresponding record will be stored in the computed offset The IDs of deleted nodesrelationships will be reused

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 2: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-2

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

Materials adapted by permission from Graph Databases (2nd Edition) by Ian Robinson et al (OrsquoReilly Media Inc) Copyright 2015 Neo Technology Inc

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

A Review Question

07-3

httpsgooglRkoRu6

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Index-free Adjacency

Native storage of relationships between nodes Effectively a pre-computed bidirectional join

Traversal is like pointer dereferencing Almost as fast as well

Index-free Adjacency Each node maintains a direct link to its adjacent nodes Each node is effectively a micro-index to the adjacent nodes

Cheaper than global indexes Query are faster do not depends on the total size of the graph

07-4

Slides 3-10 are based on Graph Database chapter 61 and 62

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Architecture

07-5

Graph data is stored in store files on disk Nodes relationships properties and labels all have their own store

files Separating graph and property data promotes fast traversal

userrsquos view of their graph and the actual records on disk are structurally dissimilar

Page 163 of Graph Database

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node store file

07-6

All node data is stored in one node store file Physically stored in file named noestorenodestoredb Each record is of a fixed size ndash 15 bytes (was 9 bytes in earlier

version ) Offset of stored node = node id 15 (node id = 100 offset = 1500) Deleted IDs in id file and can be reused

4 bytes for the first relationship ID

4 bytes for the first property ID

5 bytes for labels

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship store file

07-7

All relationship data is stored in one relationship store file Physically stored in file named neostorerelationshipstoredb Each record is of a fixed size ndash 34 bytes Offset of stored relationship = relationship id 34

So relationship id = 10 offset = 340

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Other Files

07-8

Property store contains fixed size records to store properties for nodes and relationships Simple properties are stored inline Complex ones such as long string or array property are stored else

where

Node label in node records references data in label store Relationship type in relationship record references data in

relationship type store Both Node ID and Property ID are 4 bytes

There is a maximum ID value 2~32 -1 ID is assigned and managed by the system

The corresponding record will be stored in the computed offset The IDs of deleted nodesrelationships will be reused

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 3: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

A Review Question

07-3

httpsgooglRkoRu6

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Index-free Adjacency

Native storage of relationships between nodes Effectively a pre-computed bidirectional join

Traversal is like pointer dereferencing Almost as fast as well

Index-free Adjacency Each node maintains a direct link to its adjacent nodes Each node is effectively a micro-index to the adjacent nodes

Cheaper than global indexes Query are faster do not depends on the total size of the graph

07-4

Slides 3-10 are based on Graph Database chapter 61 and 62

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Architecture

07-5

Graph data is stored in store files on disk Nodes relationships properties and labels all have their own store

files Separating graph and property data promotes fast traversal

userrsquos view of their graph and the actual records on disk are structurally dissimilar

Page 163 of Graph Database

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node store file

07-6

All node data is stored in one node store file Physically stored in file named noestorenodestoredb Each record is of a fixed size ndash 15 bytes (was 9 bytes in earlier

version ) Offset of stored node = node id 15 (node id = 100 offset = 1500) Deleted IDs in id file and can be reused

4 bytes for the first relationship ID

4 bytes for the first property ID

5 bytes for labels

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship store file

07-7

All relationship data is stored in one relationship store file Physically stored in file named neostorerelationshipstoredb Each record is of a fixed size ndash 34 bytes Offset of stored relationship = relationship id 34

So relationship id = 10 offset = 340

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Other Files

07-8

Property store contains fixed size records to store properties for nodes and relationships Simple properties are stored inline Complex ones such as long string or array property are stored else

where

Node label in node records references data in label store Relationship type in relationship record references data in

relationship type store Both Node ID and Property ID are 4 bytes

There is a maximum ID value 2~32 -1 ID is assigned and managed by the system

The corresponding record will be stored in the computed offset The IDs of deleted nodesrelationships will be reused

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 4: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Index-free Adjacency

Native storage of relationships between nodes Effectively a pre-computed bidirectional join

Traversal is like pointer dereferencing Almost as fast as well

Index-free Adjacency Each node maintains a direct link to its adjacent nodes Each node is effectively a micro-index to the adjacent nodes

Cheaper than global indexes Query are faster do not depends on the total size of the graph

07-4

Slides 3-10 are based on Graph Database chapter 61 and 62

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Architecture

07-5

Graph data is stored in store files on disk Nodes relationships properties and labels all have their own store

files Separating graph and property data promotes fast traversal

userrsquos view of their graph and the actual records on disk are structurally dissimilar

Page 163 of Graph Database

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node store file

07-6

All node data is stored in one node store file Physically stored in file named noestorenodestoredb Each record is of a fixed size ndash 15 bytes (was 9 bytes in earlier

version ) Offset of stored node = node id 15 (node id = 100 offset = 1500) Deleted IDs in id file and can be reused

4 bytes for the first relationship ID

4 bytes for the first property ID

5 bytes for labels

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship store file

07-7

All relationship data is stored in one relationship store file Physically stored in file named neostorerelationshipstoredb Each record is of a fixed size ndash 34 bytes Offset of stored relationship = relationship id 34

So relationship id = 10 offset = 340

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Other Files

07-8

Property store contains fixed size records to store properties for nodes and relationships Simple properties are stored inline Complex ones such as long string or array property are stored else

where

Node label in node records references data in label store Relationship type in relationship record references data in

relationship type store Both Node ID and Property ID are 4 bytes

There is a maximum ID value 2~32 -1 ID is assigned and managed by the system

The corresponding record will be stored in the computed offset The IDs of deleted nodesrelationships will be reused

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 5: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Architecture

07-5

Graph data is stored in store files on disk Nodes relationships properties and labels all have their own store

files Separating graph and property data promotes fast traversal

userrsquos view of their graph and the actual records on disk are structurally dissimilar

Page 163 of Graph Database

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node store file

07-6

All node data is stored in one node store file Physically stored in file named noestorenodestoredb Each record is of a fixed size ndash 15 bytes (was 9 bytes in earlier

version ) Offset of stored node = node id 15 (node id = 100 offset = 1500) Deleted IDs in id file and can be reused

4 bytes for the first relationship ID

4 bytes for the first property ID

5 bytes for labels

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship store file

07-7

All relationship data is stored in one relationship store file Physically stored in file named neostorerelationshipstoredb Each record is of a fixed size ndash 34 bytes Offset of stored relationship = relationship id 34

So relationship id = 10 offset = 340

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Other Files

07-8

Property store contains fixed size records to store properties for nodes and relationships Simple properties are stored inline Complex ones such as long string or array property are stored else

where

Node label in node records references data in label store Relationship type in relationship record references data in

relationship type store Both Node ID and Property ID are 4 bytes

There is a maximum ID value 2~32 -1 ID is assigned and managed by the system

The corresponding record will be stored in the computed offset The IDs of deleted nodesrelationships will be reused

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 6: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node store file

07-6

All node data is stored in one node store file Physically stored in file named noestorenodestoredb Each record is of a fixed size ndash 15 bytes (was 9 bytes in earlier

version ) Offset of stored node = node id 15 (node id = 100 offset = 1500) Deleted IDs in id file and can be reused

4 bytes for the first relationship ID

4 bytes for the first property ID

5 bytes for labels

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship store file

07-7

All relationship data is stored in one relationship store file Physically stored in file named neostorerelationshipstoredb Each record is of a fixed size ndash 34 bytes Offset of stored relationship = relationship id 34

So relationship id = 10 offset = 340

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Other Files

07-8

Property store contains fixed size records to store properties for nodes and relationships Simple properties are stored inline Complex ones such as long string or array property are stored else

where

Node label in node records references data in label store Relationship type in relationship record references data in

relationship type store Both Node ID and Property ID are 4 bytes

There is a maximum ID value 2~32 -1 ID is assigned and managed by the system

The corresponding record will be stored in the computed offset The IDs of deleted nodesrelationships will be reused

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 7: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship store file

07-7

All relationship data is stored in one relationship store file Physically stored in file named neostorerelationshipstoredb Each record is of a fixed size ndash 34 bytes Offset of stored relationship = relationship id 34

So relationship id = 10 offset = 340

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Other Files

07-8

Property store contains fixed size records to store properties for nodes and relationships Simple properties are stored inline Complex ones such as long string or array property are stored else

where

Node label in node records references data in label store Relationship type in relationship record references data in

relationship type store Both Node ID and Property ID are 4 bytes

There is a maximum ID value 2~32 -1 ID is assigned and managed by the system

The corresponding record will be stored in the computed offset The IDs of deleted nodesrelationships will be reused

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 8: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Other Files

07-8

Property store contains fixed size records to store properties for nodes and relationships Simple properties are stored inline Complex ones such as long string or array property are stored else

where

Node label in node records references data in label store Relationship type in relationship record references data in

relationship type store Both Node ID and Property ID are 4 bytes

There is a maximum ID value 2~32 -1 ID is assigned and managed by the system

The corresponding record will be stored in the computed offset The IDs of deleted nodesrelationships will be reused

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 9: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-9

httpsgooglSqW4Iu

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 10: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node and Relationship structure

Bob LIKES Alice

07-10

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 11: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Doubly linked list

07-11

0

1 2

10 20

1 1 1

Node file NameAlice NameBob NameCharlie

10 10 20

Relationship file

1

Offset 340

0 1 20

TypeFOLLOWS Some property

1

Offset 680

0 2 10

Some propertyTypeFOLLOWS

person personperson

Creation order node Alice node Bob node Charlie relationship Alice --gt Bob relationship Alice --gt Charlie

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 12: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-12

httpsgooglZxIIBN

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 13: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-13

httpsgooglkPYIIE

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 14: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-14

ldquoThe node and relationship stores are concerned only with the structure of the graph not its property data Both stores use fixed-sized records so that any individual recordrsquos location within a store file can be rapidly computed given its ID These are critical design decisions that underline Neo4jrsquos commitment to high-performance traversalsrdquo

-- Chapter 6 Graph Databases

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 15: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming APIs

07-15

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 16: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Neo4j Query Execution

Each Neo4j Query is turned into an execution plan by a execution planner Rule Strategy Planner

Consider available indexes but does not use statistical information Cost Strategy Planner (default and in development)

Use statistic information to evaluate a few alternative plans Eg If there are less Movie nodes than People nodes a query involving

both may get better performance if starting from a collection of Movie nodes bull See example in lab

Query plan stages Starting point Expansion by matching given path in the query statement Row filtering skipping sorting projection etchellip Updating

07-16

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 17: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan an example

07-17

Query

MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors)RETURN directorsname

explain

profile

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 18: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Starting Points

Most queries start with one or a set of nodes except if a relationship ID is specified MATCH (n1)-[r]-gt() WHERE id(r)= 0 RETURN r n1 This query will start from locating the first record in the relationship

file

Query may start by scanning all nodes MATCH(n) RETURN (n) MATCH (cloudAtlas title Cloud Atlas)lt-[DIRECTED]-(directors) RETURN

directorsname

Query may start by scanning all nodes belonging to a given label MATCH (pPersonnamerdquoTom Hanksrdquo) return p Labels are implicitly indexed

Query may start by using index

07-18

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 19: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Query Plan With Index

Neo4j supports index on properties of labelled node

Index has similar behaviour as those in relational systems

Create Index CREATE INDEX ON Person(name)

Drop Index DROP INDEX ON Person(name)

07-19

Query

MATCH (baconPerson nameKevin Bacon)-[14]-(hollywood) RETURN DISTINCT hollywood

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 20: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactions Neo4j supports full ACID transactions

Similar to those in RDBMS Uses locking to ensure consistency

Lock Manager manages locks held by a transaction Logging

Write Ahead Logging (WAL) Transaction Commit Protocol

Acquire locks (Atomicity Consistency Isolation) Write Undo and Redo records to the WAL

for each node relationship property changed is written to the log Write commit record to the log and flush to disk (Durability) Release locks

Recovery ndash if the database servermachine crashes Apply log records to replay changes made by the transactions

07-20

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 21: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Question Time

07-21

httpsgooglm8cOcz

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 22: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-22

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 23: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Graph Data Modelling

Graph data modelling is very closely related with domain modelling

You need to decide Node or Relationship Node or Property LabelType or Property

Decisions are based on Features of entities in application domain Your typical queries Features and constraints of the underlying storage system

07-23

Slides 18-22 are based on Chapter 4 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 24: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Nodes for Things Relationship for Structures AS A reader who likes a book I WANT to know which books other

readers who like the same book have liked SO THAT I can find other books to read

07-24

MATCH (Reader nameAlice)-[LIKES]-gt(Book titleDune)lt-[LIKES]-(Reader)-[LIKES]-gt(booksBook)RETURN bookstitle

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 25: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node vs Relationship

Model Facts as Nodes

07-25

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 26: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node or Property

Represent Complex Value Types as Nodes

07-26

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 27: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationship Property or Relationship Type

Eg The relationship between user node and address node can be

typed as HOME_ADDRESS BILLING_ADDRESS or typed as generic ADDRESS and differentiated using a type property

typersquohomersquo typersquobillingrsquo

We use fine-grained relationships whenever we have a closed set of relationship types Eg there are only a finite set of address types If traversal would like to follow generic type ADDRESS we may

have to use redundant relationships MATCH (user)-[HOME_ADDRESS|WORK_ADDRESS|

DELIVERY_ADDRESS]-gt(address) MATCH (user)-[ADDRESS]-gt(address) MATCH (userUser)-[r]-gt(addressAddress)

07-27

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 28: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou) 07-28

Outline

Neo4j Storage

Neo4j Query Plan and Indexing

Neo4j ndash Data Modeling

Neo4j ndash Programming API

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 29: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

REST API

Programming Neo4j can be done using any language using REST API

Default representation for request and response is JSON format

Service root httphostnameportdbdata Node is represented by its id

httphostnameportdbdatanode0 Relationship can be represented by its id or traversed from

a node httphostnameportdbdatanode0relationshipsall httplocalhost7474dbdatarelationship5

07-29

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 30: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Transactional Cypher HTTP end point

This is the default way to interact with Neo4j using REST Allows you to send Cyhper queries in JSON as the body of a POST

request within the scope of transactions Both read and write queries

Transactions can be open across a number of requests or just for one request

Single request transaction is sent to httphostnameportdbdatatransactioncommit

Multiple requests transaction end points Start

httphostnameportdbdatatransaction Sending queries in this transaction

httphostnameportdbdatatransactiontransaction_id Commit

httphostnameportdbdatatransactiontransaction_idcommit

07-30

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 31: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Single Request Transaction Example

REQUEST POST httplocalhost7474dbdatatransactioncommit Accept applicationjson charset=UTF-8 Content-Type applicationjson

statements [ statement CREATE (n) RETURN id(p) ]

RESPONSE 200 OK Content-Type applicationjson

results [

columns [ id(n) ]

data [ row [ 18 ]

] ]

errors [ ]

07-31

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 32: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java APIs

07-32

Page 158 of Graph Databases book

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 33: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Core API

For embedded Neo4j Instantiate an EmbeddedGraphDatabase

GraphDatabaseService graphDB = new GraphDatabaseFactory()newEmbeddedGraphDatabase(DB_PATH)

Shutdown the service before exiting the program

graphDBshutdown()

07-33

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 34: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Node

Node abstraction All nodes are of the type Node which is an interface extending

another interface PropertyContainer

Create new Node Nodes are created by invoking the

GraphDatabaseServicecreateNode() methodNode firstNode = graphDBcreateNode()Node secondNode = graphDBcreateNode()Node nodeWithLabel = graphDBcreateNode(Labelhellip labels)

Set Node propertiesfirstNodesetProperty(rdquonamerdquo rdquoBobrdquo)secondNodesetProperty(rdquonamerdquo rdquoAlicerdquo)

07-34

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 35: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Relationships

Relationship abstraction All relationships are of type Relationship which is an interface

extending another interface PropertyContainer

Relationships types can be set using Java enumsprivate static enum RelType implements RelationshipType

KNOWS LIKES HATES

Create a relationshipknows = firstNodecreateRelationshipTo(secondNode RelTypeKNOWS)

Set relationship propertiesknowssetProperty(rdquosincerdquo 1965)

07-35

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 36: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Find Noderelationships

Individual node and relationship can be retrieved by its ID Node getNodeById(long id) Relationship getRelationshipById(long id) Not recommended because the ids maybe reused by other node

Nodes can be retrieved by label and property values ResourceIterableltNodegt findNodesByLabelAndProperty(

Label label String key Object value) Relationships of a node can be retrieved by type andor

directions IterableltRelationshipgt getRelationships(RelationshipType type

Direction dir)

07-36

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 37: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API ndash Deleting

Delete the relationships of a node before deleting the node

Deleting a relationshipRelationship relrel = firstNodegetSingleRelationship(KNOWS DirectionOUTGOING)reldelete()

Delete a nodefirstNodedelete()

07-37

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 38: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

Java API - Transactions

All operations are performed in the context of a transactiontry ( Transaction tx = graphDbbeginTx() )

operations on the graph txsuccess()

If you attempt to access the graph outside of a transaction those operations will throw NotInTransactionException

07-38

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References
Page 39: Dr. Ying Zhou School of Information Technologies COMP5338 – Advanced Data Models Week 7: Neo4j Storage, Query execution, Data Modelling and Programming

COMP5338 Advanced Data Models - 2015 (A Fekete amp Y Zhou)

References

Ian Robinson Jim Webber and Emil Eifrem Graph Databases Second Edition OrsquoReilly Media Inc June 5 You can download this book from the Neo4j site http

wwwneo4jorglearn will redirect you to httpgraphdatabasescom

Chapter 4 Chapter 6

Neo4j ndash Reference Manual httpneo4jcomdocsstable

07-39

  • COMP5338 ndash Advanced Data Models
  • Outline
  • A Review Question
  • Index-free Adjacency
  • Neo4j Architecture
  • Node store file
  • Relationship store file
  • Other Files
  • Question Time
  • Node and Relationship structure
  • Doubly linked list
  • Question Time (2)
  • Question Time (3)
  • Slide 14
  • Outline (2)
  • Neo4j Query Execution
  • Query Plan an example
  • Query Starting Points
  • Query Plan With Index
  • Transactions
  • Question Time (4)
  • Outline (3)
  • Graph Data Modelling
  • Node vs Relationship
  • Node vs Relationship (2)
  • Node or Property
  • Relationship Property or Relationship Type
  • Outline (4)
  • REST API
  • Transactional Cypher HTTP end point
  • Single Request Transaction Example
  • Java APIs
  • Core API
  • Node
  • Relationships
  • Find Noderelationships
  • Java API ndash Deleting
  • Java API - Transactions
  • References