slides from graphday santa clara

170
SANTA CLARA APRIL 14, 2016 09:00-09:30 09:30-10:15 10:15-11:00 11:00-11:30 11:30-12:30 12:30-13:30 13:30-17:00 Breakfast and Registration Graphs in Action: Driving Digital Transformation with Neo4j Under the Hood: What’s a Graph and Where Do They Fit Break Transform Your Data: A worked example Lunch Training Session Agenda

Upload: neo4j-the-fastest-and-most-scalable-native-graph-database

Post on 16-Feb-2017

607 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Slides from GraphDay Santa Clara

SANTA CLARA APRIL 14, 2016

09:00-09:30 09:30-10:15

10:15-11:00

11:00-11:30 11:30-12:30 12:30-13:30 13:30-17:00

Breakfast and RegistrationGraphs in Action: Driving Digital Transformation with Neo4jUnder the Hood: What’s a Graph and Where Do They FitBreakTransform Your Data: A worked exampleLunchTraining Session

Agenda

Page 2: Slides from GraphDay Santa Clara

Speakers

Lars Nordwall Emil Eifrem Kevin Van Gundy Nicole White

Page 3: Slides from GraphDay Santa Clara

Driving Digital Transformation With Neo4j

GRAPHS IN ACTION

Santa Clara, April 14, 2016

Lars Nordwall Chief Operating Officer

@lnordwall [email protected]

Page 4: Slides from GraphDay Santa Clara

2016 Reality

174

Page 5: Slides from GraphDay Santa Clara

Corporate Threat

Source: Accenture Strategy research, summer 2015

700 business leaders in the European Union, United States, China and Japan, a majority identified large digital players or start-ups as the greatest competitive threat to profitable growth. *)

Page 6: Slides from GraphDay Santa Clara

Corporate Life SpanThe average corporate life span has been falling for more than half a century.

Standard & Poor’s data show it was - 61 years in 1958 - 25 years in 1980 - 18 years in 2011

Digitization is placing unprecedented pressure on organizations to evolve.

Atthepresentrate,75percentofS&P500incumbentswillbegoneby2027

Source: McKinsey, 2015

Page 7: Slides from GraphDay Santa Clara

DilemmaEveryone collects data today. More

data the better..

“Store first, ask questions later”

Everyone seems to hire data

scientists today (or at least trying).

Page 8: Slides from GraphDay Santa Clara

There is another dimension

beyond data volume:

Data Relationships

Why a graph database?

Page 9: Slides from GraphDay Santa Clara

Social networks RetailHR & Recruiting

Manufacturing & Logistics

Health Care Telco

Today we see graph-projects in virtually every industry

Finance

Page 10: Slides from GraphDay Santa Clara

Retail

Page 11: Slides from GraphDay Santa Clara

Neo4j solves retail-related challenges for some of the largest companies in the world

Adidas uses Neo4j to combine content and product data into a single, searchable graph database which is used to create a personalized customer experience

“We have many different silos, many different data domains, and in order to make sense out of our data, we needed to bring those together and make them useful for us,” – Sokratis Kartelias, Adidas

eBay Now Tackles eCommerce Delivery Service Routing with Neo4j

“We needed to rebuild when growth and new features made our slowest query longer than our fastest delivery - 15 minutes! Neo4j gave us best solution” – Volker Pacher, eBay

Walmart uses Neo4j to give customer best web experience through relevant and personal recommendations

“As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands”. - Marcos Vada, Walmart

Page 12: Slides from GraphDay Santa Clara

End ConsumersComponent Manufacturers

Logistics

Traditional Retail Value Chain

RetailersWholesalersAssembly Plants

Page 13: Slides from GraphDay Santa Clara

PAYMENTSSALES- CHANNELS

SUPPLY CHAIN

PRODUCTS MARKETING

CRM

CUSTOMER EXPERIENCE

The Online Retail Value Chain

Page 14: Slides from GraphDay Santa Clara

PAYMENTSSALES-CHANNELS

SUPPLY CHAIN

PRODUCTS MARKETING

CRM

CUSTOMER EXPERIENCEStore

Mobile

Webstore

Page 15: Slides from GraphDay Santa Clara

PAYMENTSSALES-CHANNELS

SUPPLY CHAIN

PRODUCTS MARKETING

CRM

CUSTOMER EXPERIENCEStore

Mobile

Shipping

Inventory

Express goods

Home delivery

Webstore

Page 16: Slides from GraphDay Santa Clara

PAYMENTSSALES-CHANNELS

SUPPLY CHAIN

PRODUCTS MARKETING

CRM

CUSTOMER EXPERIENCEStore

Mobile

Shipping

Inventory

Express goods

Home delivery RatingsPrice-range

Category

Webstore

Page 17: Slides from GraphDay Santa Clara

PAYMENTSSALES-CHANNELS

SUPPLY CHAIN

PRODUCTS MARKETING

CRM

CUSTOMER EXPERIENCEStore

Mobile

Shipping

Inventory

Express goods

Home delivery RatingsPrice-range

Category ContentPromotions

Online advertising

Webstore

Page 18: Slides from GraphDay Santa Clara

PAYMENTSSALES-CHANNELS

SUPPLY CHAIN

PRODUCTS MARKETING

CRM

CUSTOMER EXPERIENCEStore

Mobile

Shipping

Inventory

Express goods

Home delivery RatingsPrice-range

Category ContentPromotions

Online advertising

Loyalty Programs

Returns

Feedback

reviews

Tweets

Emails

Customer support

Webstore

Page 19: Slides from GraphDay Santa Clara

PAYMENTSSALES-CHANNELS

SUPPLY CHAIN

PRODUCTS MARKETING

CRM

CUSTOMER EXPERIENCEStore

Mobile

Shipping

Inventory

Express goods

Home delivery RatingsPrice-range

Category ContentPromotions

Online advertising

Loyalty Programs

Returns

Feedback

reviews

Tweets

Emails

Customer support

Credit Card

Cash

Mobile Pay

Purchase History

PAYMENTS

Webstore

Page 20: Slides from GraphDay Santa Clara

Digital transformation in retail today requires to put all this data into good use

Page 21: Slides from GraphDay Santa Clara

SHOPPING EXPERIENCE

Page 22: Slides from GraphDay Santa Clara
Page 23: Slides from GraphDay Santa Clara

Related products

People who bought X also bought Y

Recommendations (In Real-Time)

The main product

Page 24: Slides from GraphDay Santa Clara

LOOKS_AT

KITCHEN AID SERIES

Page 25: Slides from GraphDay Santa Clara

LOOKS_AT

Complaints

reviews

TweetsEmails

KITCHEN AID SERIES

Page 26: Slides from GraphDay Santa Clara

LOOKS_AT

Returns

Complaints

reviews

TweetsEmails

KITCHEN AID SERIES

Page 27: Slides from GraphDay Santa Clara

LOOKS_AT

Returns

Inventory

Complaints

reviews

TweetsEmails

KITCHEN AID SERIES

Page 28: Slides from GraphDay Santa Clara

LOOKS_AT

Returns

Home delivery

Inventory

Express goods

Complaints

reviews

TweetsEmails

Location/

KITCHEN AID SERIES

Promotions

Bundling

Page 29: Slides from GraphDay Santa Clara

LOOKS_AT

Returns

Purchase History

Price-range

Home delivery

Inventory

Express goods

Complaints

reviews

TweetsEmails

Category

Promotions

Bundling

Location/

KITCHEN AID SERIES

Page 30: Slides from GraphDay Santa Clara

LOOKS_AT

Returns

Purchase History

Price-range

Home delivery

Inventory

Express goods

Complaints

reviews

TweetsEmails

Category

Promotions

Bundling

Location

KITCHEN AID SERIES

Page 31: Slides from GraphDay Santa Clara
Page 32: Slides from GraphDay Santa Clara
Page 33: Slides from GraphDay Santa Clara

To get results, in real time, from a dataset that is highly interconnected

– you need a graph database!

Page 34: Slides from GraphDay Santa Clara

THANK YOU!

Lars Nordwall Chief Operating Officer

@lnordwall [email protected]

Page 35: Slides from GraphDay Santa Clara

Under the Hood: What’s a Graph, and Where Do They Fit

Santa Clara, April 14, 2016

Emil Eifrem CEO, Neo Technology

Founder, Neo4j

Page 36: Slides from GraphDay Santa Clara

What is the most powerful database in the world?

Page 37: Slides from GraphDay Santa Clara
Page 38: Slides from GraphDay Santa Clara
Page 39: Slides from GraphDay Santa Clara

The internet

Page 40: Slides from GraphDay Santa Clara

Genetic Ancestry of One Single Corn Variety

Page 41: Slides from GraphDay Santa Clara

Philip’s Linkedin Graph

Page 42: Slides from GraphDay Santa Clara

GOT IT. GRAPHS.

BUT WHAT IS A GRAPH?

Page 43: Slides from GraphDay Santa Clara

A Graph Is

NODE

NODE

NODE

RELATIONSHIP

RELATIONSHIP

RELATIONSHIP

Page 44: Slides from GraphDay Santa Clara

WITH

PERSON

CHECKING ACCOUNT

BANK

A Graph IsH

AS

Page 45: Slides from GraphDay Santa Clara

HA

S

HAS

HOTEL

ROOM

BOOKING

A Graph Is

Page 46: Slides from GraphDay Santa Clara

PERFORMED

PAUL McCARTNEY

BEATLES

A Graph IsB

ELO

NG

S_TO

SINGER

COMPOSERHEY JUDE

Page 47: Slides from GraphDay Santa Clara

KNOWS

KN

OW

S

KNOWS

WO

RK

S_AT

WORKS_AT

WORKS_AT

COMPANY

STANFORD

STU

DIE

D_A

T

KNOWS

NEO

COLUMBIA

STU

DIE

D_A

T

STUDIED_AT

STUDIED_AT

NAME:ANNE

SINCE:2012

PROPERTY

A Graph

Page 48: Slides from GraphDay Santa Clara

NAME:ANNE

SINCE:2012

A Graph

Page 49: Slides from GraphDay Santa Clara
Page 50: Slides from GraphDay Santa Clara

Use of Graphs has created some of the most successful companies in the world

C34,3%B

38,4%A3,3%

D3,8%

1,8%1,8% 1,8%

1,8%

1,8%

E8,1%

F3,9%

Page 51: Slides from GraphDay Santa Clara

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

Page 52: Slides from GraphDay Santa Clara

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

VIEW

ED

GRAPH THINKING: Real Time Recommendations

VIEWED

BOUG

HT

VIEWED BOUGHT

BOUGHT

BO

UG

HT

BOUG

HT

Page 53: Slides from GraphDay Santa Clara

“As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands.” Marcos Wada

Software Developer, Walmart

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

Page 54: Slides from GraphDay Santa Clara

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

GRAPH THINKING: Master Data Management

MANAGES

MANAGES

LEADS

REGION

MANAGES

MANAGES

REGION

LEADS

LEADS

COLL

ABO

RATE

S

Page 55: Slides from GraphDay Santa Clara

Neo4j is the heart of Cisco HMP: used for governance and single source of truth and a one-stop shop for all of Cisco’s hierarchies.

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

Page 56: Slides from GraphDay Santa Clara

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

OPENED_ACCOUNT

HAS IS_ISSUED

GRAPH THINKING: Fraud Detection

HAS

LIVES LIVES

IS_ISSUED

OPE

NED_

ACCO

UNT

Page 57: Slides from GraphDay Santa Clara

“Graph databases offer new methods of uncovering fraud rings and other sophisticated scams with a high-level of accuracy, and are capable of stopping advanced fraud scenarios in real-time.”

Gorka SadowskiCyber Security Expert

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

Page 58: Slides from GraphDay Santa Clara

GRAPH THINKING: Graph Based Search

PUBLISH

INCLUDE

INCLUDE

CREATE

CAPT

URE

IN

INSO

URCE

USES

USES

IN

IN

USES

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

SOURCE SOURCE

Page 59: Slides from GraphDay Santa Clara

Uses Neo4j to manage the digital assets inside of its next generation in-flight entertainment system.

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

Page 60: Slides from GraphDay Santa Clara

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

BROWSES

CONN

ECTS

BRIDGES

ROUTES

POW

ERS

ROUTES

POWERSPOWERS

HOSTS

QUERIES

GRAPH THINKING: Network & IT-Operations

Page 61: Slides from GraphDay Santa Clara

Uses Neo4j for network topology analysis for big telco service providers

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

Page 62: Slides from GraphDay Santa Clara

GRAPH THINKING: Identity And Access Management

TRUSTS

TRUSTS

ID

ID

AUTHENTICATES AUTH

ENTI

CATE

S

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

OWNS

OWNSC

AN

_REA

D

Page 63: Slides from GraphDay Santa Clara

UBS was the recipient of the 2014 Graphie Award for “Best Identify And Access Management App”

NEO4j USE CASESReal Time Recommendations

Master Data Management

Fraud Detection

Identity & Access Management

Graph Based Search

Network & IT-Operations

Page 64: Slides from GraphDay Santa Clara

Neo4j Adoption by Selected VerticalsSOFTWARE FINANCIAL

SERVICES RETAIL MEDIA & BROADCASTING

SOCIAL NETWORKS TELECOM HEALTHCARE

Page 65: Slides from GraphDay Santa Clara

TECHNICAL BENEFITS OF GRAPH DATABASES

Page 66: Slides from GraphDay Santa Clara

IntuitivnessSpeedAgility

Page 67: Slides from GraphDay Santa Clara

IntuitivnessSpeedAgility

Page 68: Slides from GraphDay Santa Clara

Intuitivness

Page 69: Slides from GraphDay Santa Clara

IntuitivnessSpeedAgility

Page 70: Slides from GraphDay Santa Clara

Connectedness and Size of Data Set

Resp

onse

Tim

e Relational and Other NoSQL Databases

0 to 2 hops 0 to 3 degrees Thousands of connections

1000x Advantage

Tens to hundreds of hops Thousands of degrees Billions of connections

Neo4j

“Minutes to milliseconds”

Real-Time Query Performance

Page 71: Slides from GraphDay Santa Clara

Speed

“We found Neo4j to be literally thousands of times faster than our prior MySQL solution, with queries that require

10-100 times less code. Today, Neo4j provides eBay with functionality that was previously impossible.”

- Volker Pacher, Senior Developer

“Minutes to milliseconds” performance Queries up to 1000x faster than RDBMS or other NoSQL

Page 72: Slides from GraphDay Santa Clara

IntuitivnessSpeedAgility

Page 73: Slides from GraphDay Santa Clara

A Naturally Adaptive Model

A Query Language Designed for Connectedness

+

=Agility

Page 74: Slides from GraphDay Santa Clara

CypherTypical Complex SQL Join The Same Query using Cypher

MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report)WHERE boss.name = “John Doe”RETURN sub.name AS Subordinate, count(report) AS Total

Project ImpactLess time writing queries• More time understanding the answers • Leaving time to ask the next question

Less time debugging queries: • More time writing the next piece of code • Improved quality of overall code base

Code that’s easier to read: • Faster ramp-up for new project members • Improved maintainability & troubleshooting

Page 75: Slides from GraphDay Santa Clara

CYPHER

Page 76: Slides from GraphDay Santa Clara

Users Love Cypher

Page 77: Slides from GraphDay Santa Clara

openCypher

Page 78: Slides from GraphDay Santa Clara

LovesAnn Dan

(Dan)(Ann) -[:LOVES]->

Page 79: Slides from GraphDay Santa Clara

Impact on the Business

Neo4j is ultra efficient &normally needs far less hardware

than any alternative

How?

Increase revenue

• Do new & impossible things • Faster time-to-market

Reduce cost

• Lower infrastructure costs

How?• Value from data relationships • Batch to real time • 1000x faster

Page 80: Slides from GraphDay Santa Clara

THANK YOU!

Page 81: Slides from GraphDay Santa Clara

Coffee Break Next session: Transform Your Data: A Worked Example

Page 82: Slides from GraphDay Santa Clara

TRANSFORM YOUR DATA

Santa Clara, April 15, 2016Neo4j @ GraphDay

Page 83: Slides from GraphDay Santa Clara
Page 84: Slides from GraphDay Santa Clara
Page 85: Slides from GraphDay Santa Clara

"The future is now…"

Page 86: Slides from GraphDay Santa Clara

ACCOUNT HOLDER 2

ACCOUNT HOLDER 1

ACCOUNT HOLDER 3

CREDIT CARD

BANKACCOUNT

BANKACCOUNT

BANKACCOUNT

ADDRESS

PHONE NUMBER

PHONE NUMBER

SSN 2

UNSECURE LOAN

SSN 2

UNSECURE LOAN

CREDIT CARD

Page 87: Slides from GraphDay Santa Clara

ABOUT ME

[email protected] @kevinvangundy

• Basically Gandalf

Page 88: Slides from GraphDay Santa Clara

AGENDA• SQL Pains • Building a Neo4j Application • Moving from RDBMS -> Graph Models

• Walk through an Example • Creating Data in Graphs • Querying Data

Page 89: Slides from GraphDay Santa Clara

SQL

Day in the Life of a RDBMS Developer

Page 90: Slides from GraphDay Santa Clara
Page 91: Slides from GraphDay Santa Clara
Page 92: Slides from GraphDay Santa Clara
Page 93: Slides from GraphDay Santa Clara
Page 94: Slides from GraphDay Santa Clara
Page 95: Slides from GraphDay Santa Clara
Page 96: Slides from GraphDay Santa Clara
Page 97: Slides from GraphDay Santa Clara

SELECT p.name, c.country, c.leader, p.hair, u.name, u.pres, u.stateFROM people p LEFT JOIN country c ON c.ID=p.country LEFT JOIN uni u ON p.uni=u.idWHERE u.state=‘CT’

Page 98: Slides from GraphDay Santa Clara

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

Page 99: Slides from GraphDay Santa Clara

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

Page 100: Slides from GraphDay Santa Clara

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

JOIN

Page 101: Slides from GraphDay Santa Clara
Page 102: Slides from GraphDay Santa Clara

Have you seen Ted's UUID?

Page 103: Slides from GraphDay Santa Clara
Page 104: Slides from GraphDay Santa Clara

• Complex to model and store relationships • Performance degrades with increases in data • Queries get long and complex • Maintenance is painful

SQL Pains

Page 105: Slides from GraphDay Santa Clara

• Easy to model and store relationships • Performance of relationship traversal remains constant with

growth in data size • Queries are shortened and more readable • Adding additional properties and relationships can be done on

the fly - no migrations

Graph Gains

Page 106: Slides from GraphDay Santa Clara

SQL Pains

Page 107: Slides from GraphDay Santa Clara

Graph Gains

Page 108: Slides from GraphDay Santa Clara

SQL Pains

Page 109: Slides from GraphDay Santa Clara

Graph Gains

Page 110: Slides from GraphDay Santa Clara
Page 111: Slides from GraphDay Santa Clara
Page 112: Slides from GraphDay Santa Clara
Page 113: Slides from GraphDay Santa Clara

How do you use Neo4j?

CREATE MODEL

+

LOAD DATA QUERY DATA

Page 114: Slides from GraphDay Santa Clara

How do you use Neo4j?

Page 115: Slides from GraphDay Santa Clara
Page 116: Slides from GraphDay Santa Clara

How do you use Neo4j?

Page 117: Slides from GraphDay Santa Clara

Language Drivers

Page 118: Slides from GraphDay Santa Clara

Language Drivers

Page 119: Slides from GraphDay Santa Clara

Native Server-Side Extensions

Page 120: Slides from GraphDay Santa Clara

Architectural Options

DataStorageandBusinessRulesExecu5on

DataMiningandAggrega5on

Applica'on

GraphDatabaseCluster

Neo4j Neo4j Neo4j

AdHocAnalysis

BulkAnaly'cInfrastructureHadoop,EDW…

DataScien'st

EndUser

DatabasesRela5onalNoSQLHadoop

Page 121: Slides from GraphDay Santa Clara

MIGRATE ALLDATA

MIGRATE GRAPHDATA

DUPLICATEGRAPHDATA

Non-graphdata Graphdata

GraphdataAlldata

Alldata

Relational Database

Graph Database

Application

Application

Application

RDBMS to Graph Options

Page 122: Slides from GraphDay Santa Clara

FROM RDBMS TO GRAPHS

Page 123: Slides from GraphDay Santa Clara
Page 124: Slides from GraphDay Santa Clara
Page 125: Slides from GraphDay Santa Clara

Northwind

Page 126: Slides from GraphDay Santa Clara

Northwind - the canonical RDBMS Example

Page 127: Slides from GraphDay Santa Clara

( )-[:TO]->(Graph)

Page 128: Slides from GraphDay Santa Clara
Page 129: Slides from GraphDay Santa Clara

( )-[:IS_BETTER_AS]->(Graph)

Page 130: Slides from GraphDay Santa Clara

Starting with the ER Diagram

Page 131: Slides from GraphDay Santa Clara

Locate the Foreign Keys

Page 132: Slides from GraphDay Santa Clara

Drop the Foreign Keys

Page 133: Slides from GraphDay Santa Clara

Find the JOIN Tables

Page 134: Slides from GraphDay Santa Clara

(Simple) JOIN Tables Become Relationships

Page 135: Slides from GraphDay Santa Clara

Attributed JOIN Tables -> Relationships with Properties

Page 136: Slides from GraphDay Santa Clara

Querying a Subset Today

Page 137: Slides from GraphDay Santa Clara

As a Graph

Page 138: Slides from GraphDay Santa Clara

QUERYING THE GRAPH

Page 139: Slides from GraphDay Santa Clara

using openCypher

Page 140: Slides from GraphDay Santa Clara

Who do people report to?MATCH (sub:Employee)-[:REPORTS_TO]->(e:Employee)RETURN *

Page 141: Slides from GraphDay Santa Clara

Who do people report to?

Page 142: Slides from GraphDay Santa Clara

Who do people report to?MATCH (sub:Employee)-[:REPORTS_TO]->(e:Employee)RETURN e.employeeID AS managerID, e.firstName AS managerName, sub.employeeID AS employeeID, sub.firstName AS employeeName;

Page 143: Slides from GraphDay Santa Clara

Who do people report to?

Page 144: Slides from GraphDay Santa Clara

Who does Robert report to?

MATCH p=(sub:Employee)-[:REPORTS_TO]->(e:Employee)WHERE sub.firstName = ‘Robert’RETURN p

Page 145: Slides from GraphDay Santa Clara

Who does Robert report to?

Page 146: Slides from GraphDay Santa Clara

What is Robert’s reporting chain?

MATCH p=(sub:Employee)-[:REPORTS_TO*]->(e:Employee)WHERE sub.firstName = ‘Robert’RETURN p

Page 147: Slides from GraphDay Santa Clara

What is Robert’s reporting chain?

Page 148: Slides from GraphDay Santa Clara

Report: Product Cross-Selling

MATCH (o:Order)-[:INCLUDES]->(:Product{productName:'Chocolade'}),(employee)-[:SOLD]->(o),(employee)-[:SOLD]->(otherOrder)-[:INCLUDES]->(other:Product)RETURN employee.firstName, other.productName, COUNT(DISTINCT otherOrder) as count ORDER BY count DESC;

Page 149: Slides from GraphDay Santa Clara

Product Cross-Selling

Page 150: Slides from GraphDay Santa Clara

POWERING AN APP

Page 151: Slides from GraphDay Santa Clara

Simple App

Page 152: Slides from GraphDay Santa Clara

Simple Python Code

Page 153: Slides from GraphDay Santa Clara

Simple Python Code

Page 154: Slides from GraphDay Santa Clara

Simple Python Code

Page 155: Slides from GraphDay Santa Clara

Simple Python Code

Page 156: Slides from GraphDay Santa Clara

But how do I liberate my RDBMs data?

Page 157: Slides from GraphDay Santa Clara

CSV

Page 158: Slides from GraphDay Santa Clara

CSV files for Northwind

Page 159: Slides from GraphDay Santa Clara

3 Steps to Creating the Graph

IMPORT NODES CREATE INDEXES IMPORT RELATIONSHIPS

Page 160: Slides from GraphDay Santa Clara

Importing Nodes// Create categoriesUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/categories.csv" AS rowCREATE (:Category {categoryID: row.CategoryID, categoryName: row.CategoryName, description: row.Description});

// Create ordersUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS rowMERGE (order:Order {orderID: row.OrderID}) ON CREATE SET order.shipName = row.ShipName;

Page 161: Slides from GraphDay Santa Clara

Importing Nodes// Create customersUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/customers.csv" AS rowCREATE (:Customer {companyName: row.CompanyName, customerID: row.CustomerID, fax: row.Fax, phone: row.Phone});

// Create productsUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/products.csv" AS rowCREATE (:Product {productName: row.ProductName, productID: row.ProductID, unitPrice: toFloat(row.UnitPrice)});

Page 162: Slides from GraphDay Santa Clara

Creating Indexes

CREATE CONSTRAINT ON (p:Product) ASSERT p.productID is UNIQUE;CREATE CONSTRAINT ON (e:Employee) ASSERT e.employeeID is UNIQUE;CREATE CONSTRAINT ON (c:Customer) ASSERT c.customerID is UNIQUE;CREATE INDEX ON :Product(productName);CREATE INDEX ON :Category(categoryID);CREATE INDEX ON :Supplier(supplierID);CREATE INDEX ON :Customer(customerName);

Page 163: Slides from GraphDay Santa Clara

Sew it together…

Page 164: Slides from GraphDay Santa Clara

Creating RelationshipsUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS rowMATCH (order:Order {orderID: row.OrderID})MATCH (customer:Customer {customerID: row.CustomerID})MERGE (customer)-[:PURCHASED]->(order);

USING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/products.csv" AS rowMATCH (product:Product {productID: row.ProductID})MATCH (supplier:Supplier {supplierID: row.SupplierID})MERGE (supplier)-[:SUPPLIES]->(product);

Page 165: Slides from GraphDay Santa Clara

Creating RelationshipsUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS rowMATCH (order:Order {orderID: row.OrderID})MATCH (product:Product {productID: row.ProductID})MERGE (order)-[pu:INCLUDES]->(product)ON CREATE SET pu.unitPrice = toFloat(row.UnitPrice), pu.quantity = toFloat(row.Quantity);

USING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/orders.csv" AS rowMATCH (order:Order {orderID: row.OrderID})MATCH (employee:Employee {employeeID: row.EmployeeID})MERGE (employee)-[:SOLD]->(order);

Page 166: Slides from GraphDay Santa Clara

Creating RelationshipsUSING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/products.csv" AS rowMATCH (product:Product {productID: row.ProductID})MATCH (category:Category {categoryID: row.CategoryID})MERGE (product)-[:PART_OF]->(category);

USING PERIODIC COMMITLOAD CSV WITH HEADERS FROM "https://raw.githubusercontent.com/neo4j-contrib/developer-resources/gh-pages/data/northwind/employees.csv" AS rowMATCH (employee:Employee {employeeID: row.EmployeeID})MATCH (manager:Employee {employeeID: row.ReportsTo})MERGE (employee)-[:REPORTS_TO]->(manager);

Page 167: Slides from GraphDay Santa Clara

High Performance LOADingneo4j-import

4.58 million thingsand their relationships…

Loads in 100 seconds!

Page 168: Slides from GraphDay Santa Clara

WRAPPING UP

Page 169: Slides from GraphDay Santa Clara

“Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.”

Page 170: Slides from GraphDay Santa Clara

THANK YOU!

Kevin Van Gundy @kevinvangundy [email protected]