intro to graphs for hr analytics

Post on 02-Dec-2014

3.292 Views

Category:

Data & Analytics

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

This is the talk that I did at the Belgian Data Science meetup on how graph databases could help with HR Analytics.

TRANSCRIPT

An  Intro  to  Graphs  for  

HR  Analy)cs  rik@neotechnology.com  

Agenda  •  About  Graphs  •  About  Graph  Databases  •  Why  Graph  Databases  ma=er  for  HR  Analy@cs  –  Short  demonstra@on  

•  Case  Studies  •  Q&A  

Introduc@on:  about  Graphs  

Meet ���Leonhard Euler •  Swiss  mathema@cian  •  Inventor  of  Graph  Theory  (1736)  

Königsberg  (Prussia)  -­‐  1736  

A

B

D

C

A

B

D

C

1"

2"3"

4"

7"6"

5"

About  Graph  Databases  

So  what  is  a  graph  database?  

•  OLTP  database  – “end-­‐user”  transac@ons  

•  Model,  store,  manage  data  as  a  graph  

What  is  a  graph?  Node  

Rela@onship  

Contrast  with  Rela@onal  

Graphs are often referred to as “Whiteboard Friendly”. The data model reflects the way a domain expert would naturally

draw their data on a whiteboard “The schema is the data”. Schema flexibility allows the system

to change in response to a changing environment

What  are  graphs  good  for?  

Complex  Querying  

Examples  of  complex  queries?  1.  Semi-­‐structure  in  datasets  

14

– Normaliza@on  introduces  complexity  

– Forces  developers  to  develop  all  kinds  of  logic  to  deal  with  this  variability  in  their  applica@on  logic  

Examples  of  complex  queries:  2.  Connectedness  in  data  

Lots  of  normalized  rela@onships  between  the  different  en@@es,  forces  developers  to  do  •  Deep  joins  •  Recursive  joins  •  Pathfinding  opera@ons  •  “open-­‐ended”  queries  

Examples  of  Connectedness  

Graph  Querying  

Querying  a  Graph  

•  “Graph  local”  vs  “Graph  global”  – Contextualized  “ego-­‐centric”  queries  

•  “Parachute”  into  graph  – Start  node(s)  

•  Found  through  Index  lookups  

•  Crawl  the  surrounding  graph  – 2  million+  joins  per  second  

•  No  more  Index  lookups:    Index-­‐free  adjacency  

Queries:  Pa=ern  Matching  

Pa=ern  

Short  demo:  HR  Analy@cs  

Domains  that  jump  out  

•  The  REAL  Enterprise  Social  Network  –  Be=er  understanding  of  the  “coffee-­‐room”  network  

•  Recruitment  – Micro-­‐targe@ng  –  Social  integra@on  

•  Competency  management  –  Smart  matching  –  Taxonomies  –  Op@miza@on  algorithms  

21

It  always  starts  with  a  MODEL  

Then  for  some  Queries  

•  Network  Analy@cs  – Degree  Centrality  – Betweenness  Centrality  – PageRank  

•  Recommenda@ons  – Triadic  closures  – Complex  pa=ern  matching  

23

Use  Cases  (neo4j.com/use-­‐cases)  

Customers  (neo4j.com/customers)  

Graph  Gists  (h=p://gist.neo4j.org/)  

Neo Technology, Inc Confidential

Neo4j License Overview

Developer!Seats!

($6K*/Developer/Year)

Test!Instances!

($6K/Instance/Year)

Production!Instances!

(Bundle / Core Pricing)

Instances whose purpose is to ensure that the software accessing

Neo4j is meeting specification.!!

(e.g. System Test, Integration Test, UAT, Performance Test, Staging)

Instances that store and process data in a way that benefits and

advances an organization’s goals.!!

May be accessed by applications and/or end users

Includes access by programmers to licensed test instances, and

private instances on the programmer’s personal machine for the sole purpose of writing, debugging, or testing software

designed to access Neo4j

*Or otherwise, depending on the Bundle, and negotiation

Neo4j  versions  /  licenses  

Personal  <  Startup  /  Departmental  <  Enterprise  deployment  models  Open  source  &  Commercial  license  terms  available  

Specific  OEM  models  

Future  trainings  &  events!  

28

Neo  Technology  www.neotechnology.com    Neo4j  www.neo4j.org      rik@neotechnology.com  or  +32  478  686800  

Q&A,  Conclusion,  Next  Steps  

top related