graphs in the real world

37
Graphs in the Real World March 2015

Upload: neo4j-the-fastest-and-most-scalable-native-graph-database

Post on 14-Jul-2015

677 views

Category:

Technology


1 download

TRANSCRIPT

Graphs in the Real World

March 2015

Value from Data RelationshipsCommon Graph Database Use Cases

Internal Applications

Master Data Management

Network and IT Operations

Fraud Detection

Customer-Facing Applications

Real-Time Recommendations

Graph-Based Search

Identity and Access Management

Graphs for Master Data Management

MDM Solutions with Graph Databases

C

C

A AA

U

S S SS S

USER_ACCESS

CONTROLLED_BY

SUBSCRIBED _BY

User

Customers

Accounts

Subscriptions

VP

Staff Staff StaffStaff

DirectorStaffDirector

Manager Manager Manager Manager

FiberLink

FiberLink

FiberLink

Ocean Cable

Switch Switch

Router Router

Service

OrganizationalHierarchy

Product Subscriptions

CMDBNetworkInventory

Social Networks

MDM Isn’t Hierarchical

Typical MDM system structure …but MDM is really a network

Patient

Agent

G.P.Surgeon Partner

Insurance

Patient

AgentG.P.Surgeon

PartnerInsurance

Challenges with Current MDM Systems

Lack of support for non-hierarchical or matrix data relationships

• Master data is never strictly hierarchical

• Systems are designed for fixed top-down hierarchy

• Non-hierarchical data is not supported

Inability to unlock value from data relationships

• Systems store only very simple data relationships

• Complex relationships and links not stored

Inflexible and expensive to maintain

• Changes to the model are expensive and time-consuming

die Bayerische – Master Data Management

• Field sales unit needed easy access to policies and customer data in variety of ways

• Growing business needed growing support

• Existing IBM DB2 system unable to meet performance requirements as it scaled

• Needed 24/7 system for sales unit outside the company

Mid-size German insurer

Founded in 1858

More than 500 employees

Project executed by Delvin GmbH,

subsidiary of die BayerischeVersicherung

die Bayerische SOLUTION

• Enables field sales unit to flexibly search for insurance policies and personal data

• Raises the bar for insurance industry practices

• Supports the business as it scales, with great performance

• Ported metadata into Neo4j easily

Classmates – Social network

Online yearbook connecting friends from

school, work and military in US and Canada

Founded as Memory Lane in Seattle

Develop new social networking capabilities to monetize yearbook-related offerings

• Show all the people I know in a yearbook

• Show yearbooks my friends appear in most often

• Show sections of a yearbook that my friends appear most in

• Show me other schools my friends attended

Classmates SOLUTION

Neo4j provides a robust and scalable graph database solution

• 3-instance cluster with cache sharding and disaster-recovery

• 18ms response time for top 4 queries

• 100M nodes and 600M relationships in initial graph—including people, images, schools, yearbooks and pages

• Projected to grow to 1B nodes and 6B relationships

Graphs for Network and IT Operations Management

Network Graphs – Telco Example

PROBLEM

Need: Instantly diagnose problems in networks of 1B+ elements

But: Basing diagnosis solely on streaming machine data severely limits accuracy and effectiveness

SOLUTION

Real-time graph analytics provide actionable insight for the largest complex connected networks in the world

• The entire network lives in a graph

• Analyzes dependencies in real time

• Highly scalable with carrier-grade uptime requirements

Graphs for Fraud Detection

Fraud Scenarios

Retail First Party Fraud• Opening many lines of credit with no intention of paying back

• Accounts for $10B+ in annual losses at US banks(1)

Synthetic Identities and Fraud Rings• Rings of synthetic identities committing fraud

Insurance – Whiplash for Cash• Insurance scams using fake drivers, passengers and witnesses• Increase network efficiency

eCommerce Fraud• Online payment fraud

(1) Business Insider: http://www.businessinsider.com/how-to-use-social-networks-in-the-fight-against-first-party-fraud-2011-3

ProsSimpleStops rookies

Discrete Data Analysis

RevolvingDebt

INVESTIGATE

INVESTIGATE

Number of accounts

ConsFalse positivesFalse negatives

Connected Analysis

RevolvingDebt

Number of accounts

PROSDetect fraud rings

Fewer false negatives

Doing Connected Analysis is Challenging

• Large amounts of data and relationships must be processed

• New data and relationships are continually being added

• Fraud rings must be uncovered in real-time to prevent fraud

ValueEffective in detecting some of the most impactful attacks, even from organized rings

ChallengeExtremely difficult with traditional technologies

For example a ten-person fraud bust-out is $1.5M, assuming 100 false identities and 3 financial instruments per identity, each with a $5K credit limit

Connected Analysis with Neo4j

Modeling a Fraud Ring as a Graph

AccountHolder

1

AccountHolder

2

AccountHolder

3

SSN2

SSN2

PhoneNumbe

r2

CreditCard

Address1

BankAccount

BankAccount

BankAccount

PhoneNumbe

r2

CreditCard

UnsecuredLoan

UnsecuredLoan

View of fraud ring in a graph database

Modeling Insurance Fraud as a Graph

Accident1

Accident2

Person1

Person2

Person3

Person4

Person5

Person6

Car1

Car2

Car3

Car4

INVOLVES

DRIVES

REPRESENTS

WITNESSES

ADJUSTS

HEALS

Gartner’s Layered Fraud Prevention Approach (4)

(4) http://www.gartner.com/newsroom/id/1695014

Traditional Fraud Prevention

Analysis of users

and their endpoints

Analysis ofnavigation

behavior and suspect patterns

Analysis of anomaly

behavior by channel

Analysis of anomaly behavior

correlated across channels

Analysis of relationships

to detect organized crime

and collusion

Layer 1

Endpoint-Centric

Navigation-Centric

Account-Centric

Cross-Channel

Entity Linking

Layer 2 Layer 3 Layer 4 Layer 5

DISCRETE DATA ANALYSIS CONNECTED ANALYSIS

Graphs for Real-time Recommendations

Real-Time Recommendations - Benefits

Online Retail• Suggest related products and services• Increase revenue and engagement

Media and Broadcasting• Create an engaging experience• Produce personalized content and offers

Logistics• Recommend optimal routes• Increase network efficiency

Real-Time Recommendations - Challenges

Make effective real-time recommendations

• Timing is everything in point-of-touch applications

• Base recommendations on current data, not last night’s batch load

Process large amounts of data and relationships for context

• Relevance is king: Make the right connections

• Drive traffic: Get users to do more with your application

Accommodate new data and relationships continuously

• Systems get richer with new data and relationships

• Recommendations become more relevant

Using Data Relationships for Recommendations

Collaborative filtering

Predict what users like based on the similarity of their behaviors, activities and preferences to others

Content-based filtering

Recommend items based on what users have liked in the past

Movie

Person

Person

Walmart – Retail Recommendations

World’s largest companyby revenue

World’s largest retailer and private employer

SF-based global e-commerce division

manages several websites

Found in 1969Bentonville, Arkansas

• Needed online customer recommendations to keep pace with competition

• Data connections provided predictive context, but were not in a usable format

• Solution had to serve many millions of customers and products while maintaining superior scalability and performance

Walmart SOLUTION

• Brings customers, preferences, purchases, products and locations into a graph model

• Uses data relationships to make product recommendations

• Solution deployed across Walmartdivisions and websites

N eo Tec h n o l o g y , I n c C o n f i d en t i a l

GRAPHS ARE EATING RETAIL

CUSTOMERS ORDERS PRODUCT

CATEGORY

THE PROBLEM

CONNECTIONS HOLD PREDICTIVE CONTEXT

CONNECTIONS IN THE DATA NOT IN A

USABLE FORMAT

OTHER EXAMPLES

THE SOLUTION

BRING THE DATA INTO A GRAPH

SO THAT THE CONNECTIONS

CAN BE USED TO MAKE

PRODUCT RECOMMENDATIONS.

COMPETITIVE PRESSURE DEMANDS ONLINE

RECOMMENDATIONS.

eBay – Real-time routing recommendations

C2C and B2Cretail network

Full e-commerce functionality for individuals and

businesses

Integrated with logistics vendors for product

deliveries

• Needed an offering to compete with Amazon Prime and Google Express

• Enable customer-selected delivery inside 90 minutes

• Calculate best route option in real-time

• Scale to enable a variety of services

• Offer more predictable delivery times

eBay Now SOLUTION

• Acquired UK-based Shutl, a leader in same-day delivery

• Used Neo4j to create eBay Now

• 1000 times faster than the prior MySQL-based solution

• Faster time-to-market

• Improved code quality with 10 to 100 times less query code

Graphs for Graph-Based Search

Curaspan – Graph-based Search

Leader in patient management for

discharges and referrals

Manages patient referrals 4600+ health care facilities

Connects providers, payers via web-based patient management platform

Founded in 1999 in Newton, Massachusetts

• Improve poor performance of Oracle solution

• Support more complexity including granular, role-based access control

• Satisfy complex Graph Search queries by discharge nurses and intake coordinatorsFind a skilled nursing facility within n miles of a given location, belonging to health care group XYZ, offering speech therapy and cardiac care, and optionally Italian language services

Curaspan SOLUTION

• Met fast, real-time performance demands

• Supported queries span multiple hierarchies including provider and employee-permissions graphs

• Improved data model to handle adding more dimensions to the data such as insurance networks, service areas and care organizations

• Greatly simplified queries, simplifying multi-page SQL statements into one Neo4j function

Graphs for Identity and Access Management

Telenor – Identity & Access Management

Oslo-based Telco#1 in Nordic countries

#10 in world

Mission-critical system

Availability and responsiveness critical to

customer satisfaction

Millions of plans, customers, admins, groups • Highly interconnected data set with massive joins

Degrading relational performance• Login took minutes to retrieve access rights

Nightly batch workaround• Solved performance problem, but meant data was

not current

Replace slow Sybase system• Batch workaround reached 9 hours in 2014—longer

than the nightly batch window

Telenor SOLUTION

• Modeling resource graph was straightforward, as the domain is a graph

• Moved authorization from Sybase to Neo4j

• Retired faulty nightly batch process

• Moved real-time response to milliseconds

• Showed fresh data, not yesterday’s snapshot

• Addressed customer retention risks

• Kept business running through aggressive data growth

Value from Data RelationshipsCommon Graph Database Use Cases

Internal Applications

Master Data Management

Network and IT Operations

Fraud Detection

Customer-Facing Applications

Real-Time Recommendations

Graph-Based Search

Identity and Access Management

Graphs in the Real World

March 2015