the five graphs of government: how federal agencies can utilize graph technology
TRANSCRIPT
WASHINGTON D.C.
FEBRUARY 28, 2017
09:00-09:30 09:30-10:15
10:15-11:0011:00-11:3011:30-12:15
12:30-13:3013:30-17:00
Breakfast and RegistrationHow Government Agencies use Neo4j to Build the Next Generation of Applications and ServicesIntelligence Analysis with Neo4jBreakLeveraging the Graph for Knowledge Architecture at NASALunchTraining Session
Agenda
Today’s Journey….
Fred Kagan David MesaChief
Knowledge Architect for
NASA
Kimberly KaganDirector
Critical Threats Project American Enterprise
Institute
PresidentInstitute for the
Study of War
Today’s Guest Speakers
“Life can only be understood backwards;but it must be lived forwards.”
-Søren Kierkegaard
Yellowstone National Park Ecosystem
Known Influences Entered One-at-a-Time
(Willow)-[:HABITAT_FOR]->(Lincoln’s Sparrow)(Aspen)-[:FOOD_FOR]->(Beaver)
(Beaver Ponds)-[:HABITAT_FOR]->(Beaver)(Deer)-[:BROWSE_ON]->(Cottonwood)(Berry Shrubs)-[:FOOD_FOR]->(Bears)
…
Yellowstone National Park Ecosystem
Known Influences Revealed as a Graph
MATCH path = (:Animal {Entity:"Wolves"})-[*]->(:Landscape {Entity:"Rivers"})WITH extract(node IN nodes(path) | node.Yellowstone) AS factor, rand() AS numberRETURN factor AS How_Wolves_Affect_RiverStabilityORDER BY numberLIMIT 5
Yellowstone National Park Ecosystem
Query for Trophic Cascades
Conclusion:
1. Where do graph databases fit into the overall data landscape?
2. What is a graph database & when is it useful?
3. Be inspired to find your next graph in government
Takeaways from this Session:
Source: http://dataconomy.com/2014/06/understanding-big-data-ecosystem/
Big Data Landscape
“Big Data Landscape 3.0”
Discrete DataMinimally
connected data
All You Really Need to Know(at least for today)
Other NoSQL Relational DBMS Neo4j Graph DB
Connected DataFocused on
Data Relationships
Use of Graphs has created some of the most successful companies in the world
“Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.”
By the end of 2018, 70% of leading organizations will have one or more pilot or proof-of-concept efforts underway utilizing graph databases.
Analyst Perspective
“Forrester estimates that over 25% of enterprises will be using graph databases by 2017”
IT Market Clock for Database Management Systems, 2014https://www.gartner.com/doc/2852717/it-market-clock-database-management
TechRadar™: Enterprise DBMS, Q1 2014http://www.forrester.com/TechRadar+Enterprise+DBMS+Q1+2014/fulltext/-/E-RES106801
Making Big Data Normal with Graph Analysis for the Masses, 2015
http://www.gartner.com/document/3100219
Source: https://www.forrester.com/report/Market+Overview+Graph+Databases/-/E-RES121473
Empowering Journalists To Make Sense of Data Taking mankind to
MarsHelping Cure Cancer
2016: A Year in Graphs
SOFTWARE FINANCIALSERVICES RETAIL MEDIA &
OTHERSOCIAL
NETWORKS TELECOM HEALTHCARE
2016: A Year in Graphs
Real-Time Recommendations
Dynamic PricingArtificial Intelligence
& IoT-applications
Fraud Detection Network ManagementCustomer Engagement
Supply Chain Efficiency
Identity and Access Management
2016: A Year in Graphs
Graphs in Government
5.Planning4.Science
& Education
2.Resource Management
3.Oversight
1.Security
2016: A Year in Graphs
Some Perspective
We are still here
Journeyingto here
THE PROPERTY GRAPH DATA
MODEL
A way of representing data
DATA DATA
Relational Database
Good for: • Well-understood data structures that don’t change too frequently
A way of representing data
• Known problems involving discrete parts of the data, or minimal connectivity
DATA
Graph Database
Relational Database
A way of representing data
Good for: • Dynamic systems: where the data topology is difficult to predict
• Dynamic requirements: the evolve with the business
• Problems where the relationships in data contribute meaning & value
Good for: • Well-understood data structures that don’t change too frequently• Known problems involving discrete parts of the data, or minimal connectivity
27
A unified view for ultimate agility• Easily understood• Easily evolved• Easy collaboration
between business and IT
#1 Benefit: Project AgilityThe Whiteboard Model Is the Physical Model
Connectedness and Size of Data Set
Resp
onse
Ti
me
Relational and Other NoSQL Databases
0 to 2 hops0 to 3 degreesThousands of connections
1000x Advantage
Tens to hundreds of hops Thousands of degreesBillions of connections
Neo4j
“Minutes to milliseconds”
#2 Benefit:“Minutes to Milliseconds” Real-Time Query Performance
“We found Neo4j to be literally thousands of times faster than our prior MySQL solution, with queries that require 10-100 times less code. Today, Neo4j provides eBay with functionality that was previously impossible.”
- Volker Pacher, Senior Developer
“Minutes to milliseconds” performanceQueries up to 1000x faster than RDBMS or other NoSQL
#3 Benefit:“Minutes to Milliseconds” Real-Time Query Performance
Where’s the Magic?
At Write Time: data is connected
as it is stored
At Read Time: Lightning-fast retrieval of data and relationships via pointer chasing
Index free adjacency
Magic Ingredient #1 of 3:Graph Optimized Memory & Storage
MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse)
MARRIED_TO
Dan Ann
NODE RELATIONSHIP TYPE
LABEL PROPERTY VARIABLE
Magic Ingredient #2 of 3:A Productive and Powerful Graph Query Language
33
Example HR Query in SQL The Same Query using Cypher
MATCH (boss)-[:MANAGES*0..3]->(sub), (sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”RETURN sub.name AS Subordinate,
count(report) AS Total
Project Impact
Less time writing queries• More time understanding the answers• Leaving time to ask the next question
Less time debugging queries: • More time writing the next piece of code
• Improved quality of overall code base
Code that’s easier to read:• Faster ramp-up for new project members
• Improved maintainability & troubleshooting
Magic Ingredient #2 of 3:A Productive and Powerful Graph Query Language
Graph Transactions OverACID Consistency
Graph Transactions OverNon-ACID DBMSs
34
Maintains Integrity Over Time Becomes Corrupt Over Time
Magic Ingredient #3 of 3:ACID Graph Writes
GRAPHS IN GOVERNMENT
Graphs in Government
5.Planning4.Science
& Education
2.Resource Management
3.Oversight
1.Security
Graphs in Government
5.Planning4.Science
& Education
2.Resource Management
3.Oversight
Law Enforcement
1.Security
Cyber Security
Intelligence
“Don’t consider traditional technology adequate to keep up with criminal
trends”Market Guide for Online Fraud Detection, April 27, 2015
Law Enforcement
Use Case:Information and Data Synchronization in Law EnforcementLaw Enforcement Agencies use Neo4j to model the information into graphs to improve efficiency and make direct and implicit patterns readily apparent in real time.
A suspect often appears in several different databases
Financial recordsConvictions
Adresses
Vehicles
Traffic cameras
Arrests
Police Reports
Agency Records Public Records Traffic Records
Appears_in
Has
Has Has
Owns Registered
SUSPECT
Has
The Graphs In Government
The Graphs In Government 01
Bystander investigateddue to deep connection found
Use Case:Modeling Graphs in InvestigationsNeo4j is used by LE to track all parts of criminal investigations, including witnesses, suspects, forensic evidence, and locations. All related directly and indirectly.
Law Enforcement
INVESTIGATE
Revolving Debt
Number of Accounts
INVESTIGATE
Normal behavior
Fraud Detection With Discrete Analysis
Revolving Debt
Number of Accounts
Normal behavior
Fraudulent pattern
Fraud Detection With Connected Analysis
The Graphs In Government 01
ACCOUNT HOLDER 2
ACCOUNT HOLDER 1
ACCOUNT HOLDER 3
CREDIT CARD
BANKACCOUNT
BANKACCOUNT
BANKACCOUNT
ADDRESS
PHONE NUMBER
PHONE NUMBER
SSN 2
UNSECURED LOAN
SSN 2
UNSECURED LOAN
Law Enforcement
Use Case:Modeling Fraud Rings as GraphsOrganizing a fraud ring in the real world is relatively simple. A group of people share their personal information to create synthetic identities. For example with just 2 individuals sharing names and social security numbers can create 4 different identities. This can be discovered with connected analysis.
Cyber Security
Attack Analysis
Source: http://neo4j.com/graphgist/40caddf1d7537bce962e/https://linkurio.us/graph-data-visualisation-cyber-security-threats-analysis/
UDP storm attack
Domain Model ConnectedDomains
Graphs in Government
5.Planning4.Science
& Education
Asset & InventoryManagement
Supply Chain
Network & IT Operations
2.Resource Management
3.Oversight
1.Security
Network & IT Operations
• Impact & Dependency Analysis
• Root Cause Analysis• Network Design• Network Security Analysis• Network Asset Management (CMDB)
Supply Chain Example from Industry
Graphs in Government
5.Planning4.Science
& Education
2.Resource Management
Anti-Money Laundering/Fraud
Risk
Ownership
3.Oversight
1.Security
Asset Graph:Financial Ownership
• Impact & Dependency Analysis
• Risk Assessment• Compliance Enforcement
The Graphs In Government 01
Withdraw
Use Case:Combating Money Laundering With GraphsNeo4j is used to combat advanced money laundering schemes. Money laundering is all about how funds travel across a network of parties. Without graph analysis capabilities, some of these patterns can be impossible to detect.
Washed in complex series of transfers
Anti-Money Laundering
Deposit
The Graphs In Government 01
The Cali Cartel Money Laundering Scheme
Money Laundering
Source: http://neo4j.com/blog/analyzing-panama-papers-neo4j/
Case Study:“The Panama Papers”• The International Consortium of
Investigative Journalists (ICIJ) exposed highly connected networks of offshore tax structures used by the world’s richest elites.
• With 11,5 million documents, it’s the largest financial leak of all times.
• The unfolded connections in “The Panama Papers” was a major news story 2016.
The Graphs In Government 01
Money Laundering
Graphs in Government
5.Planning
Research
Exploration
Environment
4.Science &
Education
2.Resource Management
3.Oversight
1.Security
Coming soon…
Patents, Papers, and LegislationGraph-Based Search
1) Patentula Search Demo: https://www.youtube.com/watch?v=GpHSO5j8nvQ2) Visualizing and searching relationships between academic papers using Neo4j Graph database
http://dspace.thapar.edu:8080/jspui/bitstream/10266/4014/1/801432008_Karan_cse_16-final-.pdf
Graphs in Government
5.Planning
Lessons Learned
Dependency Management
Consolidation
4.Science &
Education
2.Resource Management
3.Oversight
1.Security
Product
RDBMS
CRM
RDBMS
Payment
RDBMS
Marketing
RDBMS
Logistics
RDBMS
Product
RDBMS
CRM
RDBMS
Payment
RDBMS
Marketing
RDBMS
Logistics
RDBMS
Predicting WWI [Easley and Kleinberg]
THANK YOU!
WASHINGTON D.C.
FEBRUARY 28, 2017
09:00-09:30 09:30-10:15
10:15-11:0011:00-11:3011:30-12:15
12:30-13:3013:30-17:00
Breakfast and RegistrationHow Government Agencies use Neo4j to Build the Next Generation of Applications and ServicesIntelligence Analysis with Neo4jBreakLeveraging the Graph for Knowledge Architecture at NASALunchTraining Session
Agenda