silicon valley nosql meetup april 2012
DESCRIPTION
Join Objectivity, Inc.’s VP of Product Management, Brian Clark, in a discussion of the latest trends in Big Data Analytics, defining what is Big Data and understanding how to maximize your existing architectures by utilizing NOSQL technologies to improve functionality and provide real-time results. There will be a focus on relationship analytics as well as an introduction to NOSQL data stores, object and graph databases, such as the architecture behind Objectivity/DB and InfiniteGraph.TRANSCRIPT
Maximize your Data with Real-time Big Data Analytics using NOSQL Technologies.
Silicon Valley NOSQL Meetup Group
Thursday, April 26, 2012 – Brian Clark
04/10/2023 © Objectivity Inc 2012 1
Agenda
• About me!
• Objectivity, Inc.
• NOSQL
• Big Data
• Use Cases
• InfiniteGraph and Objectivity/DB Overview
• Demo
• Q & A
04/10/2023 © Objectivity Inc 2012 2
School - The 3 R’s
•Reading•wRiting•aRithmetic•I knew I was in trouble!
04/10/2023 © Objectivity Inc 2012 3
University - The 3 B’s
•Bands (Friday night Hop)•Booze•Birds•I knew I was in trouble!
• = a job as a mainframe computer operator
04/10/2023 © Objectivity Inc 2012 4
A Brief History of Computing
04/10/2023 © Objectivity Inc 2012 5
A Brief History of Computing
04/10/2023 © Objectivity Inc 2012 6
A Brief History of Computing
04/10/2023 © Objectivity Inc 2012 7
Performance with
Complexity and Scalability
1990’s
Physical independence
SQL
1970’s
Many-to-many relationships,
but still too rigid
1960’s
Physical pointers
1960’s
A Brief History of Databases
Hierarchical Model
Network Model
Relational Model
Object-Oriented
04/10/2023 © Objectivity Inc 2012 8
Objectivity, Inc.
• The world today is about big data, distributed objects and connections between them.
• Objectivity/DB™ Distributed big data and object management.
• InfiniteGraph™ Connects the dots on a global scale.
04/10/2023 © Objectivity Inc 2012 9
NOSQL
InfiniteGraph in the “NOSQL” Market
04/10/2023 © Objectivity Inc 2012 11
The Right Tool for the Right Job (1 of 2)
First, a truism:• The closer the data model matches the data store
structure, the faster queries can be executed, the higher the scalability, and the easier it is to write applications.
• One size doesn’t fit all, and multiple tools might join forces to fully solve a problem.
Relational Databases• Data represented by rows (records) and columns
(attributes); a schema defines the columns and their distribution amongst tables.
• Versatile, can solve most data storage and access problems; can solve all if scale is limited.
• Good for producing lists of data based on a value in that data, such as a list of customers with unfilled orders.
Hadoop/MapReduce• General purpose parallel processing and storing
facility for massive amounts of data.• Data store is a file system, not a database.• Good for problems that can be broken into many
small parts and processed independently, and done so offline, such as the ETL (extract, transform, load) process for preparing and moving captured data into a data warehouse.
Object Databases• Data represented by objects, which are groups of
attributes; schema defines the attributes, which may include pointers (relationships) to other objects
• Ability to store and retrieve whole objects makes access to set of data very fast; tighter connection to object-oriented programming application reduces complexity.
• Good for accessing massive amounts of data about related items, such as a user’s account history.
04/10/2023 © Objectivity Inc 2012 12
The Right Tool for the Right Job (2 of 2)Key-Value Databases•Rows and columns like a relational database, but only 2 columns, making it an indexing system (find a value based on the key) •No schema required, so the value could be anything, such as an object or a pointer to data in another data store•Very fast for indexing, such as looking up a user’s shopping cart on an ecommerce site.
Column Family Databases•Rows and columns like a relational database, but storage on disk is organized so as to make attributes (columns) highly accessible without accessing the whole of the associated record (row).•Results in very fast actions regarding attributes, such as calculating average age
Document Databases•Similar to object database, but without the need to predefine an object’s attributes (i.e., no schema required).•Provides flexibility to store new types or unanticipated sizes of data/objects during operation, on the fly, such as event logging where the data format is unpredictable and not just simple text (e.g., video).
Graph Databases•Similar to object database, but the objects and relationships between them are all objects with their own respective sets of attributes.
•Enables very fast queries when the value of the data is in the relationships, i.e. relationships between people/items•Are two people/items related (even if separated by several levels of relationship)?
•Where the relationships represent costs, what is the optimal combination of groups of people/items?
04/10/2023 © Objectivity Inc 2012 13
Big Data
Big Data
• Volume
• Velocity
• Variety
= VALUE!
Requires new ways of thinking – distributed data and processing
04/10/2023 © Objectivity Inc 2012 15
Parallel Processing and Storage
Apache HADOOP
• Map/Reduce– Distributed processing.
• HDFS– Distributed file system.
• HBase– Distributed storage for
large tables.
• Cassandra– Multi-master database with
no single point of failure.
InfiniteGraph• Distributed processing
- Peer-to-peer servers and clients anywhere in the network.
• Distributed data- Federation of databases
anywhere in the network.
• Standard filesystem- Random I/O for fast navigational
queries.
• Single logical view of all data in the federation- Any client anywhere can access
server anywhere.
04/10/2023 © Objectivity Inc 2012 16
04/10/2023 17
Common Big Data Architecture
RDBMS GraphDB
DocumentDB
HadoopBigTable
Key-ValueStores
DataWarehouse
Data Aggregation & Application Analytics
ColumnStores
Commodity Linux Clusters or High Performance Compute platforms
Structured Semi-structured Un-structured
ObjectDB
© Objectivity Inc 2012
Common Big Data Architecture
Visualization and Analytics
toolsHadoopRDBMS Other
storesFront End Processing Raw Data
The strategic competitors are all moving in this direction for Big Data
ObserveOrientDecideAct
© Objectivity Inc 201204/10/2023 18
Big Data Analytics Solutions
Data Analytics Applications
Greenplum HadoopGreenplum
Greenplum Data
Integration Accelerator
Raw Data
Infosphere BigInsights
IBMHadoopDB2 Infosphere
WarehouseFront End Processing Raw Data
Oracle In-Database Analytics
Cloudera Hadoop
Oracle 11g
Oracle NoSQL
Oracle Data
IntegratorRaw Data
Autonomy Vertica Database
Front End Processing Raw Data
EMC
IBM
Oracle
HP
© Objectivity Inc 201204/10/2023 19
Big Data Landscape
• All current solutions have the same basic architecture model.
• None of the current solutions have a way to store connections between entities in the different silos.
– Analytics today focuses on the nodes of data (quantifiable occurrences) rather than the relevant connections or edges between the nodes (qualitative occurrences).
• Objectivity has a proven way to efficiently store, manage and query the relationships and connections between data.
© Objectivity Inc 201204/10/2023 20
Disruptive Big Data New Architecture
Visualization and Analytics
tools
HadoopRDBMS Other stores
Front End Processing Raw Data
The Proven Connection StoreObjectivity/DB and/or InfiniteGraph Raw Data
Represents data nodes
Represents bidirectional relationships/connections between data.
© Objectivity Inc 201204/10/2023 21
Why We’re Different
• Relational databases are not optimized to understand objects or connections.
• Objectivity/DB™ is all about objects and relationships.
• InfiniteGraph™ is all about the connections as first class citizens.
04/10/2023 © Objectivity Inc 2012 22
Use Cases & Challenges
Relationships are everywhere
CRM,
Sales & Marketing
Networ
k Mgmt, Telecom
Intelligen
ce (Government&
Business
)
FinanceHealthca
re
Research
: Genomic
s
Social
Networks
LogisticsMaster Data
Management
PLM (Product Lifecycle Mgmt)
04/10/2023 © Objectivity Inc 2012 24
Financial Services
Fraud Detection
– Problem: Detect patterns of fraudulent activities before damage is done
– Solution: Real-time identification of inconsistencies enables instantaneous notification to security systems
– Results:• Improved banking security and
client confidence• Reduction of lost revenues• Improved efficiency allows fraud-
detection teams to develop and deploy additional services
04/10/2023 © Objectivity Inc 2012 25
Application Development
The “Facebook” For Education
– Problem: Develop system capable of handling exponential user- base growth
– Solution: Leverage InfiniteGraph’s scalability and performance to support real-time relationship information between all members and to act as primary DB for all topics and users
– Results: Complete social networking site allowing global users to access courses from leading institutions & to collaborate effectively with other students and teachers
04/10/2023 © Objectivity Inc 2012 26
Use Case – Confidential Ad Placement Network
• Ad placement on smart phone based on user profile and location data generated by opt-in application (e.g., a free game).
• Location data captured and distilled by Cassandra (key-value/column family hybrid database).
• Locations matched with geospatial data to refine user interests.
• As ad placement orders arrive, InfiniteGraph matches groups of users with ads, maximizing relevance for the user, value for the advertiser and revenue for the ad placement company.
04/10/2023 © Objectivity Inc 2012 27
Government
Broad Area Maritime Surveillance UAS
– Problem: Monitor potential threats across open oceans and remote areas on a 24/7 basis
– Solution: Use Objectivity/db to develop a system for unmanned aircraft to capture and transmit real-time data of any type for analysis and sharing
– Results: A federated view of maritime surveillance and continuous reconnaissance capability for mission, reconnaissance, and communications assessments
04/10/2023 © Objectivity Inc 2012 28
Healthcare
Bring together doctors, patients, and their records
– Problem: As patients move between doctors, manage their records globally to better capture and understand symptoms, causes, and interdependencies and to improve diagnoses
– Solution: Create a database using Objectivity/db and InfiniteGraph capable of managing real-time entries of patient visits, symptoms, diagnoses, reactions to medications, and progress
– Results:
• Improved times to more accurate diagnoses
• Creation of a knowledge base of similar medical cases
• Increase success rates of initial prescriptions based on historical recommendations
04/10/2023 © Objectivity Inc 2012 29
30
Team: Objectivity, L-3, and Lockheed U.S. Air Force’s Network Centric Collaborative Targeting (NCCT) U.S. Navy’s Cooperative Engagement Capability (CEC) system.
Network Centric Collaborative Targeting
04/10/2023 30© Objectivity Inc 2012
NCCT - Customer Challenge
Time sensitive targets were hard to find Sensors operated as independent systems The performance of each individual sensor is very good ( great
ears and eyes) but collectively lack a central nervous system Mountains of Data are coming from sensors Existing sensors alone cannot reliably find highly mobile, moving
and/or spoofing targets
Silo’d systems with individual reports did not provide
solutions
04/10/2023 31© Objectivity Inc 2012
NCCT - Technical Solution Architecture
Company Confidential
1. Build a distributed systems that could support multi-agency platform requirements
2. Collect data from any number of high volume sources
3. Provide a data architecture that supported the need to correlate and fuse data collection for a single view of the targets
4. Support a near real-time data reporting C4ISR system
04/10/2023 32© Objectivity Inc 2012
Intelligence - Customer Need
Deliver all the possible connections between them in seconds
Finding the links between callers
Collect 400,000,000 phone calls, plus address, emails, meetings….
04/10/2023 33© Objectivity Inc 2012
Intelligence Problem - Performance
With a relational product: Initial attempts to traverse links across the database literally shut
down the server.
After much server and database optimization a process could be run on a single query and would produce a result over a 48 hour period.
Results were unacceptable…..
With Objectivity: The many-to-many data application was an excellent fit for Objectivity.
We then developed a proof-of-concept that delivered showing 5-6 degrees of separation within about 1 minute, running on a laptop computer
04/10/2023 34© Objectivity Inc 2012
InfiniteGraph & Objectivity/DB Technical Overview
What is a graph database?
• Optimized around data relationships– Relationships as first class citizens– Super fast traversal between entities– Rich/flexible annotation of connections
• Small focused API (typically not SQL)– Natively work with concepts of Vertex/Edge– SQL has no concept of “navigation”
• Graphs grow quickly e.g.– Billions of phone calls / day in US– Emails, social media events, IP Traffic– Financial transactions
• Some analytics require navigation of large sections of the graph• Each step (often) depends on the last• Must distribute data and go parallel
04/10/2023 © Objectivity Inc 2012 36
Database Data Representation
• Traditional databases are good at recording things, not events or relationships
04/10/2023 © Objectivity Inc 2012 37
Meetings
P1 Place TimeP2Alice Denver 5-27-10Bob
Calls
From Time DurationToBob 13:20 25CarlosBob 17:10 15Charlie
Payments
From Date AmountToCarlos 5-12-10 100000Charlie
Met5-27-10Alice
Called13:20Bob
Paid100000Carlos
Charlie
Called17:10
Rows/Columns/Tables Relationship/Graph Optimized
Viewing the Data
04/10/2023 © Objectivity Inc 2012 38
The InfiniteGraph Visualizer will need this name to display the contents of the graph database.
InfiniteGraph™
• Connects the dots on a global scale.
• InfiniteGraph™ finds connections in big data.
04/10/2023 © Objectivity Inc 2012 39
Find Answers Faster with InfiniteGraph™ Distributed Graph Database
04/10/2023 © Objectivity Inc 2012 40
• Supports large scale and distributed systems.
• Proven technology and deployments.
• Flexible and Easy: • Distributed and cloud ready, Java on interoperable platforms, integrates
with most other data stores, supports ACID to flexible modes.
InfiniteGraph’s Unique Advantages
04/10/2023 © Objectivity Inc 2012 41
InfiniteGraph Basic Architecture
InfiniteGraph - Core/API
ConfigurationNavigation Execution
Management Extensions
BlueprintsUser Apps
Distributed Object and Relationship Persistence Layer
Session / TX ManagementPlacement
04/10/2023 © Objectivity Inc 2012 42
InfiniteGraph Features
• Distributed parallel ingest.
• Flexible distributed storage management.
• Node naming and indexing for fast lookup.
• User controlled navigational queries – using node and edge filters.
• Navigator plug-in architecture for sharing plug-ins with the visualizer.
• InfiniteGraph Visualizer.
• Blueprints support via Gremlin
04/10/2023 © Objectivity Inc 2012 43
© Objectivity Inc 2012 44
Objectivity/DB Basic Architecture
Java API
User Application
C++ Public API
ULBPython API
C#/.NET
I/O Manager
Objy Kernel
Lock Server Page Server(AMS) Query Server
04/10/2023
© Objectivity Inc 2012 45
Distributed Data /ProcessingDistributed Federated Persistent Store
Network
Federated Data Management
Single Logical ViewAll clients and servers see all data.
Distributed Data Management
Scale OutScale Out
SAN
04/10/2023
Distributed Data Architecture
04/10/2023 46
#21538 - 1874 - 9638 - 164
Container Slot
PageDatabase
Federation(schema &
catalog)
• 1,000’s trillions of unique objects• 1,000’s petabytes storage• Logical/physical indirection at every
segment• Resolving ID fast regardless of number of
objects
64 Bit OID (Object ID)
© Objectivity Inc 2012
Database
Container Container
64K
Container
64K
Container
47
Distributed Processing Architecture
ClientSimple, Distributed Servers
ApplicationObjectivity/DB
Cache
Lock ServersLock Servers
Data ServersData Servers
Data ServersQuery Agents
Put the data and processing where it’s needed
© Objectivity Inc 201204/10/2023
© Objectivity Inc 2012 48
Flexibility – language interoperability
A B C D FE
Java App C++ App C# App Python App
Objectivity/DB Objectivity/DB Objectivity/DB Objectivity/DB
04/10/2023
Flexibility – heterogeneous platforms
04/10/2023 49
LinuxUnix
(Sun, HP)
Wintel Mac OSX
Network Storage
© Objectivity Inc 2012
04/10/2023 © Objectivity Inc 2012 50
InfiniteGraph™ - Link Hunter demonstration
Comprehensive Online Resources
InfiniteGraph.com(main site, content
and messaging)
Download InfiniteGraph
Product Documentation
InfiniteGraph Developer WikiGoogle Group for
Developers
Our Blog
04/10/2023 © Objectivity Inc 2012 51
Company Snapshot
52
Customers
Products
Corporate
Financials & Ownership
• Established in 1988
• Headquartered in Sunnyvale, California
• NOSQL platform for managing and discovering relationships between complex data
• Objectivity/DB™: Object-oriented data management system that manages localized, centralized or distributed databases
• InfiniteGraph™: New massively scalable graph database that enables organizations to find, store, and exploit the relationships hidden in their data
• Deeply embedded in nearly 90 enterprises and government organizations
• Competitive advantages in Big Data with strong IP and patent position
• Growing pipeline of near-term opportunities across expanding use cases
• Generating increased revenues in last twelve months
• Profitable and cash flow positive; no debt
• Ownership: Privately held by employees and venture investors
Market Opportunity
• Big Data Market forecasted to be $11.6B in 2012, with CAGR of 28.0% over the next 5 years
• 40% per year data growth, cloud adoption, mobile usage and improved real-time, predictive analytics underpin Objectivity’s growth opportunities
• Strategically positioned as key Big Data enabler that pulls through servers, DBs and file stores
© Objectivity Inc 201204/10/2023
Brian Clark
VP Product Marketing, Objectivity Inc.
http://www.infinitegraph.com
http://www.objectivity.com
04/10/2023 © Objectivity Inc 2012 53