nosql databases and analytic use cases

27
NoSQL Databases and Analytic Use Cases Aaron Cordova INFORMS

Upload: koverse-inc

Post on 11-May-2015

389 views

Category:

Data & Analytics


0 download

DESCRIPTION

Koverse CTO Aaron Cordova's (@aaroncordova) talk from the 2014 INFORMS conference - "The Business of Big Data"

TRANSCRIPT

Page 1: NoSQL Databases and Analytic Use Cases

NoSQL Databases and Analytic Use Cases

Aaron Cordova INFORMS

Page 2: NoSQL Databases and Analytic Use Cases

NoSQL

• Perhaps better is “Non-Relational”

• Departure from conventional relational db

• Trade traditional features for simplicity, scalability, flexibility

Page 3: NoSQL Databases and Analytic Use Cases

Types of NoSQL DBs

Columnar!!

BigTable Hbase

Accumulo Cassandra

Graph!!

Neo4j OrientDB

Key-Value !

Dynamo Riak

Voldemort BerkeleyDB

Document!!

MongoDB CouchDB

MarkLogic (XML)

Page 4: NoSQL Databases and Analytic Use Cases

Trades

Give up!!

Cross-row Transactions Relational JOINS Type Checking

SQL

Gain!!

Simplicity Scalability (distributed)

Schema Flexibility Geographic distribution

Programmatic APIs

Page 5: NoSQL Databases and Analytic Use Cases

NoSQL Distributed

Name Age Phone

Bob 43 555-1212

Jenny 32 555-1213

Sally 28 555-1214

Joe 45 555-1215

Up to Petabytes

Page 6: NoSQL Databases and Analytic Use Cases

Consistency

Name Age Phone

Bob 43 555-1212

Jenny 32 555-1213

Sally 28 555-1214

Joe 45 555-1215

Name Age Phone

Bob 43 555-1212

Jenny 32 867-5309

Sally 28 555-1214

Joe 45 555-1215

Name Age Phone

Bob 43 555-1212

Jenny 32 555-1213

Sally 28 555-1214

Joe 45 555-1215

X

Multiple Data Centers

Single Data Center

Page 7: NoSQL Databases and Analytic Use Cases

Consistency

Geographically Distributed, !

Eventually Consistent!!

Dynamo Riak

Voldemort Cassandra MongoDB CouchDB

Single Data Center, Highly Consistent!

!BigTable Hbase

Accumulo Cassandra

Neo4j OrientDB MongoDB

MarkLogic (XML)

Page 8: NoSQL Databases and Analytic Use Cases

Programmability

SQLObjects DB

Objects DB

VS

Page 9: NoSQL Databases and Analytic Use Cases

Programmability

MongoDBWeb Client Javascript

Node.js server JavascriptJSON JSON

Page 10: NoSQL Databases and Analytic Use Cases

Analytics

Page 11: NoSQL Databases and Analytic Use Cases

Analytics

Analytical DB

Operational DB

Operational DB

Operational DB

Business Activity

Business Intelligence

Updates, transactions

Denormalized, Aggregations

Page 12: NoSQL Databases and Analytic Use Cases

Analytics

OLAP

OLTP

OLTP

OLTP

Business Activity

Business Intelligence ETL

Schema knowledge

Joins happen here

Page 13: NoSQL Databases and Analytic Use Cases

Analytics

NoSQL DB

OLTP

OLTP

OLTP

Business Activity

Business Intelligence ?

Page 14: NoSQL Databases and Analytic Use Cases

NoSQL and Analytics

• Importing operational data can create a scale problem

• Combining operational data can create sparse data

• Operational schemas may change

Page 15: NoSQL Databases and Analytic Use Cases

NoSQL and Analytics

Scalability, Schema Flexibility

Page 16: NoSQL Databases and Analytic Use Cases

Full Outer Join

Cust.name Cust.age Orders.shoes Facebook.likes …

Bob 43 $50 - …

Sarah 32 $25 5/5/14 …

Sally 28 - 4/3/12 …

- - $35 11/1/13 …

- - - 9/24/12 …

Joe 45 $45 - …

… … … … …

Billions of rows

Thousands of columns

Sparse

Page 17: NoSQL Databases and Analytic Use Cases

BigTable Data Model

Row ID Column Value

R000 Cust.name Bob

R000 Cust.age 43

R000 Orders.shoes $50

R002 Cust.name Sally

R002 Cust.age 32

R002 Facebook.likes 4/3/12

… … …

Page 18: NoSQL Databases and Analytic Use Cases

MongoDB Data Model{ !! Cust.name: “Bob”,!! Cust.age: 43,!! Orders.shoes: $50!},!{!! Cust.name: “Sally”,!! Cust.age: 32,!! Facebook.likes: 4/3/12!},!…!

Page 19: NoSQL Databases and Analytic Use Cases

NoSQL Data Loading Shift

NoSQL Analytics!!

Composite, Sparse Schemas Scale out

Aggressive Indexing Data Discovery

Conventional BI!!

Data cleaning Regularization

Denormalization Star Schema

Known operational Schemas

Page 20: NoSQL Databases and Analytic Use Cases

Analytics

NoSQL DB

OLTP

OLTP

OLTP

Business Activity

Business Intelligence

Schema Discovery

Joins happen here

Page 21: NoSQL Databases and Analytic Use Cases

NoSQL Analytics Shift

Transformations!!

MapReduce Pre-computed Large answers Simple Lookups

Queries!!

SQL Computed on the fly

Small answers Roll up

Drill down

Page 22: NoSQL Databases and Analytic Use Cases

Analytics

NoSQL DB

OLTP

OLTP

OLTP

Business Activity

Business Intelligence

MapReduce

Transformations

Fast Lookups

Page 23: NoSQL Databases and Analytic Use Cases

MapReduce Analytics

Supported!!

SQL (Hive) Statistical Modeling Machine Learning

Text Analytics Feature Extraction Image Processing

Graph Analysis

Page 24: NoSQL Databases and Analytic Use Cases

MapReduce Analytic WorkflowReusable

Transforms

SearchableCollections

Page 25: NoSQL Databases and Analytic Use Cases

Combined-Data Security

Requirements!!

Physically co-located data Strong logical access control

Role-based

Page 26: NoSQL Databases and Analytic Use Cases

Questions

?

Page 27: NoSQL Databases and Analytic Use Cases

Contact Info

!! Aaron Cordova! 1-855-403-1399 www.koverse.com [email protected]