mongo db: operational big data database

Post on 06-May-2015

1.064 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

MongoDB is the leading NoSQL database due to a plenitude of reasons, open source, general purpose, document oriented database supported by a large community and educational platform. It's horizontal scalability features allows this to fit in the operational big data scenarios where the business needs point to realtime analytics and ever-increasing data sets. This talk will focus on the usage of MongoDB for big data operational purposes and why it's ideal to be used in such scenarios. Also integration with other notable big data technology out there like Hadoop and BI tools. Norberto Leite - Senior Solutions Architect, @MongoDB. Mongo DB presentation during the Pentaho & Big Data Ecosystem - Live Seminar 2013

TRANSCRIPT

MongoDB: Operational Big Data

Senior Solutions Architect, MongoDB

norberto@mongodb.com

Norberto Leite

@nleite

Agenda

•  MongoDB Intro

•  Big Data

•  MongoDB Operation Big Data(base)

•  Use Cases

•  QA

Ola!

•  Norberto Leite

•  Solutions Architect –  wingman

•  Barcelona/Brussels

MongoDB

MongoDB

The leading NoSQL database

Document Database

Open-Source

General Purpose

5,000,000+ MongoDB Downloads

100,000+ Online Education Registrants

20,000+ MongoDB User Group Members

20,000+ MongoDB Days Attendees

20,000+ MongoDB Management Service (MMS) Users

Global Community

MongoDB Overview

300+ employees 600+ customers

Offices in New York, Palo Alto, Washington DC, London, Dublin, Barcelona and Sydney Over $231 million in funding

MongoDB Overview

Agile Scalable

MongoDB Vision

To provide the best database for how we build and run apps today

Build –  New and complex data –  Flexible –  New languages –  Faster development

Run –  Big Data scalability –  Real-time –  Commodity hardware –  Cloud

Operational Database Landscape

Document Data Model

Relational MongoDB { ! first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: [45.123,47.232],! cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }!}!

MongoDB is full featured

Rich Queries •  Find Paul’s cars •  Find everybody in London with a car built

between 1970 and 1980

Geospatial •  Find all of the car owners within 5km of Trafalgar Sq.

Text Search •  Find all the cars described as having leather seats

Aggregation •  Calculate the average value of Paul’s car collection

Map Reduce •  What is the ownership pattern of colors by

geography over time? (is purple trending up in China?)

{ ! first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: [45.123,47.232],! cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }!}!

MongoDB

Developers are more productive

Big Data

Best definition so far!

RDBMS Scale = Bigger Computers

“Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.”

IBM Press Release 28 Aug, 2012

Vertical Scalability

This Was a Problem for Google

Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html

250

,000

+ M

BP

’s =

= 4.

1 m

iles 2010 Search Index Size:

100,000,000 GB

New data added per day 100,000+ GB

Databases they could use 0

And for Facebook

2010: 13,000,000 queries per second

And for Facebook

2010: 13,000,000 queries per second

TPC Top Results

TPC #1 DB: 504,161 tps

And for Facebook

2010: 13,000,000 queries per second

TPC Top Results

TPC #1 DB: 504,161 tps

Top 10 combined: 1,370,368 tps

Living in the Post-transactional Future

Order-processing systems largely “done” (RDBMS); primary focus on better search and recommendations or adapting prices on the fly (NoSQL)

Vast majority of its engineering is focused on recommending better movies (NoSQL), not processing monthly bills (RDBMS)

Easy part is processing the credit card (RDBMS). Hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL)

Shift in What We’re Computing

How IT/Data Scientists Define Big Data

Source: Silicon Angle, 2012

MongoDB Operational Big Data(base)

Consideration – Online vs. Offline

•  Long-running •  High-Latency •  Availability is lower priority

•  Real-time •  Low-latency •  High availability

Online Offline vs.

Consideration – Online vs. Offline

Online Offline vs.

MongoDB/NoSQL Is Good for…

360° View of the Customer

Mobile & Social Apps Fraud Detection

User Data Management

Content Management &

Delivery Reference Data

Product Catalogs Machine to Machine Apps Data Hub

MongoDB and Enterprise IT Stack

EDW Hadoop

Man

agem

ent &

Mon

itorin

g Security &

Auditing

RDBMS

CRM, ERP, Collaboration, Mobile, BI

OS & Virtualization, Compute, Storage, Network

RDBMS

Applications

Infrastructure

Data Management

Online Data Offline Data

Horizontal Scalability

MongoDB Architecture

Use Cases

Leading Organizations Rely on MongoDB

Fortune 500 & Global 500

•  10 of the Top Financial Services Institutions

•  10 of the Top Electronics Companies

•  10 of the Top Media and Entertainment Companies

•  8 of the Top Retailers

•  6 of the Top Telcos

•  5 of the Top Technology Companies

•  4 of the Top Healthcare Companies

Data Hub User Data Management

Big Data Content Mgmt & Delivery Mobile & Social

MongoDB Solutions

Customer example: Online Travel

Travel

•  Flights, hotels and cars •  Real-time offers •  User profiles, reviews •  User metadata (previous

purchases, clicks, views)

•  User segmentation •  Offer recommendation engine •  Ad serving engine •  Bundling engine

Algorithms

MongoDB Connector for

Hadoop

Machine Learning

Ad-Serving

•  Catalogs and products •  User profiles •  Clicks •  Views •  Transactions

•  User segmentation •  Recommendation engine •  Prediction engine

Algorithms

MongoDB Connector for

Hadoop

Data Hub

Insurance

•  Insurance policies •  Demographic data •  Customer web data •  Call center data •  Real-time churn detection

•  Customer action analysis •  Churn prediction

algorithms

Churn Analysis

MongoDB Connector for

Hadoop

Obrigado!

Norberto Leite

@nleite

Senior Solutions Architect, MongoDB

norberto@mongodb.com

QA ?

top related