pentaho and mongodb partner to solve government big data challenges

41
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 1 Dave Henry SVP Enterprise Solutions, Pentaho December 2013 Pentaho & MongoDB Partner to Solve Government Big Data Challenges Bob Gourley Publisher, CTOvision.com Will LaForest Director of Federal, MongoDB

Upload: pentaho

Post on 10-May-2015

792 views

Category:

Technology


1 download

DESCRIPTION

With growing volumes and varieties of data flowing at increasing speed, government agencies need a fast and easy way to harness and gain insight from their big data sources. The expanded level of native integration between Pentaho Business Analytics 5.0 and MongoDB provides the first analytics capability with full support for this popular NoSQL data store. The combination of Pentaho Business Analytics and MongoDB helps government agency developers to: Increase Data Value – With Pentaho, MongoDB data can be accessed, blended, visualized and reported in combination with any other data source for increased insight and operational analytics Reduce Complexity – Reporting on data stored in MongoDB is simplified, increasing developer productivity with Pentaho’s automatic document sampling, drag and drop interface and schema generation Accelerate Data Access and Querying – With no impact on throughput, this integration builds on the features and capabilities in MongoDB, such as the Aggregation Framework, Replication and Tag Sets These are the slides from the live webinar that took place on Tuesday, December 3, 2013 at 10 am PT.

TRANSCRIPT

Page 1: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 1

Dave Henry SVP Enterprise Solutions, Pentaho

December 2013

Pentaho & MongoDB Partner to Solve Government Big Data Challenges

Bob Gourley Publisher, CTOvision.com

Will LaForest Director of Federal, MongoDB

Page 2: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 2

Best Practices for Federal Big Data Projects

Big Data Management

Bob Gourley Publisher, CTOvision.com

Page 3: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 3

A focus on a new discipline of “Big Data

Management”

Intro to top 5 “Best

Practices” of Federal

Data activities

Invitation to collaborate and refine

approaches

A perpetual draft - your

input is requested

Brief Purpose Research & Reports

Contribute your thoughts at

CTOvision.com

Page 4: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 4

Big Data Government Newsletter - reader survey 2,600 readers 2% response rate, across Federal agencies

Review of openly published research by Wikibon, TDWI, IDC, Gartner, Forrester and of course our own CTOvision Review of best practices and use cases from the best vendors in

Enterprise Big Data Engagement of the community at events like Strata and Hadoop World

Update Sources

Planning Assumption The ability to collect, parse, analyze machine data in real time,

whether on premise or in the cloud, will continue to grow

Page 5: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 5

Big Data Management

Agencies are thinking through the right changes to concepts and technologies Old approaches still important, but cannot solve emerging problems Big Data Management is an evolved discipline which builds on existing data

management approaches to leverage new concepts, technologies and best practices to optimize mission support

Page 6: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 6

• Open Source Information: analysis and integration • Situational Awareness across disparate data sets • Two use cases: “Connect the Dots” and “Needle in Haystack” • Cyber Security: rapid real time analysis of all relevant data • Asset catalog across extensive/dynamic enterprises • Rapid return of geospatial data • Location based push of data • Real time return of relevant search • Real time suggestion of topics • Bioinformatics:

• Human Genome • Patient location, treatment, outcomes

• Law Enforcement: Predictive Policing • Data Hub: Unified storage, governance, security, functionality

Solutions That Require Big Data Management

Page 7: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 7

Best Practices in Big Data Management

VISION Start with a mission-focused vision. This will vary by organization. Support to mission will drive everything else. Consider that analytics and Big Data go together.

STRATEGY Should prioritize and tackle challenges like: Changes to governance processes, right mix of skills for workforce, learning new technology, prioritizing which workload types will be handled by which part of the architecture.

KNOW

Know existing infrastructure and process with focus on: Understanding of legal/policy dynamics relevant to your agency, understanding of new capabilities available, current and required throughputs/capacities, types of workloads supported by each components in the architecture, available tech choices.

DESIGN Document and continuously improve. Architect to manage data in its original form. Include right mix of traditional and new in your design. Don’t assume any one platform will be a solution. Architect to insulate applications and users from a variety of disparate big data platforms.

EXECUTE Avoid custom coding wherever possible. Don’t let new Big Data Platforms become proprietary silos. ETL remains important. Ensure training for all based on job function. Don’t neglect your own training. Serve the analyst.

Page 8: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 8

Continue your market surveys, stay aware of what new technologies can do for you.

Revisit your vision. As you do, ponder this: How can you leverage data to support your mission?

Continue to study use-cases and exchange best practices. Dialog with others in and out of your sector. Great lessons are coming from other industries.

Continue to engage with the broader community. Sign-up for our Government Big Data Weekly.

Share your lessons learned.

Next Steps

Page 9: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 9

E-mail: [email protected] Blog: http://ctovision.com Twitter: http://www.twitter.com/bobgourley Facebook, LinkedIn, etc: See the blog

Provide Your Thoughts, Input, Questions

Page 10: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 10

The Modern Operational Database for Government

Will LaForest Director of Federal, MongoDB

Page 11: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 11

The Evolution of Databases

2010

RDBMS

NoSQL

OLAP/BI

Hadoop

2000

RDBMS

OLAP/BI

1990

RDBMS

Operational & Real-time

Datawarehouse

Online

Offline

Page 12: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 12

Relational Database Challenges

Variety

• Unstructured data

• Semi-structured data

• Polymorphic data

Volume & Velocity

• Petabytes of data

• Trillions of records

• Millions of queries per second

Agile Development

• Iterative

• Short development cycles

• New workloads

New Architectures

• Horizontal scaling

• Commodity servers

• Cloud computing

Page 13: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 13

MongoDB The Modern Operational Database

Document Oriented

Open-Source

General Purpose

Page 14: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 14

Fully Featured

MongoDB {

first_name: ‘Paul’,

surname: ‘Miller’,

city: ‘London’,

location: [45.123,47.232],

cars: [

{ model: ‘Bentley’,

year: 1973,

value: 100000, … },

{ model: ‘Rolls Royce’,

year: 1965,

value: 330000, … }

}

}

Rich Queries • Find Paul’s cars • Find everybody in London with a car

built between 1970 and 1980

Geospatial • Find all of the car owners within 5km of Trafalgar Sq.

Text Search • Find all the cars described as having leather seats

Aggregation • Calculate the average value of Paul’s car collection

Native Indexes • Secondary • Compound • Geospatial

• Full Text • Hash • Covering

Page 15: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 15

MongoDB and Enterprise IT Stack

EDW Hadoop

Man

agem

ent &

Mon

itorin

g Security &

Auditing

RDBMS

CRM, ERP, Collaboration, Mobile, BI

OS & Virtualization, Compute, Storage, Network

RDBMS

Applications

Infrastructure

Data Management

Online Data Offline Data

Page 16: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

Variety – Modern Data

Page 17: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 17

Document Data Model

Relational MongoDB {

first_name: ‘Paul’,

surname: ‘Miller’

city: ‘London’,

location: [45.123,47.232],

cars: [

{ model: ‘Bentley’,

year: 1973,

value: 100000, … },

{ model: ‘Rolls Royce’,

year: 1965,

value: 330000, … }

]

}

Page 18: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 18

Dynamic Schema

MongoDB does not need any defined data schema. Every document could have different data

{name: “jeff”, eyes: “blue”, height: 72, boss: “ben”}

{name: “brendan”, aliases: [“el diablo”]}

{name: “ben”, hat: ”yes”}

{name: “matt”, pizza: “DiGiorno”, height: 74, boss: 555.555.1212}

{name: “will”, eyes: “blue”, birthplace: “NY”, aliases: [“bill”, “la ciacco”], gender: ”???”, boss: ”ben”}

Page 19: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

Volume, Velocity, and New Architectures

Page 20: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 20

Automatic Sharding

• Increase or decrease capacity as you go

• Automatic balancing

• Optimized for commodity servers and cloud infrastructure

Page 21: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 21

High Availability

• Automated replication and failover

• 0 down time with hardware failure and upgrades

• Multi-data center support

• Improved operational simplicity (e.g., HW swaps)

• Data durability and consistency

Page 22: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 22

MongoDB Performance*

Top 5 Marketing Firm

Government Agency

Top 5 Investment Bank

Data Key/value 10+ fields, arrays, nested documents

20+ fields, arrays, nested documents

Queries Key-based 1 – 100 docs/query 80/20 read/write

Compound queries Range queries MapReduce 20/80 read/write

Compound queries Range queries 50/50 read/write

Servers ~250 ~50 ~40

Ops/sec 1,200,000 500,000 30,000

* These figures are provided as examples. Your application governs your performance.

Page 23: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

Replication Benefits

Page 24: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 24

Operational and Analytical Workloads

• Application interacts with primaries

• Analytical workloads on secondaries

• Workloads are isolated from one another

• Working set appropriate for each application

Page 25: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 25

Global Data Distribution

Real-time

Real-time Real-time

Real-time

Real-time

Real-time

Real-time

Page 26: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 26

Read Global / Write Local

Primary:NYC

Secondary:NYC

Primary:LON

Primary:SYD

Secondary:LON

Secondary:NYC

Secondary:SYD

Secondary:LON

Secondary:SYD

Page 27: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 27

Solving Big Data Challenges in the

Federal Government

Dave Diegtel Head of Federal Sales, Pentaho

Page 28: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 28

• Company and Product Maturity: Pentaho has been around for over 9 years, with 1,000’s of paid customers, and 5.0 Version release. Pentaho is proven and less risky.

• Business Model and Subscription: Pentaho’s Subscription Model and Server-based pricing allows for lower upfront investment and risk compared to legacy BI vendors who traditionally cost an average of 4X for similar size deployments.

• Government Certifications: Pentaho has made significant investments in Government Certifications and Compliance such as 508 and Security.

• Open API’s and extensible architecture enable ease of integration and reduce potential for vendor lock-in.

• Existing Government Customers and Cleared Personnel

Why Pentaho for Federal Government

Page 29: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 29

A Comprehensive Big Data Platform

Dave Henry Senior VP Enterprise Solutions, Pentaho

Page 30: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 30

Pentaho 5.0 Architected for the Future Simplified analytics experience for all users

ANY Analytics • Reports • Dashboards • Visualizations • Discovery • Predictive

Analytics

ANY Environment • Data warehouses • Data marts • Stack vendors • Cloud • Embedded

Existing & New Data Infrastructure

& Processes

ANY Data • Relational • Operational • Big Data • Data sources not yet

anticipated…

Billing

Location

Social Media

Customer

Web

Network

Page 31: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 31

The New Reality Simplified analysis for all users

Simplified Analytics

Experience

Enterprise Big Data

Integration

Blended Big Data

Page 32: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 32

Pentaho & MongoDB Enable Key Use Cases Customer 360 and Device Data Analytics enable comprehensive insight

Pentaho Data Integration

Pentaho Data Integration

Mission Scope

Pentaho Analytics • Reporting • Dashboards • Visualization • Discovery

• MongoDB delivers Scalable, Low-Latency Enterprise Data Store

• Visual ETL development with Pentaho Data Integration (PDI)

• Reporting, Dashboards,

Visualization and Discovery with Pentaho Analytics

Page 33: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 33

Enterprise Customer Data Store Powerful data integration for MongoDB

mongoDB cluster PDI ETL

Web Event Data

POS Data

Customer Master

$push to data arrays

Page 34: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 34

Data Integration Exploits MongoDB’s native APIs and query language

Page 35: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 35

Operational Reports Multi-page, highly formatted reports – real-time, scheduled or burst to email

Page 36: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 36

Operational Dashboards Highly tailored, pixel-perfect dashboards on MongoDB

Page 37: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 37

Analyzer Explore and visualize data

Page 38: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 38

James Dixon Founder and CTO, Pentaho

As CTO at Pentaho, James Dixon is responsible for Pentaho's architecture and technology roadmap. James has over 15 years of professional experience in software architecture, development and systems consulting. Prior to Pentaho, James held key technical roles at AppSource Corporation (acquired by Arbor Software which later merged into Hyperion Solutions) and Keyola (acquired by Lawson Software). Earlier in his career, James was a technology consultant working with large and small firms to deliver the benefits of innovative technology in real-world environments.

Page 39: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 39

• Pentaho is the best platform to connect, integrate, and analyze both traditional sources and MongoDB

• Pentaho embraces and extends the MongoDB environment with rich visualization and exploration of data

• Pentaho’s Subscription-based business model lowers upfront investments, enabling faster ROI

• Pentaho has dozens of Federal Government Customers and made significant investments in government certifications and cleared personnel

• Pentaho and MongoDB are established partners – Pentaho carefully engineers its products to use the latest MongoDB APIs to provide the best possible performance

Why Pentaho?

Page 40: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 40

• Needs Assessment with Pentaho and MongoDB • Dave Diegtel - [email protected] • Will LaForrest - [email protected]

• Try Pentaho (30 Free Trial) -- pentaho.com/download • Learn More about Big Data and Government Solutions

• Pentaho • Big Data Website: pentahobigdata.com/ • Government Solutions: pentaho.com/solutions/government

• MongoDB: • Government Solutions: mongodb.com/industries/government • Big Data: Examples and Guidelines for the Enterprise Decision Maker

mongodb.com/lp/whitepaper/big-data-nosql • MongoDB Top 5 Considerations When Evaluating NoSQL Databases

mongodb.com/lp/whitepaper/nosql-considerations • Sign-up for the Big Data Government Newsletter at CTOvision.com &

take reader survey

Next Steps and Q&A

Page 41: Pentaho and MongoDB Partner to Solve Government Big Data Challenges

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 41

Thank You