big data: myths and realities

30
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 1 Big Data: Myths & Realities Oleksiy Razborshchuk Distinguished Solution Architect Oracle Canada ULC May 21 st , 2014 People. Process. Portfolio.

Upload: toronto-oracle-users-group

Post on 26-Jan-2015

132 views

Category:

Technology


8 download

DESCRIPTION

Presented at TOUG on May 21, 2014

TRANSCRIPT

Page 1: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 1

Big Data: Myths & Realities

Oleksiy Razborshchuk

Distinguished Solution Architect

Oracle Canada ULC

May 21st, 2014

People. Process. Portfolio.

Page 2: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 2

Agenda

Big Data

Oracle’s Big Data Solution and Differentiators

Use Cases and Implementation Examples

Page 3: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 3

True or False?

Page 4: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 4

BLOG

What Makes Big Data BIG DATA?

Volume • Very large

quantities of data

Velocity • Extremely

fast streams of data

Variety • Wide range

of datatype characteristics

BLOG

Telematics

Social

Social

Value • High potential

business value if harnessed

Page 5: Big Data: Myths and Realities

Challenge: Exploiting Synergies

Big Data. Big Architecture.

ANALYZE

DECIDE ACQUIRE

ORGANIZE

Page 6: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 6

Basics Of Hadoop

In Memory

File 1 Piece 1 1

File 1 Piece 2 2

File 1 Piece 3 3

2 5

3 6

4 7

Name Node

Data Node Data Node Data Node Data Node JAR

Map

Reduce Map

Reduce Map

Reduce

Map

Reduce

Job Tracker Task Tracker Task Tracker Task Tracker Task Tracker

Page 7: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 7

MapReduce Example

Hello World Goodbye World

<K,V>

<Hello,1>

<World,1> <Goodbye,1>

<World,1>

<K,V,V,V,V> <World,1,1> <Hello,1> <Goodbye,1>

<Goodbye,1> <Hello,1> <World,2>

Page 8: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 8

Wrap Up

Page 9: Big Data: Myths and Realities

Hadoop Architecture

9

Page 10: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 10

Cloudera Stack

Page 11: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 11

Active Archive

Transformation and Processing

Self-Service Exploratory BI

Advanced Analytics

Enterprise Data Hub (EDH)

Page 12: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 12

What is Big Data Environment?

VS &

Page 13: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 13

Unified Data Analytics Environment

Unified Analytics API

SQL R MR

Unified Analytics Processing Platform

Hadoop RDBMS

Management Framework and Tools

Page 14: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 14

Big Data in the Enterprise Information Architecture Strategy

14

Page 15: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 15

Agenda

Big Data

Oracle’s Big Data Solution and Differentiators

Use Cases and Implementation Examples

Page 16: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 16

Oracle Big Data Appliance

Better

TCO

Faster Time

to Value

Optimized Lower risk. Engineered to perform.

Page 17: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 17

What Do We Mean by Commodity DIY ?

Red Hat / CentOS Different

Platform

Every

Time

Integrated

Tuned

Optimized

Identical

Applications

Compute

& Storage

Networking

OS

CPU, RAM, Blade, Rack

Cisco

120+ separate parts Months from start to production

1 Big Data Appliance Unpack to production in days

Hadoop Distribution

Page 18: Big Data: Myths and Realities

18 © 2014 Oracle Corporation and CIBC – Proprietary and Confidential

Why Oracle Big Data Appliance vs. Commodity With proof points on the following slides

• Designed and Engineered by Cloudera & Oracle (OEM)

• Big Data Best Practices already implemented

• Pre-Integrated, pre-optimized, and pre-tuned before arrival

• Comprehensive (all h/w, s/w, tools, integration labour)

• Manageability top-to-bottom

• Secure and hardened

• Shorter deployment and time to market

• Faster Performance

• Lower TCO

Page 19: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 19

BDA TCO Beats Build Your Own Hadoop Cluster

$0

$200,000

$400,000

$600,000

$800,000

$1,000,000

$1,200,000

$1,400,000

Year 1 Year 2 Year 3 Year 4 Year 5

Oracle BDA

HP+Cloudera

Cisco+Cloudera

Dell+Cloudera

IBM+Cloudera

Page 20: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 20

Engineered by Cloudera and Oracle

Managed Distribution

– Components certified to work together and on Oracle Big Data Appliance in regular

updates, on the same hardware/software stack as all our customers

Cloudera’s Hadoop Knowledge Engineered into the system

– Master service lay-out, settings for Hadoop parameters

– Optimized data block size, number of Map-Reduce slots

– Infiniband fabric optimized

Enterprise Hadoop Features jointly developed

– Multi-Homing for Hadoop

– Highly Available NameNode Solution

– Tight integration between Oracle Enterprise Manager and Cloudera Manager

– Sentry security (invented by Cloudera and Oracle)

Page 21: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 21

Engineered for Quicker Time and Lower Cost

http://www.oracle.com/us/corporate/analystreports/industries/esg-big-data-wp-1914112.pdf

ESG believes that a "buy" versus "do-it-yourself"

approach will yield roughly one-third faster time-

to-market benefit improvement...

0

5

10

15

20

25

30

Oracle Big Data Appliance Build it yourself

Time to Market (Weeks)

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

Oracle Big Data Appliance Build it yourself

Cost: Initial Infrastructure/Tasks

[…] nearly 40% cost savings versus IT

architecting, designing, procuring, configuring,

and implementing its own big data infrastructure.

Compared with a DIY Cluster

Page 22: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 22

Engineered for Performance Compared with a DIY Cluster

0

5

10

Big Data Appliance

DIY Hadoop Cluster

Tim

e (

ho

urs

)

Configured for exceptional

performance on delivery

6x faster than custom 20-node

Hadoop cluster for large batch

transformation jobs

Engineering done by Oracle and

Cloudera:

– OS and File System Tuning

– Java Virtual Machine Tuning

– Hadoop Configuration and Setup

6x

Page 23: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 23

Enterprise-Grade Big Data

BDA 2.5 DIY CDH 4.6

Integrated Management Console

Single Command, Full Stack

Patching and Upgrade

Automatic Cluster Re-Configuration

Encryption and Auditing

out-of-box

Authentication, Access Control

HA / DR

Engineered by Cloudera for EDH

Tuned and Optimized Performance

(OS, Java, Hadoop, Infiniband)

Page 24: Big Data: Myths and Realities

24 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.

Oracle Unified Information reference architecture Native integration between BDA and Exadata (like iPhone and iPad)

Stream Acquire – Organize – Analyze

Oracle BI Foundation Suite

Oracle Real-Time Decisions

Endeca Information Discovery

Decide

Oracle Event Processing

Oracle Big Data Connectors

Oracle Data Integrator

Oracle

Advanced

Analytics

Oracle

Database

Oracle OLAP,

Spatial,

Graph

Apache Flume

Oracle GoldenGate

Oracle

NoSQL

Database

Cloudera

Hadoop

Oracle R

Distribution

Oracle Coherence

Oracle Big Data Appliancea Oracle Exadata

Page 25: Big Data: Myths and Realities

25 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.

Big Data Connectors and Data Integrator

Big Data Appliance +

Hadoop

Exadata +

Oracle Data Warehouse

15TB / hour

10x Faster

Page 26: Big Data: Myths and Realities

26 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.

Agenda

Big Data

Oracle’s Big Data Solution and Differentiators

Use Cases and Implementation Examples

Page 27: Big Data: Myths and Realities

27 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.

Big Data Solutions for Financial Services

IT Optimization

Big Data Analytics

Business Process Transformation

• ETL and batch processing • Extended Data Warehouse

• Mainframe offloading • Active Archiving

• Customer 360 • Omni-channel CX

• Cross-selling / Geo-fencing • Payment Analytics

• AML / Anti-Fraud • Risk Management

• Pricing Management • Compute Offload (VAR)

Page 28: Big Data: Myths and Realities

28 Copyright © 2014, Oracle and/or its affiliates. All rights reserved.

Customer 360 with NGData Lily

Oracle’s Big Data Value

Added Partner

Individual Customer

Behaviour Translated into

Industry Specific KPIs

Customers include:

Socio-demo

Life Time Events

Mobility

Affluence

Social

Affinity

Lifestyle

Competitor

Segment

Communication Preferences

Communication History

Customer Status

Products

Usage

Customer Engagement

CLTV

Loyalty

Customer Experience

Customer DNA

Page 29: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 29

Case Study Lowering Costs by Simplifying IT Infrastructure

Objectives

Comply with regulations requiring more

data to support stress testing

Reduce IT costs & streamline processing

by eliminating duplicate data stores

Solution

Single, reliable BDA/Exadata-based ODS

supporting all downstream systems

Landing zone & archival repository for

both structured & unstructured data

Use Exadata as “19th” BDA node

- Toyota Global Vision

Operational Data Store Mainframe,

RDBMS, more

BDA Exadata

• Agile business

model

• All data

• De-normalized

& Partial-

normalized

• Normalized

• Aggregate data

• EDW

Oracle Enterprise Manager

Oracle Data Integrator

Data Delivery

Master

S1

Master

S2

Master

Sn SOA/API

CRMS

Other

Fast access to 85% more data

Lower costs, simplified architecture and

fast time to value

Benefits

Page 30: Big Data: Myths and Realities

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. 30

3 Key Takeaways from this presentation

• Big Data is not just Hadoop

• Key BD use cases: Active Archive, Data Processing, BI Analytics

• Oracle+Cloudera = most complete & integrated solution in the industry