creating an enterprise-class hadoop platform › sites › default › orig › abds2012 › ... ·...

20
2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Creating an Enterprise-class Hadoop Platform Joey Jablonski Practice Director, Analytic Services DataDirect Networks, Inc. (DDN)

Upload: others

Post on 28-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Creating an Enterprise-class Hadoop Platform

Joey Jablonski Practice Director, Analytic Services

DataDirect Networks, Inc. (DDN)

Page 2: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Who am I?

Practice Director, Analytic Services at DataDirect Networks, Inc.

3+ years with Hadoop, 12+ with HPC Contact Details @jrjablo [email protected]/[email protected] www.linkedin.com/in/joeyjablonski

2

Page 3: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Why Hadoop?

Scalable – Performance & Capacity Growing Ecosystem (Flexibility) Established APIs & Interfaces Location on the adoption curve Proven base to create Analytical Platforms

3

Page 4: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

What is Enterprise Class?

Scalable – OPEX & CAPEX Manageable Integration with existing tools Flexible Workflow – Process Integration No Rip & Replace Metrics to manage towards Business Driven, Technological Capabilities

4

Page 5: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

The Big Data Challenge

The Big Data Equation:

Volume Velocity Variety + +

Petabytes of Data Trillions of Objects

GB/s TB/s Millions of IO/s

Object Operations

Structured Unstructured

Streams & Batches

Page 6: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Analytics | Looking for Actionable Information

Billions of Data

Points to Consider

• Consumer purchasing trends • Product perception • Drug Discovery • Genomics • Surveillance • Financial Analysis

Page 7: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Data Gravity

7

DATA

Services

Applications

Page 8: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Why is data Analytics so hard?

Hacking Skills

Substantive Expertise

Math & Statistics

knowledge Trad

ition

al

Res

earc

h

DataScience

Business Acumen

CuriosityCommunications

Analytics

Poor D

ecisioning

Technical Business

Page 9: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

What is Hadoop missing today?

Active-Active high-availability Established management tools Enterprise integration mindset Enterprise class hardware Consistent version-compatibility & deployment Efficient CAPEX & OPEX scaling Resource management/SLAs/QoS Security.

9

Page 10: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Hadoop Operational Considerations

Deploy

Manage

Monitor Respond

Upgrade

Software Platform Hardware Platform

Page 11: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Todays Enterprise Picture

11

The Cloud

Page 12: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Getting there….

Improved Results

Modify Behavior Insight

Page 13: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Hadoop Architectural Considerations

13

Page 14: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Planning for Growth

14

Adop

tion

Hig

her i

s B

ette

r

Capacity

Goal for Human Costs

Performance Scalability User Growth

Page 15: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Shared v. Commodity

15

Shared Component Approach • Lower Operational Costs • Efficient operational resource

scaling • Shared resources with other IT

platforms • Efficiency in computing,

connectivity & service placement

Commodity Server Approach • Lower Entry Costs • Shorter MTBF • Inefficient scaling of tools and

processes • Mis-match with traditional IT

operations models

Page 16: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Ethernet v. Infiniband

16

Infiniband • 100% Storage Management Offload • End-End InfiniBand Networking with RDMA

Acceleration • Real-Time Data Delivery to Provide

MapReduce Process Consistency • Smaller Compute, Compact Storage to

Minimize Data Center Impact

Ethernet • Compatibility, ensured connectivity • Limitations in traffic types and bandwidth

availability • High CPU/Overhead cost • Minimal options for offloading with Linux

environments

Page 17: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Analytic User Types

17

Empowered Users Aware Users Enabled Users

Page 18: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Hadoop Enterprise Integration

18

Extract Transform Load

Data Information Insight Results

APIs

Integration

Monitoring & Response

Page 19: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

And finally, Hadoop is…

…more then just hardware, It is about an ecosystem of hardware &

software. …about integrating with existing systems. …a toolkit to build Analytical Platforms. …a component of the larger corporate

processes and mandates. …a component of the wider business KPIs.

19

Page 20: Creating an Enterprise-class Hadoop Platform › sites › default › orig › ABDS2012 › ... · What is Hadoop missing today? Active-Active high-availability Established management

2012 SNIA Analytics and Big Data Summit. © DataDirect Networks, Inc. (DDN). All Rights Reserved.

Q&A

20