webinar turbo charging_data_science_hawq_on_hdp_final

49
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved Turbocharging Your Data Science with HAWQ on the Hortonworks Data Platform We Do Hadoop

Upload: hortonworks

Post on 28-Jul-2015

301 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Turbocharging Your Data Science with HAWQ on the Hortonworks Data Platform

We Do Hadoop

Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Your Hosts

Michael Cucchi •  Sr. Director of Outbound Product for Pivotal's Data,

Mobile, and IoT solutions •  20 years of engineering, management, and

marketing experience in the high-tech industry

@mikecucchi Matt Morgan •  Vice President, Global Product Marketing •  20 year history as a marketing and product

executive in cloud, SaaS, and big data businesses

@forwardtension

Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Establish Hadoop as the Foundational Technology of the Modern Enterprise

Data Architecture

Year Founded In 2011, 24 engineers from the original Hadoop team at Yahoo! spun out to form Hortonworks.

Ticker Symbol NASDAQ: HDP

Headquarters Santa Clara, CA

Business Model Open Source Software Support Subscriptions, Training and Consulting Services

Non-GAAP Billings Grew from zero to over $120 million on an annualized basis in 11 quarters

Subscription Customers

437 in 11 quarters with 105 added in Q1-2015 alone.

Support 24×7, global web, telephone support

Partners 1100 joint engineering, strategic reseller, technology, and system integrator partners

Employees 650+

Global Operations 17 countries

#1 28 out of 86 Apache Hadoop committers Hortonworks employs the largest group of Hadoop committers under one roof; more than twice any other company.

#1 165 Apache committer seats for projects in HDP Our committers work in 20+ projects on the data access, management, security, operations, and governance needs of the enterprise; more than twice any other company.

Hortonworks Quick Facts

The Forrester Wave™ Big Data Hadoop Solutions We are recognized as a leader in Hadoop by Forrester Research based on the strengths of our offerings and strategy

Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Traditional Systems Under Pressure Challenges •  Constrains data to app •  Can’t manage new data •  Costly to Scale

Business Value

Clickstream

Geolocation

Web Data

Internet of Things

Docs, emails

Server logs

2012 2.8 Zettabytes

2020 40 Zettabytes

LAGGARDS

INDUSTRY LEADERS

1

2 New Data

ERP CRM SCM

New

Traditional

Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Early Hadoop: The Start of a Modern Data Architecture Apache Hadoop is an open source data platform for managing large volumes of high velocity and variety of data •  Built by Yahoo! to be the heartbeat of its ad & search business

•  Donated to Apache Software Foundation in 2005 with rapid adoption by large web properties & early adopter enterprises

•  Incredibly disruptive to current platform economics

Traditional Hadoop Advantages ü  Manages new data paradigm ü  Handles data at scale ü  Cost effective ü  Open source

Traditional Hadoop Had Limitations Batch-only architecture with limited analytic options Single purpose clusters, specific data sets Difficult to integrate with existing investments Not enterprise-grade

Application

Storage HDFS

Batch Processing MapReduce

Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Today: Modern Data Architecture Unifies Data & Processing

Modern Data Architecture •  Enable applications to have access to

all your enterprise data through an efficient centralized platform

•  Supported with a centralized approach governance, security and operations

•  Versatile to handle any applications and datasets no matter the size or type

Clickstream   Web    &  Social  

Geoloca3on   Sensor    &  Machine  

Server    Logs  

Unstructured  

SOU

RC

ES

Existing Systems

ERP   CRM   SCM  

AN

ALY

TIC

S

Data Marts

Business Analytics

Visualization & Dashboards

AN

ALY

TIC

S

Applications Business Analytics

Visualization & Dashboards

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

HDFS (Hadoop Distributed File System)

YARN: Data Operating System

Interactive Real-Time Batch Partner ISV Batch Batch MPP   EDW  

Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

OPERATIONAL  TOOLS  

DEV  &  DATA  TOOLS  

INFRASTRUCTURE  

Partnerships Enrich the Hadoop Ecosystem

Clickstream   Web    &  Social  

Geoloca3on   Sensor    &  Machine  

Server    Logs  

Unstructured  

SOU

RC

ES

Existing Systems

ERP   CRM   SCM  

AN

ALY

TIC

S

Data Marts

Business Analytics

Visualization & Dashboards

AN

ALY

TIC

S

Applications Business Analytics

Visualization & Dashboards

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

HDFS (Hadoop Distributed File System)

Deep Partnerships

Hortonworks engages in deep engineered relationships with the leaders in the data center, such as EMC, Microsoft, Teradata, Red Hat, HP, SAS & SAP Broad Partnerships

Over 1100 partners work with us to certify their applications to work with Hadoop so they can extend big data to their users

YARN: Data Operating System EDW  

Interactive Real-Time Batch Partner ISV

Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Hadoop Adoption Follows a Predictable Journey Cost Optimization, new analytic apps, and ultimately to a data lake

Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Hadoop Driver: Cost optimization

Archive Data off EDW Move rarely used data to Hadoop as active archive, store more data longer

Offload costly ETL process Free your EDW to perform high-value functions like analytics & operations, not ETL

Enrich the value of your EDW Use Hadoop to refine new data sources, such as web and machine data for new analytical context

AN

ALY

TIC

S

Data Marts

Business Analytics

Visualization & Dashboards

HDP helps you reduce costs and optimize the value associated with your EDW

AN

ALY

TIC

S D

ATA

SYST

EMS

Data Marts

Business Analytics

Visualization & Dashboards

HDP 2.2

ELT °

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

N

Cold Data, Deeper Archive & New Sources

Enterprise Data Warehouse

Hot

MPP

In-Memory

Clickstream   Web    &  Social  

Geoloca3on   Sensor    &  Machine  

Server    Logs  

Unstructured  

Existing Systems

ERP   CRM   SCM  

SOU

RC

ES

Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Hadoop Driver: Advanced analytic applications

Single View: Improve acquisition & retention •  HDP enables a single view of each

customer, allowing organizations to provide targeted, personalized customer experiences.

•  Single view reduces attrition, improves cross-sell and improves customer satisfaction.

Predictive Analytics: Identify next best action •  HDP captures, stores and processes

large volumes of data streaming from connected devices

•  Stream processing and data science help introduce new analytics for real-time and batch analysis

Data Discovery: Uncover new findings •  HDP allows exploration of new data

types and large data sets that were previously too big to capture, store & process.

•  Unlock insights from data such as clickstream, geo-location, sensor, server log, social, text and video data.

Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

360° Customer View Boosts Sales at Home Supply Retailer

Problem: Lack of unified customer record across all channels clouded targeting for marketing campaigns

•  No “golden record” for analytics on customer buying behavior across all channels

•  Data repositories on web traffic, POS transactions and in-home services existed in

isolation of each other

•  Data storage costs were increasing, without a corresponding increase in value

Solution: HDP data lake drives golden customer record, targeted marketing, and reduction in data storage expenses

•  Golden record enables targeted, personalized marketing with higher success rates

•  Data warehouse offload saved millions of dollars in recurring expense

•  Price optimization versus competitors à several millions in top-line revenue growth

New Analytic Applications Clickstream, Unstructured

and Structured Data

Retail

Major home improvement retailer

RT2

Why Hadoop?

Single View

Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Responsive Patient Treatment with Real-time Monitoring of Vitals

Problem: Inability to store and access sufficient data for medical decision support in real time

•  9 million patient records on a legacy system were not searchable nor retrievable

•  Cohort selection for research projects was slow, despite abundance of data

•  Clinicians had minimal access to historical data gathered across all patients

Solution: Unified data lake improves patient health, speeds research

•  Legacy system retired immediately, saving $500K in annual recurring expense

•  Records stored with patient identification for clinical use, same data presented

anonymously to researchers for cohort selection

•  Wireless patches transmit vital signs, algorithms notify doctors of high risk patterns

•  Heart patients weigh themselves from home, algorithms notify doctors about unsafe

weight changes and recommend a visit to the clinic

New Analytic Applications Sensor, Social Data

& ETL Offload

Healthcare

Public university teaching hospital

HC2

Why Hadoop?

Predictive Analytics

Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Hadoop Driver: Enabling the Data Lake SC

ALE

SCOPE

Data Lake Definition •  Centralized Architecture

Multiple applications on a shared data set with consistent levels of service

•  Any App, Any Data Multiple applications accessing all data affording new insights and opportunities.

•  Unlocks ‘Systems of Insight’ Advanced algorithms and applications used to derive new value and optimize existing value.

Drivers: 1.  Cost Optimization 2.  Advanced Analytic Apps

Goal: •  Centralized Architecture •  Data-driven Business

DATA LAKE

Journey to the Data Lake with Hadoop

Systems of Insight

Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Case Study: 12-Month Hadoop Evolution at TrueCar D

ata

Plat

form

Cap

abili

ties

12 months execution plan

June 2013 Begin Hadoop Execution

July 2013 Hortonworks Partnership

May ‘14 IPO

Aug 2013 Training & Dev Begins

Nov 2013 Production Cluster 60 Nodes 2 PB

Jan 2014 40% Dev Staff Proficient

Dec 2013 Three Production Apps (3 total)

Feb 2014 Three More Production Apps (6 total)

12 Month Results at TRUECar •  Six Production Hadoop Applications •  Sixty nodes/2PB data •  Storage Costs/Compute Costs

from $19/GB to $0.12/GB

“We addressed our data platform capabilities strategically as a pre-cursor to IPO.”

Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Hortonworks Data Platform Hadoop for the Enterprise

Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

HDP Makes Hadoop Enterprise-Ready

Hortonworks Data Platform Multi-tenant data platform built on a centralized architecture of shared enterprise services

YARN: data operating system

Governance Security

Operations

Resource management

Existing applications

New analytics

Partner applications

Data access: batch, interactive, real-time

Storage

Key benefits Consolidates all data sets

Delivers real-time insights

Integrates with data center

Scalable and affordable

Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Any application Batch, interactive, and real-time

Any data Existing and new datasets

Anywhere Complete range of deployment options

Commodity Appliance Cloud

HDP Makes Hadoop Pervasive

YARN: data operating system

Existing applications

New analytics

Partner applications

Data access: batch, interactive, real-time

Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

An “Any Application” Example: Spark in HDP

Delivering a production-ready experience for Spark applications

•  Centralized Resource Management Integrated with YARN

•  Consistent Operations Provisioned and managed by Ambari

•  Comprehensive Security Runs within secure clusters

•  Deployable Anywhere Windows, Linux, on-premises or cloud; consistent Cloudbreak launch experience

YARN: data operating system

Governance Security

Operations

Resource management

Storage

Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

BI / Analytics (Hive)

IoT Apps (Storm, HBase, Hive)

An “Anywhere” Example: Cloudbreak and HDP

Dev / Test (all HDP services)

Data Science (Spark)

Cloudbreak

1. Pick a Blueprint 2. Choose a Cloud 3. Launch HDP!

Example Ambari Blueprints: IoT Apps, BI / Analytics, Data Science, Dev / Test

Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

“Hortonworks loves and lives open source innovation” World Class Support and Services. Hortonworks' Customer Support received a maximum score and was significantly higher than both Cloudera and MapR

A Leader in Hadoop

The Forrester Wave™ Big Data Hadoop Solutions Q1 2014

Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

INRASTRUCTURE

Pivotal in the Modern Data Architecture

OPERATIONS TOOLS

Provision, Manage & Monitor

DEV & DATA TOOLS

Build & Test

DAT

A SY

STEM

S A

PPLI

CAT

ION

S

Repositories

ROOMS

Statistical Analysis

BI / Reporting, Ad Hoc Analysis

Interactive Web & Mobile Applications

Enterprise Applications

EDW MPP RDBMS

EDW MPP

SOU

RC

ES

OLTP, ERP, CRM Systems

Documents & Emails

Web Logs, Click Streams

Social Networks

Machine Generated

Sensor Data

Geo-location Data

On Premise, Cloud, Appliance

Gov

erna

nce

&

Inte

grat

ion

Secu

rity

Ope

ratio

ns

Data Access

Data Management

YARN Greenplum

Gemfire HAWQ

22 © Copyright 2014 Pivotal. All rights reserved. 22 © Copyright 2014 Pivotal. All rights reserved.

Turbo Charging Data Science with HAWQ

23 © 2015 Pivotal Software, Inc. All rights reserved.

Pivotal By the Numbers FOUNDED APRIL 2013

1700+ EMPLOYEES

FUNDED BY EMC, VMWARE, AND GE

HUNDREDS OF CUSTOMERS

PIVOTAL DATA >$100M in data software bookings in 2014

PIVOTAL CLOUD FOUNDRY Fastest revenue growth in an open source project in history

>$40M in first year for Pivotal Cloud Foundry in 2014 (subscription)

BIG DATA CLOUD PLATFORM AGILE

24 © 2015 Pivotal Software, Inc. All rights reserved.

Software is Eating the World

Data Is Fueling Software

25 © 2015 Pivotal Software, Inc. All rights reserved.

The Data Divide

BIG DATA CHASM

70% of data

generated by customers

80% of data stored

3% prepared for

analysis

0.5% being

analyzed

<0.5% being

operationalized

26 © Copyright 2014 Pivotal. All rights reserved.

Pivotal Business Data Lake Architecture Ingestion

Tier Insights

Tier System monitoring System management

Processing Tier

Workflow management

Distillation Tier

HDFS storage Unstructured and structured data

In-memory

MPP database

Real-time

Micro batch

Mega batch

SQL NoSQL

SQL MapReduce

Query interfaces

SQL

Sources Action Tier

Real-time ingestion

Micro batch ingestion

Batch ingestion

Real-time insights

Interactive insights

Batch insights

27 © 2015 Pivotal Software, Inc. All rights reserved.

The Data Driven Enterprise Journey STORE •  Structured

•  Unstructured

•  High Volume

•  High Velocity

ANALYZE •  Predictive Analytics

•  Machine Learning

•  Advance Data Science

•  Realtime Analytics

DEVELOP •  Advanced Analytic Pipelines

•  Realtime Analytical Applications

•  Global Scale Data-Driven Applications

•  Enterprise, Consumer, IoT, and Mobile

INNOVATE •  Agile Dev Expertise

•  DevOps

•  Hybrid Cloud

•  Continuous Delivery

•  Closed Loop Applications

AGILE DEVELOPMENT

BIG DATA PREDICTIVE ANALYTICS

ENTERPRISE PAAS

28 © 2015 Pivotal Software, Inc. All rights reserved.

Technical Observations •  SQL is today and will remain the most valuable workload on Hadoop •  While Hadoop continues to mature, focused MPP SQL will remain

important •  Scale out in-memory processing will have significant enterprise

adoption and impact into the future •  Streaming and Machine Learning will continue to gain value •  Open Source is becoming critical to enterprise investment decisions

29 © Copyright 2015 Pivotal. All rights reserved.

®

Pivotal BDS + Hortonworks HDP = The Complete Solution

Pivotal Data Engineering Pivotal Labs Pivotal Data Science

HDP

30 © 2015 Pivotal Software, Inc. All rights reserved.

SQL on Hadoop Ecosystem HAWQ

Challenges Requirements •  Complex joins not supported •  Complex joins at performance

•  Advanced analytics support •  Advanced analytics at scale within SQL

•  Interactive query latency issues •  Fast interactive queries on large data

•  Ad-hoc query performance issues •  Strong ad-hoc query support in optimizer

•  SQL analytic query coverage issues •  Full analytic SQL compliance

•  Concurrent query throughput issues •  High query throughput for mixed workloads

31 © 2015 Pivotal Software, Inc. All rights reserved.

HAWQ HAWQ: Enterprise Class SQL on Hadoop •  Leverages market leading Greenplum technology

•  100% ANSI SQL Compliant for analytic workloads

•  Advanced cost-based query optimizer

•  Highest performing SQL on Hadoop

•  Polymorphic storage with advanced compression

•  Industry differentiating data federation with PXF*

•  Built-in advanced analytics for data science (MADLib)

•  Supports all major file HDFS file formats (AVRO, Parquet, HDFS)

•  Integrated with leading analytical tools out-of-the-box

HAWQ

*PXF = Pivotal eXtension Framework

32 © 2015 Pivotal Software, Inc. All rights reserved.

Business Benefits Feature Benefit Rich and compliant SQL dialect •  Powerful and portable SQL apps

•  Leverage large SQL-based ecosystems

TPC-DS compliance •  Enable a wide range of use cases •  Avoid surprises in production

Flexible/efficient joins at linear scale Off-load EDW workloads at a much lower cost

Deep analytics + machine learning Predictive/advanced learning use cases at scale

Data federation capabilities Build use cases with diverse/external data assets without data movement

High availability and fault tolerance Off-load business critical workloads from EDW

Native Hadoop file format support Reduce ETL and data movement = lower costs

HAWQ

33 © 2015 Pivotal Software, Inc. All rights reserved.

Pivotal Query Optimizer (PQO) For HAWQ and Greenplum Database

HAWQ

Turns a SQL query into an execution plan

Greenplum DB

�  Leading Cost Based Optimizer for BIG data �  Applies all possible optimizations at the same time

–  Considers many more plan alternatives –  Optimizes a wider range of queries –  Optimizes memory usage

�  New Extensible Code Base –  Rapid adoption of emerging technologies

PIVOTAL VALUE-ADDED FUNCTIONALITY

34 © 2015 Pivotal Software, Inc. All rights reserved.

Configuring and Managing HAWQ with Ambari •  Install HAWQ/PXF Ambari plugin

RPM

•  Restart Ambari

•  Add HAWQ/PXF service like any other Hadoop component

HAWQ

35 © 2015 Pivotal Software, Inc. All rights reserved.

Pivotal eXtension Framework (PXF) •  Enables connectivity between HAWQ and

other services (Hive, HBase). •  Provides an extensible framework to add

support for custom services •  Operates as a separate service in Hadoop

Industry differentiators •  Low latency on large data sets •  Extensible and customizable •  Considers cost model of federated sources

HAWQ

HDFS (Hadoop Distributed File System)

Hive

HBase P X F

Services

HAWQ

36 © 2015 Pivotal Software, Inc. All rights reserved.

Data Driven Journey with Pivotal Big Data Suite STORE •  Structured

•  Unstructured

•  High Volume

•  High Velocity

ANALYZE •  Predictive Analytics

•  Machine Learning

•  Advance Data Science

•  Realtime Analytics

DEVELOP •  Advanced Analytic Pipelines

•  Realtime Analytical Applications

•  Global Scale Data-Driven Applications

•  Enterprise, Consumer, IoT, and Mobile

INNOVATE •  Agile Dev Expertise

•  DevOps

•  Hybrid Cloud

•  Continuous Delivery

•  Closed Loop Applications

AGILE DEVELOPMENT

BIG DATA PREDICTIVE ANALYTICS

ENTERPRISE PAAS

Spring XD

Spark

Pivotal HD & Open Data Platform

Spring XD

Pivotal Greenplum Database

Pivotal HAWQ

Spring XD

Pivotal GemFire

Redis

Rabbit MQ

Spring IO

Groovy

Pivotal BDS on PCF

Pivotal Cloud Foundry

Pivotal Labs Data Science Data Engineering

37 © 2015 Pivotal Software, Inc. All rights reserved.

Putting it All Together DATA FEEDS TRANSACTIONAL APPS ANALYTIC APPS

Expert Systems & Machine Learning

Advanced Analytics

Real-Time Data

Data Stream Pipeline

HDFS Data Lake

Distributed Computing

38 © 2015 Pivotal Software, Inc. All rights reserved.

Putting it All Together DATA FEEDS TRANSACTIONAL APPS ANALYTIC APPS

GemFire

Ingest Filter Enrich Sink SpringXD

HAWQ GPDB

39 © Copyright 2015 Pivotal. All rights reserved.

Demo: HAWQ on HDP bit.ly/HAWQonHDPVideo

Tutorial: HAWQ on Sandbox

bit.ly/HAWQonHDPTutorial

Page 40

© 2015 Open Data Platform initiative. All rights reserved.

THE OPEN DATA PLATFORM INITIATIVE

Page 41

© 2015 Open Data Platform initiative. All rights reserved.

Introducing The Open

Data Platform Initiative

Page 42

© 2015 Open Data Platform initiative. All rights reserved.

A shared industry effort to help promote and advance the state of Apache Hadoop® and Big Data

technologies for the Enterprise

43 © Copyright 2014 Pivotal. All rights reserved.

The Open Data Platform will accelerate the delivery of Big Data solutions by providing a well-defined

platform called ‘The ODP Core’

Page 44

© 2015 Open Data Platform initiative. All rights reserved.

The ODP Core

▪  The ODP Core is the kernel over which the industry can build enterprise-class Apache Hadoop® solutions

–  Simplifying development of interoperable technologies ▪  Created by the ODP Developer Community

–  A team of cross industry technical experts

–  Individual, or member company developers – anyone can participate

▪  Using an open and transparent planning and release process that follows the Apache Way

–  Interoperability within and beyond the ODP Core drives a broad set of use cases and rapid market growth

Page 45

© 2015 Open Data Platform initiative. All rights reserved.

Delivering Enterprise

Requirements & Real-world Experience

ODP Member Companies

•  Diverse representation of the Big Data eco-system –  End users, ISVs, Systems Integrators, Distribution vendors, etc.

–  Any company can join the Open Data Platform

•  A forum for the Enterprise to define its Big Data requirements –  Industry groups (SIGs) to align on common industry practices and

challenges •  Direct feedback and participation in the ODP Core

–  Real world experience determining what is Enterprise grade

Page 46

© 2015 Open Data Platform initiative. All rights reserved.

A Simple Beginning For The ODP Core

▪  The ODP Core is starting with a small number of projects –  Enables a rapid start for the Initiative and an industry driven definition

▪  All members decide how the ODP Core evolves –  All members are responsible for choosing projects to include in the ODP Core

–  Platinum, Gold and Silver member companies = One Member / One Vote

HDFS

YARN

Map Reduce

Ambari

ü  Deployable Hadoop configuration ü  Improves interoperability ü  Gives customers more freedom ü  Follows the Apache Way

ODP Core Initial Projects

47 © Copyright 2014 Pivotal. All rights reserved.

Quickly Showing Value To The Industry

Common core

HDP 2.2 Open Platform 4.0 with Apache Hadoop

IIP

Key benefits Improves ecosystem interoperability

Unlocks customer choice

Eliminates wasteful guesswork

Respects the Apache way

Hortonworks, IBM, Pivotal and InfoSys Harmonize on Open Data Platform Vision to Accelerate Big Data Solutions

Apache Hadoop 2.6 Apache Ambari

Pivotal HD 3.0

Page 48

© 2015 Open Data Platform initiative. All rights reserved.

How You Can Participate

§ Anybody can join the ODP – Company memberships start at $1k

§ Have a direct voice into the future of big data

§ Help us define priorities to solve your challenges

§  Join your peers and accelerate industry solutions

§  Contribute people, tests, and code to accelerate executing on the vision

ODP - enabling Big Data solutions to flourish atop a

common core platform

Page 49 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Questions?