level up – how to achieve hadoop acceleration

33
Grab some coee and enjoy the pre-show banter before the top of the hour!

Upload: inside-analysis

Post on 07-Jul-2015

516 views

Category:

Technology


2 download

DESCRIPTION

The Briefing Room with Robin Bloor and HP Vertica Live Webcast on August 26, 2014 Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=3dd6d1b068fe395f665c75adb682ac41 Hadoop has long passed the point of being a nascent technology, but many users have found that when left to its own devices, Hadoop can be a one trick pony. To get the most out of Hadoop, organizations need a flexible platform that empowers analysts and data managers with a complete set of information lifecycle management and analytics tools without a performance tradeoff. Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he outlines Hadoop’s role in a big data architecture. He’ll be briefed by Walt Maguire of HP Vertica, who will showcase his company’s big data solutions, including HAVEn and the HP Big Data Platform. He will demonstrate how HP Vertica acts as a complement to Hadoop, and how the combination of the two provides a versatile and highly performant solution. Visit InsideAnlaysis.com for more information.

TRANSCRIPT

Page 1: Level Up – How to Achieve Hadoop Acceleration

Grab some coffee and

enjoy the

pre-show

banter

before the top of the

hour!

Page 2: Level Up – How to Achieve Hadoop Acceleration

The Briefing Room

Level Up – How to Achieve Hadoop Acceleration

Page 3: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

Welcome

Host: Eric Kavanagh

[email protected] @eric_kavanagh

Page 4: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

!   Reveal the essential characteristics of enterprise software, good and bad

!   Provide a forum for detailed analysis of today’s innovative technologies

!  Give vendors a chance to explain their product to savvy analysts

!   Allow audience members to pose serious questions... and get answers!

Mission

Page 5: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

Topics

2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room

This Month: BIG DATA ECOSYSTEM

September: INTEGRATION & DATA FLOW

October: ANALYTIC PLATFORMS

Page 6: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

Executive Summary

u  Yes, you still need to PLAN

u  File formats and partitioning MATTER

u  Pay attention to TCO

u  Be willing to fail fast, OFTEN

LOOK BEFORE YOU LEAP INTO HADOOP!

Page 7: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

Analyst: Robin Bloor

Robin Bloor is Chief Analyst at The Bloor Group

[email protected] @robinbloor

Page 8: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

HP Vertica

!  HP Vertica offers a range of enterprise software and database management solutions

!  The column-oriented Vertica Analytics Platform leverages a standard SQL interface that now integrates with Hadoop

!  HP’s big data initiative includes HAVEn (Hadoop, Autonomy IDOL, Vertica, Enterprise Security and n Apps), a platform designed to analyze and manage petabytes of data

Page 9: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

Guest: Walter Maguire

Walter Maguire has twenty-seven years of experience in analytics and data technologies. He practiced data science before it had a name, worked with big data when "big" meant a megabyte, and supported the movement which brought data management and analytic technologies from back-office to the front. In October of 2010, Walt became the first hire west of Denver for Vertica, makers of the Vertica Analytics software platform for real-time analytics of structured and unstructured data. Since then, he has helped build the HP Vertica customer base and team in the western USA. Now as Chief Field Technologist with HP Vertica, Walt addresses customer needs with the continuing evolution of Vertica and HAVEn, the HP Big Data strategy that links hardware, software, services, and business transformation consulting for successful execution.

Page 10: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

With the HP Vertica Analytics Platform Walt Maguire, Chief Field Technologist, HP Vertica Chris Selland, VP Business Development, HP Vertica

Big Data & SQL: Hadoop Convergence

Page 11: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 11

Essential Requirements of an Analytics Platform

Page 12: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 12

Faster answers from Big Data at a fraction of the cost of traditional data warehouses

Introducing HP Vertica Dragline

Store all your data in any format cost-effectively across Vertica + Hadoop

Explore all your data directly in Hadoop without moving or changing it

Serve all of your data consumers without compromise from individualized queries to large complex reports

HP Vertica

Page 13: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 13

Cost-Optimized Storage - ILM

Tier-off older data

Value Discovery

Interactive Data Frequently queried Vertica data cache

Batch Data

Archive Data

Serve Convert data to Vertica storage format

Explore Any format

Store Any format Location Format

Cold

Cool

Hot

Dark Data

Page 14: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 14

The Richest, Most Open SQL on Hadoop

Challenge: Extracting data from Hadoop requires complex and brittle ETL processes

Solution: Hadoop Navigation and Analytics Benefits: •  Navigate Hadoop data using its native catalog •  Quickly and easily load native data types from Hadoop to

Vertica •  Avoid creating and maintaining time-consuming schemas •  Use the full power of HP Vertica SQL and analytics •  Choose your own Hadoop distribution

Page 15: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 15

The Richest, Most Open SQL on Hadoop

Challenge: Extracting Data from Hadoop requires complex and brittle ETL processes

Solution: Hadoop Navigation and Analytics Benefits: •  Navigate Hadoop data using its native catalog •  Quickly and easily load native data types from Hadoop to

Vertica •  Avoid creating and maintaining time-consuming schemas •  Use the full power of HP Vertica SQL and Analytics •  Choose your own Hadoop distribution

Page 16: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

16

HP Vertica Flex Zone Avoid creating and maintaining time-consuming schemas

on semi-structured data Faster SQL querying

semi-structured data loading Auto-schematization

for JSON and delimited data Flexible parsers

for blazing-fast performance One-step schema

Load, manage, and explore semi-structured data

Page 17: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Exploring Data with FlexZone

Create Flex Table (with or without any columns)

Load Flex Tables (using parsers

data format independent)

Explore Data Using map functions

Materialize Flex Table (Compute keys / Build Views Materialize flextable columns)

Manage Flex Table (Alter, Config, etc)

Page 18: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Monitor Customer Experience by Joining Machine Logs to Tweets

Create Flex Table with Machine Log

Data

Explore Data with Map Functions

Create Flex Table with Twitter Data

Sentiment Score Tweets with Pulse

Join Machine Log Data to Tweets

with Time Series Event Join

Associate customer

sentiment with application

response times

Page 19: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Ecosystem

HP Vertica

Page 20: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

•  Integrated SQL-on-Hadoop solution with MapR live as of May 2014

•  Reseller relationship with Hortonworks announced July 2014

•  Significant joint customer base with Cloudera

Hadoop Partnership Momentum and Milestones

Page 21: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

HP Vertica Marketplace

Page 22: Level Up – How to Achieve Hadoop Acceleration

© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Thank you!

Walt Maguire Chief Field Technologist, HP Vertica [email protected] Chris Selland VP Business Development, HP Vertica [email protected]

Page 23: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

Perceptions & Questions

Analyst: Robin Bloor

Page 24: Level Up – How to Achieve Hadoop Acceleration

Robin Bloor, PhD

THE NEXT PHASE OF HADOOP’S EVOLUTION?

Page 25: Level Up – How to Achieve Hadoop Acceleration

Hadoop Evolution FR

OM

Serial batch workloads MapReduce Versatile data storage Key-value access only Island of processing

TO Multiple concurrent

workloads Multiple algorithms “Optimized” data storage SQL, JSON & even SPARQL access Integrated processing

Page 26: Level Up – How to Achieve Hadoop Acceleration

The Data Warehouse: From/To

Page 27: Level Up – How to Achieve Hadoop Acceleration

Data & Data Lifecycle Management

Page 28: Level Up – How to Achieve Hadoop Acceleration

The Major Workload Will Be Analytics

The consequences are that: 1.   DATA ACCESS will need to be more

versatile

2.   WORKLOAD MANAGEMENT will need to be more versatile

3.   Hadoop will need to “shake hands” with ONE OR MORE database engines

Page 29: Level Up – How to Achieve Hadoop Acceleration

u  In general what is the DBA overhead for the combination of Hadoop, Flex Zone and Vertica?

u  How does data lifecycle management work in practice?

u  Analytics can be done on Hadoop, in Flex Zone and in Vertica. How are these choices normally handled in a Flex Zone/Vertica environment?

u  What analytics components can be used with Flex Zone and Vertica?

Page 30: Level Up – How to Achieve Hadoop Acceleration

u  What do you see as the sweet spot for this architecture (by sector & company size)? Where might it be overkill?

u  In respect to scale, what is your largest implementation of Hadoop/Flex Zone/Vertica by data volume?

u  Who do you see as closest in direct competition?

Page 31: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

Page 32: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

Upcoming Topics

www.insideanalysis.com

2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room

This Month: BIG DATA ECOSYSTEM

September: INTEGRATION & DATA FLOW

October: ANALYTIC PLATFORMS

Page 33: Level Up – How to Achieve Hadoop Acceleration

Twitter Tag: #briefr

The Briefing Room

THANK YOU for your

ATTENTION!

Opening slide image courtesy of Wikimedia Commons