extending open source big data to the enterprise with sap...

17
Extending Open Source Big Data to the Enterprise with SAP HANA Vora Daniel Rutschmann, Sr. Director, Database & Data Management GTM December 8, 2016

Upload: phamdien

Post on 16-Mar-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

Extending Open Source Big Data to the Enterprise with SAP HANA VoraDaniel Rutschmann, Sr. Director, Database & Data Management GTMDecember 8, 2016

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 2

What has SAP got to do with Big Data?

• SAP is the world’s largest provider of Enterprise Application Software

• 82,400 employees in 130+ countries

• 335,000 customers in 190 countries

• 87% of the Forbes Global 2000 companies

• 74% of the world’s transaction revenue touches an SAP system

• Dealing with complex data processing problems is our daily business.

• Founding sponsor of the UC Berkeley AMPLab

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 3

Challenges With Getting Actionable Insights

Different data formats need different

computation tools

Business analysts struggle with highly

technical tools

Lack of unified environment for

production deployment

DraftChallenges

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 4

Draft

SAP HANA Vora 1.3

SAP HANA Vora is an enterprise-ready, easy-to-use in-memory distributed computing solution to help organizations uncover actionable insights from big data.

Builds upon Apache Spark

Seamless Integration with SAP HANA

Runs on Hadoop

Vora

* in beta

*

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 5

Distributed Computing for the Digital Enterprise

Hadoop

Spark

Distributed Transaction Log

Disk-to-Memory Accelerator

Data Modeler

OLAP Time Series Graph Doc Store

SAP HANA Vora

O P E N C O N S U M P TI O N

Data Science, Predictive, Business Intelligence, Visualization Apps

Insights from one single solution

Enterprise-ready

Easier to use

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 6

Insights from One Single SolutionDraft

In-memory distributed computing engines

Sophisticated analytics for relational, time series, graph and JSON data

High performance even when dataset sizes exceed memory capacity

Hadoop

Spark

Distributed Transaction Log

Disk-to-Memory Accelerator

Relational Time Series Graph Doc Store

SAP HANA Vora

O P E N C O N S U M P TI O N

Data Science, Predictive, Business Intelligence, Visualization Apps

Data Modeler

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 7

Time Series Data Analysis across big data

-30

-25

-20

-15

-10

-5

0

5

Temperature °C

Halifax Waterloo

Efficiently analyze time series data in distributed environments

� Interactive access to standard time series analysis functions using the well-known SQL language

� Efficient compression allowing analysis of more data using less memory

� Build time series models visually using VoraData Modeler

Trend | Cyclical | Seasonal | Random | Exception

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 8

Vora Time Series Functions

Sequence of data points recorded over time� Can be equidistant or non-equidistant� Detect and correct errors / anomalies� Granularization� Standard aggregation� Analysis

Specify a series clause during table creation� Define the period column (timestamp)� Provide a compression definition (optional)� Define start/end of series (optional)� Define the series increment (optional)

Column Functions� Trend� Stddev� Median� Linear_Approx� Const_Approx� Cubic_Spline_Approx� Polynomial_Approx

Table Functions � Auto_Corr� Cross_Corr� Histogram� DFT� Granulize

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 9

Enterprise ReadyDraft

Production-ready, integrated solution

Metadata persistence

Out-of-the-box business functions including hierarchy processing and currency conversion

Seamless integration with SAP HANA

Hadoop

Spark

Distributed Transaction Log

Disk-to-Memory Accelerator

Relational Time Series Graph Doc Store

SAP HANA Vora

Data Science, Predictive, Business Intelligence, Visualization Apps

Other Apps

In-Memory StoreSAP HANA

Platform

O P T I O N A LData Modeler

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 10

SAP HANA and SAP HANA Vora Integration

Other Apps

Gain business coherence with business data and big data

SAP HANA in-memory platform

In-Memory Store

SAP HANA Platform

HANA Smart Data Access Spark

Controller

YARN

HDFSFiles

VoraSpark

Files

VoraSpark

Files

VoraSpark

Spark Data-source API enhancement

SQL

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 11

Utilities CustomerData Tiering to Petabyte Scale

Data Lifecycle Manager

HOT-STORE(Column Table)

WARM-STORE(Extended Table)

DATA MOVEMENT

YARN

HDFSFiles

Vora

Spark

Files

Vora

Spark

Files

Vora

Spark

Hadoop Cluster

SAP HANA

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 12

Fashion Retail Use CaseSocial Media Analysis

Check out the Vora Test Drive on http://testdrive.saphanavora.comi

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 13

Easier to Use

Intuitive web interface with drag-and-drop for creating data models

One SQL entry point to interact with specialized computing engines

Connect familiar analytics tools and web notebooks

Hadoop

Spark

Distributed Transaction Log

Disk-to-Memory Accelerator

Relational Time Series Graph Doc Store

SAP HANA Vora

Data Science, Predictive, Business Intelligence, Visualization Apps

Data Modeler

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 14

Data Modeler for creating business scenarios

Creating business scenarios views :

• Data Browser for viewing and exporting data

• SQL Editor for writing and running SQL scripts

• Modeler to visually create data models with intuitive web interface

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 15

The Lambda Architecture

www.ymc.ch/en/lambda-architecture-part-1

Batch Layer� High latency, high throughput� Compute official result

Speed Layer� Low latency� Compute approximate update to last known

result

Serving Layer� Real-time� Merge batch/speed results

© 2016 SAP SE or an SAP affiliate company. All rights reserved. 16

Lambda ArchitectureCustomer Example

MQ Kafka

LTE

Validate & Aggregate Messages

Reporting with standard BI Tools

Low Latency – High throughput

Modern Developer Tools

Record/Replay

SpatialPredictive Libraries

SAP HANASAP HANA Smart Data Streaming

All Thing History +++

High Speed Analytics

Immutable Copy

Observations

• Intelligent distribution eliminates replication of data for analytics• Each component provides fit-to purpose analytics

• Each component scales independently for the use case at hand

• Reduced TCO • Increased Analytical Agility• Brings the code to the data• Supports additional usage models

Alerts

Things

YARN

HDFSFiles

Vora

Spark

Files

Vora

Spark

Files

Vora

Spark

© 2016 SAP SE or an SAP affiliate company. All rights reserved.

Thank You

Contact: [email protected]

sap.com/hana-vora