big data tim
DESCRIPTION
TRANSCRIPT
© 2013 SAP AG. All rights reserved. 1Confidential
1.8
1
90% OF THE DATA IN THE WORLD TODAY
has been created in the last two years alone
IN 2011, THE AMOUNTOF DATA SURPASSED
ZETTABYTES
1.8 ZETTABYTES
BILLION 32 GB iPads
57.5= =$34.4 TRILLION=
COMBINED GDP OF:
• US• Japan• China• Germany
• France• UK• Italy
**IDC Digital Universe Study Extracting Value from Chaos
Today we measure available data in zettabytes
© 2013 SAP AG. All rights reserved. 22
Types and Volumes of Data …
…have grown dramatically
Mobile
M2M data
Planning
ERP SystemsTransactions
PlanningSales Order
Things
Instant Messages
CRM Systems
Inventory
Legacy EDW
Sales Order
Things
MobileDemand
Legacy ERP
Customer
Transactions
Structured data grew by
more than 40% per year
Traditional content types,
Including unstructured data,
are growing by up to 80% per year
© 2013 SAP AG. All rights reserved.
Where is this data?
© 2013 SAP AG. All rights reserved. 3Confidential
WHAT CRITICAL “NEW SIGNALS”
MIGHT WE BE MISSING?
What can’t we see?
Is it in our ERP Systems?
Our M2M data?Social?
© 2013 SAP AG. All rights reserved. 4
Big Data - Definition
Big Data Challenge: turn raw data into insights that drive business value and manage in a cost effective manner;
Most commonly this refers to terabytes or petabytes of data, stored in multiple formats, from different internal and external sources, with strict demands for speed and complexity of analysis
“Big Data” refers to the problems of capturing, storing, managing, and analyzing massive amounts of various types of data
© 2013 SAP AG. All rights reserved. 5Confidential
m o b ile
mobileB u s in e s s
S u it e
( E R P )
Business Suite (ERP)
S y s te m o f R e c o rd
“Foundational SAP”
System of Record
“Foundational SAP”
S y s t e m o f
E n g a g e m e n t
“New er SAP”
System of Engagement
“Newer SAP”
B u s in e s s
A n a ly t ic s
Business Analytics
D a t a L o g is t ic s /Q u a l i t y
E T L
Data Logistics/Quality ETL
SAP CloudMaintenance & Operations24/7, SLA’s, DR & HA, Elasticity
In M e m o r y D a t a b a s e P la t fo r mIn Memory / Columnar/ MPP/ Federation
In Memory Database PlatformIn Memory / Columnar/ MPP/ Federation
HANA
The SAP you need to know
© 2013 SAP AG. All rights reserved. 6Confidential
OLTP OLAP Predictive HTTPText
CR
M
SC
M
SR
M
PL
M
ER
P
SAP
BW
Custom
Apps
(Data Services , SLT, CEP)
SAP Business Suite
HANA
Native
Apps
HOT
WARM
COLD
Core
BI
Geospatial
100101011010100101
memory
disk
cached
OtherDB Other
ERP
Digging Deeper
Models
Bulk/Streaming/Real-time Federation
User Interface & Applications
Other Data …
Engines
Logical
Physical Table(s)
Virtual Tables
Ingest Engines
COLD
Data Logistics
In M e m o r y D a t a b a s e P la t fo r mIn Memory / Columnar/ MPP/ Federation
In Memory Database PlatformIn Memory / Columnar/ MPP/ Federation
HANA
© 2013 SAP AG. All rights reserved. 7Confidential
Open Hadoop Strategy
ResellResell
Strategic Partnership
© 2013 SAP AG. All rights reserved. 8Confidential
Accelerated BI with SAP BusinessObjects and SAP HANAOne unified and complete BI Suite addressing the full spectrum of BI on SAP HANA
Discovery and Analysis
Discover areas to optimize your business
Adapt data to business needs
Tell your story with beautiful visualizations
Discover. Predict. Create.
Dashboards and Apps
Deliver engaging information to users where they need it
Track key performance indicators and summary data
Build custom experiences so users get what they need quickly
Build Engaging Experiences
Reporting
Securely distribute information across your organization
Give users the ability to ask and answer their own questions
Build printable reports for operational efficiency
Share Information
© 2013 SAP AG. All rights reserved. 9Confidential
Data Logistics
HANA Studio
SAP In-Memory Database
In MemoryModels
Column Store
* SAP HANA Roadmap
SAP LT Replication Server
SAP BusinessSuite
SAP BW
Non SAP Data Sources
SAP BOBJ Data Services
Sybase Replication Server
SAP Event Stream Processor *
Trigger Based, Real Time
ETL, Batch
Log Based
M2MODBC
SAP HANA
DB Connection
ODBC
** SAP ERP & BW Extractors
Event Streams
Data Sources
SAP BusinessObjects
tools
Other query tools
SQL BICS SQL MDX
ECDA/ODBC
© 2013 SAP AG. All rights reserved. 10Confidential
SAP Big Data Apps
• Customer Engagement Intelligence
• Predictive Analytics RDS
See overview https://community.wdf.sap.corp/docs/DOC-222087
© 2013 SAP AG. All rights reserved. 11Confidential
Delivering On Your Business ImperativesData Science Services
Forecasting Sales and Demand Forecast demand and managing
inventory levels in perishable CPG Model variant cannibalization and
impact on manufacturer forecasts Utility load demand forecasting
Optimization Optimize transport and logistics -
recover from unforeseen disruptions Optimize depth and timing of retail
markdowns to boost sales Grow deposits not excessive interest
costs
Check and Compliance Deliver faster response time and
higher throughput of compliance checks to enable competitive advantage
Tackling public fraud waste and abuse by analyzing records for tax discovery
Performance and Insights Maximizing guest / customer
experience Assess the impact of promotions, and
improve profitability Directional insight on growing
revenues and basket sizes
Contact “DL BigDataSalesSupport” for more information about SAP Data Science Services
HANA + Hadoop
© 2013 SAP AG. All rights reserved. 13Confidential
What is Hadoop
Open source project inspired by Google/Yahoo Used at Yahoo, Facebook, eBay, LinkedIn, startups, Fortune 500
enterprises to store and process Petabytes of data on thousands of servers Hadoop components
– Cluster of commodity servers
– Distributed storage layer (Hadoop Distributed File System, or HDFS)
– Distributed processing infrastructure (MapReduce programming model)
Cluster of Commodity Servers
HadoopNameNode
10s to 1000s DataNode(s)
HadoopComputation Engines
Map-Reduce
Hive HBase Mahout
Pig Sqoop …
Data storage (Hadoop Distributed File system)
Hadoop Software Architecture
© 2013 SAP AG. All rights reserved. 14Confidential
Apache HadoopSoftware framework for distributed data processing
Hadoop Distributed File System (HDFS) – reliable data storage on commodity hardware
HIVE -- data warehousing solution on top of Hadoop with direct access to HDFS and Hbase
MapReduce – programing model for parallel data processing and query execution
HDFSName Node
(stores metadata)
Data Node
(stores actual data in blocks)
Data Node
(stores actual data in blocks)
replication
client
HDFSMapReduce HDFS
Input process output
© 2013 SAP AG. All rights reserved. 15Confidential
Why Hadoop?
Pros
Free software
Cheap hardware - commodity servers
Scalable to thousands of nodes and petabytes of data
Highly fault-tolerant storage and processing
Flexible – write Java MapReduce programs to do any kind of processing; any data- no fixed schema needed
Open source libraries & tools
Cons
Specialized skillset to administer and develop – Hadoop is not free!
Require more development (programming MapReduce & other NoSQL tools) than relational technologies (SQL, stored procedure)
HIVE/PIG/Impala not as performant nor as mature as relational tech
Batch-oriented jobs, not real-time
Less mature in enterprise readiness – security, ETL, management, monitoring, etc
© 2013 SAP AG. All rights reserved. 16Confidential
SAP HANA + Hadoop Provides Real-Time on BIG DATA
81.0secInfinite storageInstant Results
SAP HANA HADOOP
Combine INSTANT Results with INFINITE Storage
• Modern in-memory platform
• Transact/analyze in real-time
• Native predictive, text, and spatial algorithms
• Distributed disk platform
• Store infinite amounts of unstructured data
• No-SQL access