massivement parallèle avec pdw !

26

Upload: microsoft-technet-france

Post on 15-Jun-2015

231 views

Category:

Technology


3 download

DESCRIPTION

Comment évoluer vers les Big Data avec Microsoft SQL Server Parallel Data Warehouse et Hadoop? Il existe aujourd'hui des solutions clé en main basées sur SQL Server répondant à tous vos besoins, sous forme d'architectures de références ou d'Appliance. Ces appliances prêtes à l'emploi vous permettent d'accélérer vos déploiements, réduire vos risques et optimiser votre TCO. Pour faciliter le lien entre le monde des entrepôts de données et celui des Big Data, Polybase propose un accès familier à toutes vos données relationnelles ou non grâce à l’utilisation de commandes T-SQL classiques. Venez découvrir ce qui se cache sous le capot de ces drôles de machines, et les bénéfices que vous pouvez en tirer ! Speakers : Gilbert Breton (Microsoft), Lionel Pénuchot (Microsoft)

TRANSCRIPT

Page 1: Massivement Parallèle avec PDW !
Page 2: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Depuis votre smartphone sur :http://notes.mstechdays.fr

De nombreux lots à gagner toute les heures !!!Claviers, souris et jeux Microsoft…

Merci de nous aider à améliorer les Techdays !

Donnez votre avis !

Page 3: Massivement Parallèle avec PDW !

Business Intelligence

Massivement Parallèle avec PDW !

Lionel PénuchotGilbert Breton

Page 4: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

… data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing.

– Gartner, “The State of Data Warehousing in 2012”

Data sources

OLTP ERP CRM LOB

ETL

Data warehouse

BI and analytics

L’approche traditionnelle d’un DWH

Page 5: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

L’approche traditionnelle d’un DWH

Data sources

OLTP ERP CRM LOB

ETL

Data warehouse

BI and analytics

Increasing data volumes

1

Real-time data

2

Non-Relational Data

Devices Web Sensors Social

New data sources & types

3Cloud-born data

4

Page 6: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Modernisation d’un projet décisionnel

INFRASTRUCTURE

DATA MANAGEMENT & PROCESSING

DATA ENRICHMENT AND FEDERATED QUERY

BI & ANALYTICS

Self-service CollaborationCorporate PredictiveMobile

Extract, transform, loadSingle query model Data quality Master data management

Non-relationalRelational Analytical Streaming Internal & External

Data sources

OLTP ERP CRM LOB

Non-Relational Data

Devices Web Sensors Social

Page 7: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

INFRASTRUCTURE

DATA MANAGEMENT & PROCESSING

DATA ENRICHMENT AND FEDERATED QUERY

BI & ANALYTICSThe Microsoft Data Platform

Self-service CollaborationCorporate PredictiveMobile

Extract, transform, loadSingle query model Data quality Master data management

Non-relationalRelational Analytical Streaming Internal & External

VirtualizationScalability Security and Identity Quality of service

Show All

Close All

Page 8: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

All Volumes Any DataReal-time Performance

Page 9: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

All Volumes Any DataReal-time Performance

Page 10: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Scale-out données relationnelles

Massively Parallel Processing (MPP) parallelizes queries

Multiple nodes with dedicated CPU, memory, storage

Incrementally add HW for near linear scale to multi-PB

Handles query complexity and concurrency at scale

No fork-lift of prior warehouse to increase capacity

Scale OUT

From Terabytes to Mult i -Petabytes

Scale out technologies in SQL Server Parallel Data Warehouse

Relational

Page 11: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Scale-out données non relationnelles

Create Hadoop cluster for your data requirements

Seamlessly add more compute to fit your demand

Scales out linearly Shut down clusters when you are done

Scale Out “Big Data”

Scale OUT

Scale out non-relational data in HDInsight (for Azure or PDW)

Non-relational

Page 12: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

• Basel III implementation on PDW• RDBMS> Windows HPC computation

farm-> PDW store and aggregate• Run the aggregation daily

Financial Risk Management System Use case

# Contracts x 1 Million

SMP (Extrapolated) hour:min

4 Node PDW

10 Node PDW

Improvement4 node MPP vs SMP

Improvement10 node MPP vs SMP

Scale factor 4 vs 10 nodes

155 min

8 min 18sec

3 min 30sec 6.6x 15.7x 2.4x

302 hr 45

min25 min 50sec

9 min 52sec 6.4x 16.7x 2.6x

109 hr 9min 1 hr 40min

32 min 26sec 5.5x 16.9x 3.1x

2522 hr 52min 4 hr 31min 1 hr 29min

5.1x 15.4x 3.0x

50

1day 21 hr 45min

9 hr 35min3 hr 7 min 4.8x 14.7x 3.1x

Page 13: Massivement Parallèle avec PDW !

démo

#mstechdays Business Intelligence

SCALE-OUT WITH MPP TECHNOLOGIES IN PDW

Page 14: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

All Volumes Any DataReal-time Performance

Page 15: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

All Volumes Any DataReal-time Performance

Page 16: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Traitement In-Memory

Stores data in columnar format for massive compression

Loads data into or out of memory for next-gen performance

Leverage your existing hardware regardless of specs

Updateable and clustered for real-time trickle loading

Customer

Sales

Country

Supplier

Products

Up to 100x faster and 10x compress ion

In-memory columnstore for next-generation performance

Relational

Page 17: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Insight en quasi temps réel

Low latency with sub-zero processing of large event streams

Continuous insight through historical data mining

Management simplicity and flexible deployment

EVENT TARGETS

EVENT SOURCES

Real - t ime Ins ights

Real-time with complex event processing

Relational

Page 18: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

• 30 users, 6 TB of data (5 years of history)• Daily volumes: 50 million rows • Bank data for customer and fraud analysis

– Weekly analysis of deviance and strange transactions, suspect of money laundry.

• Retail data for customer purchase analysis on receipt line level– For example backtracking last 6 months per customer, per product and per

category; resulting in marketing offers such as “you like item A, other who Item A also purchase item B” etc.

• Now queries cover half a year of data complete in 1.5 minute on PDW compared to 8 minutes covering a month of data on traditional DW.

Retail industry Use case

Page 19: Massivement Parallèle avec PDW !

démo

#mstechdays Business Intelligence

IN-MEMORY COLUMNSTORE FOR NEXT-GEN PERFORMANCE

Page 20: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

All Volumes Any DataReal-time Performance

Page 21: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

All Volumes Any DataReal-time Performance

Page 22: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Exploration de données non relationnelles

Manage non-relational data

100% Apache-based

Management simplicity of Windows

Bringing Hadoop to software, appliance, cloud

“Big Data” with s impl ic i ty

Windows Azure

Parallel Data Warehouse

Hortonworks Data Platform

Hadoop cluster in HDP for Windows and HDInsight

Non-relational

Page 23: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Intégration avec Polybase

Query relational and Hadoop in parallel

Single query

No need to ETL Hadoop data into DW

Query Hadoop with existing T-SQL skillsQuery relat ional + non

relat ional

SQL Result set

Relational data

PolyBase

Integrated query with PolyBase in SQL PDW

Page 24: Massivement Parallèle avec PDW !

démo

#mstechdays Business Intelligence

POLYBASE TO JOIN RELATIONAL AND NON-RELATIONAL DATA

Page 25: Massivement Parallèle avec PDW !

#mstechdays Business Intelligence

Différentes options de déploiement

Box Software

CloudAppliances

SQL Server

Hortonworks Data Platform

Parallel Data Warehouse

SQL Server for data warehousing in Windows Azure

VMs

HDInsight for Windows Azure

SQL Server Fast Track

Page 26: Massivement Parallèle avec PDW !

© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Digital is business