microsoft technológie pre bigdata - efocus konferencie · 2014-10-16 · microsoft, windows,...

33
Microsoft technológie pre BigData Ľubomír Goryl Solution Professional

Upload: others

Post on 21-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Microsoft technológiepre BigData

Ľubomír Goryl

Solution Professional

Page 2: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks
Page 3: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Tradičný prístup

Page 4: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Breaking points of traditional approach

Page 5: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Breaking points of traditional approach

Page 6: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Breaking points of traditional approach

Page 7: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

HDInsight v Azure (Haddop)

Page 8: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

HDInsight – krok 1/3

Page 9: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

HDInsight – krok 2/3

Page 10: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

HDInsight – krok 3/3

Page 11: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

15’ …. Hadoop cluster - running

Page 12: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Hadoop is a platform with portfolio of projects

Page 13: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

A Hadoop distribution is a package of projects

Page 14: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

BI Tools

SSRS / SSAS

SQL Server SMPAnalytics Platform System

ETL/ELT with SSIS, DQS, MDS

ERP CRM LOB APPS

ETL/ELT with DWLoader

Hadoop / Big Data

PDW

HDInsight

PolyBase

Ad hoc queries

Microsoft Analytics Platform SystemAppliance pre moderný Datawarehouse

Page 15: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Čo je Parallel Data Warehouse?

• Shared-nothing parallel database system» Massively parallel processing (MPP)» A “Control” server that accepts user queries, generates a plan, and

distributes operations in parallel to compute nodes» Multiple “Compute” servers running SQL Server» A “Management” server for administering the system» A “Data Movement Service” that facilitates parallel SQL operations

• Delivered as an appliance» Balanced and pre-configured software and industry standard

hardware from HP» Single Call Support» Fastest Time to Market» Scales from 2 to 56 Nodes

Page 16: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

SQL Server Parallel Data Warehouse

Page 17: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

SQL Server Parallel Data Warehouse

Parallel query execution

Query

Results

Page 18: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

HP ConvergedSystem 300 for Microsoft Analytics Platform Base unit

Passive server (#1)

Passive server (#2)

Optional passive server (#3)

FDR InfiniBand

2 x HP 5120 EI switches

Passive Server Block (2):

Virtualized control and

management node;

failover node for HA

InfiniBand (data

network) and Ethernet

(management network)

connectivity

Active Server Block (2)

and Storage Block (1)

Base unit

Page 19: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

BI Tools

SSRS / SSAS

SQL Server SMPAnalytics Platform System

ETL/ELT with SSIS, DQS, MDS

ERP CRM LOB APPS

ETL/ELT with DWLoader

Hadoop / Big Data

PDW

HDInsight

PolyBase

Ad hoc queries

Microsoft Analytics Platform SystemAppliance pre moderný datawarehouse

Page 20: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

SQL Server Parallel Data Warehouse – Appliance Update 1Extending the distributed Data Warehouse further

Page 21: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Single T-SQL query model for PDW and Hadoop with rich features of T-SQL including joins without ETL

Leverages the power of MPP to enhance query execution performance

Supports Windows Azure HDInsight to enable new hybrid cloud scenarios

Query non-Microsoft Hadoop distributions such as Hortonworks and Cloudera

Query Hadoop data with T-SQL using PolyBaseBringing the worlds or big data and the data warehouse together for users and IT

SQL Server

Parallel Data

Warehouse

Cloudera

Hortonworks

(Windows, Linux)

Windows Azure

HDInsight

PolyBase

Microsoft

HDInsight

Select… Result

set

Page 22: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Access Hadoop on different cluster (cloud or on premise)

HDInsight is the Microsoft branded Hortonworks Data Platform

• We made it work on Windows

• We brougth SystemCenter support to it

Polybase

Hortonworks Data

Platform (HDP) on

Windows or Linux

HDInsight on Azure

(rebranded HDP)

Polybase

Polybase

Page 23: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Hardware topology overview

• Uses the same PDW hardware and topology

• The key difference is introduction of 2 additional servers on rack 1 for the HDI Head Node

• 1 active server and 1 fail over server.

PDW Control Node

HDI Head Node

HDI Data Nodes

(1 scale unit)

PDW Compute Nodes

(1 scale unit)

Page 24: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Smallest (0TB) To Largest (6PB)

• Start small with a few Terabyte warehouse• From 2 compute nodes to 56 compute

nodes• 1 quarter rack up to 7 full racks• Add capacity up to 6 Petabytes

0TB 6 PB

Add

Capacity

Add

Capacity

Largest Warehouse

PB

Start Small And Grow

Minimal Downtime

Seamlessly add capacityScale from a Quarter Rack with 2 Compute Nodes up to 56 Compute Nodes!

Page 25: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Bringing Hadoop to a billion people

Page 26: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Making advanced analytics accessible to Hadoop

Cloud

Page 27: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

PowerBI, Excel PowerMap, PowerQuery, …

Page 28: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Hadoop Scenario 1: pre-process ETL

Page 29: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Hadoop Scenario 2: hot and cold storage

Hot datain DW

Cold datain Hadoop

Page 30: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Hadoop Scenario 3: true data discovery

Page 31: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

Industry Use Cases of HadoopFinancial services Retail Telecom Manufacturing

Healthcare Public sectorUtilities, oil and gas

Page 32: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

For more information visit: http://azure.microsoft.com/en-us/services/hdinsight/

Page 33: Microsoft technológie pre BigData - eFocus Konferencie · 2014-10-16 · Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks

© 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market

conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.

MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.