© copyright 2015 emc corporation. all rights reserved. hadoop-as-a-service (hdaas) flexible und...

16
© Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer @ EMC Marius Lohr – Systems Engineer @ EMC

Upload: daniela-wagner

Post on 06-Apr-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

Hadoop-as-a-Service (HDaaS) Flexible und skalierbare ReferenzarchitekturLena Frank – Systems Engineer @ EMCMarius Lohr – Systems Engineer @ EMC

Page 2: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

klassische IT Dienste: neue IT Dienste:

Fallbeispiel: CIO eines DAX Unternehmens

Page 3: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

Die Möglichkeiten

hjdfhjdsfh

neue Geschäftsfelder

Risikominimierung

Verbesserung operatives Geschäft

Umsatzsteigerung

Page 4: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

Schnelles Deployment

mehrere Mandanten

Anforderungen und Workloads

Hochverfügbarkeit und Datensicherheit

Kostendruck ggü. Cloud Anbietern

Fehlendes Wissen über Hadoop

Infrastrukturen

Die Herausforderungen

Page 5: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

Ethernet

Job Tracker Task Tracker DataNode 2nd NameNode

NameNode

Klassische Hadoop Architektur

Sqoop Mahout Hive HBasePIG

NameNode

Data Node + Compute Node

Data Node + Compute Node

Data Node + Compute Node

Data Node + Compute Node

Data Node + Compute Node

Data Node + Compute Node

Page 6: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

Klassische Hadoop Architektur• dedizierte Serverumgebung mit lokalem Storage

• Hardware und Kapazität nur für Hadoop Daten gedacht

• Effizienz• schlechte CPU Auslastung da auf Lastspitzen zugeschnitten

• 3-fach Spiegelung (300% Brutto) durch Hadoop Architektur

• Skalierungsmöglichkeiten• starres Verhältnis von Compute Node zu Data Node

• Enterprise Class Dienste• Fehlende Datensicherungskonzepte wie Snapshots, Replikation, Backup

• Keine logische Trennung von Mandanten

Page 7: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

Ethernet

Sqoop

PIG

Mahout Hive HBase

Job Tracker Task Tracker DataNode

Compute Node Compute Node Compute Node

Compute NodeCompute Node Compute Node

NameNode

Hadoop Architektur mit konsolidiertem HDFS Storagedata node

HDFS

name node

Page 8: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

• Open-Source Projekt

• Schnelles Deployment von Hadoop Clustern in virtuellen Umgebungen

Project Serengeti

vCenter

VM VMVM

vSphere + Serengeti

Host

Hadoop Node

Hadoop Node

Host Host Host

VMManagement

Server Templates

Page 9: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

Hadoop-as-a-Service Referenzarchitektur

data node

HDFS

Compute Node Compute Node Compute NodevCenter

Infrastructure Mgmnt

Self Service Portal Serengeti Orchestration

& ChargebackUser

Management Hadoop

virt

uell

phys

ikal

isch Nam

e node

Page 10: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

SELF SERVICEPORTAL

ORCHESTRATOR

3: Invoke

AD

USER/TENANT MGMT

2: Validate

Shared HDFS

Storage

HDFS/REST API

4a: Provision Storage

SERENGETI4b: Provision

Compute5: Instantiate

PIVO-TAL HD MASTER

HD WORKER PIVO-

TAL HD MASTER

HD WORKER

Data Scientist

1: Request

7: Access and Analyze

HadoopCluster

6: Notify

6: Notify

HDaaS Workflow

Page 11: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

unabhängige Skalierung der Infrastruktur• Compute und Data Nodes voneinander unabhängig erweiterbar

bessere Ausnutzung der IT Infrastruktur• >80% Storage Utilization, verbesserte CPU Utilization• parallele Workloads von non-Hadoop Applikationen auf gleicher

Hardware

automatisierte Bereitstellung und einfaches Management• konsolidierter HDFS Speicher• Compute Templates als Basis für schnelles Deployment

Mandantentrennung• Logische Trennung der Datenzugriffe• Logische Trennung der Compute Nodes

zusätzlicher Schutz der Daten• Snapshots, Replikation, Backup

Vorteile einer entkoppelten und virtualisierten Hadoop Infrastruktur

HDFS

Data Scientist

Hadoop-as-a-ServiceReferenzarchitektur

Virtualisierte Hadoop Cluster

Shared HDFS Storage

Page 12: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

EMC Scale-Out Data Lake FoundationData Lake Foundation

12© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

NEXT-GEN WORKLOADSTRADITIONAL WORKLOADS

HPC

Backup/Archive

Analytics

Mobile

File Shares

Cloud Apps

DAS

CLOUD

OBJECTTAPE

SAN

NAS

Page 13: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

EMC Scale-Out Data Lake FoundationData Lake Foundation

13© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

TAPE

NAS DAS

CLOUDSAN

OBJECT

Data LakeFoundation

TRADITIONAL WORKLOADS

HPC

Backup/Archive

Analytics

Mobile

File Shares

Cloud Apps

NEXT-GEN WORKLOADS

Page 14: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

Next-Gen Access Methods

FILE

FILE

14© Copyright 2014 EMC Corporation. All rights reserved.© Copyright 2014 EMC Corporation. All rights reserved.

HPC

Backup/Archive

Analytics

Mobile

File Shares

Cloud Apps

Page 15: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

Expanded Enterprise-Grade Features

DATA PROTECTION

DATA SECURITY PERFORMANCE MANAGEMENT

DATA MANAGEMENT

15© Copyright 2015 EMC Corporation. All rights reserved.

Isilon Data LakeFoundation

Page 16: © Copyright 2015 EMC Corporation. All rights reserved. Hadoop-as-a-Service (HDaaS) Flexible und skalierbare Referenzarchitektur Lena Frank – Systems Engineer

© Copyright 2015 EMC Corporation. All rights reserved.

Haben Sie noch Fragen?