frontier bigdata

23

Upload: neil-soman

Post on 03-Jul-2015

125 views

Category:

Technology


0 download

DESCRIPTION

Talk presented at Frontier Big Data meetup at Elance headquarters in Mountain View

TRANSCRIPT

Page 1: Frontier bigdata
Page 2: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Big Data: Public, Private or Hybrid?

Neil SomanCo-Founder

[email protected]

2

Page 3: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

First there was cloud

• 2006: AWS launched

• 2007/2008 Eucalyptus started as National Science Foundation project

–First private cloud computing IaaS platform, picked AWS API

• 2009 Eucalyptus Systems launched

–Open Source

–Today: v3.4, 2 rewrites, 7+ services, compute, network, storage

• 2007+ AWS gains rapid traction and becomes industry leader

3

Page 4: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Cloud & Big Data

4

Page 5: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Cloud Trends

5

Page 6: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Why not just run on bare metal?

• Need level of control for end users

• Reduces burden on admins

• Better tenancy: can be flexible and run other tasks

• Easily scale out as needed

• Virtualization is getting better, overhead getting lower

• Hypervisors adding more features: VirtIO, Passthrough

6

Page 7: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

What characteristics are essential

7

Page 8: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Big Data Considerations• QoS/SLAs

• VM placement: Physical characteristics are important

• Gather performance data

– E.g. CloudWatch

• Scale out as needed based on performance data

– E.g. Autoscaling

• Ability to attach local disks

• Huge/XXL instances, SSDs

8

Page 9: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Big Data & Networking• Subnet stickiness (Static IPs)

• Multiple NICs for VMs

• Equal cost links between VMs

• High I/O (e.g. bonded interface)

• Built into cloud fabric

9

Page 10: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Big Data & Storage• High performance Block Storage

• Ability to use local disk

• Hypervisor I/O optimizations

• Ability to control VM locality

• Object Storage

• To store results

• Scale out & capacity

• Load balanced

10

Page 11: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Hybrid as IaaS future

• Private & Public clouds used together

• Collection of features, policies, control mechanisms to enable this

• API compatibility

• Semantic compatibility

11

Page 12: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Eucalyptus & AWS

• API compatible with EC2, EBS, S3, IAM, CloudWatch, Elastic Load Balancing, Autoscaling

– Coming up: EMR, VPC, STS, SNS, Elastic Beanstalk, …

• Netflix Cloud Prize for the Best Portability Enhancement

12

Page 13: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Eucalyptus Architecture

• Reference Architectures: eucalyptus.com/eucalyptus-cloud/iaas/architecture

13

Page 14: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

It Just Works

• Enhanced API compatibility

• Easy installation

• User console

• Ability to import AWS or VMDK images

• Ability to easily upgrade

• High Availability

14

Page 15: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Eucalyptus & Hybrid

• AWS fidelity

– API & semantics

– Error reporting

• Hybrid console

• SDKs and tools work “out of the box”

• Netflix Asgard and other tools run on Eucalyptus

• Can be extended

– Google Compute Engine?

• Coming up: VPC & federation.

15

Page 16: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Big Data & Eucalyptus Storage

• Block Storage for data

– Use local storage when possible

– Very sensitive to placement

• Object storage for input & results

– E.g. EMR

16

Page 17: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Block Storage

• Ephemeral/Local Disk Storage

– Ability to specify VM types

– Ability to use large instances

– QoS/SLAs

• E.g. High IOPs instances

– Ability to specify VM placement

17

Page 18: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Object Storage

• Performance, scalable storage

• Scale out

• Integrated with Identity & Access Management (IAM)

– Ability to specify policies

• Easy to add capacity

• Interfaces with third party object stores

– RiakCS, Ceph, Swift

• Interfaces with Amazon S3

– Local storage can be treated as a cache or for reduced redundancy

18

Page 19: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Scale out Object Storage

19

Page 20: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Performance Monitoring & Scaling

• CloudWatch metrics

– Ability to define custom metrics

• Using CloudWatch metrics to scale up & down

• Better ability to reason about performance

– Private cloud gives you better visibility

• Use CW metrics to control Autoscaling

20

Page 21: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Customer Case Study

• Dev/Test for large enterprises: 12k cores, 20k by the end of the year, 60k next year

• Adopted AWS APIs

• Storage

– EBS & Object

– Integration with existing on-premise storage infrastructure

• User console

• Need three Eucalyptus trained admins.

21

Page 22: Frontier bigdata

© 2013 EUCALYPTUS SYSTEMS,

INC.

Where are we going?

• Hybrid = Private + Public

– Either for bursting or production deployment

– Dev/Test

– Scalable Web Services

• Big Data needs key features

• Storage

– Combination of Block & Object Storage

– Object stores becoming more prevalent

• Do not force APIs onto users, fight “interesting” battles

22

Page 23: Frontier bigdata

THANK YOUNeil Soman

[email protected]