frontier bigdata
DESCRIPTION
Talk presented at Frontier Big Data meetup at Elance headquarters in Mountain ViewTRANSCRIPT
© 2013 EUCALYPTUS SYSTEMS,
INC.
Big Data: Public, Private or Hybrid?
Neil SomanCo-Founder
2
© 2013 EUCALYPTUS SYSTEMS,
INC.
First there was cloud
• 2006: AWS launched
• 2007/2008 Eucalyptus started as National Science Foundation project
–First private cloud computing IaaS platform, picked AWS API
• 2009 Eucalyptus Systems launched
–Open Source
–Today: v3.4, 2 rewrites, 7+ services, compute, network, storage
• 2007+ AWS gains rapid traction and becomes industry leader
3
© 2013 EUCALYPTUS SYSTEMS,
INC.
Cloud & Big Data
4
© 2013 EUCALYPTUS SYSTEMS,
INC.
Cloud Trends
5
© 2013 EUCALYPTUS SYSTEMS,
INC.
Why not just run on bare metal?
• Need level of control for end users
• Reduces burden on admins
• Better tenancy: can be flexible and run other tasks
• Easily scale out as needed
• Virtualization is getting better, overhead getting lower
• Hypervisors adding more features: VirtIO, Passthrough
6
© 2013 EUCALYPTUS SYSTEMS,
INC.
What characteristics are essential
7
© 2013 EUCALYPTUS SYSTEMS,
INC.
Big Data Considerations• QoS/SLAs
• VM placement: Physical characteristics are important
• Gather performance data
– E.g. CloudWatch
• Scale out as needed based on performance data
– E.g. Autoscaling
• Ability to attach local disks
• Huge/XXL instances, SSDs
8
© 2013 EUCALYPTUS SYSTEMS,
INC.
Big Data & Networking• Subnet stickiness (Static IPs)
• Multiple NICs for VMs
• Equal cost links between VMs
• High I/O (e.g. bonded interface)
• Built into cloud fabric
9
© 2013 EUCALYPTUS SYSTEMS,
INC.
Big Data & Storage• High performance Block Storage
• Ability to use local disk
• Hypervisor I/O optimizations
• Ability to control VM locality
• Object Storage
• To store results
• Scale out & capacity
• Load balanced
10
© 2013 EUCALYPTUS SYSTEMS,
INC.
Hybrid as IaaS future
• Private & Public clouds used together
• Collection of features, policies, control mechanisms to enable this
• API compatibility
• Semantic compatibility
11
© 2013 EUCALYPTUS SYSTEMS,
INC.
Eucalyptus & AWS
• API compatible with EC2, EBS, S3, IAM, CloudWatch, Elastic Load Balancing, Autoscaling
– Coming up: EMR, VPC, STS, SNS, Elastic Beanstalk, …
• Netflix Cloud Prize for the Best Portability Enhancement
12
© 2013 EUCALYPTUS SYSTEMS,
INC.
Eucalyptus Architecture
• Reference Architectures: eucalyptus.com/eucalyptus-cloud/iaas/architecture
13
© 2013 EUCALYPTUS SYSTEMS,
INC.
It Just Works
• Enhanced API compatibility
• Easy installation
• User console
• Ability to import AWS or VMDK images
• Ability to easily upgrade
• High Availability
14
© 2013 EUCALYPTUS SYSTEMS,
INC.
Eucalyptus & Hybrid
• AWS fidelity
– API & semantics
– Error reporting
• Hybrid console
• SDKs and tools work “out of the box”
• Netflix Asgard and other tools run on Eucalyptus
• Can be extended
– Google Compute Engine?
• Coming up: VPC & federation.
15
© 2013 EUCALYPTUS SYSTEMS,
INC.
Big Data & Eucalyptus Storage
• Block Storage for data
– Use local storage when possible
– Very sensitive to placement
• Object storage for input & results
– E.g. EMR
16
© 2013 EUCALYPTUS SYSTEMS,
INC.
Block Storage
• Ephemeral/Local Disk Storage
– Ability to specify VM types
– Ability to use large instances
– QoS/SLAs
• E.g. High IOPs instances
– Ability to specify VM placement
17
© 2013 EUCALYPTUS SYSTEMS,
INC.
Object Storage
• Performance, scalable storage
• Scale out
• Integrated with Identity & Access Management (IAM)
– Ability to specify policies
• Easy to add capacity
• Interfaces with third party object stores
– RiakCS, Ceph, Swift
• Interfaces with Amazon S3
– Local storage can be treated as a cache or for reduced redundancy
18
© 2013 EUCALYPTUS SYSTEMS,
INC.
Scale out Object Storage
19
© 2013 EUCALYPTUS SYSTEMS,
INC.
Performance Monitoring & Scaling
• CloudWatch metrics
– Ability to define custom metrics
• Using CloudWatch metrics to scale up & down
• Better ability to reason about performance
– Private cloud gives you better visibility
• Use CW metrics to control Autoscaling
20
© 2013 EUCALYPTUS SYSTEMS,
INC.
Customer Case Study
• Dev/Test for large enterprises: 12k cores, 20k by the end of the year, 60k next year
• Adopted AWS APIs
• Storage
– EBS & Object
– Integration with existing on-premise storage infrastructure
• User console
• Need three Eucalyptus trained admins.
21
© 2013 EUCALYPTUS SYSTEMS,
INC.
Where are we going?
• Hybrid = Private + Public
– Either for bursting or production deployment
– Dev/Test
– Scalable Web Services
• Big Data needs key features
• Storage
– Combination of Block & Object Storage
– Object stores becoming more prevalent
• Do not force APIs onto users, fight “interesting” battles
22
THANK YOUNeil Soman