2nd eucalyptus bay area meet up with rich wolski

23
© 2012 Eucalyptus Systems, Inc. -- confidential Eucalyptus Architecture and Implementation Rich Wolski, CTO March 1, 2012

Upload: eucalyptus-systems-inc

Post on 23-Jun-2015

563 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Eucalyptus Architecture and Implementation

Rich Wolski, CTO March 1, 2012

Page 2: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Eucalyptus Multi-tiered Service Architecture

User Transactions

Inventory and Scheduling

Actualization Actualization Actualization Actualization Actualization

Inventory and Scheduling

Inventory and Scheduling

Service Delivery User Requests

Page 3: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Eucalyptus Components •  Cloud Controller (CLC)

–  User request processing (except for Walrus), Credentials management, VM (instance) state management

•  Walrus (S3) –  S3 user request processing, Append-only, Put/Get object storage

•  Cluster Controller (CC) –  VM inventory, Network provisioning/security group implementation

•  Storage Controller (SC) –  Block level, network attached storage (SAN and Linux)

•  Node Controller (NC) –  Hypervisor interface and control, VM launch/decommissioning

•  VMWare Broker –  Gateway between CC and ESX and/or vSphere for VMWare

Page 4: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Component Architecture

CLC Walrus

CC SC

NC/VMWareB

NC/VMWareB

NC/VMWareB

NC/VMWareB

NC/VMWareB

CC SC CC SC

Service Delivery User Requests

Page 5: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Eucalyptus Generations

•  Eucalyptus 1.X (June 08 through Sep. 10) –  University code

•  Eucalyptus 2.X (June 10 through Feb. 11) –  Commercial focus, early production

•  Eucalyptus 3.X (present - ) –  Production operational improvements –  Full commercial feature set (almost)

•  Few, if any features deprecated –  BitTorrent?

Page 6: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

New Eucalyptus 3.0 Features •  High-availability (HA) of the Eucalyptus Service

–  Hot fail-over and repair for all components except NC

•  AWS Identity and Access Management (IAM) API plus extensions for private clouds

– Quotas and metering

•  Eucalyptus Block Storage improvements –  AWS Volume-backed instance API (persistent instances)

“bootable” –  NetApp and JBOD support added to existing Dell Equallogic

•  Full support for Windows images –  Seven different versions, AWS compatible authentication,

sysprep, ephemeral disk

•  Accounting/Usage reporting –  Charge-back interface linked to quotas

Page 7: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Eucalyptus 3.0 Platform Improvements •  Revamped image caching in the NC

–  Faster instance starts using copy-on-write

•  Refactored VMWare broker –  Faster and more robust image preparation, support for vSphere 4.X,

improved scale, more extensive deployment topologies

•  Extended Linux distro support –  RHEL 5 and RHEL 6, packages for Canonical LTS (Ubuntu 10.04)

•  Substantial improvement in automated QA –  Full QA sequence is 5 days (features + distros + hypervisors +

deployment topologies + networking modes)

•  Re-designed administrative webUI •  Improved command-line admin tools •  Re-designed packaging, upgrade and dependency management •  Re-designed installation mechanism (package repositories)

Page 8: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Eucalyptus in The Wild

•  Eucalyptus 2.0 Deployments – Games, mobile infrastructure, media, telecom

•  Tons of feedback –  Not all of it angry

•  Top 3 –  Platform HA -> VM connectivity and request service – Quotas, accounting, reporting – Windows (fast image creation and start)

Page 9: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

High Availability

•  Eliminate single point of failure –  Host failure –  Network connectivity failure (including network partitions)

•  Tolerate as many multiple failure cases as possible •  Avoid data loss at all costs

–  Fail stop is better than data loss

•  Availability of the services that Eucalyptus offers –  Eucalyptus requests –  VM connectivity and storage –  Not VM HA -> application level

Page 10: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

HA Web Service Architecture

•  All Eucalyptus components are implemented as Web Services

–  CLC, Walrus, SC, VMWare Broker– Java –  CC and NC - C

•  CC and NC are each implemented in separate Axis2c service container

•  CLC, Walrus, SC, and VMWare Broker share a web service stack and JVM when co-located

Page 11: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

PoC Configuration

Linux

Web Service DB management

CLC Walrus

SC VMWb

CC

Linux

NC

Linux

NC

Linux

NC

Linux

NC

Linux

NC

Linux

Web Service DB management

CLC Walrus

SC VMWb

CC

Walrus

Walrus

Page 12: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Multi-component Failure

Linux

Web Service DB management

CLC

SC VMWb

CC

Linux

NC

Linux

NC

Linux

NC

Linux

NC

Linux

NC

Linux

Web Service DB management

CLC Walrus

SC VMWb

CC

Walrus CLC

SC VMWb

CC

Page 13: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Production CLC CLC Wal

SC

Wal

SC

CC

VMb

CC

VMb

Linux

NC

Linux

NC

Linux

NC

Linux

NC

Linux

NC

Page 14: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Group Membership and Heartbeat

•  HA is from the perspective of the “master” CLC •  Jgroups determines which machines are “up”

–  The network connecting the “up” machines is unpartitioned

•  Heartbeat determines which services are available within the “up” group

•  Back-up CLC monitors the “up” group to determine if it contains a master

–  If not, it becomes the master

•  Master and Back-up DBs kept synced –  Resync when failed CLC is restored

Page 15: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Interesting Wrinkles

•  CLC and Walrus have externally visible URLs –  DNS remapping service is built into the CLC

•  What happens if the master loses connectivity with the user?

–  Back-up may have an alternative path to user –  If DNS remaps, and the back-up becomes active, the system

may experience a “split brain” •  Fail stop •  Arbitrator service

•  Multi-failure can cause split brain – Master fails over, new master fails before original back,

original then brought up => fail stop

Page 16: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

IAM, Quotas, and Reporting

•  IAM is AWS “Identity and Access Management” –  Accounts and users, and groups of users –  JSON based policies defines calls that users and groups can

execute –  Also possible to attach policies to resources S3 (buckets for

now)

•  Eucalyptus extends the IAM predicates with inequalities –  Implements quotas as tests against IAM policies

•  Resource usage information exportable in a variety of formats and through GUI

Page 17: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

For Example

eucalyptus support sales dev

EC2 image permission

S3 bucket ACL

quota

quota

{    "Version":"2012-­‐02-­‐12",    "Statement":[{        "Sid":"2",        "Effect":“Limit",        "Action":"ec2:RunInstances",        "Resource":"*",        "Condition":{            "NumericLessThanEquals":{                "ec2:quota-­‐vminstancenumber":  "256"            }        }    }]  }  

Page 18: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Evaluation Logic

Sys admin?

Reject Accept

Yes No

Account-level permission satisfied?

Yes

Account admin or

IAM user policy allowed?

No Reject

No

Allocating resources?

Yes

Accept

No Yes

Exceeding Quota?

Reject

Yes No Accept

Page 19: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Windows

•  Windows images are big – One customer wants 200 GB images –  Ephemeral within the C: drive

•  Need a way to use CoW to improve Windows launch time

Page 20: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

The Blob Store

•  Blobs are (sparse) files on the file system –  remember to use ‘ls –s’ to see disk space allocated –  files are mounted on loopback when in use –  future implementation could use LVM volumes instead of files

•  Mapping and copy-on-write snapshots are implemented using Linux kernel’s device-mapper (same as LVM snapshots)

–  once snapshotted or mapped, file access method cannot be used

–  i.e., backing file on disk no longer has the bits you want

Page 21: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Image -> Instance in the NC Eucalyptus Linux Image on NC

NC cache area

NC work

space

Walrus

ERIEMI EKI

swap

ephemeral0

EMI

ERI

EKI

mkfs.ext3

download download download

mkswap

EMI

ERI

EKI

EMI + KEYPT

swap

ephemeral0

copy

copy

snapEMI + KEY

snap

snap

map

swapephemeral0

map

mapzero

snap

•  NC’s cache keeps objects from Walrus and partitions created from scratch, one per size/type

•  LRU eviction policy for non-pinned objects limits disk use

•  EKI and ERI are copied to work space due to libvirt requirement

•  Other objects are snapshotted, tuned, and then mapped to compose the disk

Page 22: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

What’s Next? •  Eucalyptus 3.1 (Q2)

–  Refactoring for packaged plug-ins –  Postgres instead of MySQL

•  Eucalyptus 3.2 (Q4) –  Feature release –  Possibilities

•  ELB, Cloudwatch, Autoscaling •  Tags

•  Eucalyptus 4 in 2013 and Eucalyptus 5 in 2014 –  Application features -> services and API – Operational features -> ease of use, maintenance,

performance

•  Please help! – tell us what Eucalyptus needs and when it needs it

Page 23: 2nd Eucalyptus Bay Area Meet Up with Rich Wolski

© 2012 Eucalyptus Systems, Inc. -- confidential

Thanks!

•  [email protected] •  @richwolski

Questions?