technical aspects of cloud computing ed...exec education course on cloud computing, nov 15, 2011...

Post on 20-May-2018

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Technical Aspects of Cloud Computing

Chaitan Baru San Diego Supercomputer Center Competitive Advantage Through Cloud Computing

1

Exec Education Course on Cloud Computing, Nov 15, 2011

News flash… •  “Now Available

in the Cloud..”

•  What does this mean? ▫  For the

developers ▫  For the users

2

Exec Education Course on Cloud Computing, Nov 15, 2011

Outline

•  The changing context • Cloud computing definitions ▫  Implementation considerations

• Application case studies •  Future directions

• Course materials: ▫  http://clds.sdsc.edu, click on Education ▫  Files stored in the SDSC Cloud

3

Exec Education Course on Cloud Computing, Nov 15, 2011

The Changing Context •  Rapid growth in data ▫  data-driven business decisions

•  Scientific workloads as a predictor of future business workloads ▫  Sensor-based systems, remote sensing, genome sequencing

•  A point of inflexion ▫  Changing software: from RDBMS, to noSQL, to streaming,

to scientific data, … ▫  Changing hardware: multi-core, solid-state disk, large

memory and new types of memory ▫  Changing platforms: wholly-owned (“on premises”) systems

vs clouds ▫  Changing business costs / models: ultra-high productivity,

energy efficiency, rent vs own, …

4

Exec Education Course on Cloud Computing, Nov 15, 2011

New sources of increasing data volumes

•  Sensor Networks Top Social Networks for Big Data, Stacey Higginbotham Sep. 13, 2010, http://gigaom.com/cloud/sensor-networks-top-social-networks-for-big-data-2/

5

Exec Education Course on Cloud Computing, Nov 15, 2011

Gene Sequencing

• ~2TB/experiment in next-generation gene sequencing ▫  Sequencing of individuals ▫  Multiple runs per individual ▫  Multiple sequencing over time

•  Managing and Analyzing Next-Generation Sequence Data, Richter BG, Sexton DP (2009). PLoS Comput Biol 5(6): e1000369. doi:10.1371/journal.pcbi.1000369, June 2009

6

Exec Education Course on Cloud Computing, Nov 15, 2011

Remote Sensing

• ~1TB of high-resolution topographic data for San Andreas Fault ▫  10x more for imagery ▫  Repeated scans for ecological applications ▫  OpenTopography.org

•  LaSDI Initiative: Laser Spatial Data Infrastructure ▫  Sub-meter to 10cm scale 3D models of earth

7

Exec Education Course on Cloud Computing, Nov 15, 2011

Data is the new oil ! •  The data ecosystem ▫  From acquisition, to transfer, storage, creation of

derived products, and exploitation •  Those with data are better off than those without •  Those who can exploit data have the competitive

advantage ▫  Walmart, Fedex, Wall Street trading, Internet

companies (Google, Amazon, Facebook, Twitter,..) •  And, cannot find oil without data…! ▫  Oil exploration data growing from 10’sTB to PBs over

next few years

8

Exec Education Course on Cloud Computing, Nov 15, 2011

Where should all the data reside?

• All in your private systems (private cloud)?

• All in a public cloud ?

• Hybrid model: Private + Public ?

9

Exec Education Course on Cloud Computing, Nov 15, 2011

Cloud Computing: Definition

• Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction

Another turn of the screw in our push towards productivity

Exec Education Course on Cloud Computing, Nov 15, 2011

Cloud Computing: NIST Definition •  On-demand self-service ▫  You get the resource when you ask for it, using APIs

•  Broad network access ▫  Accessible from anywhere

•  Resource pooling ▫  Shared resource provisioning

•  Rapid elasticity ▫  Uniform scaleout

•  Measured service ▫  Monitoring of usage, reporting of usage

11

Exec Education Course on Cloud Computing, Nov 15, 2011

Delivery Models

•  Software as a Service (SaaS) •  Platform as a Service (PaaS) •  Infrastructure as a Service (IaaS)

•  Importance? ▫  Implications on what type of programming work

needs to be done ▫  Who do you have (or want to hire) to work on

this?

Exec Education Course on Cloud Computing, Nov 15, 2011

•  http://thoughtsoncloud.com/

13

Exec Education Course on Cloud Computing, Nov 15, 2011

Delivery Model: Software as a Service

•  Software as a Service (SaaS) ▫  The capability provided to the consumer is to use the

provider’s applications running on a cloud infrastructure ▫  Accessible from various client devices through a thin

client interface such as a Web browser (e.g., web-based email) ▫  The consumer does not manage or control the

underlying cloud infrastructure   With the possible exception of limited user-specific application

configuration settings.

Exec Education Course on Cloud Computing, Nov 15, 2011

Software as a Service: Example •  Google Maps API ▫  http://code.google.com/apis/maps/ ▫  Users are provided with simple APIs for maps ▫  Uses cloud resources at the back-end

•  Facebook ▫  http://www.facebook.com ▫  Social networking site using a suite of cloud-based tools at the

back-end •  Animoto ▫  http://www.animoto.com ▫  Service that makes videos from user uploaded images ▫  Uses Amazon EC2 at the back-end

Exec Education Course on Cloud Computing, Nov 15, 2011

Salesforce.com: SaaS

16

Exec Education Course on Cloud Computing, Nov 15, 2011

Delivery Models: Platform as a Service

•  Platform as a Service (PaaS) ▫  The capability provided to the consumer is to deploy

onto the cloud infrastructure consumer-created applications using programming languages and tools supported by the provider (e.g., java, python, .Net) ▫  The consumer does not manage or control the

underlying cloud infrastructure, network, servers, operating systems, or storage   But the consumer has control over the deployed applications

and possibly application hosting environment configurations.

Exec Education Course on Cloud Computing, Nov 15, 2011

Platform as a Service: Google AppEngine

*Credit: http://rdn-consulting.com/blog/2009/02/07/exploring-cloud-computing-development/

Exec Education Course on Cloud Computing, Nov 15, 2011

Delivery Model: Infrastructure as a Service •  Infrastructure as a Service (IaaS) ▫  The capability provided to the consumer is to

provision processing, storage, networks, and other fundamental computing resources ▫  Consumer is able to deploy and run arbitrary software,

which can include operating systems and applications ▫  The consumer does not manage or control the

underlying cloud infrastructure   But has control over operating systems, storage, deployed

applications, and possibly select networking components (e.g., firewalls, load balancers)

Exec Education Course on Cloud Computing, Nov 15, 2011

Infrastructure as a Service: Amazon Web Services (AWS) •  Amazon Elastic Compute Cloud (EC2) ▫  A web service that provides resizable

compute capacity in the cloud. ▫  Configure an Amazon Machine Instance

(AMI) and load it into the Amazon EC2 service

▫  Quickly scale capacity, both up and down, as your computing requirements change

•  Amazon Simple Storage Service (S3) ▫  A simple web services interface that can be

used to store and retrieve large amounts of data, at any time, from anywhere on the web

▫  It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites

Exec Education Course on Cloud Computing, Nov 15, 2011

AWS, aws.amazon.com

21

Exec Education Course on Cloud Computing, Nov 15, 2011

Microsoft Azure www.microsoft.com/windowsazure/

22

Exec Education Course on Cloud Computing, Nov 15, 2011

OpenStack, www.openstack.org

•  Receiving attention. Example, Cisco support for OpenStack •  http://www.slideshare.net/CiscoSP360/velocity-2011-cisco-and-

open-stack

23

Exec Education Course on Cloud Computing, Nov 15, 2011

Eucalyptus

24

•  Target market ▫  On-premise (private) IaaS ▫  Use existing infrastructure to create AWS-compatible cloud

•  Products: ▫  Eucalyptus IaaS ▫  Eucalyptus OpenSource ▫  Eucalyptus RightScale

Exec Education Course on Cloud Computing, Nov 15, 2011

Eucalyptus IaaS

25

Exec Education Course on Cloud Computing, Nov 15, 2011

Nirvanix

• CloudComplete ▫  Can vary among private, hybrid, public cloud

implementations, using Nirvanix’s public cloud

26

Exec Education Course on Cloud Computing, Nov 15, 2011

Virtualization and Cloud Computing •  Virtualization is the ability to run “virtual machines” on top of

a “hypervisor.” ▫  A virtual machine (VM) is a software implementation of a

machine (i.e., a computer) that executes programs like a physical machine. ▫  Each VM includes its own kernel, operating system, supporting

libraries and applications. ▫  A hypervisor provides a uniform abstraction of the underlying

physical machine. Multiple VMs can execute simultaneously on a single hypervisor. ▫  The decoupling of the VM from the underlying physical hardware

allows the same VM to be started on different physical machines. •  Virtualization is an enabler for cloud computing ▫  Gives the cloud computing provider the flexibility to move and

allocate the computing resources requested by the user wherever the physical resources are available.

27

Exec Education Course on Cloud Computing, Nov 15, 2011

SNIA CDMI

• Cloud Data Management Interface

•  Standardizing at the IaaS level

28

Exec Education Course on Cloud Computing, Nov 15, 2011

Some Take Home Lessons

• Cloud providers are providing you a service, not just a product ▫  Product model: sell product, support product ▫  Service model: provide service, become intimately

exposed to all aspects of the service that the customer sees   Seeing this from a customer’s viewpoint

29

Exec Education Course on Cloud Computing, Nov 15, 2011

Cloud Computing Costs*

* Source: McKinsey & Co

Exec Education Course on Cloud Computing, Nov 15, 2011

Cloud Computing: The Rationale •  Flatten out the peaks and valleys of utilization to

get higher overall utilization of entire infrastructure

• Bring together workloads with different valley / peak behaviors

• But…is running a high utilization operation the same as running a low utilization operation?

•  Velocity 2010: Datacenter Infrastructure Innovation, James Hamilton, VP & Distinguished Engineer, Amazon AWShttp://www.youtube.com/watch?v=kHW-ayt_Urk

31

Exec Education Course on Cloud Computing, Nov 15, 2011

Dealing with Peaks •  Old approach: ▫  Provision for peak workload. Low utilization at other

times •  HPC approach: ▫  Build a machine for a certain max job size. Provide job

queue and “on-demand”, pre-emptible access at other times.

•  Cloud approach: ▫  Charge different rates for use at different times, based

on usage ▫  E.g. Amazon Spot Instance

•  Typical server workloads: 10-15%

32

Exec Education Course on Cloud Computing, Nov 15, 2011

Some Application Case Studies from XLDB • XLDB11: 5th Extremely Large Databases

Workshop, Oct 18-19, SLAC, Palo Alto, CA •  “State of practice” workshop •  Presentations on current system implementation

and challenges, and needs and requirements ▫  E.g. presentations from: Facebook, LinkedIn,

eBay, Google, Netflix, Novartis, Quora, Metamarkets, Microsoft, …

•  http://www-conf.slac.stanford.edu/xldb2011/Program.asp

33

Exec Education Course on Cloud Computing, Nov 15, 2011

Quora.com

•  Scaling up Quickly in the Cloud, Edmond Lau, Quora, XLDB Workshop

34

Exec Education Course on Cloud Computing, Nov 15, 2011

Quora.com

35

Exec Education Course on Cloud Computing, Nov 15, 2011

36

Exec Education Course on Cloud Computing, Nov 15, 2011

Quora

37

Exec Education Course on Cloud Computing, Nov 15, 2011

Quora

38

Exec Education Course on Cloud Computing, Nov 15, 2011

39

Exec Education Course on Cloud Computing, Nov 15, 2011

Metamarkets, Michael Driscoll, co-founder, CTO

40

Exec Education Course on Cloud Computing, Nov 15, 2011

41

Exec Education Course on Cloud Computing, Nov 15, 2011

42

Exec Education Course on Cloud Computing, Nov 15, 2011

Metamarkets: Performance at scale

• Evaluating the online ad market • Billions of microtransactions per day • Require billion rows/second performance •  Fast analytics over 100’s of terabytes • Metamarket’s Druid system ▫  Partial aggregates + In-memory data + Indexes ▫  Distributed data + Parallelizable Queries =

Horizontal Scalability ▫  Real-time analytics ▫  Implemented in the cloud (AWS)

43

Exec Education Course on Cloud Computing, Nov 15, 2011

Cloud Data Analytics, Roger Barga, Microsoft Azure

44

Exec Education Course on Cloud Computing, Nov 15, 2011

45

Exec Education Course on Cloud Computing, Nov 15, 2011

46

Exec Education Course on Cloud Computing, Nov 15, 2011

47

Exec Education Course on Cloud Computing, Nov 15, 2011

Netflix •  Presentation by Eric Colson, VP, Netflix • All Netflix processing (DVD rentals and

Streaming video) is in the cloud (AWS) •  Total cost of Netflix implementation may be

higher than an in-house solution but, ▫  Netflix made a business strategy decision. They

are not in the business of running IT infrastructure ▫  Cloud computing required them to build a

distributed IT team, which did not match their culture of building close teams.

48

Exec Education Course on Cloud Computing, Nov 15, 2011

Application Examples: Bioinformatics

• Crossbow ▫  Genotyping from short reads using cloud

computing ▫  http://bowtie-bio.sourceforge.net/crossbow/

index.shtml •  SDSC Project to implement Hadoop-based

processing for next generation sequencing on SDSC’s HPC systems as well as clouds (AWS)

49

Exec Education Course on Cloud Computing, Nov 15, 2011

Role of Big Data •  What is the connection between cloud and big data? •  Cloud Scaling Lots of data •  Big data Lots of data •  Hadoop ▫  A software (eco)system for efficient processing of very

large data ▫  Uses MapReduce, which has become a convenient

language for low entry barrier, very large-scale data processing

•  Could use cloud computing resources to implement Hadoop ▫  $1M Question: data movement and data locality

50

Exec Education Course on Cloud Computing, Nov 15, 2011

Discussion of Survey

• Results from “Cloud Storage: Adoption, Practice and Deployment”, survey conducted for the Storage Networking Industry Association (SNIA)

51

Exec Education Course on Cloud Computing, Nov 15, 2011

Research and Markets Survey Cloud Computing in HPC: Rationale for Adoption •  Top reason: Access to extra resources to meet peak system load

requirements •  Cost Avoidance ▫  Continued demand for HPC compute cycles. Cloud computing could deliver low-

cost computing cycles. •  Capacity Management ▫  Deal with periodic demand peaks and better management of data center growth,

power, and cooling issues. •  Collaboration ▫  Integration of internet-based applications and communications may allow HPC

users to better work with those both inside and outside of their organizations. •  Evaluation of Cloud Systems ▫  Looking at the cloud system alternative to determine if and how they can make

use of the technology and concepts. •  Organizational Requirement ▫  Making sure that the competition does not “steal a march” with a new technology.

52

Exec Education Course on Cloud Computing, Nov 15, 2011

Monitoring and Benchmarking

• Monitoring of resource usage is essential for cloud environments

• What about SLAs and QoS? •  “Cloud computing means get your legal teams

lined up” ! • Resource monitoring is well-recognized need ▫  But need to ensure the right level of monitoring

and reporting is available • Benchmarking is a new frontier

53

Exec Education Course on Cloud Computing, Nov 15, 2011

Amazon Cloudwatch

54

http://aws.amazon.com/cloudwatch/

Exec Education Course on Cloud Computing, Nov 15, 2011

55

Exec Education Course on Cloud Computing, Nov 15, 2011

Azurescope

56

See: http://azurescope.cloudapp.net/BenchmarkTestCases

Exec Education Course on Cloud Computing, Nov 15, 2011

Azurescope application “probe”

57

Exec Education Course on Cloud Computing, Nov 15, 2011

Azurescope write “probe”

58

Exec Education Course on Cloud Computing, Nov 15, 2011

Future Directions: Benchmarking •  The need for Big Data and Cloud benchmarks •  The changing software landscape ▫  From RDBMS and Data Warehousing to NoSQL,

Hadoop, Unstructured data, Stream Processing, Graph Processing, … ▫  The changing hardware landscape

  Multi-core, SSD, new types of memory, large memory, different networking options, commodity vs high-end, …

▫  Multiple platform choices   Dedicated data platforms   Cloud platforms

Exec Education Course on Cloud Computing, Nov 15, 2011

Benchmarking Issues •  “Reference benchmarks” for big data (TPC style) ▫  Define modalities of big data ▫  Define end-to-end flows of big data ▫  Identify key real-world characteristics, e.g. multi-rack,

heterogeneous hardware ▫  Identify which existing benchmarks can be reused

•  “Probe” benchmarks for clouds ▫  Cloud performance can be variable ▫  “Application-level” performance probes ▫  The Cloud Weather Service™

•  Benchmarking workshop being planned for Feb 2012 timeframe

Exec Education Course on Cloud Computing, Nov 15, 2011

Future Directions: “Vertical” Clouds

• Clouds that are aimed at major markets ▫  Should have something unique about them, and a

sizable market • E.g. ▫  Collocated clouds for performance   e.g. Wall Street trading systems, online

advertisement systems, etc. ▫  Collocated clouds for security / privacy

61

Exec Education Course on Cloud Computing, Nov 15, 2011

Email: baru@sdsc.edu

62

top related