my other computer is a data center (2010 v21)

Post on 09-May-2015

2.372 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

This is a talk I gave recently about cloud computing from the perspective of thinking of a data center as your "other computer."

TRANSCRIPT

An Overview of Cloud Computing:My Other Computer is a Data Center

Robert GrossmanOpen Cloud Consortium

January 7, 2010

Part 1What is a Cloud?

2

What is a Cloud?

3

Software as a Service (SaaS)

What Else is a Cloud?

4

Platform as a Service (PaaS)

Is Anything Else a Cloud?

5

Infrastructure as a Service (IaaS)

Are There Other Types of Clouds?

6

Large Data Cloud Services

ad targeting

What is Virtualization?

7

Idea Dates Back to the 1960s

Virtualization first widely deployed with IBM VM/370.

8

IBM Mainframe

IBM VM/370

CMS

App

Native (Full) VirtualizationExamples: Vmware ESX

MVS

App

CMS

App

What Do You Optimize?

Goal: Minimize latency and control heat.

Goal: Maximize data (with matching compute) and control cost.

10

Scale is new

Elastic, Usage Based Pricing Is New

11

1 computer in a rack for 120 hours

120 computers in three racks for 1 hour

costs the same as

Elastic, usage based pricing turns capex into opex. Clouds can be used to manage surges in computing needs.

Simplicity Offered By the Cloud is New

12

+ .. and you have a computer ready to work.

A new programmer can develop a program to process a container full of data with less than day of training using MapReduce.

Databases Data CloudsScalability 100’s TB 100’s PBFunctionality Full SQL-based queries,

including joinsOptimized access to sorted tables (tables with single keys)

Optimized Databases optimized for safe writes

Clouds optimized for efficient reads

Consistency model

ACID (Atomicity, Consistency, Isolation & Durability) – database always consist

Eventual consistency – updates eventually propagate through system

Parallelism Difficult because of ACID model; shared nothing is possible

Basic design incorporates parallelism over commodity components

Scale Racks Data center13

What Resource is Managed? Scarce processors wait for data

– Manage cycles– wait for an opening in the queue– scatter the data to the processors– and gather the results

Persistent data wait for queries– Manage data– persistent data waits for queries– computation done locally– results returned

Supercomputer Center Model

Data CenterModel

Part 2. Data Centers as the Unit of Computing

“Cloud computing has become the center of investment and innovation.”Nicholas Carr, 2009 IDC Directions

15

Cloud computing is at the top of the Gartner hype cycle.

experimental science

simulation science

datascience

160930x

1670250x

197610x-100x

200410x-100x

Requirements for Clouds

Scale to Data Centers

Scale Across Data Centers

Support Large Data Flows

Support Security, Auditing

Support Real Time Alerts

Business X X

E-science X X X

Healthcare X X

Defense X X X X X

Transition Taking Place A hand full of players are building multiple data

centers a year and improving with each one. This includes Google, Microsoft, Yahoo, … A data center today costs $200 M – $400+ M Berkeley RAD Report points out analogy with

semiconductor industry as companies stopped building their own Fabs and starting leasing Fabs from others as Fabs approached $1B

18

Which is the Operating System?

19

workstation

VM 1 VM 5

VM 1 VM 50,000

Data Center Operating SystemHyperviser

data center

How Do You Program A Data Center?

20

Some Programming Models for Data Centers

Operations over data center of disks– MapReduce (“string-based”)– User-Defined Functions (UDFs) over data center– SQL and Quasi-SQL over data center– Data analysis / statistics over data center

Operations over data center of memory– Grep over distributed memory– UDFs over distributed memory– SQL and Quasi-SQL over distributed memory– Data analysis / statistics over distributed memory

Part 3.Open Cloud Consortium

U.S. 501(3)(c) not-for-profit corporation Supports the development of standards and

interoperability frameworks. Supports reference implementations for

cloud computing. Manages testbeds: Open Cloud Testbed,

Intercloud Testbed, Open Science Data Cloud Develops benchmarks.

23

www.opencloudconsortium.org

OCC Members

Companies: Aerospace, Booz Allen Hamilton, Cisco, InfoBlox, Open Data Group, Raytheon, Yahoo

Universities: CalIT2, Johns Hopkins, Northwestern, University of Illinois at Chicago, University of Chicago

Government agencies: NASA Organizations: Sector Project

24

Open Cloud Testbed

Phase 2 9 racks 250+ Nodes 1000+ Cores 10+ Gb/s

25

MREN

CENIC Dragon

Hadoop Sector/Sphere Thrift KVM VMs Eucalyptus

VMs

C-Wave

Intercloud Testbed

Infrastructure as a Service– Virtual Data Centers (VDC)– Virtual Networks (VN)– Virtual Machines (VM)– Physical Resources

Platform as a Service– Cloud Compute Services– Data & Storage as a Service

Open Virtualization Format (OVF)

Open Cloud Computing Interface (OCCI)

SNIA Cloud Data Management Interface (CDMI)

Large Data Cloud Interoperability Framework

Dynamic infrastructure service linking IaaS and DaaS

Dynamic infrastructure service naming and linking

entities in the IaaS layers

Working with Infrastructure 2.0 Working Group

Working with Infrastructure 2.0 Working Group

Open Science Data Cloud

27

sky cloud

biocloud

Planning to work with 5 international partners (all connected with 10 Gbps networks).

MalStone (OCC-Developed Benchmark)

MalStone A MalStone BHadoop 455m 13s 840m 50s

Hadoop streaming with Python

87m 29s 142m 32s

Sector/Sphere 33m 40s 43m 44s

Sector/Sphere 1.20, Hadoop 0.18.3 with no replication on Phase 1 of Open Cloud Testbed in a single rack. Data consisted of 20 nodes with 500 million 100-byte records / node.

Some Lessons Learned (So Far)

Python over Hadoop Distributed File System surprisingly powerful.

Tuning Hadoop can be a large (unacknowledged) cost.

Performance of a cloud computation can be significantly impacted by just 1 or 2 nodes that are a bit slower.

Wide area clouds can be practical in some cases.

29

Part 4. Sector

30

http://sector.sourceforge.net

Sector Overview Sector is fast

– As measured by MalStone & Terasort Sector is easy to program

– Supports UDFs, MapReduce & Python over streams Sector does not require extensive tuning. Sector is secure

– A HIPAA compliant Sector cloud is being set up Sector is reliable

– Sector v1.24 supports multiple master node servers31

Google’s Large Data Cloud

Storage Services

Data Services

Compute Services

32

Google’s Stack

Applications

Google File System (GFS)

Google’s MapReduce

Google’s BigTable

Hadoop’s Large Data Cloud

Storage Services

Compute Services

33

Hadoop’s Stack

Applications

Hadoop Distributed File System (HDFS)

Hadoop’s MapReduce

Data Services

Sector’s Large Data Cloud

Storage Services

Compute Services

34

Sector’s Stack

Applications

Sector’s Distributed File System (SDFS)

Sphere’s UDFs

Routing & Transport Services

UDP-based Data Transport Protocol (UDT)

Data Services

Generalization: Apply User Defined Functions (UDF) to Files in Storage Cloud

35

map/shuffle reduce

UDFUDF

Hadoop vs SectorHadoop Sector

Storage Cloud Block-based File-basedProgramming Model

MapReduce UDF & MapReduce

Image processing

Difficult with MapReduce

Easy with UDF

Protocol TCP UDTReplication At write At write or period.Security Not yet HIPAA capableLanguage Java C++

36Source: Gu and Grossman, Sector and Sphere, Phil. Trans. Royal Society A, 2009.

Terasort - Sector vs Hadoop Performance1 Rack 2 Racks 3 Racks 4 Racks

Nodes 32 64 96 128

Cores 128 256 384 512

Hadoop 85m 49s 37m 0s 25m 14s 17m 45s

Sector 28m 25s 15m 20s 10m 19s 7m 56s

Speed up 3.0 2.4 2.4 2.2

Sector/Sphere 1.24a, Hadoop 0.20.1 with no replication on Phase 2 of Open Cloud Testbed with co-located racks.

Sector Applications Distributing the 15 TB Sloan Digital Sky Survey to

astronomers around the world (joint with JHU, 2005) Managing and analyzing high throughput sequence

data (Cistrack, University of Chicago, Cistrack, 2007). Detecting emergent behavior in distributed network

data (Angle, won SC 07 Analytics Challenge) Image processing for high throughput sequencing. Wide area clouds (won SC 09 BWC with 100 Gbps

wide area computation) New ensemble-based algorithms for trees Graph processing

38

Cistrack Database

Analysis Pipelines & Re-analysis

Services

Cistrack Web Portal & Widgets

Cistrack Large Data Cloud Services

Ingestion Services

Cistrack Elastic Cloud

Services

Thank you

For more information, please see blog.rgrossman.com

40

top related