computing outside the box

1

Ian FosterComputation Institute

Argonne National Lab & University of Chicago

3

1890

4

1953

5

“Computation may someday be organized as a public utility …

The computing utility could become the basis for a new and

important industry.”

John McCarthy (1961)

11I-WAY, 1995

12

The grid, 1998

“Dependable, consistent, pervasive access to resources”

Dependable: Performance and functionality guarantees

Consistent: Uniform interfaces to a wide variety of resources

Pervasive: Ability to “plug in” from anywhere

13

Application

Infrastructure

14

Application

InfrastructureService oriented infrastructure

15

Layered grid architecture

Application

Fabric“Controlling things locally”: Access to, & control of,

resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling

use

Collective“Managing multiple resources”: ubiquitous infrastructure

services

User“Specialized services”: user- or appln-specific

distributed services

InternetTransport

Application

Link

Internet Protocol

Architecture

Initially custom … later Web Services

17www.opensciencegrid.org

18www.opensciencegrid.org

19

Bennett Berthenthal et al., www.sidgrid.org

20Brian Tieman

21

Simplifiedexampleworkflows

Genome sequence analysis

Physicsdata

analysis

Sloan digital sky

surveywww.opensciencegrid.org

22

“Sine” workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node

IoanRaicu

23Same scenario, but with dynamic resource

provisioning

24

Data diffusion ine-wave workload: Summary

GPFS 5.70 hrs, ~8Gb/s, 1138 CPU hrs

DD+SRP 1.80 hrs, ~25Gb/s, 361 CPU hrs DD+DRP 1.86 hrs, ~24Gb/s, 253 CPU hrs

25

Application


26

ApplicationService oriented applications


28

ApplnService

Create

Index service

StoreRepository ServiceAdvertize

Discover

Invoke;get results

Introduce

Container

Transfer GAR

Deploy

Ohio State University and Argonne/U.Chicago

Creating Services in 2008Introduce and gRAVI

Introduce Define service Create skeleton Discover types Add operations Configure security

Grid Remote Application Virtualization Infrastructure Wrap executables

Globus

29

As of Oct19, 2008:

122 participants105 services

70 data35 analytical

30

Microarray clustering using Taverna

1. Query and retrieve microarray data from a caArray data service:cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/CaArrayScrub

2. Normalize microarray data using GenePattern analytical service node255.broad.mit.edu:6060/wsrf/services/cagrid/PreprocessDatasetMAGEService

1. Hierarchical clustering using geWorkbench analytical service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/HierarchicalClusteringMage

Workflow in/output

caGrid services

“Shim” servicesothers

Wei Tan

31

Birmingham•

The Globus-basedLIGO data grid

Replicating >1 Terabyte/day to 8 sites

>100 million replicas so farMTBF = 1 month

LIGO Gravitational Wave Observatory

Cardiff

AEI/Golm

32

Pull “missing” files to a storage system

List of required Files

GridFTPLocalReplicaCatalog

ReplicaLocationIndex

Data Replicati

on Service

Reliable File

Transfer Service Local

ReplicaCatalog

GridFTP

Data replication service

“Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005

ReplicaLocationIndex

Data MovementData Location

Data Replication

33

Hypervisor/OS Deploy hypervisor/OS

Why not leverage dynamic deployment capabilities?

Physical machineProcure hardware

VM VM Deploy virtual machine

State exposed & access uniformly at all levelsProvisioning, management, and monitoring at all levels

JVM Deploy container

DRS Deploy service GridFTP LRC

VO Services

GridFTP

34

Maybe we need to specialize further …

User

ServiceProvider

“Provide access to data D at S1, S2,

S3 with performance P”

ResourceProvider

“Provide storage with performance P1, network with

P2, …”

D

S1

S2

S3

D

S1

S2

S3Replica catalog,User-level multicast, …

D

S1

S2

S3

35Infrastructure

Applications

36

Energy

Progress of adoption

38US$3

39Credit: Werner Vogels

40Credit: Werner Vogels

41

Animoto EC2 image usage

Day 1 Day 8

0

4000

42

Software

Platform

Infrastructure

Saleforce.com, Google,Animoto, …, …, …caBIG, TG gateways

43

Software

Platform

Infrastructure


Amazon, GoGrid, Sun,Microsoft, …

44

Software

Platform

Infrastructure


Amazon, GoGrid, Sun,Microsoft, …

Amazon, Google,Microsoft, …

45

Dynamo: Amazon’s highly available key-value store (DeCandia et al.,

SOSP’07) Simple query model

Weak consistency, no isolation

Stringent SLAs (e.g., 300ms for 99.9% of requests; peak 500 requests/sec)

Incremental scalability

Symmetry Decentralization Heterogeneity

Technologies used in Dynamo

Problem Technique AdvantagePartitioning

Consistent hashing

Incremental scalability

High Availability for writes

Vector clocks with

reconciliation during reads

Version size is decoupled from update rates

Handling temporary failures

Sloppy quorum and hinted handoff

Provides high availability and

durability guarantee when some of the replicas are not

availableRecovering from

permanent failures

Anti-entropy using Merkle

trees

Synchronizes divergent replicas in the background

Membership and failure detection

Gossip-based membership

protocol and failure

detection.

Preserves symmetry and avoids having a centralized registry

for storing membership and node liveness information

47

ApplicationService oriented applications


48

Energy Internet

The Shape of Grids to Come?

49

Killers apps for COTB?

Biomedical informatics/Evidence-based medicine

Human responses to global climate disruption

50

My servers

Chicago

Chicago

handle.net

BIRN

Chicago

IaaS provider

Chicago

BIRN

Chicago

Using IaaS in biomedical informatics

51

“The computer revolution

hasn’t happened yet.”

Alan Kay, 1997

52Time

Con

nect

ivity

(on

log

scal

e) Science Enterprise Consumer

“When the network is as fast as the computer's

internal links, the machine disintegrates across

the net into a set of special purpose appliances”

(George Gilder, 2001)

Grid Cloud ????

Computation Institutewww.ci.uchicago.edu

Thank you!

computing outside the box

Technology

caarray data service

chicago appln service

grid computing

utility computing

computing utility

infrastructure applications

box computing

demand computing