going places with distributed computing - · pdf fileenforcement corresponding to ... have a...

42
Going Places with Distributed Computing ESC Grid Workshop 2007 Kate Keahey University of Chicago Argonne National Laboratory [email protected]

Upload: hoangmien

Post on 06-Mar-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

Going Places with DistributedComputing

ESC Grid Workshop 2007

Kate Keahey

University of Chicago

Argonne National Laboratory

[email protected]

Page 2: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

There is more to Grid applicationsthan running simple batch

computations…

Page 3: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Galaxies Collide on the I-WAY

I-WAY (SC95) Colliding Galaxies

Co-schedulingcomponents

Tightly coupledsimulation distributedover NCSA, PSC, SDSC

Multiple components Dark matter Gas Visualization

We never managed to dothis again

Page 4: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Functional MRI (fMRI)

Large datasets 90,000 volumes / study

100s of studies

Wide range of analyses Testing, production runs

Data mining

Ensemble, Parameter studies

Page 5: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Moral Hazard

Page 6: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

A Different Kind of Complexity

Complex experimental application codes Developed over more than 10 years, by more than 100

scientists, comprises ~2 M lines of C++ and Fortran code

Require complex, customized environments Rely heavily on the right combination of compiler versions

and available libraries Dynamically load external libraries depending on the task

to be performed

Environment validation To ensure reproducibility and result uniformity across

environments

www.star.bnl.gov

Page 7: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Critical Applications

The National Fusion Collaboratory: Grids for Experimental Science

Page 8: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

Urgent ResourceRequirements

Applications thatrequire resourcesin response tourgent needs Event tracking,

simulation andprediction, etc.

Unexpected

Page 9: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

Combating Complexity

Page 10: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

The Swift System Clean separation of logical/physical concerns Logical Concerns

XDTM specification of logical data structures Concise specification of parallel programs (SwiftScript, with

iteration, etc.) Rigorous provenance tracking and query: Virtual data schema

& automated recording

Physical Concerns Efficient execution on distributed resources: Karajan

threading, Falkon provisioning, Globus interfaces, pipelining,load balancing

Improved usability and productivity Demonstrated in numerous applications

Page 11: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

Virtual Node(s)

SwiftScript

Abstractcomputation

Virtual DataCatalog

SwiftScriptCompiler

Specification Execution

Virtual Node(s)

Provenancedata

ProvenancedataProvenance

collector

launcher

launcher

file1

file2

file3

AppF1

AppF2

Scheduling

Execution Engine(Karajan w/

Swift Runtime)

Swift runtimecallouts

CC CC

Status reporting

Provisioning

ResourceProvisionersFalkon/VWS

AmazonEC2

Swift Architecture

Yong Zhao, Mihael Hatigan, Ioan Raicu, Mike Wilde, Ben Clifford

Page 12: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Provisioning Resources

Condor Glideins Provision resources for a Condor pool by “gliding in”

Condor daemons as jobs through a GRAM interface

MyCluster Deploys “personal clusters” (SGE or Condor) on Teragrid

resources.

Unlike Condor Glideins, each user gets their own cluster,instead of having resources added to an existing pool

Falkon Deploys a daemon that is specially optimized for

scheduling lightweight tasks, such as those found inworkflows.

Page 13: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

The Underpinnings:Resource Provisioning as CentralAbstraction for Grid Computing

Page 14: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Of Jobs and Resources

Job management vs resource provisioning The focus of infrastructure available today it to

schedule jobs Resources are provisioned as a side-effect of job

deployment

We need general-purpose leases Associated with the required environment With a well-defined resource quota and availability General-purpose

Adaptable to many interactions: managed by customizedschedulers, log in via ssh, etc.

Building on the side-effect Request a resource quota Once obtained, prepare the environment and make it

available via some required method

Page 15: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Two Things to Consider

Lease Terms Expressive enough so that we can define a variety of

leases Best-effort, advance reservations (including immediate

leases), periodic, preemptible/non-preemptible, etc. A description of the required resources: leasing CPUs,

storage… An environment

Making lease terms explicit Standards (OGF): WS-Agreement, JSDL

Enforcement Corresponding to lease terms Enforcing a large set of the desired features Enforcing them efficiently

Page 16: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Virtual Workspaces

A dynamically provisioned environment Environment definition: we get exactly the (software)

environment we need on demand. Resource allocation: Provision the resources the workspace

needs (CPUs, memory, disk, bandwidth, availability),allowing for dynamic renegotiation to reflect changingrequirements and conditions.

Implementation Traditional means: publishing, automated

configuration, coarse-grained enforcement Virtual Machines: encapsulated configuration and

fine-grained enforcement

Paper: “Virtual Workspaces: Achieving Quality of Service and Quality ofLife in the Grid”

Page 17: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

The Virtues of Virtualization

Hardware

Virtual Machine Monitor (VMM) / Hypervisor

Guest OS(Linux)

Guest OS(NetBSD)

Guest OS(Windows)

VM VM VM

AppApp AppAppApp

Xen

VMWare

UML

KVM

etc.

Parallels

Bring your environment with you Excellent enforcement and isolation Fast to deploy, enables short-term leasing Have a performance impact but it is acceptable for most

modern hypervisors Suspend/resume, migration

Page 18: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Deploying WorkspacesRemotely

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Workspace

-Workspace metadata-Pointer to the image-Logistics information

-Deployment request-CPU, memory, node count, etc.

VWSService

Virtual Workspace implementationhttp://workspace.globus.org

Page 19: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Interacting withWorkspaces

Poolnode

Trusted Computing Base (TCB)

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

The workspace service publishesinformation on each workspace

as standard WSRF ResourceProperties.

Users can query thoseproperties to find out

information about theirworkspace (e.g. what IP

the workspace wasbound to)

Users can interactdirectly with their

workspaces the sameway the would with a

physical machine.

VWSService

Page 20: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Cloud Computing

Page 21: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Virtual Workspaces for STAR STAR image configuration

A virtual cluster composed of an OSG headnode and STAR workernodes

Using the workspace service over EC2 to provision resources Allocations of up to 100 nodes Dynamically contextualized for out-of-the-box cluster

Page 22: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Running jobs : 300Running jobs : 300

PDSF

Fermi

VWS/EC2 BNLRunning jobs : 230

Running jobs : 150 Running jobs : 50

Running jobs : 150

Running jobs : 300Running jobs : 282Running jobs : 243Running jobs : 221Running jobs : 195Running jobs : 140Running jobs : 76Running jobs : 0

Running jobs : 200 Running jobs : 50

Running jobs : 150Running jobs : 142Running jobs : 124Running jobs : 109Running jobs : 94Running jobs : 73Running jobs : 42

Running jobs : 195Running jobs : 183Running jobs : 152Running jobs : 136Running jobs : 96Running jobs : 54Running jobs : 37Running jobs : 0 Running jobs : 42Running jobs : 39Running jobs : 34Running jobs : 27Running jobs : 21Running jobs : 15Running jobs : 9Running jobs : 0

Running jobs : 0

Job Completion :

File Recovery :

WSU

with thanks to Jerome Lauret and Doug Olson of the STAR project

Page 23: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

NerscPDSF

EC2(via Workspace

Service)

WSU

Accelerated display of a workflow job state Y = job number, X = job state

with thanks to Jerome Lauret and Doug Olson of the STAR projectwith thanks to Jerome Lauret and Doug Olson of the STAR project

Page 24: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Can I Do It at Home?

Challenge: how can I provide a “cloud”using virtualization without disrupting thecurrent operation of my cluster?

Flying Low: the Workspace Pilot Integrates with popular LRMs (such as PBS)

Implements “best effort” leases

Glidein approach: submits a “pilot” program thatclaims a resource slot

Provides administrator tools

Deployed on UC TeraPort

Page 25: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Workspace Pilot in Action

VWS

LRM/PBS

Xen dom0

Xen dom0

Xen dom0

VM

VMVM

VM

Level 1:provision raw

resources

Level 2:provision VMs

Page 26: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

Division of Labor:Decoupling Leasing and Platform

Management

Page 27: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

A Word from the Expert

The greatest improvements inthe productive powers oflabour, and the greater part ofthe skill, dexterity, andjudgment with which it isanywhere directed, or applied,seem to have been the effectsof the division of labour.

(Adam Smith)

Page 28: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

What Does It Mean to“Provide Resources”?

Environment

Hardware

Hardware

Environment

HardwareVMM

Virtual Appliance

Software managementpackages:

e.g., COD, Bcfg, Pacman(requires privilege or can

deal only with upperlayers of software)

Providing isolationvia e.g. Xen, KVM,Vmware, Vserver(different levels of

encapsulation)

Page 29: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Where Do AppliancesCome From?

Marketplaces(VMWare, EC2, Workspace …)

appliancedescription

Appliance Provider(a user, a VO, a Grid…)

Good… but: ease-of-use? maintenance? all those formats?

Page 30: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Where Do AppliancesCome From?

Marketplaces(VMWare, EC2, Workspace …)

appliancedescription

Appliance Provider(a user, a VO, a Grid…)

Appliance ManagementSoftware

(OSFarm, rPath, CohesiveFT…))

Xen VMware CDROM

Better!

Page 31: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Deploying Appliances

Appliances are “portable” It can be reused and customized to

many contexts

Making the appliance context-aware: Other appliances Site-specific information (e.g. a DNS

server) User/group/VO/Grid-specific

information (e.g. public keys, hostcerts, gridmapfiles, etc.)

Security issues Who do I trust to provide legitimate

context information? How do I make sure that appliances

adhere to my site policies?

VM

VMVM

VM

site

Virtual Organization

Page 32: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Where Do AppliancesCome From?

Marketplaces(VMWare, EC2, Workspace …)

appliancedescription

applianceassertions

appliancecontextualization

Appliance Provider(a user, a VO, a Grid…)

Appliance ManagementSoftware

(OSFarm, rPath, CohesiveFT…))

Xen VMware CDROM

Page 33: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Contextualization Challenge: Putting a VM in the deployment context of the Grid, site,

and other VMs Assigning and sharing IP addresses, name resolution, application-level

configuration, etc.

Solution: Management of Common Context

Paper: “A Scalable Approach To Deploying And Managing Appliances”,TeraGrid conference 2007

Configuration-dependent provides&requires

Common understandingbetween the image “vendor”and deployer

Mechanisms for securelydelivering the requiredinformation to images acrossdifferent implementations

contextualization agent

Common Context

IPhostname

pk

Page 34: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Workspace Ecosystem

Resource Providers:Local clusters,

Grid resource providers (TeraGrid, OSG)Commercial providers: EC2, Sun, slicehost,

Provisioning a resource, not a platform

Appliance Providers:Virtual Organizations, groups or individual users

via OSFarm, rPath, CohesiveFT, bcfg2, etc.

Middleware:appliances --> resources

manage secure appliance deploymentCombining networks and storage

VWS EC2

Page 35: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

Making Leases Cost-Effective

Page 36: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

What Can We Afford?

Enforceable == cost-effective Can we afford to provide desirable features?

“advance reservation” semantics, preemption forurgent computing, renegotiation, etc.

Typically utilization is a problem: draining Preemption could help but has issues

Porting to use specific software required

VMs to the rescue… Suspend/resume, migration

Policies, features, etc. Thesis work by Borja Sotomayor

Page 37: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Node

1N

ode

2N

ode

3 SH

OR

T-T

ER

MLE

AS

E

Node

1N

ode

2N

ode

3 SH

OR

T-T

ER

MLE

AS

E

The Draining Problem

With

out p

re-e

mpt

ion

With

pre

-em

ptio

n

Page 38: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Lease Manager

ExecutionManager

Events

Manages the VMs(image transfers,start, stop, suspend,resume)‏

Manages theapplicationsrunning insidethe VMs

LeaseRequests

JobRequests

VMM-enabledWorkerNodes

Best-EffortLease

Requests

Combining Leasing andJob Management

Page 39: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Interleaving “Soft”and “Hard” Leases

Injected leases are short (1h-2h), very frequent (every 4 to 8 hours), large(number of nodes between 1/3 and ½ of the cluster) ‏

Not using VMs (even withbackfilling) results in anoticeable hit on runtime. Inthis case, the schedulercannot readily start largeparallel jobs because of theresource leases. With VMs,these can be started, andsuspended before theleases start.

Page 40: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Conclusions Grid applications are demanding

Complex workflows, multi-component applications,complex environment requirements, criticalexecution needs, etc.

Higher-level languages to manage location-independent computing

Resource leasing as a central abstraction of Gridcomputing Enables acquisition of required resources SLAs: making relationships very explicit Short-term leasing an enabling element Environment management: level the playing field for

many communities Users are voting with their feet

Page 41: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Conclusions (cntd)

Towards centralized resource leasing Economies of scale

Load-balancing demand

Division of labor Appliance providers

Resource providers

Leveraging diversity of leases -> economy of scale

Page 42: Going Places with Distributed Computing - · PDF fileEnforcement Corresponding to ... Have a performance impact but it is acceptable for most ... name resolution, application-level

11/29/07, ESAC Grid Workshop Kate Keahey

Credits

Workspace team: Tim Freeman, Borja Sotomayor

Guest appearances Ian Foster, Frank Siebenlist

With thanks to many collaborators: Jerome Lauret (STAR, BNL), Doug Olson (STAR, LBNL), Marty Wesley

(rPath), Stu Gott (rPath), Ken Van Dine (rPath), Predrag Buncic (Alice,CERN), Haavard Bjerke (CERN), Rick Bradshaw (Bcfg2, ANL), NarayanDesai (Bcfg2, ANL), Duncan Penfold-Brown (Atlas,uvic), Ian Gable (Atlas,uvic), David Grundy (Atlas, uvic), Ti Leggit (University of Chicago), GregCross (University of Chicago), Mike Papka (University of Chicago/ANL)