enabling escience: open software, standards, infrastructure

Post on 05-Jan-2016

29 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Enabling eScience: Open Software, Standards, Infrastructure. Ian Foster Argonne National Laboratory University of Chicago Globus Alliance www.mcs.anl.gov/~foster. UK eScience Meeting, Nottingham, September 2, 2004. The Grid Meets the BBC. - PowerPoint PPT Presentation

TRANSCRIPT

Enabling eScience:Open Software, Standards,

InfrastructureIan Foster

Argonne National Laboratory

University of Chicago

Globus Alliance

www.mcs.anl.gov/~foster

UK eScience Meeting, Nottingham, September 2, 2004

The Grid Meets the BBC

“The Grid is an international project that looks in detail at a terrorist cell operating on a global level and a team of American and British counter-terrorists who are tasked to stop it”

Gareth Neame, BBC's head of drama

A Better Characterization?

“The Grid is an international project that looks in detail at scientific collaborations operating on a global level and a team of computer scientists who are tasked to enable them”

But perhaps not as telegenic?

4

eScience & Grid: 6 Theses1. Scientific progress depends increasingly on large-scale

distributed collaborative work

2. Such distributed collaborative work raises challenging problems of broad importance

3. Any effective attack on those problems must involve close engagement with applications

4. Open software & standards are key to producing & disseminating required solutions

5. Shared software & service infrastructure are essential application enablers

6. A cross-disciplinary community of technology producers & consumers is needed

5

Software,Standards

Implication: A Problem-Driven, Collaborative R&D Methodology

Design

DeployBuild

Apply

Analyze

ApplyApply

Deploy

Apply

ComputerScience

Infra-structure

DisciplineAdvances

6

Overview

How are we doing? Software Standards Infrastructure Community

An advertorial, and request for input Globus Toolkit version 4

Summary

7

Overview

How are we doing? Software Standards Infrastructure Community

An advertorial, and request for input Globus Toolkit version 4

Summary

8

Why Open Software Matters

eScience requires sophisticated functionality but is a small “market”

Commercial software does not meet needs Open software can help jumpstart development

by reducing barriers to entry Encourage adoption of common approaches to

key technical problems Enable broad Grid technology ecosystem

A basis for international cooperation A basis for cooperation with industry

9

“Open Software” isUltimately about Community

Contributors: design, development, packaging, testing, documentation, training, support United by common architectural perspective

Users May be major contributors via, e.g., testing

Governance structure To determine how the software evolves

Processes for coordinating all these activities Packaging, testing, reporting, …

An ecosystem of complementary components Enabled by appropriately open architecture

10

“Ecosystem”?

Not a monoculture … … or Cambrian explosion

… but a web of components

11

E.g., Globus Alliance & Toolkit(Argonne, USC/ISI, Edinburgh, PDC, NCSA)

An international partnership dedicated to creating & disseminating high-quality open source Grid technology: the Globus Toolkit Design, engineering, support, governance

Academic Affiliates make major contributions EU: CERN, MPI, Poznan, INFN, etc. AP: AIST, TIT, Monash, etc. US: SDSC, TACC, UCSB, UW, etc.

Significant industrial contributions & adoption 1000s of users worldwide, many contribute

12

Broader Ecosystem*:Example Complementary Projects

NSF Middleware Initiative Packaging, testing, additional components

Virtual Data Toolkit (GriPhyN + PPDG) GT, Condor, Virtual Data System, etc.

EGEE and “gLite” Close collaboration with Globus + Condor

TeraGrid, Earth System Grid, NEESgrid, … Consume and produce components

Open Middleware Infrastructure Institute Collaboration on components, testing, etc.

* See tutorial by Lee Liming: AHM, GGF, SC’2004.

13

Broader Ecosystem:E.g., NMI Distributed Test Facility

(NSF Middleware Initiative’s GRIDS Center)

How Grid Software Works: NSF Network for Earthquake Engineering Simulation (NEES) Transform our ability to carry out research vital to

reducing vulnerability to catastrophic earthquakes

15

Building a NEES Collaboratory:What the User Wants

Secure, reliable, on-demand access to data,software, people, and other resources(ideally all via a Web Browser)

16

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

How it Really Happens(A Simplified View)

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

17

How it Really Happens(without Grid Software)

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

A

B

C

D

E

Application Developer

10

Off the Shelf

12

GlobusToolkit

0

Grid Community

0

18

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

How it Really Happens(with Grid Software)

WebBrowser

ComputeServer

GlobusMCS/RLS

DataViewer

Tool

CertificateAuthority

CHEF ChatTeamlet

MyProxy

CHEF

ComputeServer

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

Globus IndexService

GlobusGRAM

GlobusGRAM

GlobusDAI

GlobusDAI

GlobusDAI

Application Developer

2

Off the Shelf

9

Globus

Toolkit5

Grid Community

3

19

0

10

20

30

40

50

60

70

8:0

0

8:3

0

9:0

0

9:3

0

10

:00

10

:30

11

:00

11

:30

12

:00

12

:30

13

:00

13

:30

14

:00

14

:30

15

:00

15

:30

16

:00

16

:30

17

:00

17

:30

18

:00

18

:30

Nu

mb

er

of

Pa

rtic

ipa

nts

UIUC

Colorado

NEESgridMultisite OnlineSimulation Test

(July 2003)

Illin

ois

Colo

rado

Illinois (simulation)

20

NEESgrid Summary A successful “turn of the crank”

S/w produced & deployed on time & budget, and new applications enabled

A producer as well as consumer of Grid s/w Many sociopolitical “learning opportunities”

4 tasks: develop s/w, engineer s/w, elicit requirements, educate community

Experiment-driven deployment™ was key “No victory is final”: challenges remain

Hand off s/w to separate operations team Sharing of facilities, data: politically charged

21

Software: Summary Good software arises from trying to solve real

problems in real projects—& then generalizing E.g., Globus: security, job submission/mgmt, data

movement, monitoring, etc. The result is solutions that make sense within a wide

variety of applications Solve real problems, but not every problem

Resulting software is not a “turnkey” solution for any significant application “Turnkey” solutions require integration Factoring can extract higher-level “solutions”

22

Example “Solutions” Portal-based User Registration System (PURSE)

Source: Earth System Grid, PDC Web-based A&A management

Lightweight Director Replicator Source: LIGO Data replication management

Workflow execution & management DAGman + Condor-G + Globus components Source: Virtual data toolkit

Service monitoring & fault detection Source: Earth System Grid

23

Overview

How are we doing? Software Standards Infrastructure Community

An advertorial, and request for input Globus Toolkit version 4

Summary

24

“Standards”: Examples of Success

Grid Security Infrastructure Broadly used, multiple implns, WS-Security Rich Grid security ecosystem, with linkages to

MyProxy, OTP, KX509, Shibboleth, … GridFTP

Broadly used, multiple implementations WSDL/SOAP

Facilitating service-oriented architectures OGSI/WSRF

Many find encode useful patterns, behaviors

25

Standards: Status Open Grid Services Architecture (OGSA)—the lighthouse by

which we steer Defines requirements & priorities But far from complete

W3C, OASIS, GGF, DMTF, IETF Good things are happening in many areas WS-Agreement, DAIS, SRM, …, …

But for those building systems today? Problem areas: monitoring, policy, data, etc. Ad hoc approaches: will cost us big later

“Experiment-driven deployment” on intl. scale to drive interoperability of infrastructure, code

26

Overview

How are we doing? Software Standards Infrastructure Community

An advertorial, and request for input Globus Toolkit version 4

Summary

27

Infrastructure Broadly deployed services in support of virtual

organization formation and operation Authentication, authorization, discovery, …

Services, software, and policies enabling on-demand access to important resources Computers, databases, networks, storage, software

services,… Operational support for 24x7 availability Integration with campus infrastructures Distributed, heterogeneous, instrumented systems

can be wonderful CS testbeds

28Grid2003: An Operational Grid 28 sites (2100-2800 CPUs) & growing 400-1300 concurrent jobs 8 substantial applications + CS experiments Running since October 2003

Korea

http://www.ivdgl.org/grid2003

29

Grid2003 Software Stack(“Virtual Data Toolkit”)

Application

Chimera Virtual Data System

DAGMan and Condor-G

Globus Toolkit GSI, GRAM, GridFTP, etc.

Site schedulers and file systems

Clusters and storage systems

Three levels of deployment:+ Site services: GRAM, GridFTP, etc.+ Global & virtual organization services+ IGOC: iVDGL Grid Operations Center

30

Grid2003 Metrics

Metric Target AchievedNumber of CPUs 400 2762 (28 sites)

Number of users > 10 102 (16)

Number of applications > 4 10 (+CS)

Number of sites running concurrent apps

> 10 17

Peak number of concurrent jobs 1000 1100

Data transfer per day > 2-3 TB 4.4 TB max

31

Grid2003 Applications To Date

CMS proton-proton collision simulation ATLAS proton-proton collision simulation LIGO gravitational wave search SDSS galaxy cluster detection ATLAS interactive analysis BTeV proton-antiproton collision simulation SnB biomolecular analysis GADU/Gnare genone analysis Various computer science experiments

www.ivdgl.org/grid2003/applications

32Example Grid3 Application:NVO Mosaic Construction

NVO/NASA Montage: A small (1200 node) workflow

Construct custom mosaics on demand from multiple data sources

User specifies projection, coordinates, size, rotation, spatial sampling

Work by Ewa Deelman et al., USC/ISI and Caltech

33

Next Step:Open Science Grid

U.S. (international?) consortium to provide services to a broad set of sciences

Grid3 as a starting point, expanding to include many more sites

A major focus is the MOU/SLA structure required to sustain & scale operations Resource providers Resource consumers Virtual organizations

We hope to collaborate with TeraGrid, EGEE, UK NGS, etc.

34

Infrastructure: Summary

Encouraging progress Real understanding of how to operate Grid

infrastructures is emerging Production infrastructures are appearing

and are being relied upon for real science Significant areas of concern remain

Security is going to get harder International interoperability still elusive We haven’t got the right model for sustained

infrastructure development & support

35

Overview

How are we doing? Software Standards Infrastructure Community

An advertorial, and request for input Globus Toolkit version 4

Summary

36

Community Big picture is extremely positive

The “eScience”/“Grid” community is large, enthusiastic, smart, and diverse

Significant exchange of ideas, software, personnel, experiences

Real application-CS cooperation We can do better in various specific areas

Not clear we’re always focusing on the real problems: often viewed as “mundane”??

CS community could be even more engaged Software development a community effort

GlobalCommunity

38

Overview

How are we doing? Software Standards Infrastructure Community

An advertorial, and request for input Globus Toolkit version 4

Summary

39

What’s New inGT 4.0 (January 31, 2005)

For all: Additions: data, security, execution, XIO, … Improved packaging, testing, performance, usability,

doc, standards compliance (phew) WS components ready for broader use

For the end user: More complementary tools & solutions C, Java, Python APIs; command line tools

For the developer: Java (Axis/Tomcat) hosting greatly improved Python (pyGlobus) hosting for the first time

40

41Apache Axis Web Services Container

Good news for Java WS developers: GT4.0 works with standard Axis* and Tomcat* GT provides Axis-loadable libraries, handlers Includes useful behaviors such as inspection,

notification, lifetime mgmt (WSRF) Others implement GRAM, etc.

Major Globus contributions to Apache ~50% of WS-Addressing code ~15% of WS-Security code Many bug fixes WSRF code a possible next contribution

* Modulo Axis and Tomcat release cycle issues

Axis

SecurityAddressing

GTbits

Appbits

42

Standards Compliance Web services: WS-I compliance

All interfaces support WS-I Basic Profile, modulo use of WS-Addressing

Security

a) WS-I Basic Security Profile (plaintext)

b) IETF RFC 3820 Proxy Certificate GridFTP

GGF GFD 020 Others in progress & being tracked

WSRF (OASIS), WS-Addressing (W3C), OGSA-DAI (GGF), RLS (GGF)

43

Globus Ecosystem(Just a Few Examples Listed Here)

Tools provide higher-level functionality Nimrod-G, MPICH-G2, Condor-G, Ninf-G NTCP telecontrol GT4IDE Eclipse IDE

Packages integrate GT with other s/w VDT, NMI, CTSS, NEESgrid, ESG

Solutions package a set of functionality VO management, monitoring, replica mgmt

Documentation, e.g. Borja Sotomayor’s tutorial

44

GT4.0 Release Schedule

Date StabilityLevel

Features added after?

Public interfaces

changed after?

Aug 3 Alpha Yes Yes

Oct 15

Full-featured development

No Yes, but only if significant benefits

Dec 3 Beta-quality development

No No

Jan 31,

2005

Stable release

(FINAL)

No No

45

We’d Getting a Lot of Help,But Could do with A Lot More

Testing and feedback Users, developers, deployers: plan to use the software

now & provide feedback Tell us what is missing, what performance you need,

what interfaces & platforms, … Ideally, also offer to help meet needs (-:

Related software, solutions, documentation Adapt your tools to use GT4 Develop new GT4-based components Develop GT4-based solutions Develop documentation components

46

Overview

How are we doing? Software Standards Infrastructure Community

An advertorial, and request for input Globus Toolkit version 4

Summary

47

eScience & Grid: 6 Theses1. Scientific progress depends increasingly on large-scale

distributed collaborative work

2. Such distributed collaborative work raises challenging problems of broad importance

3. Any effective attack on those problems must involve close engagement with applications

4. Open software & standards are key to achieving a critical mass of contributors

5. Shared software & service infrastructure are essential application enablers

6. A cross-disciplinary community of technology producers & consumers is vital

48

Overall, We are Doing Well

Communities & individuals are, increasingly, using the Grid to advance their science

Broad consensus on many key architecture concepts, if not always their implementation

Significant base of open source software, widely used in applications & infrastructure

Service-oriented arch facilitates cooperation on software development & code reuse

Grid standards are making a difference on a daily basis: e.g., GSI, GridFTP

49

Overall, We are Doing Well (2)

A real understanding of how to operate Grid infrastructures is emerging

Production infrastructures are appearing and are being relied upon for real science

Productive international cooperation is occurring at many levels

A vibrant community has formed and shows no signs of slowing down

Real connections have been formed between computer science & applications

50

Software,Standards

Problem-Driven, Collaborative R&D Methodology

Design

DeployBuild

Apply

Analyze

ApplyApply

Deploy

Apply

ComputerScience

Infra-structure

DisciplineAdvances

GlobalCommunity

51

Software Ecosystem

Not a monoculture … … or Cambrian explosion

… but a web of components

52

We Can Certainly Do Better Be smarter about how we work with users

Not enough to point people at a manual Treat s/w as shared infrastructure, to be developed,

engineered, tested, improved Be honest about costs & time scales, expertise

Establish real collaboration on software Partition the space of what to do: it’s large Partners, not customers or competitors

Tackle process issues explicitly Standardize on packaging, testing, support Deployment, operations, security issues

53

We Can Certainly Do Better (2) Aspire to code reuse & interoperability

Interoperability layers are not the answer Recognize the costs of noninteroperability

Focus standards efforts on the real problems faced when sharing software & infrastructure Quit fiddling with Web services infrastructure!

Build sustained, critical mass teams Problems are hard; requires time & expertise

Build and operate large-scale Grids with real application groups to drive all of this With explicit O(5) year focus and goals

54

Thanks, in particular, to:

Carl Kesselman and Steve Tuecke, my long-time Globus co-conspirators

Kate Keahey, Lee Liming, Jennifer Schopf, Gregor von Laszewski, Mike Wilde @ Argonne

Globus Alliance members at Argonne, U.Chicago, USC/ISI, Edinburgh, PDC, NCSA

Miron Livny, U.Wisconsin Condor project Other partners in Grid technology, application, &

infrastructure projects DOE, NSF (esp. NMI program), NASA, IBM, Microsoft for

generous support

55

For More Information

Globus Alliance www.globus.org

Global Grid Forum www.ggf.org

Open Science Grid www.opensciencegrid.org

Background information www.mcs.anl.gov/~foster

GlobusWORLD 2005 Feb 7-11, Boston

2nd Editionwww.mkp.com/grid2

Extra Slides

57

Globus Toolkit: A Brief History GT 1.0 (1998) to 2.0 (2002)

PKI-based Grid Security Infrastructure Execution (GRAM), data (GridFTP), info (MDS) Gradual introduction of support processes

GT 3.0 (June 2003) and 3.2 (Feb 2004) International collaboration Higher-level services: replica location, file transfer,

registry, credential repository Refactoring GT mechanisms WS framework Most production deployments recommended to use

pre-WS (“GT2.4”) components

58

NEESgrid Software Details Inputs

Essentially all of GT3.2—GSI, GridFTP, GRAM, MDS, … (coherent architecture helps!)

CHEF, Creare Data Turbine, OpenSEES Custom development

NTCP telecontrol, data management, etc. Integration

All of the above, and more Outputs

NEESgrid system NEES consortium NTCP components Globus Toolkit …

top related