services for science v2 (apan26)

35
Ian Foster Computation Institute Argonne National Lab & University of Chicago Services for Science

Upload: ian-foster

Post on 10-May-2015

695 views

Category:

Technology


0 download

DESCRIPTION

Talk given at APAN26 in beautiful Queenstown, New Zealand. Updates my INGRID talk with new material on caBIG--including some nice slides provided by Ravi Madduri.

TRANSCRIPT

Page 1: Services for Science v2 (APAN26)

Ian Foster

Computation Institute

Argonne National Lab & University of Chicago

Services for Science

Page 2: Services for Science v2 (APAN26)

2

Thanks!

DOE Office of Science

NSF Office of Cyberinfrastructure

National Institutes of Health

Colleagues at Argonne, U.Chicago, USC/ISI, OSU, Manchester, and elsewhere

Page 3: Services for Science v2 (APAN26)

3

ScientificCommunication, ~1600

Brahe Kepler

Page 4: Services for Science v2 (APAN26)

4

1980

Page 5: Services for Science v2 (APAN26)

5

Scientific Communication, ~2000

Data ArchivesData Archives

User

Analysis toolsAnalysis tools

Gateway

Figure: S. G. Djorgovski

Discovery toolsDiscovery tools

Service-Oriented Science

Page 6: Services for Science v2 (APAN26)

6

Application Scenario

Location AMicroarray, Protein,

Image data

Location BMicroarray, Protein,

Image data

Location CMicroarray, Protein,

Image data

Location CImage Analysis

Location DImage Analysis

Microarray and protein databases at other institutions

Different database systems, data

representations, security

Different program

invocation, remote access, data transfer

Page 7: Services for Science v2 (APAN26)

7

caBIG: sharing of infrastructure, applications, and data.

DataIntegration!

Services& Cancer Biology

Globus

Page 8: Services for Science v2 (APAN26)

8

Service-Oriented Science

People create services (data, code, instr.) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

Page 9: Services for Science v2 (APAN26)

9

Creating Services

People create services (data, code, instr.) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

Page 10: Services for Science v2 (APAN26)

10

Anatomy of a Service

op1 opN (meta)data

Implementation(s)

Clients RegistryManagement

Clients

Service

Service

AttributeAuthorityAttributeAuthority

Persistence

Page 11: Services for Science v2 (APAN26)

11

Creating Services (~2005)

“This full-day tutorial provides an introduction to programming Java services with the latest version of the Globus Toolkit version 4 (GT4). The tutorial teaches how to build a Java Service that makes use of GT4 mechanisms for state management, security, registry and related topics.”

Page 12: Services for Science v2 (APAN26)

12

ApplnService

Create

Index service

StoreRepository ServiceAdvertize

Discover

Invoke;get results

Introduce

Container

Transfer GAR

Deploy

Ohio State University and Argonne/U.Chicago

Creating Services in 2008Introduce and gRAVI

Introduce Define service Create skeleton Discover types Add operations Configure security

Grid Remote Application Virtualization Infrastructure Wrap executables

Globus

Page 13: Services for Science v2 (APAN26)

Demonstration:Creating Services

Introduce + gRAVIShannon Hastings

Scott OsterDavid Ervin

Stephen Langella

Kyle ChardRavi Madduri

Page 14: Services for Science v2 (APAN26)

14Center for Enabling Distributed Petascale Science

Workflow Automation at DOE Facilities

AutomationReproducibility

SecurityReusability

StorageMetadataAnalysis

Visualization

Advanced Photon Source

Page 15: Services for Science v2 (APAN26)

15

Discovering Services

People create services (data or functions) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

Page 16: Services for Science v2 (APAN26)

16

The ultimate arbiter?

Types, ontologies

Can I use it?

Billions of services

Discovering Services

Assume success

Syntax, semantics

Permissions

Reputation

A B

Page 17: Services for Science v2 (APAN26)

17

Discovery (1):Registries

Globus

Page 18: Services for Science v2 (APAN26)

18

Discovery (2):Standardized Vocabularies

Core Services

Grid Service

Uses TerminologyDescribed In

Cancer DataStandards

Repository

EnterpriseVocabularyServices

ReferencesObjects

Defined in

Service Metadata

Publishes

Subscribes toand Aggregates

Queries Service

Metadata Aggregated In

Registers To

Discovery Client API

IndexService

Globus

Page 19: Services for Science v2 (APAN26)

19

Page 20: Services for Science v2 (APAN26)

20

Discovery (3): Tagging& Social Networking

GLOSS: Generalized

Labels Over Scientific data Sources

(Foster, Nestorov)

Page 21: Services for Science v2 (APAN26)

21

Discovery (3): Tagging& Social Networking

David de Roure, Carole Goble,

et al.

Page 22: Services for Science v2 (APAN26)

22

Composing Services

People create services (data or functions) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

Page 23: Services for Science v2 (APAN26)

23

Composing Services:E.g., BPEL Workflow System

Data Service@ uchicago.edu

Analytic service@ osu.edu

Analytic service@ duke.eduResearcher

Or Client App

<BPELWorkflow

Doc>

<WorkflowInputs>

<WorkflowResults>

BPELEngine

link

caBiG: https://cabig.nci.nih.gov/; BPEL work: Ravi Madduri et al.

link

link

link

See also Kepler & Taverna

Globus

Page 24: Services for Science v2 (APAN26)

24

Composing Services: Taverna

caGrid Scavenger with semantic/metadata-

based caGrid service query

A sample caGrid

workflow

Globus

Page 25: Services for Science v2 (APAN26)

25

Composing Services

Globus

Page 26: Services for Science v2 (APAN26)

Demonstration:Composing Services

Taverna + GT4Taverna team

Wei TanRavi Madduri

Page 27: Services for Science v2 (APAN26)

27

Publishing Services

People create services (data or functions) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

Page 28: Services for Science v2 (APAN26)

28

Publishing Services

Description Syntax, semantics

State Availability, load, …

Policies Who, what, when, …

Hosting Location, scalability, …

Page 29: Services for Science v2 (APAN26)

29

Authorization: SAML & XACML

VOMS Shibboleth LDAP PERMIS…

GT4 ClientGT4 Server

PDP

AttributesAuthorization

Decision

PIP PIP PIP

SAML

XACML

Globus

Page 30: Services for Science v2 (APAN26)

30

Hosting Services

People create services (data or functions) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

Page 31: Services for Science v2 (APAN26)

31

The Two Dimensions of Service-Oriented Science

Decompose across network

Clients integrate dynamically Select & compose services Select “best of breed” providers Publish result as new services

Decouple resource & service providers

Function

Resource

Data Archives

Analysis tools

Discovery toolsUsers

Fig: S. G. Djorgovski

Page 32: Services for Science v2 (APAN26)

32

The geWorkbench/caGrid/TeraGrid Interface

Page 33: Services for Science v2 (APAN26)

33

Putting It Together for the Example Scenario

Location AMicroarray, Protein,

Image data

Location BMicroarray, Protein,

Image data

Location CMicroarray, Protein,

Image data

Location CImage Analysis

Location DImage Analysis

caGrid Service Interfaces

caGridEnviron-

ment

Registered Object

Definitions

Advertise-ment

Log on, Grid credentials

Query and Analysis Workflow

Discovery

Microarray & protein databases at other

institutions

Page 34: Services for Science v2 (APAN26)

34

Lessons Learned A convenient higher-level abstraction

Suitable for a subset of scientific use cases

Infrastructure need to be sustainable Integrates well with hospital/cancer

center/experimental facility IT infrastructure Workflows are attractive to users Scalability and provenance are important No vendor lock-in (if you are careful) User experience remains ambiguous

Early adopters are enthusiastic (50+ services) Cancer centers seek clear ROI

Page 35: Services for Science v2 (APAN26)

35

Services for Science A new approach to communicating

A (not-so new) approach to structuring systems They’re real

Excellent infrastructure and tools (Globus, Introduce, gRAVI, Taverna, Swift, etc., etc.)

Substantial numbers of services out there They’re challenging

Sociology: incentives, rewards Infrastructure: hosting Provenance: justifying “results” Scaling: services, requests