services for science v2 (apan26)

Post on 10-May-2015

696 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Talk given at APAN26 in beautiful Queenstown, New Zealand. Updates my INGRID talk with new material on caBIG--including some nice slides provided by Ravi Madduri.

TRANSCRIPT

Ian Foster

Computation Institute

Argonne National Lab & University of Chicago

Services for Science

2

Thanks!

DOE Office of Science

NSF Office of Cyberinfrastructure

National Institutes of Health

Colleagues at Argonne, U.Chicago, USC/ISI, OSU, Manchester, and elsewhere

3

ScientificCommunication, ~1600

Brahe Kepler

4

1980

5

Scientific Communication, ~2000

Data ArchivesData Archives

User

Analysis toolsAnalysis tools

Gateway

Figure: S. G. Djorgovski

Discovery toolsDiscovery tools

Service-Oriented Science

6

Application Scenario

Location AMicroarray, Protein,

Image data

Location BMicroarray, Protein,

Image data

Location CMicroarray, Protein,

Image data

Location CImage Analysis

Location DImage Analysis

Microarray and protein databases at other institutions

Different database systems, data

representations, security

Different program

invocation, remote access, data transfer

7

caBIG: sharing of infrastructure, applications, and data.

DataIntegration!

Services& Cancer Biology

Globus

8

Service-Oriented Science

People create services (data, code, instr.) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

9

Creating Services

People create services (data, code, instr.) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

10

Anatomy of a Service

op1 opN (meta)data

Implementation(s)

Clients RegistryManagement

Clients

Service

Service

AttributeAuthorityAttributeAuthority

Persistence

11

Creating Services (~2005)

“This full-day tutorial provides an introduction to programming Java services with the latest version of the Globus Toolkit version 4 (GT4). The tutorial teaches how to build a Java Service that makes use of GT4 mechanisms for state management, security, registry and related topics.”

12

ApplnService

Create

Index service

StoreRepository ServiceAdvertize

Discover

Invoke;get results

Introduce

Container

Transfer GAR

Deploy

Ohio State University and Argonne/U.Chicago

Creating Services in 2008Introduce and gRAVI

Introduce Define service Create skeleton Discover types Add operations Configure security

Grid Remote Application Virtualization Infrastructure Wrap executables

Globus

Demonstration:Creating Services

Introduce + gRAVIShannon Hastings

Scott OsterDavid Ervin

Stephen Langella

Kyle ChardRavi Madduri

14Center for Enabling Distributed Petascale Science

Workflow Automation at DOE Facilities

AutomationReproducibility

SecurityReusability

StorageMetadataAnalysis

Visualization

Advanced Photon Source

15

Discovering Services

People create services (data or functions) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

16

The ultimate arbiter?

Types, ontologies

Can I use it?

Billions of services

Discovering Services

Assume success

Syntax, semantics

Permissions

Reputation

A B

17

Discovery (1):Registries

Globus

18

Discovery (2):Standardized Vocabularies

Core Services

Grid Service

Uses TerminologyDescribed In

Cancer DataStandards

Repository

EnterpriseVocabularyServices

ReferencesObjects

Defined in

Service Metadata

Publishes

Subscribes toand Aggregates

Queries Service

Metadata Aggregated In

Registers To

Discovery Client API

IndexService

Globus

19

20

Discovery (3): Tagging& Social Networking

GLOSS: Generalized

Labels Over Scientific data Sources

(Foster, Nestorov)

21

Discovery (3): Tagging& Social Networking

David de Roure, Carole Goble,

et al.

22

Composing Services

People create services (data or functions) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

23

Composing Services:E.g., BPEL Workflow System

Data Service@ uchicago.edu

Analytic service@ osu.edu

Analytic service@ duke.eduResearcher

Or Client App

<BPELWorkflow

Doc>

<WorkflowInputs>

<WorkflowResults>

BPELEngine

link

caBiG: https://cabig.nci.nih.gov/; BPEL work: Ravi Madduri et al.

link

link

link

See also Kepler & Taverna

Globus

24

Composing Services: Taverna

caGrid Scavenger with semantic/metadata-

based caGrid service query

A sample caGrid

workflow

Globus

25

Composing Services

Globus

Demonstration:Composing Services

Taverna + GT4Taverna team

Wei TanRavi Madduri

27

Publishing Services

People create services (data or functions) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

28

Publishing Services

Description Syntax, semantics

State Availability, load, …

Policies Who, what, when, …

Hosting Location, scalability, …

29

Authorization: SAML & XACML

VOMS Shibboleth LDAP PERMIS…

GT4 ClientGT4 Server

PDP

AttributesAuthorization

Decision

PIP PIP PIP

SAML

XACML

Globus

30

Hosting Services

People create services (data or functions) …

which I discover (& decide whether to use) …

& compose to create a new function ...

& then publish as a new service.

I find “someone else” to host services, so I don’t have to become an expert in operating services & computers!

I hope that this “someone else” can manage security, reliability, scalability, …

!!“Service-Oriented Science”, Science, 2005

31

The Two Dimensions of Service-Oriented Science

Decompose across network

Clients integrate dynamically Select & compose services Select “best of breed” providers Publish result as new services

Decouple resource & service providers

Function

Resource

Data Archives

Analysis tools

Discovery toolsUsers

Fig: S. G. Djorgovski

32

The geWorkbench/caGrid/TeraGrid Interface

33

Putting It Together for the Example Scenario

Location AMicroarray, Protein,

Image data

Location BMicroarray, Protein,

Image data

Location CMicroarray, Protein,

Image data

Location CImage Analysis

Location DImage Analysis

caGrid Service Interfaces

caGridEnviron-

ment

Registered Object

Definitions

Advertise-ment

Log on, Grid credentials

Query and Analysis Workflow

Discovery

Microarray & protein databases at other

institutions

34

Lessons Learned A convenient higher-level abstraction

Suitable for a subset of scientific use cases

Infrastructure need to be sustainable Integrates well with hospital/cancer

center/experimental facility IT infrastructure Workflows are attractive to users Scalability and provenance are important No vendor lock-in (if you are careful) User experience remains ambiguous

Early adopters are enthusiastic (50+ services) Cancer centers seek clear ROI

35

Services for Science A new approach to communicating

A (not-so new) approach to structuring systems They’re real

Excellent infrastructure and tools (Globus, Introduce, gRAVI, Taverna, Swift, etc., etc.)

Substantial numbers of services out there They’re challenging

Sociology: incentives, rewards Infrastructure: hosting Provenance: justifying “results” Scaling: services, requests

top related