grid services overview & introduction ian foster argonne national laboratory university of...
TRANSCRIPT
Grid ServicesOverview & Introduction
Ian FosterArgonne National Laboratory
University of Chicago
Univa Corporation
OOSTech, Baltimore, October 26, 2005
2
What’s This About “Grid Services”?
I will describe Web service interfaces that implement useful behaviors Primitives: resources, state, security Services: program execution, data movement, data
access, … I will also describe open source software that
implements those interfaces In particular, Globus Toolkit (GT4)
This is all standard Web services! “Grid is a use case for Web services, focused on
resource management”
3
RR
R
R
R
R
R
RR
R
What Grid is About:Aggregation in Virtual Organizations• Distributed resources and people• Linked by networks, crossing admin domains• Sharing resources, common goals• Dynamic behaviors
VO-BVO-A
R
R
R
R
4
R RR
R
R
R
R
R
RRR
R
VO-A VO-B
• Distributed resources and people• Linked by networks, crossing admin domains• Sharing resources, common goals• Dynamic behaviors• Fault tolerant
What Grid is About:Aggregation in Virtual Organizations
5
Grid Technology:Take Services Seriously
Model the world as a collection of services Computations, computers, instruments, storage,
data, communities, agreements, … Focus on what these things have in common
E.g., state modeling & lifecycle: Negotiation, deployment/creation, modeling, monitoring, management, termination
E.g., security: Authentication, authorization, audit, …
Result is Grid infrastructure Using Web services as a platform
6
“Stateless” vs. “Stateful” Services
Without state, how does client: Determine what happened (success/failure)? Find out how many files completed? Receive updates when interesting events arise? Terminate a request?
Few useful services are truly “stateless”, but WS interfaces alone do not provide built-in support for state
Client
FileTransferService
move (A to B)move
7
FileTransferService (without WSRF)
Developer reinvents wheel for each new service Custom management and identification of state: transferID Custom operations to inspect state synchronously
(whatHappen) and asynchronously (tellMeWhen) Custom lifetime operation (cancel)
Client
FileTransferService
move (A to B) : transferIDmove
statewhatHappen
tellMeWhen
cancel
8
WSRF in a Nutshell Service State representation
Resource Resource Property
State identification Endpoint Reference
State Interfaces GetRP, QueryRPs,
GetMultipleRPs, SetRP Lifetime Interfaces
SetTerminationTime ImmediateDestruction
Notification Interfaces Subscribe Notify
ServiceGroups
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
9
FileTransferService (w/ WSRF)
Developer specifies custom method to createResource and leaves the rest to WSRF standards:
State exposed as Resource + Resource Properties and identified by Endpoint Reference (EPR)
State inspected by standard interfaces (GetRP, QueryRPs) Lifetime management by standard interfaces (Destroy)
ClientFileTransferService
createResource (A to B) : EPRcreateResource
RPs
Transfer getRP
queryRPs
destroy
10
Grid Infrastructure:Open Standards
Web services(WSDL, SOAP, WS-Security, WS-ReliableMessaging, …)
WS-Resource Framework & WS-Notification*(Resource identity, lifetime, inspection, subscription, …)
WS-Agreement(Agreement negotiation)
WS Distributed Management(Lifecycle, monitoring, …)
Applications of the framework(Compute, network, storage provisioning,
job reservation & submission, data management,application service QoS, …)
*WS-Transfer, WS-Enumeration, WS-Eventing, WS-Management define similar functions
11
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
Tools for building WSRF
services
12
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
Implementation of WSRF: Resources,
EndpointReferences, ResourceProperties
Operation Providers: pre-build implementations of
WSRF operations
Notification implementation: Topics, TopicSet, Embedded
Notification Consumer service
Implementations of Resources (ReflectionResource,
PersistentReflectionResource) and ResourceProperties
(SimpleResourceProperty, ReflectionResourceProperty)
14
Service Container
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
Service Container: host multiple services in container; one JVM
process
…more details: based on AXIS service
container, processes SOAP messages, ResourceContext
extension.
15
Service Container
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
Secure Communication: Transport, Message,
Conversation (Transport demonstrates best
performance)
PIP
PDP
Configurable Security Policies: Policy Information
Points (PIPs), Policy Decision Points (PDP) -- chained
Example authorization PDPs: GridMap, SAML
implementations,XACML policies
16
Service Container
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
PIP
PDP
WorkManager DB Conn Pool JNDI Directory
WorkManager: “thread pool”, site independent
“work” manager
Apache Database Connection Pool library
(JDBC “DataSource” implementation)
JNDI Directory: manages internal, shared objects
(ResourceHomes, WorkManager,
Configuration objects,…)
17
Apache Tomcat
Service Container
GT4 WS Core in a Nutshell
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
RPs
Resource
ServiceGetRP
GetMultRPs
SetRP
QueryRPs
Subscribe
SetTermTime
Destroy
EPREPR
EPR
ResourceHome
PIP
PDP
WorkManager DB Conn Pool JNDI Directory
Deploy Service Container “standalone”
or within Apache Tomcat
18
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
19
GT4 Security Public-key-based authentication Extensible authorization framework based on Web
services standards SAML-based authorization callout
As specified in GGF OGSA-Authz WG
Integrated policy decision engine XACML policy language, per-operation policies, pluggable
Credential management service MyProxy (One time password support)
Community Authorization Service Standalone delegation service
20
GT4 Use of Security Standards
Supported, Supported, Fastest, but slow but insecure so default
21
GT-XACML Integration
eXtensible Access Control Markup Language OASIS standard, open source implementations
XACML: sophisticated policy language Globus Toolkit ships with XACML runtime
Included in every client and server built on GT Turned-on through configuration
… that can be called transparently from runtime and/or explicitly from application …
… and we use the XACML-”model” for our Authz Processing Framework
22
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
I. Foster, Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2-13, 2005
Globus Toolkit: Open Source Grid Infrastructure
23
Managing Computers & Computation
GRAM (Grid Resource Allocation & Management) service Negotiate access Stage code Monitor service Manage service Collect accounting data
Can negotiate access to clusters,creation of virtual machines,establishment of virtual networks, …
GRAM
Client
24
CMS DC04
ATLASDC2
Usa
ge:
CP
Us
Dynamic Provisioning ofComputational Services
Open Science Grid use over 6 months
25
Dynamic Service Deployment
CommunityA
CommunityZ
…
• Community scheduling logic• Data distribution• Community management• Science services• PlanetLab nodes• ...
Requirements:• Community control• Persistence• Resource guarantees• Non- interference
26
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
27
Managing Storage & Data
Service interfaces for managing storage & data movement Storage management (SRM, NeST) Data movement (GridFTP, RFT) Replica management (RLS, DRS)
Service interfaces for accessing data in diverse formats OGSA Data Access & Integration GridFTP data access & movement
28
GridFTP in GT4 100% Globus code
No licensing issues Stable, extensible
IPv6 Support XIO for different transports Striping multi-Gb/sec wide area transport
27 Gbit/s on 30 Gbit/s link Pluggable
Front-end: e.g., future WS control channel Back-end: e.g., HPSS, cluster file systems Transfer: e.g., UDP, NetBLT transport
Bandwidth Vs Striping
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
0 10 20 30 40 50 60 70
Degree of Striping
Ba
nd
wid
th (
Mb
ps
)
# Stream = 1 # Stream = 2 # Stream = 4
# Stream = 8 # Stream = 16 # Stream = 32
Disk-to-disk onTeraGrid
29Reliable File Transfer:Third Party Transfer
RFT Service
RFT Client
SOAP Messages
Notifications(Optional)
DataChannel
Protocol Interpreter
MasterDSI
DataChannel
SlaveDSI
IPCReceiver
IPC Link
MasterDSI
Protocol Interpreter
Data Channel
IPCReceiver
SlaveDSI
Data Channel
IPC Link
GridFTP Server GridFTP Server
Fire-and-forget transfer Web services interface Many files & directories Integrated failure recovery Has transferred 900K files
30
Replica Location Service
Identify location of files via logical to physical name map
Distributed indexing of names, fault tolerant update protocols
GT4 version scalable & stable
Managing ~40 million files across ~10 sites
IndexIndex
Local DB
Update send (secs)
Bloom filter
(secs)
Bloom filter (bits)
10K <1 2 1 M
1 M 2 24 10 M
5 M 7 175 50 M
31
Cardiff
AEI/Golm
Birmingham•
Reliable Wide Area Data Replication
Replicating >1 Terabyte/day to 8 sites>30 million replicas so farMTBF = 1 month
LIGO Gravitational Wave Observatory
32
Data Replication Service:An Example of Service Composition
At requesting site, deploy: WSRF services
Data Replication Service
Delegation Service Reliable File Transfer
Service
Pre-WSRF components Replica Location
Service (Local Replica Catalog and Replica Location Index)
GridFTP Server
Web Service Container
Data Replication
Service
Replicator Resource
Reliable File
Transfer Service
RFT Resource
Local Replica Catalog
Replica Location
Index
GridFTP Server
Delegation Service
Delegated Credential
Local Site
33
Data Replication Service:WSDL (PortType)
<?xml version=“1.0” encoding=“utf-8”?><wsdl:definitions name=“Replication” …>
…
<wsdl:portType name=“ReplicatorPortType” wsrp:ResourceProperties=“ReplicatorResourceProperties”>
<wsdl:operation name=“createReplicator”> … <wsdl:operation name=“start” … <wsdl:operation name=“stop”> … <wsdl:operation name=“suspend”> … <wsdl:operation name=“resume”> … <wsdl:operation name=“findItems”> … <wsdl:operation name=“SetTerminationTime”> <wsdl:operation name=“Destroy”> … <wsdl:operation name=“QueryResourceProperties”> … <wsdl:operation name=“GetMultipleResourceProperties”> … <wsdl:operation name=“GetResourceProperty”> … <wsdl:operation name=“Subscribe”> … <wsdl:operation name=“GetCurrentMessage”> … </wsdl:portType>
</wsdl:definitions>
34
Data Replication Service:WSDL (Resource Properties)
<?xml version=“1.0” encoding=“utf-8”?><wsdl:definitions name=“Replication” …>
…
<wsdl:portType name=“ReplicatorPortType” wsrp:ResourceProperties=“ReplicatorResourceProperties”>
<wsdl:operation name=“createReplicator”> … <wsdl:operation name=“start” … <wsdl:operation name=“stop”> … <wsdl:operation name=“suspend”> … <wsdl:operation name=“resume”> … <wsdl:operation name=“findItems”> … <wsdl:operation name=“SetTerminationTime”> <wsdl:operation name=“Destroy”> … <wsdl:operation name=“QueryResourceProperties”> … <wsdl:operation name=“GetMultipleResourceProperties”> … <wsdl:operation name=“GetResourceProperty”> … <wsdl:operation name=“Subscribe”> … <wsdl:operation name=“GetCurrentMessage”> … </wsdl:portType>
</wsdl:definitions>
<xsd:element name="ReplicatorResourceProperties“> … <xsd:element name=“status” …/> <xsd:element name=“stage” …/> <xsd:element name=“result” …/> <xsd:element name=“errorMessage” …/> <xsd:element name=“count” …/> <xsd:element name=“Topic” …/> <xsd:element name=“TopicExprDialect” …/> <xsd:element name=“TeminationTime” …/> <xsd:element name=“CurrentTime” …/> <xsd:element name=“FixedTopicSet” …/> …</xsd:element>
35
Data Mgmt
SecurityCommonRuntime
Execution Mgmt
Info Services
GridFTPAuthenticationAuthorization
ReliableFile
Transfer
Data Access& Integration
Grid ResourceAllocation &
ManagementIndex
CommunityAuthorization
DataReplication
CommunitySchedulingFramework
Delegation
ReplicaLocation
Trigger
Java Runtime
C Runtime
Python Runtime
WebMDS
WorkspaceManagement
Grid Telecontrol
Protocol
Globus Toolkit v4www.globus.org
CredentialMgmt
Globus Toolkit: Open Source Grid Infrastructure
36
GT4 Container
GT4 Monitoring & Discovery
GRAM User
MDS-Index
GT4 Cont.
RFT
MDS-Index
GT4 Container
MDS-Index
GridFTP
adapter
Registration &WSRF/WSN Access
Custom protocolsfor non-WSRF entities
Clients(e.g., WebMDS)
Automatedregistrationin container
WS-ServiceGroup
37
Summary
Services are typically stateful, but WS standards did not support stateful entities
WSRF provides standards for management, identification, lifetime, inspection, & manipulation of stateful entities
GT4 WS Core provides a rich environment for developing stateful services
GT4 provides a rich set of services based on WSRF & WS-Notification
38
For More Information
Globus Alliance www.globus.org
Global Grid Forum www.ggf.org
Background information www.mcs.anl.gov/~foster
2nd Editionwww.mkp.com/grid2