grid computing (3) (special topics in computer engineering)

73
1 Grid Computing (3) (Special Topics in Computer Engineering) Veera Muangsin 13 February 2004

Upload: hei

Post on 13-Jan-2016

43 views

Category:

Documents


1 download

DESCRIPTION

Grid Computing (3) (Special Topics in Computer Engineering). Veera Muangsin 13 February 200 4. Outline. High-Performance Computing Grid Computing Grid Applications Grid Architecture Grid Middleware Grid Services. Network. Before the Grid. independent sites - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Grid  Computing (3) (Special Topics in Computer Engineering)

1

Grid Computing (3)(Special Topics in Computer Engineering)

Veera Muangsin

13 February 2004

Page 2: Grid  Computing (3) (Special Topics in Computer Engineering)

2

Outline• High-Performance Computing • Grid Computing • Grid Applications• Grid Architecture

• Grid Middleware

• Grid Services

Page 3: Grid  Computing (3) (Special Topics in Computer Engineering)

3

Before the GridUser

Application

Site A Site B

Network

The User is responsible for resolving the complexities

of the environment

• independent sites

• independent hardware and software

• independent user ids

• security policy requiring local connection to the machine.

Page 4: Grid  Computing (3) (Special Topics in Computer Engineering)

4

First Step to the GridUser

Application

Site A Site B

Network

Centralized Scheduler and file staging

Metacenter • Two or more

resources connected in a controlled user environment

Constraints• common

architecture• single name

space• common

scheduler

A layer of abstraction is added that hides some of the complexities associated with running jobs in a distributed computing environment, however, limitations exist

Page 5: Grid  Computing (3) (Special Topics in Computer Engineering)

5

Grid Middleware

UserApplication

Site A Site B

Network

Infrastructure

Common Middleware

- abstracts independent, hardware, software, user ids, into a service layer with defined APIs

- comprehensive security,

- allows for site autonomy

- provides a common infrastructure based on middleware

The Grid Today

1

Request info from the grid

1

2Get response2

3Make selection and submit job

3

The underlying infrastructure is abstracted into defined APIs thereby simplifying

developer and the user access to resources, however, this layer is not intelligent

Page 6: Grid  Computing (3) (Special Topics in Computer Engineering)

6

The Near Future Grid

Grid Middleware - Infrastructure APIs (service oriented)

UserApplication

Intelligent, Customized Middleware

Site A Site B

Network

Infrastructure

Customizable Grid Services built on defined Infrastructure APIs

• automatic selection of resources

• information products tailored to users

• accountless processing

• flexible interface: web based, command line, APIs

Resources are accessed via various intelligent services that access

infrastructure APIs

The result: The Scientist and Application Developer can focus on

science and not on systems management

Page 7: Grid  Computing (3) (Special Topics in Computer Engineering)

7

Layered Grid Architecture(By Analogy to Internet Architecture)

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 8: Grid  Computing (3) (Special Topics in Computer Engineering)

Grid Components

GridFabricNetworked Resources across Organisations

Computers Clusters Data Sources Scientific InstrumentsStorage Systems

Local Resource Managers

Operating Systems Queuing Systems TCP/IP & UDP

Libraries & App Kernels …

Distributed Resources Coupling Services

Comm. Sign on & Security Information … QoSProcess Data Access

Development Environments and Tools

Languages Libraries Debuggers … Web toolsResource BrokersMonitoring

Applications and Portals

Prob. Solving Env.Scientific …CollaborationEngineering Web enabled Apps

GridApps.

GridMiddleware

GridTools

Page 9: Grid  Computing (3) (Special Topics in Computer Engineering)

9

ComputeResource

SDK

API

AccessProtocol

CheckpointRepository

SDK

API

C-pointProtocol

Example:High-Throughput Computing System

High Throughput Computing System

Dynamic checkpoint, job management, failover, staging

Brokering, certificate authorities

Access to data, access to computers, access to network performance data

Communication, service discovery (DNS), authentication, authorization, delegation

Storage systems, schedulers

Collective(App)

App

Collective(Generic)

Resource

Connect

Fabric

Page 10: Grid  Computing (3) (Special Topics in Computer Engineering)

10

Example:Data Grid Architecture

Discipline-Specific Data Grid Application

Coherency control, replica selection, task management, virtual data catalog, virtual data code catalog, …

Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs,

Access to data, access to computers, access to network performance data, …

Communication, service discovery (DNS), authentication, authorization, delegation

Storage systems, clusters, networks, network caches, …

Collective(App)

App

Collective(Generic)

Resource

Connect

Fabric

Page 11: Grid  Computing (3) (Special Topics in Computer Engineering)

11

Globus Toolkit

• Grid computing middleware– Software between the hardware and high-level

services– Basic libraries, services, command-line programs

• Most common middleware used in grids

• Integrated with Web Service

Page 12: Grid  Computing (3) (Special Topics in Computer Engineering)

12

Globus Software Architecture

Grid SSH

Grid FTP

•login•execute commands•copy files

•get and put files•3rd party copy•interactive file management•parallel transfers

Monitoring and Discovery Service

(MDS)

information about resources and services

LDAP

distributed directory service

•single sign on•delegation of credentials•authorization

Grid Security Infrastructure (GSI)

SSL/TLSX.509 Certificates

•authentication•secure communication

credentials for users, services,

hosts

•execute remote applications

•stage executable, stdin, stdout, stderr

LSFPBS

Globus Resource Allocation Manager

(GRAM) fork/exec

job management

systems

Page 13: Grid  Computing (3) (Special Topics in Computer Engineering)

13

Globus server system

PBS

GRAM Server

Grid FTP

Server

Grid SSH

ServerLSF

GRAM Server

Grid FTP

Server

Grid SSH

Server

Globus server system

Globus Deployment Architecture

MDS server system

MDS GRIS

MDS GIIS

MDS GRIS

Globus client

system

Clients are programs and

libraries

GRAM Client

Grid FTP

Client

MDS Client

Grid SSH

Client

User User application/tool

Web portal

Page 14: Grid  Computing (3) (Special Topics in Computer Engineering)

14

Globus Toolkit™• A software toolkit addressing key technical

problems in the development of Grid enabled tools, services, and applications– Offer a modular “bag of technologies”– Enable incremental development of grid-enabled

tools and applications – Implement standard Grid protocols and APIs– Make available under liberal open source license

Page 15: Grid  Computing (3) (Special Topics in Computer Engineering)

15

General Approach• Define Grid protocols & APIs

– Protocol-mediated access to remote resources– Integrate and extend existing standards– “On the Grid” = speak “Intergrid” protocols

• Develop a reference implementation– Open source Globus Toolkit– Client and server SDKs, services, tools, etc.

• Grid-enable wide variety of tools– Globus Toolkit, FTP, SSH, Condor, SRB, MPI, …

Page 16: Grid  Computing (3) (Special Topics in Computer Engineering)

16

Four Key Protocols

• The Globus Toolkit™ centers around four key protocols– Connectivity layer:

• Security: Grid Security Infrastructure (GSI)

– Resource layer:• Resource Management: Grid Resource Allocation

Management (GRAM)• Information Services: Grid Resource Information Protocol

(GRIP)• Data Transfer: Grid File Transfer Protocol (GridFTP)

Page 17: Grid  Computing (3) (Special Topics in Computer Engineering)

17

The Globus Toolkit™:

Security Services

The Globus Project™Argonne National Laboratory

USC Information Sciences Institute

http://www.globus.org

Page 18: Grid  Computing (3) (Special Topics in Computer Engineering)

18

Why Grid Security is Hard• Resources are often located in distinct administrative

domains– Each resource has own policies & procedures

• Set of resources used by a single computation may be large, dynamic, and unpredictable– Not just client/server, requires delegation

• It must be broadly available & applicable– Standard, well-tested, well-understood protocols; integrated

with wide variety of tools

Page 19: Grid  Computing (3) (Special Topics in Computer Engineering)

19

Grid Security Infrastructure (GSI)• Extensions to standard protocols & APIs

– Standards: SSL/TLS, X.509 & CA, GSS-API

– Extensions for single sign-on and delegation

• Globus Toolkit reference implementation of GSI– SSLeay/OpenSSL + GSS-API + SSO/delegation

– Tools and services to interface to local security• Simple ACLs; SSLK5/PKINIT for access to K5, AFS; …

– Tools for credential management• Login, logout, etc.• Smartcards• MyProxy: Web portal login and delegation• K5cert: Automatic X.509 certificate creation

Page 20: Grid  Computing (3) (Special Topics in Computer Engineering)

20

Site A(Kerberos)

Site B (Unix)

Site C(Kerberos)

Computer

User

Single sign-on via “grid-id”& generation of proxy cred.

Or: retrieval of proxy cred.from online repository

User ProxyProxy

credential

Computer

Storagesystem

Communication*

GSI-enabledFTP server

AuthorizeMap to local idAccess file

Remote fileaccess request*

GSI-enabledGRAM server

GSI-enabledGRAM server

Remote processcreation requests*

* With mutual authentication

Process

Kerberosticket

Restrictedproxy

Process

Restrictedproxy

Local id Local id

AuthorizeMap to local idCreate processGenerate credentials

Ditto

GSI in Action“Create Processes at A and B that Communicate & Access Files at C”

Page 21: Grid  Computing (3) (Special Topics in Computer Engineering)

21

Review ofPublic Key Cryptography

• Asymmetric keys– A private key is used to encrypt data.

– A public key can decrypt data encrypted with the private key.

• An X.509 certificate includes…– Someone’s subject name (user ID)

– Their public key

– A “signature” from a Certificate Authority (CA) that:• Proves that the certificate came from the CA.

• Vouches for the subject name

• Vouches for the binding of the public key to the subject

Page 22: Grid  Computing (3) (Special Topics in Computer Engineering)

22

Public Key Based Authentication• User sends certificate over the wire.• Other end sends user a challenge string.• User encodes the challenge string with private key

– Possession of private key means you can authenticate as subject in certificate

• Public key is used to decode the challenge.– If you can decode it, you know the subject

• Treat your private key carefully!!– Private key is stored only in well-guarded places, and only in

encrypted form

Page 23: Grid  Computing (3) (Special Topics in Computer Engineering)

23

User Proxies• Minimize exposure of user’s private key

• A temporary, X.509 proxy credential for use by our computations– We call this a user proxy certificate– Allows process to act on behalf of user– User-signed user proxy cert stored in local file– Created via “grid-proxy-init” command

• Proxy’s private key is not encrypted– Rely on file system security, proxy certificate file must be

readable only by the owner

Page 24: Grid  Computing (3) (Special Topics in Computer Engineering)

24

Delegation

• Remote creation of a user proxy

• Results in a new private key and X.509 proxy certificate, signed by the original key

• Allows remote process to act on behalf of the user

• Avoids sending passwords or private keys across the network

Page 25: Grid  Computing (3) (Special Topics in Computer Engineering)

25

GSI Applications• Globus Toolkit™ uses GSI for authentication

• Many Grid tools, directly or indirectly, e.g.– Condor-G, SRB, MPICH-G2, Cactus, GDMP, …

• Commercial and open source tools, e.g.– ssh, ftp, cvs, OpenLDAP, OpenAFS

– SecureCRT (Win32 ssh client)

• And since we use standard X.509 certificates, they can also be used for– Web access, LDAP server access, etc.

Page 26: Grid  Computing (3) (Special Topics in Computer Engineering)

26

The Globus Toolkit™:

Resource Management Services

The Globus Project™Argonne National Laboratory

USC Information Sciences Institute

http://www.globus.org

Page 27: Grid  Computing (3) (Special Topics in Computer Engineering)

27

The Challenge• Enabling secure, controlled remote access to

heterogeneous computational resources and management of remote computation– Authentication and authorization

– Resource discovery & characterization

– Reservation and allocation

– Computation monitoring and control

• Addressed by new protocols & services– GRAM protocol as a basic building block

– Resource brokering & co-allocation services

– GSI for security, MDS for discovery

Page 28: Grid  Computing (3) (Special Topics in Computer Engineering)

28

Resource Management• The Grid Resource Allocation Management (GRAM)

protocol and client API allows programs to be started on remote resources, despite local heterogeneity

• Resource Specification Language (RSL) is used to communicate requirements

• A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services– Integrated with Condor, PBS, MPICH-G2, …

Page 29: Grid  Computing (3) (Special Topics in Computer Engineering)

29

GRAM GRAM GRAM

LSF Condor NQE

Application

RSL

Simple ground RSL

Information Service

Localresourcemanagers

RSLspecialization

Broker

Ground RSL

Co-allocator

Queries& Info

Resource Management Architecture

Page 30: Grid  Computing (3) (Special Topics in Computer Engineering)

30

Globus Toolkit Implementation• Gatekeeper

– Single point of entry– Authenticates user, maps to local security environment,

runs service– In essence, a “secure inetd”

• Job manager– A gatekeeper service– Layers on top of local resource management system (e.g.,

PBS, LSF, etc.)– Handles remote interaction with the job

Page 31: Grid  Computing (3) (Special Topics in Computer Engineering)

31

GRAM Components

Grid SecurityInfrastructure

Job Manager

GRAM client API calls to request resource allocation

and process creation.

MDS client API callsto locate resources

Query current statusof resource

Create

RSL Library

Parse

RequestAllocate &

create processes

Process

Process

Process

Monitor &control

Site boundary

Client MDS: Grid Index Info Server

Gatekeeper

MDS: Grid Resource Info Server

Local Resource Manager

MDS client API callsto get resource info

GRAM client API statechange callbacks

Page 32: Grid  Computing (3) (Special Topics in Computer Engineering)

32

Job Submission Interfaces• Globus Toolkit includes several command line

programs for job submission – globus-job-run: Interactive jobs

– globus-job-submit: Batch/offline jobs

– globusrun: Flexible scripting infrastructure

• Others are building better interfaces– General purpose

• Condor-G, PBS, GRD, Hotpage, etc

– Application specific• ECCE’, Cactus, Web portals

Page 33: Grid  Computing (3) (Special Topics in Computer Engineering)

33

The Globus Toolkit™:

Information Services

The Globus Project™Argonne National Laboratory

USC Information Sciences Institute

http://www.globus.org

Page 34: Grid  Computing (3) (Special Topics in Computer Engineering)

34

Grid Information Services• System information is critical to operation of the grid

and construction of applications– What resources are available?

• Resource discovery

– What is the “state” of the grid?• Resource selection

– How to optimize resource use• Application configuration and adaptation?

• We need a general information infrastructure to answer these questions

Page 35: Grid  Computing (3) (Special Topics in Computer Engineering)

35

Examples of Useful Information• Characteristics of a compute resource

– IP address, software available, system administrator, networks connected to, OS version, load

• Characteristics of a network– Bandwidth and latency, protocols, logical topology

• Characteristics of the Globus infrastructure– Hosts, resource managers

Page 36: Grid  Computing (3) (Special Topics in Computer Engineering)

36

Grid Information: Facts of Life• Information is always old

– Time of flight, changing system state– Need to provide quality metrics

• Distributed state hard to obtain– Complexity of global snapshot

• Component will fail

• Scalability and overhead

• Many different usage scenarios– Heterogeneous policy, different information organizations,

etc.

Page 37: Grid  Computing (3) (Special Topics in Computer Engineering)

37

Grid Information Service• Provide access to static and dynamic information

regarding system components• A basis for configuration and adaptation in

heterogeneous, dynamic environments• Requirements and characteristics

– Uniform, flexible access to information– Scalable, efficient access to dynamic data– Access to multiple information sources– Decentralized maintenance

Page 38: Grid  Computing (3) (Special Topics in Computer Engineering)

38

The GIS Problem: Many Information Sources, Many Views

?RR

R

RR

?

R

R

RR

R?

R

R

R

RR

?

RR

VO A

VO B

VO C

Page 39: Grid  Computing (3) (Special Topics in Computer Engineering)

39

Information Protocols

• Grid Resource Registration Protocol– Support information/resource discovery– Designed to support machine/network failure

• Grid Resource Inquiry Protocol– Query resource description server for

information– Query aggregate server for information– LDAP V3.0 in Globus 1.1.3

Page 40: Grid  Computing (3) (Special Topics in Computer Engineering)

40

GIS Architecture

A A

Customized Aggregate Directories

R RR R

Standard Resource Description Services

Registration

Protocol

Users

Enquiry

Protocol

Page 41: Grid  Computing (3) (Special Topics in Computer Engineering)

41

Metacomputing Directory Service• Use LDAP as Inquiry • Access information in a distributed directory

– Directory represented by collection of LDAP servers

– Each server optimized for particular function

• Directory can be updated by: – Information providers and tools

– Applications (i.e., users)

– Backend tools which generate info on demand

• Information dynamically available to tools and applications

Page 42: Grid  Computing (3) (Special Topics in Computer Engineering)

42

Two Classes Of MDS Servers• Grid Resource Information Service (GRIS)

– Supplies information about a specific resource

– Configurable to support multiple information providers

– LDAP as inquiry protocol

• Grid Index Information Service (GIIS)– Supplies collection of information which was gathered from

multiple GRIS servers

– Supports efficient queries against information which is spread across multiple GRIS server

– LDAP as inquiry protocol

Page 43: Grid  Computing (3) (Special Topics in Computer Engineering)

43

Grid Resource Information Service• Server which runs on each resource

– Given the resource DNS name, you can find the GRIS server (well known port = 2135)

• Provides resource specific information– Much of this information may be dynamic

• Load, process information, storage information, etc.

• GRIS gathers this information on demand

• “White pages” lookup of resource information– Ex: How much memory does machine have?

• “Yellow pages” lookup of resource options– Ex: Which queues on machine allows large jobs?

Page 44: Grid  Computing (3) (Special Topics in Computer Engineering)

44

Grid Index Information Service• GIIS describes a class of servers

– Gathers information from multiple GRIS servers– Each GIIS is optimized for particular queries

• Ex1: Which Alliance machines are >16 process SGIs?• Ex2: Which Alliance storage servers have >100Mbps bandwidth to host X?

– Akin to web search engines

• Organization GIIS– The Globus Toolkit ships with one GIIS– Caches GRIS info with long update frequency

• Useful for queries across an organization that rely on relatively static information (Ex1 above)

• Can be merged into GRIS

Page 45: Grid  Computing (3) (Special Topics in Computer Engineering)

45

Logical MDS Deployment

ISI

GRISes

GIIS

Grads Gusto

Page 46: Grid  Computing (3) (Special Topics in Computer Engineering)

46

Example: Discovering CPU Load

• Retrieve CPU load fields of compute resources% grid-info-search -L “(objectclass=GlobusComputeResource)” \

dn cpuload1 cpuload5 cpuload15 dn: hn=lemon.mcs.anl.gov, ou=MCS, o=Argonne National Laboratory, o=Globus, c=UScpuload1: 0.48cpuload5: 0.20cpuload15: 0.03 dn: hn=tuva.mcs.anl.gov, ou=MCS, o=Argonne National Laboratory, o=Globus, c=UScpuload1: 3.11cpuload5: 2.64cpuload15: 2.57

Page 47: Grid  Computing (3) (Special Topics in Computer Engineering)

47

The Globus Toolkit™:

Data Management Services

The Globus Project™Argonne National Laboratory

USC Information Sciences Institute

http://www.globus.org

Page 48: Grid  Computing (3) (Special Topics in Computer Engineering)

48

Data Intensive Issues Include …• Harness [potentially large numbers of] data, storage,

network resources located in distinct administrative domains

• Respect local and global policies governing what can be used for what

• Schedule resources efficiently, again subject to local and global constraints

• Achieve high performance, with respect to both speed and reliability

• Catalog software and virtual data

Page 49: Grid  Computing (3) (Special Topics in Computer Engineering)

49

Desired Data Grid Functionality

• High-speed, reliable access to remote data

• Automated discovery of “best” copy of data

• Manage replication to improve performance

• Co-schedule compute, storage, network

• “Transparency” wrt delivered performance

• Enforce access control on data

• Allow representation of “global” resource allocation policies

Page 50: Grid  Computing (3) (Special Topics in Computer Engineering)

50

A Model Architecture for Data Grids

Metadata Catalog

Replica Catalog

Tape Library

Disk Cache

Attribute Specification

Logical Collection and Logical File Name

Disk Array Disk Cache

Application

Replica Selection

Multiple Locations

NWS

SelectedReplica

GridFTP Control ChannelPerformanceInformation &Predictions

Replica Location 1 Replica Location 2 Replica Location 3

MDS

GridFTPDataChannel

Page 51: Grid  Computing (3) (Special Topics in Computer Engineering)

51

Globus Toolkit ComponentsTwo major Data Grid components:

1. Data Transport and Access Common protocol

Secure, efficient, flexible, extensible data movement

Family of tools supporting this protocol

2. Replica Management Architecture Simple scheme for managing:

multiple copies of files collections of files

Page 52: Grid  Computing (3) (Special Topics in Computer Engineering)

52

Access/Transport Protocol Requirements

• Suite of communication libraries and related tools that support– GSI, Kerberos security– Third-party transfers– Parameter set/negotiate– Partial file access– Reliability/restart– Large file support– Data channel reuse

• All based on a standard, widely deployed protocol

– Integrated instrumentation

– Loggin/audit trail

– Parallel transfers

– Striping (cf DPSS)

– Policy-based access control

– Server-side computation

– Proxies (firewall, load bal)

Page 53: Grid  Computing (3) (Special Topics in Computer Engineering)

53

And The Protocol Is … GridFTP• Why FTP?

– Ubiquity enables interoperation with many commodity tools

– Already supports many desired features, easily extended to support others

– Well understood and supported

• We use the term GridFTP to refer to– Transfer protocol which meets requirements

– Family of tools which implement the protocol

• Note GridFTP > FTP• Note that despite name, GridFTP is not restricted to file

transfer!

Page 54: Grid  Computing (3) (Special Topics in Computer Engineering)

54

GridFTP: Basic Approach• FTP protocol is defined by several IETF RFCs

• Start with most commonly used subset– Standard FTP: get/put etc., 3rd-party transfer

• Implement standard but often unused features– GSS binding, extended directory listing, simple restart

• Extend in various ways, while preserving interoperability with existing servers– Striped/parallel data channels, partial file, automatic & manual

TCP buffer setting, progress monitoring, extended restart

Page 55: Grid  Computing (3) (Special Topics in Computer Engineering)

55

Replica Management• Maintain a mapping between logical names for

files and collections and one or more physical locations

• Important for many applications– Example: CERN HLT data

• Multiple petabytes of data per year• Copy of everything at CERN (Tier 0)• Subsets at national centers (Tier 1)• Smaller regional centers (Tier 2)• Individual researchers will have copies

Page 56: Grid  Computing (3) (Special Topics in Computer Engineering)

56

Replica Catalog Structure: A Climate Modeling Example

Logical File Parent

Logical File Jan 1998

Logical CollectionC02 measurements 1998

Replica Catalog

Locationjupiter.isi.edu

Locationsprite.llnl.gov

Logical File Feb 1998

Size: 1468762

Filename: Jan 1998Filename: Feb 1998…

Filename: Mar 1998Filename: Jun 1998Filename: Oct 1998Protocol: gsiftpUrlConstructor: gsiftp://jupiter.isi.edu/ nfs/v6/climate

Filename: Jan 1998…Filename: Dec 1998Protocol: ftpUrlConstructor: ftp://sprite.llnl.gov/ pub/pcmdi

Logical CollectionC02 measurements 1999

Page 57: Grid  Computing (3) (Special Topics in Computer Engineering)

57

Replica Catalog Servicesas Building Blocks: Examples

• Combine with information service to build replica selection services– E.g. “find best replica” using performance info from

NWS and MDS– Use of LDAP as common protocol for info and replica

services makes this easier

• Combine with application managers to build data distribution services– E.g., build new replicas in response to frequent accesses

Page 58: Grid  Computing (3) (Special Topics in Computer Engineering)

58

Replica Catalog Directions• Many data grid applications do not require tight

consistency semantics– At any given time, you may not be able to discover all copies– When a new copy is made, it may not be immediately

recognized as available

• Allows for much more scalable design– Distributed catalogs: local catalogs which maintain their own

LFN -> PFN mapping– Soft-state updates as basis for building various

configurations of global catalogs

Page 59: Grid  Computing (3) (Special Topics in Computer Engineering)

59

Virtual Data in Action

• Data request may Access local data Compute locally Compute remotely Access remote data

• Scheduling subject to local & global policies

• Local autonomy

?

Major Archive Facilities

Network caches & regional centers

Local sites

Page 60: Grid  Computing (3) (Special Topics in Computer Engineering)

60

Evolution of Grid Technologies

• Initial exploration (1996-1999; Globus 1.0)– Extensive appln experiments; core protocols

• Data Grids (1999-??; Globus 2.0+)– Large-scale data management and analysis

• Open Grid Services Architecture (2001-??, Globus 3.0)– Integration w/ Web services, hosting environments, resource

virtualization

– Databases, higher-level services

• Radically scalable systems (2003-??)– Sensors, wireless, ubiquitous computing

Page 61: Grid  Computing (3) (Special Topics in Computer Engineering)

61

Grids and Open Standards

Incr

ease

d fu

nctio

nalit

y,st

anda

rdiz

atio

n

Time

Customsolutions

Open GridServices Arch

GGF: OGSI, …(+ OASIS, W3C)

Multiple implementations,including Globus Toolkit

Web services

Globus Toolkit

Defacto standardsGGF: GridFTP, GSI

X.509,LDAP,FTP, …

App-specificServices

Page 62: Grid  Computing (3) (Special Topics in Computer Engineering)

62

“Web Services”• Increasingly popular standards-based framework for accessing

network applications– W3C standardization; Microsoft, IBM, Sun, others

• WSDL: Web Services Description Language– Interface Definition Language for Web services

• SOAP: Simple Object Access Protocol– XML-based RPC protocol; common WSDL target

• WS-Inspection– Conventions for locating service descriptions

• UDDI: Universal Desc., Discovery, & Integration – Directory for Web services

Page 63: Grid  Computing (3) (Special Topics in Computer Engineering)

63

The Need to SupportTransient Service Instances

• “Web services” address discovery & invocation of persistent services– Interface to persistent state of entire enterprise

• In Grids, must also support transient service instances, created/destroyed dynamically– Interfaces to the states of distributed activities– E.g. workflow, video conf., dist. data analysis

• Significant implications for how services are managed, named, discovered, and used– In fact, much of our work is concerned with the management of

service instances

Page 64: Grid  Computing (3) (Special Topics in Computer Engineering)

64

Open Grid Services Architecture• Service orientation to virtualize resources• From Web services:

– Standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency

• Building on Globus Toolkit:– Grid service: semantics for service interactions

– Management of transient instances (& state)

– Factory, Registry, Discovery, other services

– Reliable and secure transport

• Multiple hosting targets: J2EE, .NET, “C”, …

Page 65: Grid  Computing (3) (Special Topics in Computer Engineering)

65

Open Grid Services Architecture

Open Grid Services Infrastructure

OGSA services: registry,authorization, monitoring, data

access, management, etc., etc.

TransportProtocolHosting EnvironmentHosting Environment

Host. Env. & Protocol BindingsO

GS

A sch

emas

More specialized &domain-specific

services

Other

schemas

Web Services

Priorities: Data access and

integration Security SLA negotiation Manageability Monitoring …

Page 66: Grid  Computing (3) (Special Topics in Computer Engineering)

66

OGSA Service Model• System comprises (a typically few) persistent services &

(potentially many) transient services

• All services adhere to specified Grid service interfaces and behaviors– Reliable invocation, lifetime management, discovery,

authorization, notification, upgradeability, concurrency, manageability

• Interfaces for managing Grid service instances– Factory, registry, discovery, lifetime, etc.

=> Reliable, secure mgmt of distributed state

Page 67: Grid  Computing (3) (Special Topics in Computer Engineering)

67

The Grid Service• A (potentially transient) Web service with specified

interfaces & behaviors, including– Creation (Factory)– Global naming (GSH) & references (GSR)– Lifetime management– Registration & Discovery– Authorization– Notification– Concurrency– Manageability

Page 68: Grid  Computing (3) (Special Topics in Computer Engineering)

68

Use of Web Services (1)• A Grid service interface is a WSDL portType

• A Grid service definition is a WSDL extension (serviceType) containing:– A set of one or more portTypes supported by the

service– portType & serviceType compatibility statements, to

support upgradability• For discovery of compatible services when interfaces are

upgraded

– Implementation version information

Page 69: Grid  Computing (3) (Special Topics in Computer Engineering)

69

Use of Web Services (2)• A GSR is a WSDL document with extensions:

– Extension to service element to reference serviceType

– Service element extensions to carry the GSH, and the expiration time of the GSR

• A GSH is an URL, with the following properties:– Globally unique for all time

– http get on GSH + “.wsdl” returns GSR

– Can derive GSH to Mapper from it

• Registry returns WS-Inspection documents

Page 70: Grid  Computing (3) (Special Topics in Computer Engineering)

70

Distributed Resources

Condor poolsof workstations

tertiary storageclusters

national supercomputer facilities

scientific instruments

Internet optical networks space-based networks

Grid Communication Functions

Communications

Ba

sic

Gri

dF

un

cti

on

s

...

Resource Discovery

Scheduling and Access to Computing

Uniform Data Access

Monitoring and Events

security servicestransport services

Op

eration

al Su

pp

ort

Po

rta

ls

Resource Brokering

Fault Management

AccountingData Management:

replication and metadata

Encapsulation as Web Services

Encapsulation for Script Based Services

Encapsulation as Java Based Services

Web Portal Access to Application and Grid Services

Specialized Portal Access (high performance displays, PDAs, etc.)

. . .

Se

rvi c

es

B

uil

din

g

Blo

ck

s

Se

rvic

es

Workflow Management

Applications

Grids: An Emerging, Common Computing and Data Infrastructurefor Science and Engineering

Page 71: Grid  Computing (3) (Special Topics in Computer Engineering)

71

Application Domain Specific Portals

Application Domain Independent Portals

Grid Common Services: Uniform Access, Security, and Management of Compute, Data, and Instrument Resources

MER/CIPSTS/SLI MissionAnalysis

ES Modeling

ISS Training

Aviation Capacity

User Environment

Portals

Collaboration Portals

Domain Specific Web Services –Encapsulated Applications

Domain IndependentGrid Web Services

Vis

ua

liza

tio

n

Da

ta P

roc

es

sin

g &

A

na

lys

is

Da

ta M

an

ag

em

en

t

Co

lla

bo

rati

on

S

erv

ice

s

Mo

nit

ori

ng

Ev

en

ts

Wo

rkfl

ow

Ma

na

ge

me

nt

Pro

gra

mm

ing

S

erv

ice

s

Ex

pe

rim

en

tM

an

ag

em

en

t

Ins

tru

me

nt

&

Se

ns

or

Ga

tew

ay

s

Sy

ste

m M

od

els

Fli

gh

t S

imu

lati

on

Co

mp

uta

tio

na

l S

imu

lati

on

Arc

hiv

e G

ate

wa

ys

Zo

om

ing

Co

up

lin

g

Portals: Services Presented to the Users to Accomplish Tasks

Grid Web Services: Grid Functions and Application Functions Packaged for Building Portals

Multi-Site Compute, Data, and Instrument Resources

Grids: A Common Computing and Data Infrastructure forScience and Engineering

Page 72: Grid  Computing (3) (Special Topics in Computer Engineering)

72

Gri

d P

roto

co

ls a

nd

Gri

d S

ec

uri

ty I

nfr

as

tru

ctu

re

Combining Grid and Web ServicesC

lien

tsApplication

PortalsWeb

ServicesGrid Services:

Collective and Resource AccessResources

Compute(many)

Storage(many)

Communi-cation

Instruments(various)

GRAM

GridFTPData Replica and Metadata Catalog

GridMonitoring

Architecture

GridInformation

Service

We

b B

row

se

r

Grid Web ServiceDescription (WSDL)& Discovery (UDDI)

Grid X.509Certification

Authority

SRB/MetadataCatalogue

Condor-G

CORBA

MPI

Secure, Reliable

Group Comm.

Discipline /Application

SpecificPortals

(e.g. SDSCTeleScience)

ProblemSolving

Environments(AVS, SciRun,

Cactus)

EnvironmentManagement(LaunchPad,

HotPage)

Job Submission /Control

File Transfer

Data Management

CredentialManagement

Monitoring

Events

WorkflowManagement

other services:•visualization•interface builders•collaboration tools•numerical grid generators•etc.

Apache Tomcat&WebSphere&Cold Fusion=JVM + servlet

instantiation + routing

CoG Kits implementing Web Services in

servelets, servers, etc.Python, Java, etc.,

JSPs

compositionframeworks(e.g. XCAT)

XM

L /

SO

AP

ov

er

Gri

d S

ec

uri

ty I

nfr

as

tru

ctu

re

Gri

d P

roto

co

ls a

nd

G

rid

Sec

uri

ty I

nfr

astr

uct

ure

Apache SOAP,.NET, etc.

……

htt

p,

htt

ps

. e

tc.

X W

ind

ow

sP

DA

Grid ssh

Page 73: Grid  Computing (3) (Special Topics in Computer Engineering)

73

For More Information

• Globus Project™– www.globus.org

• Grid Forum– www.gridforum.org

• Book (Morgan Kaufman)– www.mkp.com/grids