servillabeam20041214gridintro
TRANSCRIPT
-
8/7/2019 ServillaBEAM20041214GridIntro
1/41
An Introduction toGrid Computing
BEAM Workshop
December 2004
Mark Servilla
LTER Network Office
-
8/7/2019 ServillaBEAM20041214GridIntro
2/41
SEEK-BEAM Workshop Dec 2004 2
Presentation Agenda
Definitions
Evolution of the Grid
Characteristics Computing Model
Protocols
Examples
References
-
8/7/2019 ServillaBEAM20041214GridIntro
3/41
SEEK-BEAM Workshop Dec 2004 3
Definitions of a Grid a network of conductors for distribution of electric
power; also : a network of radio or television stations Merriam-Webster
the illusion of a simple yet large and powerful self-
managing virtual computer out of a large collection ofconnected heterogeneous systems sharing variouscombinations of resources IBM Redbooks
Grid Computing enables virtual organizations to sharegeographically distributed resources as they pursue
common goals, assuming the absence of central location,central control, omniscience, and an existing trustrelationship. Globus Alliance
The Web provides us information the grid allows us toprocess it. -Ahmar Abbas
-
8/7/2019 ServillaBEAM20041214GridIntro
4/41
SEEK-BEAM Workshop Dec 2004 4
The Evolution of
Grid Technology
High-Performance Computing
Cluster Computing
Peer-to-Peer Computing Internet Computing
-
8/7/2019 ServillaBEAM20041214GridIntro
5/41
SEEK-BEAM Workshop Dec 2004 5
High-Performance
Computing
Traditionallyknown as super-computing
Specialized forparallelprocessingalgorithms
Shared equallyamong academia,research, andcommercial sectors
-
8/7/2019 ServillaBEAM20041214GridIntro
6/41
SEEK-BEAM Workshop Dec 2004 6
Cluster Computing
Originated 1994 Beowulf cluster NASA
High-performance
Massively-parallel (2 to 1000 nodes)
Commodity hardware (Intel, AMD)
Low-cost software (Linux, FreeBSD)
Interconnected via high-speed private networks
Shared storage SAN/NAS
AMD Athlon cluster at University ofHeidelberg,Germany 825Gflops, 35th fastest high-performance computer in the world
-
8/7/2019 ServillaBEAM20041214GridIntro
7/41
SEEK-BEAM Workshop Dec 2004 7
Cluster Computing
-
8/7/2019 ServillaBEAM20041214GridIntro
8/41
SEEK-BEAM Workshop Dec 2004 8
Peer-to-Peer Computing
Primarily used for distributed storage andfile-sharing
Early models (rcp, scp, ftp)
Restricted to LANs, or Limited to known peers
Internet-based models Centralized (Napster, Kazaa*)
Decentralized (Gnutella)
*100,000,000 downloads by 2004; 2-million new downloads a week
-
8/7/2019 ServillaBEAM20041214GridIntro
9/41
SEEK-BEAM Workshop Dec 2004 9
Centralized Peer-to-Peer
.mp3
?
??
??
?.mp3 .mp3 .mp3
-
8/7/2019 ServillaBEAM20041214GridIntro
10/41
SEEK-BEAM Workshop Dec 2004 10
Decentralized Peer-to-Peer
?
?
??
?
?
.mp3 .mp3 .mp3 .mp3
-
8/7/2019 ServillaBEAM20041214GridIntro
11/41
SEEK-BEAM Workshop Dec 2004 11
Internet Computing
Volunteer or philanthropiccomputing; utilizes personaldesktop computers connectedto the Internet
Desktop computers idleapproximately 95% of the theirlifespan
Divide and Conqueror approach Tasks broken into smaller
subtasks
Desktop executes subtasksduring idle time
Desktop sends data back tocentral server, whichaggregates results
-
8/7/2019 ServillaBEAM20041214GridIntro
12/41
SEEK-BEAM Workshop Dec 2004 12
Synthesis entre Grid
High-performance computing
pioneered the use ofparallel algorithms
Cluster computing
demonstrated the nature of shared computing andstorage
load balancing protocols
Peer-to-peer computing
distributed storage resource with no central authority
Internet computing
geographically distributed virtual organization
fabric of the project vanishes with completion of the task
-
8/7/2019 ServillaBEAM20041214GridIntro
13/41
SEEK-BEAM Workshop Dec 2004 13
Grid Characteristics
Resources that are connected via a network are geographically distributed may consist of heterogeneous hardware and/or
software are managed transparently for performance and
fault tolerance
Creates the illusion of virtual organizationsand projects without the presence of a central authority, or
a central control Explicit trust relationships between users and
resources A system that scales in space and time
-
8/7/2019 ServillaBEAM20041214GridIntro
14/41
SEEK-BEAM Workshop Dec 2004 14
Types of Resources Computation
utilization of computing cycles found on processors of themachines on the grid
Storage to increase capacity, performance, sharing, and reliability of data
Communication to increase capacity, performance, and reliability of data
communication
Collaboration tools to facilitate collaboration through conferencing, visualization, and
data sharing
Software and Licenses to share site-specific software and/or licenses
Special equipment, capacities, architectures, and policies printers, imaging, sensors, or other local specialty resources
-
8/7/2019 ServillaBEAM20041214GridIntro
15/41
SEEK-BEAM Workshop Dec 2004 15
Grid Ingredients
-
8/7/2019 ServillaBEAM20041214GridIntro
16/41
SEEK-BEAM Workshop Dec 2004 16
Grid Topologies Departmental Grids
localized to a specific group of people generally, same hardware and software designed for high throughput and high performance over a
dedicated network
Enterprise Grids service to numerous groups within a single company or
campus resource heterogeneity increases company-wide local area network
Extraprise Grids
service to multiple companies, partners, and customers withina particular domain domain based private network
Global Grids established over the public-Internet
-
8/7/2019 ServillaBEAM20041214GridIntro
17/41
SEEK-BEAM Workshop Dec 2004 17
Resource-based Grids
Compute Grids desktop nodes
server nodes
high-performance computing clusters
Data Grids performance-based distributed storage
replication for fault-tolerance
Collaboration Grids support for video-conferencing, visualization and data sharing
Utility Grids maintained and managed by a commercial service provider
compute resources acquired on a per-need basis
application resources that are purchased on a per-use or per-minute basis
-
8/7/2019 ServillaBEAM20041214GridIntro
18/41
SEEK-BEAM Workshop Dec 2004 18
Application Characteristics
Perfect Parallelism computations runautonomously (Monte Carlo Simulations)
Data Parallelism operationsperformed on data simultaneously (dbsearches)
Functional Parallelism multipleoperations are performed simultaneously
Optimized for parallel
execution
Not capable of parallel
computation
Fibonacci Series (1, 1, 2, 3, 5, 8, 13, 21,)
F(k+2) = F(k+1) + F(k)
-
8/7/2019 ServillaBEAM20041214GridIntro
19/41
SEEK-BEAM Workshop Dec 2004 19
Questions to ask?
When thinking Grid
Identity and AuthenticationIs this user who he says he is? Isthis program the right program?
Authorization and PolicyWhat can the user do on the grid?What can the application do on the grid? What resources are theuser and or application allowed to access?
Resource DiscoveryWhere are the resources? Resource CharacterizationWhat types of resources are
available? Resource AllocationWhat policy is applied when assigning the
resources? What is the actual process of assigning the resources.Who gets how much?
Resource ManagementWhich resource can be used at whattime and for what purpose?
Accounting/Billing/Service Level Agreement (SLA)Howmuch of the resources is being used? What is the rating schedule?What is the SLA?
SecurityHow do I make sure that this is done securely? How dowe know if we have been compromised? What steps are takenonce a security breach is detected?
-
8/7/2019 ServillaBEAM20041214GridIntro
20/41
SEEK-BEAM Workshop Dec 2004 20
A Grid Computing Model
(the Globus view)
Software stackconsisting of
Standards
Protocols APIs and SDKs
Loosely basedon the Internet
model
-
8/7/2019 ServillaBEAM20041214GridIntro
21/41
-
8/7/2019 ServillaBEAM20041214GridIntro
22/41
SEEK-BEAM Workshop Dec 2004 22
Grid Protocols
Grid Security Infrastructure (GSI)
Grid Resource Allocation and Management(GRAM)
Grid File Transfer Protocol (GridFTP)
Grid Information Services (GIS)
-
8/7/2019 ServillaBEAM20041214GridIntro
23/41
SEEK-BEAM Workshop Dec 2004 23
Grid SecurityInfrastructure
Extended from SSL/TLS and X.509 protocols Utilizes PKI for Certificate Authority
Primary objective is Authorization Generates primary credential
Generates temporary proxy credential Certificate Authority
Positively identify entities requesting certificates Issuing, removing, and archiving certificates Protecting the Certificate Authority server Maintaining a namespace of unique names for certificate
owners Serve signed certificates to those needing to
authenticate entities Logging activity
-
8/7/2019 ServillaBEAM20041214GridIntro
24/41
SEEK-BEAM Workshop Dec 2004 24
Public KeyInfrastructure
1. User A encrypts message with hisprivate key2. Obtains User Bs public key from
CA3. Encrypts message with Bs public
key4. Sends message
1. User B decrypts message with hisprivate key2. Obtains User As public key from
CA3. Decrypts As message with public
key4. B knows message is from A
Public
Private
Private
Public
Public
Keys
A B
Certificate
Authority
Bs public
keyAs public
key
Authentication
Credential
-
8/7/2019 ServillaBEAM20041214GridIntro
25/41
SEEK-BEAM Workshop Dec 2004 25
Grid SecurityInfrastructure
-
8/7/2019 ServillaBEAM20041214GridIntro
26/41
SEEK-BEAM Workshop Dec 2004 26
Grid Resource Allocation
and Management
Allows programs to be started on remote resources
Resource Specification Language (RSL) Resource requirements
machine type, number of nodes, memory, etc
Job configuration directory, executable, arguments, environment
Communication protocols HTTP-base RPC (early protocol)
Web-services (WSDL, SOAP)
create 5-10 instances of myprog, each on a machine with at least 64MB
memory that is available to me for 4 hours, or 10 instances, on a machine with
at least 32MB of memory
-
8/7/2019 ServillaBEAM20041214GridIntro
27/41
SEEK-BEAM Workshop Dec 2004 27
Grid File Transfer Protocol
Providing high-speed and reliable transferof large volume data (petabytes)
Extension of standard FTP to include
striped/parallel data channels
partial files
automatic and manual TCP buffer size settings
progress monitoring
extended restart functionality
-
8/7/2019 ServillaBEAM20041214GridIntro
28/41
SEEK-BEAM Workshop Dec 2004 28
Grid Information Services
Grid Resource Information Service (GRIS) provides resource specific information
Grid Resource Registration (GRR)
updates GRIS about resource status Grid Index Information Service (GIIS)
an aggregate directory service
provides a collection of information that has
been gathered from multiple GRIS servers Grid Resource Inquiry (GRI)
queries GRIS server for resource information
queries GIIS server for information
-
8/7/2019 ServillaBEAM20041214GridIntro
29/41
SEEK-BEAM Workshop Dec 2004 29
Open Grid Services
Architecture
Marriage of grid protocols with webservice protocols
Specifications for
How Grid Services are created and discovered
How Grid Service instances are named andreferenced
Interfaces that define any Grid Service
Initial release with GT 3.0 mid-2003; GT4.0 Jan 2005
-
8/7/2019 ServillaBEAM20041214GridIntro
30/41
SEEK-BEAM Workshop Dec 2004 30
Grid Examples
Network for Earthquake Engineering andSimulation (NEESGrid)
Biomedical Informatics Research Network
(BIRN)
EcoGrid
-
8/7/2019 ServillaBEAM20041214GridIntro
31/41
SEEK-BEAM Workshop Dec 2004 31
NEESGrid(Network for Earthquake Engineering and Simulation)
Linkingscientistsand facilities observation of an experiment in progress observation before and after an experiment remote operation of an experiment
Linking facilitiesand data hybrid operation of physical simulations with other
simulations, both physical and numerical automatic archiving of raw data, calibration data, and
processed data Linkingscientistsand data
collaborative views (static) of time synchronized datavisualizations
collaborative views of time synchronized data visualizations
with video and audio recordings Linkingscientistsandotherscientists
synchronous communication, such as with colleagues duringan experiment
asynchronous communication, such as with colleagues overthe course of preparing a publication resulting from anexperiment
-
8/7/2019 ServillaBEAM20041214GridIntro
32/41
SEEK-BEAM Workshop Dec 2004 32
NEESGrid(Network for Earthquake Engineering and Simulation)
-
8/7/2019 ServillaBEAM20041214GridIntro
33/41
SEEK-BEAM Workshop Dec 2004 33
NEESGrid(Network for Earthquake Engineering and Simulation)
Shake table withinstrumentation
DataGeneral
POP
Local computers& storage
NEES Equipment Site
EdgeRouter
Wide AreaNetwork
EquipmentSite
UserSite
NEESgrid
Operations
ResourceSite
Gigabit Ethernet > Gb/s WAN
Network Architecture Diagram
-
8/7/2019 ServillaBEAM20041214GridIntro
34/41
SEEK-BEAM Workshop Dec 2004 34
BIRN(Biomedical Informatics ResearchNetwork)
Testbed for a biomedical knowledgeinfrastructure
Federated database of neuro-imaging data Fusion of diverse data sources (location; level of
aggregation) Grid access to computational resources Datamining software Scalable and extensible Driven by research needs, not technology-pull or
not technology-push
-
8/7/2019 ServillaBEAM20041214GridIntro
35/41
SEEK-BEAM Workshop Dec 2004 35
BIRN(Biomedical Informatics ResearchNetwork)
-
8/7/2019 ServillaBEAM20041214GridIntro
36/41
SEEK-BEAM Workshop Dec 2004 36
BIRN(Biomedical Informatics ResearchNetwork)
-
8/7/2019 ServillaBEAM20041214GridIntro
37/41
SEEK-BEAM Workshop Dec 2004 37
EcoGrid
Metadata Standardization Ecological Metadata Language EML
Integrate diverse data networks from ecology, biodiversity,and environmental sciences
Standardized interfaces to data resources Metacat
SRB
DiGIR
Xanthoria
Metadata-mediated data access (application-based) Supports multiple metadata standards
EML, Darwin Core as foci
Computational services Pre-defined analytical services
On-the-fly analytical services
-
8/7/2019 ServillaBEAM20041214GridIntro
38/41
SEEK-BEAM Workshop Dec 2004 38
EcoGrid
*EML facilitates semi-automatic data binding
-
8/7/2019 ServillaBEAM20041214GridIntro
39/41
SEEK-BEAM Workshop Dec 2004 39
EcoGrid
-
8/7/2019 ServillaBEAM20041214GridIntro
40/41
SEEK-BEAM Workshop Dec 2004 40
Grid Organizations Globus Alliance
Globus ToolkitTM Reference implementationof the grid architecture and grid protocols
http://www.globus.org
NSF Middleware Initiative (NMI) Supports the design, development, testing,
and deployment of middleware for HPC http://www.nsf-middleware.org
GRIDS Center Grid Research Integration Deployment and
Support Center part of NMI http://www.grids-center.org
Global Grid Forum Main standards body governing the world-
wide grid community http://www.globalgridforum.org
-
8/7/2019 ServillaBEAM20041214GridIntro
41/41
SEEK-BEAM Workshop Dec 2004 41
RecommendedTexts
Grid Computing: A Practical Guide to Technology andApplications
Ahmar Abbas
Charles River Media 2004
Introduction to Grid Computing with Globus Luis Ferreira et al.
IBM Redbooks 2004
Enabling Applications for Grid Computing with Globus
Bart Jacob et al.
IBM Redbooks 2003 Grid Services Programming and Application Enablement
Luis Ferreira et al.
IBM Redbooks 2004