grid document

Upload: sumit-batra

Post on 03-Apr-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

  • 7/28/2019 Grid Document

    1/31

    Sumitbatra203

    CONTENTSSL No Chapter Page Noi

    1. INTRODUCTION 1

    2. GRID CONSTRUCTION: GENERAL PRINCIPLES 2

    3. GRID ARCHITECTURE

    4. GRID APPLICATIONS

    5. CONCLUSIONS AND FUTURE TRENDS

    6. BIBLIOGRAPHY

  • 7/28/2019 Grid Document

    2/31

    Grid Computing

    1

    INTRODUCTION

    The popularity of the Internet as well as the availability of powerful

    computers and high-speed network technologies as low-cost commodity

    components is changing the way we use computers today. These technology

    opportunities have led to the possibility of using distributed computers as a

    single, unified computing resource, leading to what is popularly known as Grid

    computing. The term Grid is chosen as an analogy to a power Grid that provides

    consistent, pervasive, dependable, transparent access to electricity irrespective

    of its source. A detailed analysis of this analogy can be found in. This new

    approach to network computing is known by several names, such as

    metacomputing, scalable computing, global computing, Internet computing, and

    more recently peer-to- peer (P2P) computing.

    Figure 1. Towards Grid computing: a conceptual view.

  • 7/28/2019 Grid Document

    3/31

    Grid Computing

    2

    Grids enable the sharing, selection, and aggregation of a wide variety of

    resources including supercomputers, storage systems, data sources, and

    specialized devices (see Figure 1)that are geographically distributed and owned

    by different organizations for solving large-scale computational and data

    intensive problems in science, engineering, and commerce. Thus creating virtual

    organizations and enterprises as a temporary alliance of enterprises or

    organizations that come together to share resources and skills, core

    competencies, or resources in order to better respond to business opportunities

    or large-scale application processing requirements, and whose cooperation is

    supported by computer networks.

    The concept of Grid computing started as a project to link

    geographically dispersed supercomputers, but now it has grown far beyond its

    original intent. The Grid infrastructure can benefit many applications, including

    collaborative engineering, data exploration, high-throughput computing, and

    distributed supercomputing.

    A Grid can be viewed as a seamless, integrated computational and

    collaborative environment (see Figure 1). The users interact with the Grid

    resource broker to solve problems, which in turn performs resource discovery,

    scheduling, and the processing of application jobs on the distributed Grid

    resources. From the end-user point of view, Grids can be used to provide the

    following types of services.

    Computational services. These are concerned with providing secure services

    for executing application jobs on distributed computational resources

    individually or collectively. Resources brokers provide the services for

    collective use of distributed resources. A Grid providing computational services

    is often called a computational Grid. Some examples of computational Grids

    are: NASA IPG, the World Wide Grid, and the NSF TeraGrid .

  • 7/28/2019 Grid Document

    4/31

    Grid Computing

    3

    Data services. These are concerned with proving secure access to distributed

    datasets and their management. To provide a scalable storage and access to the

    data sets, they may be replicated, catalogued, and even different datasets stored

    in different locations to create an illusion of mass storage. The processing of

    datasets is carried out using computational Grid services and such a

    combination is commonly called data Grids. Sample applications that need such

    services for management, sharing, and processing of large datasets are high-

    energy physics and accessing distributed chemical databases for drug design.

    Application services. These are concerned with application management and

    providing access to remote software and libraries transparently. The emerging

    technologies such as Web services are expected to play a leading role in

    defining application services. They build on computational and data services

    provided by the Grid. An example system that can be used to develop such

    services is NetSolve.

    Information services. These are concerned with the extraction and presentation

    of data with meaning by using the services of computational, data, and/or

    application services. The low-level details handled by this are the way that

    information is represented, stored, accessed, shared, and maintained. Given its

    key role in many scientific endeavors, the Web is the obvious point of departure

    for this level.

    Knowledge services. These are concerned with the way that knowledge is

    acquired, used, retrieved, published, and maintained to assist users in achieving

    their particular goals and objectives. Knowledge is understood as information

    applied to achieve a goal, solve a problem, or execute a decision. An example of

    this is data mining for automatically building a new knowledge.

  • 7/28/2019 Grid Document

    5/31

    Grid Computing

    4

    To build a Grid, the development and deployment of a number of

    services is required. These include security, information, directory, resource

    allocation, and payment mechanisms in an open environment and high-level

    services for application development, execution management, resource

    aggregation, and scheduling.

    Grid applications (typically multidisciplinary and large-scale processing

    applications) often couple resources that cannot be replicated at a single site, or

    which may be globally located for other practical reasons. These are some of the

    driving forces behind the foundation of global Grids. In this light, the Gridallows users to solve larger or new problems by pooling together resources that

    could not be easily coupled before. Hence, the Grid is not only a computing

    infrastructure, for large applications, it is a technology that can bond and unify

    remote and diverse distributed resources ranging from meteorological sensors to

    data vaults and from parallel supercomputers to personal digital organizers. As

    such, it will provide pervasive services to all users that need them.

    This paper aims to present the state-of-the-art of Grid computing and

    attempts to survey the major international efforts in this area.

    Benefits of Grid Computing

    Grid computing can provide many benefits not available with traditional

    computing models:

    Better utilization of resources Grid computing uses distributed resources

    more efficiently and delivers more usable computing power. This can decrease

    time-to-market, allow for innovation, or enable additional testing and simulation

    for improved product quality. By employing existing resources, grid computing

    helps protect IT investments, containing costs while providing more capacity.

  • 7/28/2019 Grid Document

    6/31

    Grid Computing

    5

    Increased user productivity By providing transparent access to resources,

    work can be completed more quickly. Users gain additional productivity as they

    can focus on design and development rather than wasting valuable time hunting

    for resources and manually scheduling and managing large numbers of jobs.

    ScalabilityGrids can grow seamlessly over time, allowing many thousands

    of processors to be integrated into one cluster. Components can be updated

    independently and additional resources can be added as needed, reducing large

    one-time expenses.

    Flexibility Grid computing provides computing power where it is needed

    most, helping to better meet dynamically changing work loads. Grids can

    contain heterogeneous compute nodes, allowing resources to be added and

    removed as needs dictate.

    Levels of Deployment

    Grid computing can be divided into three logical levels of deployment:

    Cluster Grids, Enterprise Grids, and Global Grids.

    Cluster Grids

    The simplest form of a grid, a Cluster Grid consists of multiple systemsinterconnected through a network. Cluster Grids may contain distributed

    workstations and servers, as well as centralized resources in a datacenter

    environment. Typically owned and used by a single project or department,

    Cluster Grids support both high throughput and high performance jobs.

    Common examples of the Cluster Grid architecture include compute farms,

    groups of multi-processor HPC systems, Beowulf clusters, and networks of

    workstations (NOW).

  • 7/28/2019 Grid Document

    7/31

    Grid Computing

    6

    Enterprise Grids

    As capacity needs increase, multiple Cluster Grids can be combined into

    an Enterprise Grid. Enterprise Grids enable multiple projects or departments to

    share computing resources in a cooperative way. Enterprise Grids typically

    contain resources from multiple administrative domains, but are located in the

    same geographic location.

    Global Grids

    Global Grids are a collection of Enterprise Grids, all of which haveagreed upon global usage policies and protocols, but not necessarily the same

    implementation. Computing resources may be geographically dispersed,

    connecting sites around the globe. Designed to support and address the needs of

    multiple sites and organizations sharing resources, Global Grids provide the

    power of distributed resources to users anywhere in the world.

    Figure 2 Three levels of grid computing: cluster, enterprise, and global grids.

    GRID CONSTRUCTION: GENERAL PRINCIPLES

  • 7/28/2019 Grid Document

    8/31

    Grid Computing

    7

    This section briefly highlights some of the general principles that

    underlie the construction of the Grid. In particular, the idealized design features

    that are required by a Grid to provide users with a seamless computing

    environment are discussed. Four main aspects characterize a Grid.

    Multiple administrative domains and autonomy. Grid resources are

    geographically distributed across multiple administrative domains and owned

    by different organizations. The autonomy of resource owners needs to be

    honored along with their local resource management and usage policies.

    Heterogeneity. A Grid involves a multiplicity of resources that are

    heterogeneous in nature and will encompass a vast range of technologies.

    Scalability. A Grid might grow from a few integrated resources to millions.

    This raises the problem of potential performance degradation as the size of

    Grids increases. Consequently, applications that require a large number of

    geographically located resources must be designed to be latency and bandwidth

    tolerant.

    Dynamicity or adaptability. In a Grid, resource failure is the rule rather than

    the exception. In fact, with so many resources in a Grid, the probability of some

    resource failing is high. Resource managers or applications must tailor their

    behavior dynamically and use the available resources and services efficiently

    and effectively.

    Design Features

  • 7/28/2019 Grid Document

    9/31

    Grid Computing

    8

    The following are the main design features required by a Grid

    environment.

    Administrative hierarchy. An administrative hierarchy is the way that eachGrid environment divides itself up to cope with a potentially global extent. The

    administrative hierarchy determines how administrative information flows

    through the Grid.

    Communication services. The communication needs of applications using a

    Grid environment are diverse, ranging from reliable point-to-point to unreliable

    multicast communications. The communications infrastructure needs to support

    protocols that are used for bulk-data transport, streaming data, group

    communications, and those used by distributed objects. The network services

    used also provide the Grid with important QoS parameters such as latency,

    bandwidth, reliability, fault-tolerance, and jitter control.

    Information services. A Grid is a dynamic environment where the location and

    types of services available are constantly changing. A major goal is to make all

    resources accessible to any process in the system, without regard to the relative

    location of the resource user. It is necessary to provide mechanisms to enable a

    rich environment in which information is readily obtained by requesting

    services. The Grid information (registration and directory) services components

    provide the mechanisms for registering and obtaining information about the

    Grid structure, resources, services, and status.

    Naming services. In a Grid, like in any distributed system, names are used to

    refer to a wide variety of objects such as computers, services, or data objects.

    The naming service provides a uniform name space across the complete Grid

    environment. Typical naming services are provided by the international X.500

    naming scheme or DNS, the Internets scheme.

  • 7/28/2019 Grid Document

    10/31

    Grid Computing

    9

    Distributed file systems and caching. Distributed applications, more often than

    not, require access to files distributed among many servers. A distributed file

    system is therefore a key component in a distributed system. From an

    applications point of view it is important that a distributed file system can

    provide a uniform global namespace, support a range of file I/O protocols,

    require little or no program modification, and provide means that enable

    performance optimizations to be implemented, such as the usage of caches.

    Security and authorization.Any distributed system involves all four aspects of

    security: confidentiality, integrity, authentication, and accountability. Security

    within a Grid environment is a complex issue requiring diverse resourcesautonomously administered to interact in a manner that does not impact the

    usability of the resources or introduces security holes/lapses in individual

    systems or the environments as a whole. A security infrastructure is the key to

    the success or failure of a Grid environment.

    System status and fault tolerance. To provide a reliable and robust environment

    it is important that a means of monitoring resources and applications is

    provided. To accomplish this task, tools that monitor resources and application

    need to be deployed.

    Resource management and scheduling. The management of processor time,

    memory, network, storage, and other components in a Grid is clearly very

    important. The overall aims to efficiently and effectively schedule the

    applications that need to utilize the available resources in the Grid computing

    environment. From a users point of view, resource management and scheduling

    should be transparent; their interaction with it being confined to a manipulating

    mechanism for submitting their application. It is important in a Grid that a

    resource management and scheduling service can interact with those that may

    be installed locally.

  • 7/28/2019 Grid Document

    11/31

    Grid Computing

    10

    Computational economy and resource trading. As a Grid is constructed by

    coupling resources distributed across various organizations and administrative

    domains that may be owned by different organizations, it is essential to support

    mechanisms and policies that help in regulate resource supply and demand. An

    economic approach is one means of managing resources in a complex and

    decentralized manner. This approach provides incentives for resource owners,

    and users to be part of the Grid and develop and using strategies that help

    maximize their objectives.

    Programming tools and paradigms. Grid applications (multi-disciplinary

    applications) couple resources that cannot be replicated at a single site even ormay be globally located for other practical reasons. A Grid should include

    interfaces, APIs, utilities, and tools to provide a rich development environment.

    Common scientific languages such as C, C++, and Fortran should be available,

    as should application-level interfaces such as MPI and PVM. A variety of

    programming paradigms should be supported, such as message passing or

    distributed shared memory. In addition, a suite of numerical and other

    commonly used libraries should be available.

    User and administrative GUI. The interfaces to the services and resources

    available should be intuitive and easy to use. In addition, they should work on a

    range of different platforms and operating systems. They also need to take

    advantage of Web technologies to offer a view of portal supercomputing. The

    Web-centric approach to access supercomputing resources should enable users

    to access any resource from anywhere over any platform at any time. That

    means, the users should be allowed to submit their jobs to computational

    resources through a Web interface from any of the accessible platforms such as

    PCs, laptops, or Personal Digital Assistant, thus supporting the ubiquitous

    access to the Grid. The provision of access to scientific applications through the

    Web (e.g. RWCPs parallel protein information analysis system) leads to the

    creation of science portals.

  • 7/28/2019 Grid Document

    12/31

    Grid Computing

    11

    GRID ARCHITECTURE

    Our goal in describing our Grid architecture is not to provide a complete

    enumeration of all required protocols (and services, APIs, and SDKs) but rather

    to identify requirements for general classes of component. The result is an

    extensible, open architectural structure within which can be placed solutions to

    key VO requirements. Our architecture and the subsequent discussion organize

    components into layers, as shown in Figure. Components within each layer

    share common characteristics but can build on capabilities and behaviors

    provided by any lower layer.

    In specifying the various layers of the Grid architecture, we follow the

    principles of the hourglass model. The narrow neck of the hourglass defines a

    small set of core abstractions and protocols (e.g., TCP and HTTP in the

    Internet), onto which many different high-level behaviors can be mapped (the

    top of the hourglass), and which themselves can be mapped onto many different

    underlying technologies (the base of the hourglass). By definition, the numberof protocols defined at the neck must be small. In our architecture, the neck of

    the hourglass consists ofResource and Connectivity protocols, which facilitate

    the sharing of individual resources. Protocols at these layers are designed so that

    they can be implemented on top of a diverse range of resource types, defined at

    the Fabric layer, and can in turn be used to construct a wide range of global

    services and application-specific behaviors at the Collective layerso called

    because they involve the coordinated (collective) use ofmultiple resources.

  • 7/28/2019 Grid Document

    13/31

    Grid Computing

    12

    Figure3. The layered Grid architecture and its relationship to theInternet protocol architecture. Because the Internet protocol

    architecture extends from network to application, there is a

    mapping from Grid layers into Internet layers.

    Fabric: Interfaces to Local Control

    The Grid Fabric layer provides the resources to which shared access is

    mediated by Grid protocols: for example, computational resources, storage

    systems, catalogs, network resources, and sensors. A resource may be a

    logical entity, such as a distributed file system, computer cluster, or distributed

    computer pool; in such cases, a resource implementation may involve internal

    protocols (e.g., the NFS storage access protocol or a cluster resource

    management systems process management protocol), but these are not the

    concern of Grid architecture.

    Fabric components implement the local, resource-specific operations

    that occur on specific resources (whether physical or logical) as a result of

    sharing operations at higher levels. There is thus a tight and subtle

    interdependence between the functions implemented at the Fabric level, on the

    one hand, and the sharing operations supported, on the other. Richer Fabric

    functionality enables more sophisticated sharing operations; at the same time, if

  • 7/28/2019 Grid Document

    14/31

    Grid Computing

    13

    we place few demands on Fabric elements, then deployment of Grid

    infrastructure is simplified. For example, resource-level support for advance

    reservations makes it possible for higher-level services to aggregate

    (coschedule) resources in interesting ways that would otherwise be impossible

    to achieve. However, as in practice few resources support advance reservation

    out of the box, a requirement for advance reservation increases the cost of

    incorporating new resources into a Grid.

    Experience suggests that at a minimum, resources should implement

    enquiry mechanisms that permit discovery of their structure, state, andcapabilities (e.g., whether they support advance reservation) on the one hand,

    and resource managementmechanisms that provide some control of delivered

    quality of service, on the other. The following brief and partial list provides a

    resource-specific characterization of capabilities.

    Computational resources: Mechanisms are required for starting programsand for monitoring and controlling the execution of the resulting processes.

    Management mechanisms that allow control over the resources allocated to

    processes are useful, as are advance reservation mechanisms. Enquiry

    functions are needed for determining hardware and software characteristics

    as well as relevant state information such as current load and queue state in

    the case of scheduler-managed resources.

    Storage resources: Mechanisms are required for putting and getting files.Third-party and high-performance (e.g., striped) transfers are useful. So are

    mechanisms for reading and writing subsets of a file and/or executing

    remote data selection or reduction functions. Management mechanisms that

    allow control over the resources allocated to data transfers (space, disk

    bandwidth, network bandwidth, CPU) are useful, as are advance reservation

    mechanisms. Enquiry functions are needed for determining hardware and

  • 7/28/2019 Grid Document

    15/31

    Grid Computing

    14

    software characteristics as well as relevant load information such as

    available space and bandwidth utilization.

    Network resources: Management mechanisms that provide control over theresources allocated to network transfers (e.g., prioritization, reservation) can

    be useful. Enquiry functions should be provided to determine network

    characteristics and load.

    Code repositories: This specialized form of storage resource requiresmechanisms for managing versioned source and object code: for example, a

    control system such as CVS.

    Catalogs: This specialized form of storage resource requires mechanisms forimplementing catalog query and update operations: for example, a relational

    database.

    Connectivity: Communicating Easily and Securely

    The Connectivity layer defines core communication and authentication

    protocols required for Grid-specific network transactions. Communication

    protocols enable the exchange of data between Fabric layer resources.

    Authentication protocols build on communication services to provide

    cryptographically secure mechanisms for verifying the identity of users and

    resources.

    Communication requirements include transport, routing, and naming.

    While alternatives certainly exist, we assume here that these protocols are drawn

    from the TCP/IP protocol stack: specifically, the Internet (IP and ICMP),

    transport (TCP, UDP), and application (DNS, OSPF, RSVP, etc.) layers of the

    Internet layered protocol architecture. This is not to say that in the future, Grid

    communications will not demand new protocols that take into account particular

    types of network dynamics.

  • 7/28/2019 Grid Document

    16/31

    Grid Computing

    15

    With respect to security aspects of the Connectivity layer, we observe

    that the complexity of the security problem makes it important that any

    solutions be based on existing standards whenever possible. As with

    communication, many of the security standards developed within the context of

    the Internet protocol suite are applicable.

    Authentication solutions for VO environments should have the following

    characteristics:

    Single sign on. Users must be able to log on (authenticate) just once andthen have access to multiple Grid resources defined in the Fabric layer,

    without further user intervention.

    Delegation. A user must be able to endow a program with the ability torun on that users behalf, so that the program is able to access the

    resources on which the user is authorized. The program should

    (optionally) also be able to conditionally delegate a subset of its rights to

    another program (sometimes referred to as restricted delegation).

    Integration with various local security solutions: Each site or resourceprovider may employ any of a variety of local security solutions,

    including Kerberos and Unix security. Grid security solutions must be

    able to interoperate with these various local solutions. They cannot,

    realistically, require wholesale replacement of local security solutions but

    rather must allow mapping into the local environment.

    User-based trust relationships: In order for a user to use resources frommultiple providers together, the security system must not require each of

    the resource providers to cooperate or interact with each other in

    configuring the security environment. For example, if a user has the right

    to use sites A and B, the user should be able to use sites A and B together

    without requiring that As and Bs security administrators interact.

  • 7/28/2019 Grid Document

    17/31

    Grid Computing

    16

    Resource: Sharing Single Resources

    The Resource layer builds on Connectivity layer communication and

    authentication protocols to define protocols (and APIs and SDKs) for the secure

    negotiation, initiation, monitoring, control, accounting, and payment of sharing

    operations on individual resources. Resource layer implementations of these

    protocols call Fabric layer functions to access and control local resources.

    Resource layer protocols are concerned entirely with individual resources and

    hence ignore issues of global state and atomic actions across distributed

    collections; such issues are the concern of the Collective layer discussed next.

    Two primary classes of Resource layer protocols can be distinguished:

    Information protocols are used to obtain information about the structureand state of a resource, for example, its configuration, current load, and

    usage policy (e.g., cost).

    Management protocols are used to negotiate access to a shared resource,specifying, for example, resource requirements (including advanced

    reservation and quality of service) and the operation(s) to be performed,

    such as process creation, or data access. Since management protocols are

    responsible for instantiating sharing relationships, they must serve as a

    policy application point, ensuring that the requested protocol operations

    are consistent with the policy under which the resource is to be shared.

    Issues that must be considered include accounting and payment. A

    protocol may also support monitoring the status of an operation and

    controlling (for example, terminating) the operation.

  • 7/28/2019 Grid Document

    18/31

    Grid Computing

    17

    Collective: Coordinating Multiple Resources

    While the Resource layer is focused on interactions with a single

    resource, the next layer in the architecture contains protocols and services (and

    APIs and SDKs) that are not associated with any one specific resource but

    rather are global in nature and capture interactions across collections of

    resources. For this reason, we refer to the next layer of the architecture as the

    Collective layer. Because Collective components build on the narrow Resource

    and Connectivity layer neck in the protocol hourglass, they can implement a

    wide variety of sharing behaviors without placing new requirements on the

    resources being shared. For example:

    Directory services allow VO participants to discover the existence and/orproperties of VO resources. A directory service may allow its users to

    query for resources by name and/or by attributes such as type,

    availability, or load. Resource-level GRRP and GRIP protocols are used

    to construct directories.

    Co-allocation, scheduling, and brokering services allow VO participantsto request the allocation of one or more resources for a specific purpose

    and the scheduling of tasks on the appropriate resources. Examples

    include AppLeS, Condor-G, Nimrod-G, and the DRM broker .

    Monitoring and diagnostics services support the monitoring of VOresources for failure, adversarial attack (intrusion detection), overload,

    and so forth.

    Data replication services support the management of VO storage (andperhaps also network and computing) resources to maximize data access

    performance with respect to metrics such as response time, reliability, and

    cost.

  • 7/28/2019 Grid Document

    19/31

    Grid Computing

    18

    Grid-enabled programming systems enable familiar programming modelsto be used in Grid environments, using various Grid services to address

    resource discovery, security, resource allocation, and other concerns.

    Examples include Grid-enabled implementations of the Message Passing

    Interface and manager-worker frameworks.

    Workload management systems and collaboration frameworksalsoknown as problem solving environments (PSEs)provide for the

    description, use, and management of multi-step, asynchronous, multi-

    component workflows

    Software discovery services discover and select the best softwareimplementation and execution platform based on the parameters of the

    problem being solved. Examples include NetSolve and Ninf.

    Community authorization servers enforce community policies governingresource access, generating capabilities that community members can use

    to access community resources. These servers provide a global policy

    enforcement service by building on resource information, and resource

    management protocols (in the Resource layer) and security protocols in

    the Connectivity layer. Akenti addresses some of these issues.

    Community accounting and payment services gather resource usageinformation for the purpose of accounting, payment, and/or limiting of

    resource usage by community members.

    Collaboratory services support the coordinated exchange of informationwithin potentially large user communities, whether synchronously or

    asynchronously. Examples are CAVERNsoft, Access Grid, and

    commodity groupware systems.

    These examples illustrate the wide variety of Collective layer protocols

    and services that are encountered in practice. Notice that while Resource layer

    protocols must be general in nature and are widely deployed, Collective layer

  • 7/28/2019 Grid Document

    20/31

    Grid Computing

    19

    protocols span the spectrum from general purpose to highly application or

    domain specific, with the latter existing perhaps only within specific VOs.

    Collective functions can be implemented as persistent services, with

    associated protocols, or as SDKs (with associated APIs) designed to be linked

    with applications. In both cases, their implementation can build on Resource

    layer (or other Collective layer) protocols and APIs. For example, Figure shows

    a Collective co-allocation API and SDK (the middle tier) that uses a Resource

    layer management protocol to manipulate underlying resources. Above this, we

    define a co-reservation service protocol and implement a co-reservation servicethat speaks this protocol, calling the co-allocation API to implement co-

    allocation operations and perhaps providing additional functionality, such as

    authorization, fault tolerance, and logging. An application might then use the

    co-reservation service protocol to request end-to-end network reservations.

    Figure4. Collective and Resource layer protocols, services, APIs,

    and SDKS can be combined in a variety of ways to deliver

    functionality to applications.

    Collective components may be tailored to the requirements of a specific

    user community, VO, or application domain, for example, an SDK that

    implements an application-specific coherency protocol, or a co-reservation

    service for a specific set of network resources. Other Collective components can

  • 7/28/2019 Grid Document

    21/31

    Grid Computing

    20

    be more general-purpose, for example, a replication service that manages an

    international collection of storage systems for multiple communities, or a

    directory service designed to enable the discovery of VOs. In general, the larger

    the target user community, the more important it is that a Collective

    components protocol(s) and API(s) be standards based.

    Applications

    The final layer in our Grid architecture comprises the user applications

    that operate within a VO environment. Figure illustrates an application

    programmers view of Grid architecture. Applications are constructed in terms

    of, and by calling upon, services defined at any layer. At each layer, we have

    well-defined protocols that provide access to some useful service: resource

    management, data access, resource discovery, and so forth. At each layer, APIs

    may also be defined whose implementation (ideally provided by third-party

    SDKs) exchange protocol messages with the appropriate service(s) to perform

    desired actions.

    Figure5. APIs are implemented by software development kits (SDKs), which in turnuse Grid protocols to interact with network services that provide capabilities to the enduser. Higher level SDKs can provide functionality that is not directly mapped to aspecific protocol, but may combine protocol operations with calls to additional APIs aswell as implement local functionality. Solid lines represent a direct call; dash lines

    protocol interactions.

  • 7/28/2019 Grid Document

    22/31

    Grid Computing

    21

    We emphasize that what we label applications and show in a single

    layer in Figure 4 may in practice call upon sophisticated frameworks and

    libraries (e.g., the Common Component Architecture , SciRun , CORBA ,

    Cactus, workflow systems) and feature much internal structure that would, if

    captured in our figure, expand it out to many times its current size. These

    frameworks may themselves define protocols, services, and/or APIs. (E.g., the

    Simple Workflow Access Protocol .) However, these issues are beyond the

    scope of this article, which addresses only the most fundamental protocols and

    services required in a Grid.

  • 7/28/2019 Grid Document

    23/31

    Grid Computing

    22

    GRID APPLICATIONS

    What types of applications will grids are used for? Building onexperiences in gigabit testbeds, the I-WAY network, and other experimental

    systems, we have identified five major application classes for computational

    grids, and described briefly in this section. More details about applications and

    their technical requirements are provided in the referenced chapters.

    Distributed Supercomputing

    Distributed supercomputing applications use grids to aggregate substantial

    computational resources in order to tackle problems that cannot be solved on a

    single system. Depending on the grid on which we are working, these

    aggregated resources might comprise the majority of the supercomputers in the

    country or simply all of the workstations within a company. Here are some

    contemporary examples:

    Distributed interactive simulation (DIS) is a technique used for training andplanning in the military. Realistic scenarios may involve hundreds of

    thousands of entities, each with potentially complex behavior patterns. Yet

    even the largest current supercomputers can handle at most 20,000 entities.

    In recent work, researchers at the California Institute of Technology have

    shown how multiple supercomputers can be coupled to achieve record-

    breaking levels of performance.

    The accurate simulation of complex physical processes can require highspatial and temporal resolution in order to resolve fine-scale detail. Coupled

    supercomputers can be used in such situations to overcome resolution

    barriers and hence to obtain qualitatively new scientific results. Although

    high latencies can pose significant obstacles, coupled supercomputers have

  • 7/28/2019 Grid Document

    24/31

    Grid Computing

    23

    been used successfully in cosmology, high-resolution abinitio

    computational chemistry computations, and climate modeling.

    Challenging issues from a grid architecture perspective include the need to

    co schedule what are often scarce and expensive resources, the scalability of

    protocols and algorithms to tens or hundreds of thousands of nodes, latency-

    tolerant algorithms, and achieving and maintaining high levels of performance

    across heterogeneous systems.

    High-Throughput Computing

    In high-throughput computing, the grid is used to schedule large numbers

    of loosely coupled or independent tasks, with the goal of putting unused

    processor cycles (often from idle workstations) to work. The result may be, as in

    distributed supercomputing, the focusing of available resources on a single

    problem, but the quasi-independent nature of the tasks involved leads to very

    different types of problems and problem-solving methods. Here are some

    examples:

    Platform Computing Corporation reports that the microprocessormanufacturer Advanced Micro Devices used high-throughput computing

    techniques to exploit over a thousand computers during the peak design

    phases of their K6 and K7 microprocessors. These computers are located

    on the desktops of AMD engineers at a number of AMD sites and wereused for design verification only when not in use by engineers.

    The Condor system from the University of Wisconsin is used to managepools of hundreds of workstations at universities and laboratories around

    the world. These resources have been used for studies as diverse as

    molecular simulations of liquid crystals, studies of ground penetrating

    radar, and the design of diesel engines.

  • 7/28/2019 Grid Document

    25/31

    Grid Computing

    24

    More loosely organized efforts have harnessed tens of thousands ofcomputers distributed world wide to tackle hard cryptographic problems.

    On-Demand Computing

    On-demand applications use grid capabilities to meet short-term

    requirements for resources that cannot be cost effectively or conveniently

    located locally. These resources may be computation, software, data

    repositories, specialized sensors, and so on. In contrast to distributed

    supercomputing applications, these applications are often driven by cost-

    performance concerns rather than absolute performance. For example:

    The NEOS and NetSolve network-enhanced numerical solver systemsallow users to couple remote software and resources into desktop

    applications, dispatching to remote servers calculations that are

    computationally demanding or that require specialized software.

    A computer-enhanced MRI machine and scanning tunneling microscope(STM) developed at the National Center for Supercomputing

    Applications use supercomputers to achieve real time image processing.

    The result is a significant enhancement in the ability to understand what

    we are seeing and, in the case of the microscope, to steer the instrument.

    A system developed at the Aerospace Corporation for processing ofdata from meteorological satellites uses dynamically acquiredsupercomputer resources to deliver the results of a cloud detection

    algorithm to remote meteorologists in quasi real time.

    The challenging issues in on-demand applications derive primarily from the

    dynamic nature of resource requirements and the potentially large populations

    of users and resources. These issues include resource location, scheduling, code

    management, configuration, fault tolerance, security, and payment mechanisms.

  • 7/28/2019 Grid Document

    26/31

    Grid Computing

    25

    Data-Intensive Computing

    In data-intensive applications, the focus is on synthesizing new information

    from data that is maintained in geographically distributed repositories, digital

    libraries, and databases. This synthesis process is often computationally and

    communication intensive as well.

    Future high-energy physics experiments will generate terabytes of dataper day, or around a peta byte per year. The complex queries used to

    detect interesting" events may need to access large fractions of this

    data. The scientific collaborators who will access this data are widely

    distributed, and hence the data systems in which data is placed are likely

    to be distributed as well.

    The Digital Sky Survey will, ultimately, make many terabytes ofastronomical photographic data available in numerous network-

    accessible databases. This facility enables new approaches toastronomical research based on distributed analysis, assuming that

    appropriate computational grid facilities exist.

    Modern meteorological forecasting systems make extensive use of dataassimilation to incorporate remote satellite observations. The complete

    process involves the movement and processing of many gigabytes of

    data.

    Challenging issues in data-intensive applications are the scheduling and

    configuration of complex, high-volume data flows through multiple levels of

    hierarchy.

  • 7/28/2019 Grid Document

    27/31

    Grid Computing

    26

    Collaborative Computing

    Collaborative applications are concerned primarily with enabling and

    enhancing human-to-human interactions. Such applications are often structured

    in terms of a virtual shared space. Many collaborative applications are

    concerned with enabling the shared use of computational resources such as data

    archives and simulations; in this case, they also have characteristics of the other

    application classes just described. For example:

    The BoilerMaker system developed at Argonne National Laboratoryallows multiple users to collaborate on the design of emission control

    systems in industrial incinerators. The different users interact with each

    other and with a simulation of the incinerator.

    The CAVE5D system supports remote, collaborative exploration of largegeophysical data sets and the models that generate them-for example, a

    coupled physical/biological model of the Chesapeake Bay.

    The NICE system developed at the University of Illinois at Chicagoallows children to participate in the creation and maintenance of realistic

    virtual worlds, for entertainment and education.

    Challenging aspects of collaborative applications from a grid architecture

    perspective are the real- time requirements imposed by human perceptual

    capabilities and the rich variety of interactions that can take place.

    We conclude this section with three general observations. First, we note

    that even in this brief survey we see a tremendous variety of already successful

    applications. This rich set has been developed despite the significant difficulties

    faced by programmers developing grid applications in the absence of a mature

    grid infrastructure. As grids evolve, we expect the range and sophistication of

    applications to increase dramatically. Second, we observe that almost all of the

  • 7/28/2019 Grid Document

    28/31

    Grid Computing

    27

    applications demonstrate a tremendous appetite for computational resources

    (CPU, memory, disk, etc.) that cannot be met in a timely fashion by expected

    growth in single-system performance. This emphasizes the importance of grid

    technologies as a means of sharing computation as well as a data access and

    communication medium. Third, we see that many of the applications are

    interactive, or depend on tight synchronization with computational components,

    and hence depend on the availability of a grid infrastructure able to provide

    robust performance guarantees.

  • 7/28/2019 Grid Document

    29/31

    Grid Computing

    28

    CONCLUSIONS AND FUTURE TRENDS

    There are currently a large number of projects and a diverse range of newand emerging Grid developmental approaches being pursued. These systems

    range from Grid frameworks to application testbeds, and from collaborative

    environments to batch submission mechanisms.

    It is difficult to predict the future in a field such as information

    technology where the technological advances are moving very rapidly. Hence, it

    is not an easy task to forecast what will become the dominant Grid approach.

    Windows of opportunity for ideas and products seem to open and close in the

    blink of an eye. However, some trends are evident. One of those is growing

    interest in the use of Java and Web services for network computing.

    The Java programming language successfully addresses several key

    issues that accelerate the development of Grid environments, such as

    heterogeneity and security. It also removes the need to install programs

    remotely; the minimum execution environment is a Java-enabled Web browser.

    Java, with its related technologies and growing repository of tools and utilities,

    is having a huge impact on the growth and development of Grid environments.

    From a relatively slow start, the developments in Grid computing are

    accelerating fast with the advent of these new and emerging technologies. It is

    very hard to ignore the presence of the Common Object Request BrokerArchitecture (CORBA) in the background. We believe that frameworks

    incorporating CORBA services will be very influential on the design of future

    Grid environments.

    The two other emerging Java technologies for Grid and P2P computing

    are Jini and JXTA . The Jini architecture exemplifies a network-centric service-

    based approach to computer systems. Jini replaces the notions of peripherals,

  • 7/28/2019 Grid Document

    30/31

    Grid Computing

    29

    devices, and applications with that of network-available services. Jini helps

    break down the conventional view of what a computer is, while including new

    classes of services that work together in a federated architecture. The ability to

    move code from the server to its client is the core difference between the Jini

    environment and other distributed systems, such as CORBA and the Distributed

    Common Object Model (DCOM).

    Whatever the technology or computing infrastructure that becomes

    predominant or most popular, it can be guaranteed that at some stage in the

    future its star will wane. Historically, in the field of computer research anddevelopment, this fact can be repeatedly observed. The lesson from this

    observation must therefore be drawn that, in the long term, backing only one

    technology can be an expensive mistake. The framework that provides a Grid

    environment must be adaptable, malleable, and extensible. As technology and

    fashions change it is crucial that Grid environments evolve with them.

    Smarr observes that Grid computing has serious social consequences and

    is going to have as revolutionary an effect as railroads did in the American

    Midwest in the early 19th century. Instead of a 3040 year lead-time to see its

    effects, however, its impact is going to be much faster. Smarr concludes by

    noting that the effects of Grids are going to change the world so quickly that

    mankind will struggle to react and change in the face of the challenges and

    issues they present. Therefore, at some stage in the future, our computing needs

    will be satisfied in the same pervasive and ubiquitous manner that we use the

    electricity power grid. The analogies with the generation and delivery of

    electricity are hard to ignore, and the implications are enormous. In fact, the

    Grid is analogous to the electricity (power) Grid and the vision is to offer

    (almost) dependable, consistent, pervasive, and inexpensive access to resources

    irrespective of their location for physical existence and their location for access.

  • 7/28/2019 Grid Document

    31/31

    Grid Computing

    BIBLIOGRAPHY

    1. Foster, C. Kesselman, editors. The Grid: Blueprint for a New ComputingInfrastructure, Morgan Kaufmann, San Francisco, Calif. (1999).

    2. www.globus.org

    3. http://en.wikipedia.org/