consumer grid - triana

16
GWD-I (candidate informational GFD) Jini Working Group I. Taylor R. Philp M. Shields O. Rana Cardiff University B. Schutz Max Planck Institute for Gravitational Physics January 2002 [email protected] 1 The Consumer Grid Copyright © Global Grid Forum (2002). All Rights Reserved. 1. Abstract We introduce here the Consumer Grid: the individual-based counterpart to the organisation-based computational Grid. We furthermore describe a peer-to-peer software system capable of utilising the vast untapped computational power of internet connected computers. Current predictions indicate there will be 490 million Internet users by the year-end 2002 and each user in this massive network will have the CPU capability of more than 100 times that of an early 1990’s supercomputer. This unleashes a potential massively parallel machine for executing even the most demanding scientific searches and calculations. The potential of such a distributed computing resource has been in some ways demonstrated recently by the SETI@home project, having used over 650,000 years of CPU time at the time of writing. This problem-specific piece of code is an example of what could become a commonplace usage of this exceptional resource, given a flexible and intuitive subsystem. Triana is a visual-programming environment that implements such a subsystem. It can interactively exploit the Consumer Grid with its ease of use, automatic code writing and intelligent distribution.

Upload: others

Post on 12-Feb-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Consumer Grid - Triana

GWD-I (candidate informational GFD)Jini Working Group

I. TaylorR. Philp

M. ShieldsO. Rana

Cardiff University

B. SchutzMax Planck Institute for Gravitational Physics

January 2002

[email protected] 1

The Consumer Grid

Copyright © Global Grid Forum (2002). All Rights Reserved.

1. Abstract

We introduce here the Consumer Grid: the individual-based counterpart to theorganisation-based computational Grid. We furthermore describe a peer-to-peersoftware system capable of utilising the vast untapped computational power ofinternet connected computers. Current predictions indicate there will be 490million Internet users by the year-end 2002 and each user in this massive networkwill have the CPU capability of more than 100 times that of an early 1990’ssupercomputer. This unleashes a potential massively parallel machine forexecuting even the most demanding scientific searches and calculations. Thepotential of such a distributed computing resource has been in some waysdemonstrated recently by the SETI@home project, having used over 650,000years of CPU time at the time of writing. This problem-specific piece of code isan example of what could become a commonplace usage of this exceptionalresource, given a flexible and intuitive subsystem. Triana is a visual-programmingenvironment that implements such a subsystem. It can interactively exploit theConsumer Grid with its ease of use, automatic code writing and intelligentdistribution.

Page 2: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 2

Table of Contents

1. Abstract……………………………………………………………………..….. 12. Introduction……………………………………………………..……....…..….. 23. Core Technologies……………………………………………………………... 3 3.1 The Grid and Globus………………………………………………………….... 3 3.2 Peer-to-Peer (P2P) Networking……………………………………..………...... 54. Triana …………………………...…………………………………….…....….. 6 4.1 The Triana Software Environment……………………………………………... 7 4.2 Triana to Support a Consumer Grid………………………………………….… 7 4.3 Triana Availability and Recent Developments………………………….….….. 11 4.4 Use Cases………………………………………………………………...…….. 11 4.5 Availability of Peers? ………………………….…………………………...….. 125. Summary of security considerations……………………………………....…… 136. Conclusions……………………………………………………………...….….. 137. Author Contact Information……………………………………....……………. 148. Intellectual property statement……………………………………………...….. 149. Full Copyright Notices………………………………………………...……….. 1510. Acknowledgements…………………………………………………………….. 1511. References………………………………………..…………………….…..…... 15

2. Introduction

The Internet is a fast growing resource capable of a vast amount of CPU power if connected inan organised way. Recent results from demographic studies by the Computer Industry Almanac[1] reports an expected 490 million Internet users by year-end 2002 and over 765 million byyear-end 2005. Current reference solutions for Grid computing [2] have addressed this problempartially by allowing organisations in widespread locations to be connected in a secure way tocreate computational Grids. However, this does not address what we define here as theConsumer Grid, that is, users that are potentially permanently connected to a network via cable,DSL or similar but which do not belong to an organisation. Such users consist of over half of theusable Grid and this CPU power has yet to be unleashed into both the scientific and businesscomputing arenas. We present a construction tool for developing peer-to-peer (P2P) systemsand applications, called Triana, that can graphically build distributed programs that can be easilydeployed onto the Consumer Grid. It is important to emphasise at this point, however, that theConsumer Grid is intended to target resources such as DSL/Cable and other privately connectedindividuals. Such resources do not belong to an organization as such and therefore are currentlynot targeted by Grid research. The Consumer Grid therefore is a complimenting paradigm toexisting Grid technologies and not a competitor.

The Consumer Grid will provide the capability to bring Grid technology to the masses, andenable truly distributed computing over public networks. We assume that the user has access tothe executable code (in the form of Java classes), which they can execute on their own resourcesand can be transferred to the node where the execution is to be performed. P2P computing andWeb services ideas have been explored by other authors, such as [3], where ideas related tocomputing portals, and a common software component architecture (CCA) has been outlined.End users who program applications using pre-packaged software are likely to be the mostprolific users of Grid infrastructure. The development of interfaces to enable such users to utilise

Page 3: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 3

existing software libraries therefore becomes crucial – especially if such interfaces can beaccessed across a network. Such users would need to know very little about Grid infrastructure,protocols or services directly. The Web infrastructure (Web servers, client browsers) hasprovided a useful first step to realising such interfaces, although the complexity of services thatcan be supported is limited. This is shown by the lack of semantics that can be captured in anHTML document, necessitating the development of specialised meta data standards based onXML). Examples of Grid portals currently in use include XCAT [4], Gateway [5], and GridPort[6].

A useful abstraction to adopt in the context of a Consumer Grid is the notion of a “service”. Wecan consider the availability of all computational resources on the Grid as a type of service.Furthermore, within P2P systems, every entity on the network can be both a service user and aservice provider. In this context, a Consumer Grid is composed of a number of peers. Each peerprovides a service that is analogous to a Web server, in that it can receive and process requestsand returns results. For example, a Web browser (client) will make an HTTP request to a Webserver, and the Web server utilising a number of server-side programs, will return a result backto the client, which will be rendered according to the Web browsers capabilities.

In our reference implementation, the Consumer Grid is composed of a number of Triana peers.Triana is a service, which utilises services from other peers to schedule applications across anetwork. A Triana peer receives requests in the form of Triana scripts and Triana data, processesthese on the local computational resources, and returns results back to the Triana client or ontoanother peer. In the same way that an Applet has security on the client side, we provide a similarlevel of security on the Triana server through the Java Sandbox. The Java Sandbox comprises anumber of cooperating system components, ranging from security managers that execute as partof the application, to security measures designed into the Java Virtual Machine (JVM) and thelanguage itself. The sandbox ensures that an untrusted and possibly malicious-application cannotgain access to system resources [7]. Furthermore, a Triana server could be implemented as aServlet and run via a Web service. The Consumer Grid model differs in that a Web server is acentralised model, whereas a Consumer Grid is composed of distributed services, withinterservice communication unlike a Web server.

3. Core Technologies

3.1 The Grid and Globus

Although initial interest in Grid [8] was primarily within the high performance computing andcomputational science communities, the recent involvement of additional communities hasmeant that results of research can now be exploited more widely. Viewing the Grid as aninfrastructure to support “Virtual Organisations” is an important step in this direction [9].

In the current developing climate of computational Grids and the associated tools required tomake use of them, a current grid system is Globus [2,10], which is mediated by the GGF [11].The Globus project provides a reference implementation, and is developing the fundamentaltechnologies needed to build Grids. The main emphasis of Globus is to securely connectorganisations together by offering a single sign-on method for accessing all resources within an

Page 4: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 4

organisation’s network. This enables scientists to use resources from other establishments fordistributed applications by a subscription mechanism. The Globus system provides a suite ofsoftware tools for supporting resource management (via GRAM), support for security (via GSI),support for defining available resources (via RSL and MDS), support for I/O (via GASS), andsupport for data management (via GASS and GridFTP). Although the data management supportin Globus is limited, it can be extended with additional tools such as Storage Resource Brokerand object databases such as Objectivity. The use of such software tools enables scientists to useresources from other establishments for distributed applications by a subscription mechanism,without having to own or maintain accounts on these remote resources.

Globus is a mechanism using public and private keys obtained from a, hopefully trusted, thirdparty certificate agency (CA) that allows for secure sign-on and data transfer methods toresources. Globus at present provides access to CPU resources primarily on UNIX platformsalthough some of the client-side tools do exist for PCs running a non-UNIX based operatingsystem such as Windows: an operating system adopted by the bulk of computer users. Previousexperience has shown considerable difficulties with the package, the manual alone requiresconsiderable administrative experience to understand. Installations have required multiplebuilds/deployment cycles particularly with previous versions: the current version at time ofwriting is Globus 2.0(β) previous versions include 1.1.3, which required considerable skill, timeand effort to install. In order for Globus to become as popular as say the SETI project theinstallation procedure would necessarily have to be simplified for the many Windows and UNIXsystems that could be made available to a computational GRID. A first step in the easing ofinstallation problems would be the adoption of clearer documentation and the usage of standardpackage installers: e.g. Solaris’s “pkgadd”, and Windows “point and click” method.

Although Globus has a very sophisticated level of authentication it is very similar to standardtelnet, ftp and job submission methods etc in usage. After suitable certificates and keys havebeen established and an account requested from an administrator of a resource, one has to signon to the Globus environment and explicitly target a resource for job execution. This normallyoccurs through the batch job scheduler, although access to an interactive session can be madethrough the usage of shell scripts and the fork process.

Some of the difficulties associated with Globus and perhaps some of the biggest drawbacksassociated with it, is the administration cost and ease of use. Administrators with resources thatthey are willing to make available have to create accounts explicitly for Globus users. Ifthousands of users wanted access to a resource it would be a daunting task indeed for anyadministrator. This functionality would perhaps be best served by the creation of a single Globusaccount, with an associated Globus shell (similar to csh or tcsh) for the resource and a daemoninforming the CA of the resources available. The Globus shell would allow the creation ofmultiple virtual accounts within it with suitable authentication from a centrally maintainedaccount database held at the CA. Resource discovery would also be handled by the CA and couldbe kept up to date by resource information daemon on the resource. Signing on would then bewith the Globus environment via the certification agency with Globus users then requesting aresource type and not a named resource. Information about the resource request could thendetermine an actual resource and activate a virtual account within the Globus shell maintaining abilling record.

Page 5: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 5

Triana, however, does not implement this level of authentication and is platform independent.Triana Services can be run on virtually any hardware and operating system platform. Thisimplicitly means, like the SETI project, anybody can make their spare CPU cycles available. Itinstalls easily with a “point-and-click” method to instantiate a service daemon. Triana does notrely on Certification Agencies: once a simple service daemon has been installed on a platform itwill allow access to that resource. Program modules are automatically transported and executedon resources enrolled in the Triana environment effectively using a virtual account. Since Trianais written in Java and makes use of the Java Sandbox, resource file systems are alsoautomatically protected. Triana can seamlessly distribute modules and entire jobs across anetwork of compute resources allowing a Triana network to perform High PerformanceThroughput (HPT) calculations, parameter space searches, Parallel computation, and evenbehave as a macroscopic pipeline processor where one machine performs one specific task andthen pipes data onto another machine.

3.2 Peer-to-Peer (P2P) Networking

At present, the Grid assumes that participating users are trusted (or that they have obtained acertificate from a trusted authority). This is generally a valid assumption, as existing sitesoffering Grid enabled software are large research centres or institutions and can be heldaccountable if they refuse access to their resources once they have committed. To make Gridtechnologies more widely usable, however, such an assumption may not hold. Furthermore,where service providers do not belong to one of these large sites, managing and running servicesover large and complex servers may not be an option. Hence, if individual scientists were notallowed to offer software libraries that they had implemented, the Grid would be restricted tospecialized service providers, and is unlikely to have a large impact on the scientific andbusiness community. We therefore see the need for finding a synergy between the Gridinfrastructure as it is being implemented at present, and a more generic infrastructure that can bemore easily shared and deployed.

We suggest the use of a P2P network as one mechanism to achieve this – an approach that hasalready proven useful within the Web community, for example, through the use of variousimplementations of the Gnutella protocol for distributed search (see section 4.5). A P2P networkis a type of network in which each connected device is able to communicate and collaborate as apeer. The concept of P2P communications is not new. In its simplistic form it is usuallystructured as one-to-one through an exchange system. The common telephone connections usethis model. Currently, a usual telephone communication requires a central exchange or set ofcentral exchanges to make and maintain the connection between dumb terminals. This is not whyP2P is regarded as the new paradigm architecture for Internet. All nodes are Peers, and eachPeer may function as router, client, or server, according to the status of the query. P2Parchitecture generates its own organization for its nodes. This self-organizing facility is animportant difference from the centralized organization of client/server computing. The heavytransfers are at the edge of the network and therefore P2P, congestion is minimized. Withlowered congestion and automatic organization, scalability becomes less complicated and highlyfacilitated by the architecture.

Page 6: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 6

Such infrastructure has already demonstrated scalability to over 3 million users world-wide (inthe case of Napster), although at present such tools target only a particular kind of service (e.g.the availability of MP3 files). Examples of other P2P technology are Jini [12] and JXTA [13].

4. Triana

4.1 The Triana Software Environment

Triana [14] is a Java-based problem-solving, data-analysis and programming environmentdeveloped at Cardiff University in the Department of Physics and Astronomy as a part of theGEO600 gravitational wave project funded by PPARC [15]. It is effortless to use, documented indetail, easily portable and comes with many built-in functions that can be used to manipulatenumeric, signal, image and textual data. There are several hundred units (i.e. programs) andnetworks of units can be created by graphical connections to construct new and more complexprograms.

Figure 1. Triana user interface

Triana is a two-layered application consisting of an object connection language (OCL) API and agraphical user interface (GUI). The Triana GUI uses OCL to connect OCL objects together toconstruct data-flow networks. These OCL networks can also be written as OCL scripts that havethe same functionality as a network created using the Triana GUI. Therefore, any OCL network

Page 7: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 7

can be run with or without using the Triana GUI. Figure 1 is a screen shot from a Trianaapplication. Here, we show a simple network that creates a sine wave, contaminates it withGaussian-noise, takes its power spectrum and then uses a unit called AccumStat to average thespectra over successive iterations to remove the noise from the original signal.

In figure 2 we show two outputs, one taken after the first iteration (notice that the signal is buriedin the noise) and the other after 20 iterations of the algorithm.

Figure 2. Triana graph output component.

Each Triana invocation includes executing a client and server component – a user may connectto a Triana service using a command line or GUI interface. The client can distribute code tomany servers, depending on where execution is required.

4.2 Triana implementation of a Consumer Grid

One recent important development in the Triana implementation of a Consumer Grid is todisconnect the user interface from the OCL engine. Communication from the user interface is viaa defined API to the OCL engine that can be accessed by other views of the Triana network. Forexample, our reference graphical user interface may well be the one of choice by softwaredevelopers working on a desktop or laptop computer but we may want to provide a differentview to those using WAP mobile phones or PDA devices. Furthermore, we would certainlywant users to be able to access a view to the progress of their running network via the internetusing a standard off-the-shelf web browser. Another recent change is in the way we define theTriana units in that they are being standardised with definitions in XML using ideas from theCommon Component Architecture (CCA) [16].

Page 8: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 8

Figure 3. Shows s a schematic representation of the Triana implementation.

As seen in figure 3 there are two distinct components in the Triana implementation: the TrianaService (TS) and the Triana Controller (TC). The Triana controller is a user interface to Trianaservice daemons. The Triana controller can be based either on a command line or a GUI userinterface, and the Triana service daemon may be either local or remote. The Triana controllerwas previously part of the Triana service but has recently been de-coupled. The Triana controllerprovides access to a network of computers running Triana service daemons via a gateway Trianaservice daemon and allows the user to describe the kind of functionality required of Triana: themodule deployment model and data stream pathways. A single Triana controller can controlmultiple Triana networks deployed over multiple CPU resources. The Triana Service daemonshave multiple functionality and are easily installed by a “point and click” method. Since Trianais implemented in Java the Triana Controller and Triana Services can be installed on almost anyplatform. The Triana Service is comprised of three components: a client, a server and acommand process server. Typically in a Triana network only one of the Triana servicescommand process server will be active and in communication with the Triana controller. Theclient of the Triana service in contact with the Triana controller then pipes modules, programsand data to the other required Triana service daemons. These Triana services will simply act inserver mode to execute the byte code and pipe data to others via the prescription given by theTriana controller.

Page 9: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 9

Figure 4. Illustrates the actual implementation of the Triana model.

Figure 4 shows the implementation of the GUI version of the Triana Controller It also shows thepallet upon which modules may be placed and how data flow is achieved by linking modulestogether: in this particular case a simple distributed pipelined linear network has been created.The figure also shows how a Triana Service provides communication to the Triana Controllerand also how it also passes module information onto the server side of other Triana Servicesrunning on remote machines.

Triana currently has the ability to distribute any of its units or any combination of collections ofits units amongst a set of distributed computers. This is currently being updated to incorporateJini and JXTA architectures for locating available peers through resource discovery. The currentcommunication-scheme used is RMI (remote method invocation). This architecture easily fits inwith the current P2P Java technologies. We expect to complete this implementation by the firstrelease of the software that will be announced in the first quarter of 2002. The software can bedownloaded from the web site [14] and results pertaining to the P2P technology will be availablevia the GridOneD project’s website [17].

Page 10: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 10

Figure 5. Distributed Triana model with Triana units as JXTA peers.

We can look at P2P networking with respect to Triana and distributing Triana task graphs atvarying levels of granularity. At the lowest level, each Triana Unit (or Group Unit) is itself apeer communicating with other Units/peers via some mechanism, using JXTA as animplementation this would be a JXTA pipe. The units or peers would organise in peer groupswith communication to the Triana client peer and collaboration and communication with otherpeer groups of units being controlled by a Triana Server peer, see figure 5. This model may haveimplementation overheads due to the level of granularity, each unit communicates via theunderlying P2P communication mechanism which could be implemented on top of manydifferent underlying layers.

Figure 6. Distributed Triana model with Triana Server as JXTA peers.

Page 11: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 11

A different choice for implementation would be to have a Triana Server peer. This peer would becapable of running a Triana task graph or sub-section of a task graph. This peer would also act asa rendezvous or relay peer able to forward messages, other sub-sections of a task graph orcontrol instructions onto other Triana Server peers. Communication is more course grained thanthe previous model, communication between units within a server are internal to that server.Triana servers discover each other via advertisements and communicate via pipes, see figure 6.

4.3 Triana Availability and Recent Developments

In August 2001 the core Triana development team decided, in light of the applicability of Trianato Grid applications and Grid’s nature of common open-source components, that the softwareshould be made fully open source within a community based distribution scheme. Both the OCLAPI and the Triana GUI are therefore scheduled for open source release in the early part of 2002once extensive documentation and code modification has been made to make it suitable for sucha paradigm.

In addition to P2P implementation, Triana will be extended to implement its distribution of unitsamongst virtual organisations by using the GridLab GAT (Grid Application Toolkit) API [18]that defines a high level API to the Grid via currently available toolkits e.g. Globus.

4.4 Triana Application Scenarios

We assume that in the context of a Consumer Grid some service providers will continue to offerthe same service during their lifetime, whilst others may change over time. We provide usagescenarios to demonstrate how services may be utilised:

Case 1: Database access

A Triana user constructs an application to access and manipulate a database. In order to do so,the user establishes a pipeline in Triana consisting of: (1) a data access service, (2) a datamanipulation service, (3) a data visualisation service, and (4) a data verification service. Afterthe pipeline has been composed, the user provides preferences for the kinds of services thatshould be utilised in each of these cases. It is now up to the Triana system to utilise its servicediscovery capability to find the services that match the constraints identified by the user in eachcase. The user must also identify the flexibility in the constraints (i.e. whether the constraintsshould be matched exactly, or whether a “soft” match is required). A part of these initialconstraints is a time limit that the user imposes to enable Triana to search for service providers.When this time limit expires, the best matches found so far are reported. As Triana usageimproves, previous matches are maintained in a Triana Cache (working memory), which aresearched first.

The Triana system looks on the network to discover peers which offer each of these services inturn. The pipeline is instantiated with peer references as new services are discovered on thenetwork. The user is now given the option to select between multiple services which have beendiscovered on the network. For instance, if Triana finds multiple components to manipulate the

Page 12: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 12

data, the user may be asked to select a service based on other options that a given serviceprovides (such as accuracy, numerical and textual data manipulation capability etc). Once aservice has been selected, and the Triana system has undertaken a service-bind to each of thestages in the pipeline, Triana now initiates the execution procedure.

Case 2: Inspiral Search for Coalescing Binaries

Compact binary stars orbiting each other in a close orbit are among the most powerful sources ofgravitational waves. The gravitational waves emitted in the process have twice the frequency ofthe binary and carry away its energy. This results in the gradual shrinking of the orbit. As theorbital radius decreases, a characteristic chirp waveform is produced whose amplitude andfrequency increase with time until eventually the two bodies merge together. Laserinterferometric detectors, such as GEO600 should be able to detect the waves from the last fewminutes before the collision [19].

For this search, we need to have a computing resource capable of speeds in the range of 5 and 10Gigaflops to keep up in real time with an on-line search. For example, within GEO 600, thegravitational wave signal is sampled at 8kHz in 24-bit resolution (stored in 4 bytes). However,the searchable frequency range is below 1 KHz and therefore a realistic sampled representationof the signal contains 2,000 samples per second. The real-time data set is divided into chunks of15 minutes in duration (i.e. 900 seconds), which results in a 7.2MB of data (4 x 900 x 2000)being processed at a time. This data is transmitted to a node and it is processed. The nodeinitialises i.e. generates its templates (a trivial computational step) and then it performs fastcorrelation on the data set with each template in a library of between 5,000 and 10,000 templates.This process takes about 5 hours on a 2 GHz PC running a C program. Therefore, 20 PC’swould need to be employed full-time to keep up with the data.

Within a Consumer Grid scenario the number of PCs would need to be increased due to varioustypes of downtime e.g. connection lost, user intervenes, computational bandwidth not reachedetc. However, since it is a massively parallel problem we believe it can be solved within such anenvironment by simply distributing the code to as many computers that are available until theresults are being returned with the specified time interval. The latency of such a system is notimportant and it can lag behind by several hours if necessary. We could employ a check-pointingmechanism to migrate if necessary. It may turn out that the Consumer Grid may require 10 timesthe nodes than that of a dedicated cluster but it still could potentially keep up without the need toinvest in a massively parallel machine to do the job.

4.5 Availability of Peers?

An obvious question at this point is “Why would users make their CPU available to others?”Our belief is that users would altruistically make their computers CPU and RAM available ifthey trusted the software, whether they thought its use was a worthy cause and if they could havecontrol about when their resource could be made available. An obvious solution to this is to takethe approach that Condor [20] and SETI [21] take and make user’s CPU available when theirworkstation is idle i.e. when the screensaver turns on, for example. Users also would have theoption to specify how much RAM the applications could use and publish this on the network.Users would then run the software in the same way in which Napster or Gnutella users run their

Page 13: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 13

but instead of sharing mp3 files or video’s they would be sharing their computational power withothers who could make use of it. Current examples of sharing resources are:

SETI@home [21] is a scientific experiment that uses Internet-connected computers in theSearch for Extraterrestrial Intelligence (SETI). You can participate by running a free programthat downloads and analyzes radio telescope data. With 3154517 users taking part there has beena total CPU time of 668852.233 years (as of 19th July 2001) and this figure is growing on a dailybasis.

Gnutella [22] is an open, decentralized, P2P search protocol that is mainly used to find files. ByApril 2000, one Gnutella implementation called FastTrack [23] matched the popularity reachedby Napster.

Napster [24] is the controversial music file-swapping application, had reported 6.7 million usersby August 2000. In February 2001, Jupiter Media Metrix [25], the global leader in marketintelligence, reported that Napster was used by 14.3 percent of online users at home in thirteenleading wired countries. Note that Napster is not a true P2P system since the availability of peersis located through a central database at napster’s web site.

5. Summary of security considerations

Naturally the security of a borrowed resource is of paramount importance to the owner of thatresource. In the case of Triana since it is written in Java and as discussed in the introduction andsection 3.1 security is provided via Java Sandbox.

6. Conclusions

This paper describes the Triana software for developing applications using Grid and P2Pinfrastructure. Triana supports the creation of applications by migrating code to a resource whereexecution is desired. This is achieved by discovering peers that offer particular computationalcapability and then downloading the code to these peers for execution. The Consumer Grid ideais centred on the notion that individuals or organisations may wish to contribute computationalresources, with resources possessing different computational capabilities and access patterns.Two ideas from P2P systems are utilised in this system: (1) File sharing ideas from systems suchas Napster and Gnutella, whereby executable and data files are transferred to the point ofexecution. Peer naming, grouping, and advertising is achieved using JXTA, (2) sharing of(unused) computational cycles managed by the resource manager, from systems such asSETI@HOME, Entropia [26] and Parabon [27]. Triana provides a user interaction facility thatmakes the existence of such an infrastructure transparent to the user. We believe that the successof P2P systems (such as Napster) and utilisation of Web servers to supporting both file sharingand CPU sharing, will be important to support a wider use of Grid technologies and applications.

Page 14: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 14

7. Author Contact Information

Dr Ian [email protected]

Dr Roger [email protected]

Mr Mathew [email protected]

Dept of Physics and AstronomyCardiff UniversityPO Box 913, CardiffUK

Dr Omer RanaDept of Computer ScienceCardiff UniversityPO Box 916Cardiff, [email protected]

Prof. Bernard Schutz:Max Planck Institute for Gravitational PhysicsAm Muehlenberg 1D-14476 Golm bei [email protected]

8. Intellectual Property Statement

The GGF takes no position regarding the validity or scope of any intellectual property or otherrights that might be claimed to pertain to the implementation or use of the technology describedin this document or the extent to which any license under such rights might or might not beavailable; neither does it represent that it has made any effort to identify any such rights. Copiesof claims of rights made available for publication and any assurances of licenses to be madeavailable, or the result of an attempt made to obtain a general license or permission for the use ofsuch proprietary rights by implementers or users of this specification can be obtained from theGGF Secretariat. The GGF invites any interested party to bring to its attention any copyrights,patents or patent applications, or other proprietary rights which may cover technology that maybe required to practice this recommendation. Please address the information to the GGFExecutive Director.

Page 15: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 15

9. Full Copyright Notice

Copyright (C) Global Grid Forum (30-Jan-02). All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative worksthat comment on or otherwise explain it or assist in its implementation may be prepared, copied,published and distributed, in whole or in part, without restriction of any kind, provided that theabove copyright notice and this paragraph are included on all such copies and derivative works.However, this document itself may not be modified in any way, such as by removing thecopyright notice or references to the GGF or other organizations, except as needed for thepurpose of developing Grid Recommendations in which case the procedures for copyrightsdefined in the GGF Document process must be followed, or as required to translate it intolanguages other than English.

The limited permissions granted above are perpetual and will not be revoked by the GGF or itssuccessors or assigns.

This document and the information contained herein is provided on an "AS IS" basis and THEGLOBAL GRID FORUM DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THEINFORMATION HEREIN GWD-I (Community Practice) 30-Jan-2002 WILL NOT INFRINGEANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESSFOR A PARTICULAR PURPOSE."

10. Acknowledgements

Thanks to Dr B. S. Sathyaprakash for providing the technical specifications concerning thesampling rates, templates and execution estimations for the binary-star user scenario.

11. References

1 Computing Industry Almanac : http://www.commerce.net/research/stats/wwstats.html2 The Globus Project Home Page : http://www.globus.org/3 Programming the Grid: Distributed Software Components, P2P and Grid Web Services for

Scientific Applications. Dennis Gannon, Randall Bramley, Geoffrey Fox, Shava Smallen, AlRossi, Rachana Ananthakrishnan et al. Department of Computer Science, Indiana University,USA.

4 The XCAT Science Portal, S. Krishnan, R. Bramley, D. Gannon, M. Govindaraju, R. Indurkar,A. Slominski. Proceedings of SuperComputing, Denver, November 2001.

5 Gateway Computation Portal, Geoffrey Fox et al.:http://www.computingportals.org/CPdoc/Gateway_CP.doc

6 NPACI GridPort Toolkit and HotPage – UCSD User Portal, Mary Thomas et al. :http://www.computingportals.org/CPdoc/HotPage.doc

Page 16: Consumer Grid - Triana

GWD-I (candidate informational GFD) 30-Jan-02

[email protected] 16

7 Java Sandbox : http://java.sun.com/marketing/collateral/security.html8 Foster, I. And Kesselman. C. (eds) The Grid: Blueprint for a New Computing Infrastructure.

Morgan Kaufmann, 1998.9 Grid Information Services for Distributed Resource Sharing. K. Czajkowski, S. Fitzgerald, I.

Foster, C. Kesselman, Procedings of the Tenth IEEE International Symposium on High-Performance Distributed Computing (HPDC-10), IEEE Press, August 2001.

10 Foster, I. And Kesselman. C. The Globus Project: A Status Report. In Proc. HeterogeneousComputing Workshop. IEEE Press, 1998, 4-18.

11 The Global Grid Forum Home Page : http://www.gridforum.org12 The community Resource for Jini ™ Technology : http://www.jini.org/13 Project JXTA : http://www.jxta.org/14 The Triana Software Environment : http://www.triana.co.uk and http://www.trianacode.org15 The GEO 600 Home Page : http://www.geo600uni-hannover.de16 The Common Component Architecture : http://www.acl.lanl.gov/cca/17 GridOneD : http://www.gridoned.org/18 GridLab : http://www.gridlab.org19 Inspiral Binary Search : http://www.astro.cf.ac.uk/groups/relativity/research/part12.html20 The Condor Home Page : http://www.cs.wisc.edu/condor/21 SETI : http://setiathome.ssl.berkeley.edu/22 Gnutella : http://gnutella.wego.com/23 FastTrack : http://www.fasttrack.nu/index_int.html24 Napster : http://www.napster.com25 Jupiter Media Metrix : http://www.jmm.com/26 Entropia : http://www.entropia.com/27 Parabon : http://www.parabon.com/