apac national grid - a technology responses to diverse requirements

44
APAC National Grid - A technology responses to diverse requirements Ian Atkinson James Cook University (www.jcu.edu.au), and Queensland Cyberinfrastructure Foundation (www.qcif.edu.au) Lindsay Hood Australian Partnership for Advanced Computing www.apac.edu.au

Upload: evers

Post on 05-Jan-2016

43 views

Category:

Documents


2 download

DESCRIPTION

APAC National Grid - A technology responses to diverse requirements. Ian Atkinson James Cook University ( www.jcu.edu.au ), and Queensland Cyberinfrastructure Foundation (www.qcif.edu.au) Lindsay Hood Australian Partnership for Advanced Computing www.apac.edu.au. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: APAC National Grid -   A technology responses to diverse requirements

APAC National Grid - A technology responses to

diverse requirements

Ian AtkinsonJames Cook University (www.jcu.edu.au), and Queensland Cyberinfrastructure Foundation

(www.qcif.edu.au)

Lindsay HoodAustralian Partnership for Advanced Computing

www.apac.edu.au

Page 2: APAC National Grid -   A technology responses to diverse requirements

Australian Partnership forAdvanced Computing

Partners:• Australian Centre for Advanced Computing and

Communications (ac3) in NSW• CSIRO• iVEC, The Hub of Advanced Computing in Western Australia • Queensland Cyber Infrastructure Foundation (QCIF)• South Australian Partnership for Advanced Computing (SAPAC) • The Australian National University (ANU)• The University of Tasmania (TPAC)• Victorian Partnership for Advanced Computing (VPAC)

4500 CPUs, 3PB storage

“providing national advanced computing,data management and

grid services for eResearch”

Page 3: APAC National Grid -   A technology responses to diverse requirements

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 4: APAC National Grid -   A technology responses to diverse requirements

Recent Review

APAC in the future must be regarded not just as the National Facility, but as the sum of its component parts comprising:

[…]

The National Grid

[…]

That the APAC National Grid must be the pre-eminent grid in Australia and continue extending its coverage to include capabilities wherever they exist or develop. It must also nurture and support scientific research teams, NCRIS infrastructure and international partnerships

Page 5: APAC National Grid -   A technology responses to diverse requirements

Concept of the APAC National Grid

Other Grids:InstitutionalInternational

Other Grids:InstitutionalInternational

Data Centres

Data Centres

Instruments

SensorNetworks

Research Teams

APAC National Grid

a virtual system ofcomputing, data storage

and visualisationfacilities

Page 6: APAC National Grid -   A technology responses to diverse requirements

NCRIS - National Collaborative Research Infrastructure

Scheme• National Plan to invest AU$500M in medium scale collaborative access

research infrastructure across 5 years 2007-1011

• 15 Investment areas of interest including:– bioinformatics, biosecurity, geosciences, astronomy, marine and terrestrial

observation systems, structural characterization

• APAC will now be funded via NCRIS– APAC and the National Grid must directly support the NCRIS invesment areas as a

high priority

– NCRIS investments are expected to develop and execute plans to ensure e-Research (cyberinfrastructure) tools are and practices are embedded into their practices and data management

Data management is now a hot topic in Australia!

Page 7: APAC National Grid -   A technology responses to diverse requirements

NCRIS Platforms for Collaboration Vision

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 8: APAC National Grid -   A technology responses to diverse requirements

QCIF

ANU

VPAC

ac3

TPAC

CSIRO

Network:AARNetAPAC Private Network ? (AARNet)

Security: APAC CA (PKI)

GrixMyProxyVOMRS

APAC National GridCore Services – built on Globus

Portal Tools:GridSphere

Info Services:MDS 2/4MIP

IVEC

SAPAC

APACNational Facility

Systems:GatewaysPartners’ systems

QCIF(JCU)

<15 Staff to deliver all services!

Page 9: APAC National Grid -   A technology responses to diverse requirements

Some requirements

• Non dedicated resources (at partner sites)

• Varied middleware requirements (many domains to support)

• Complex virtual organisation structure• Distributed data, workflows• Simplified interface

This turns out to be hard!

Page 10: APAC National Grid -   A technology responses to diverse requirements

Requirements Analysis circa.mid 2006

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 11: APAC National Grid -   A technology responses to diverse requirements

Gateways

• Rapid churn in the middleware– You need lots of test machines

• Different communities want different middleware– GT2, GT4, Gridsphere, SRB …

• Minimises interaction with non grid-dedicated production systems

• Virtualisation is well understood technology• Xen has a nice price

Page 12: APAC National Grid -   A technology responses to diverse requirements

The gateway concept• Grid middleware evolving• Security across firewalls & institutional policies are problems• Using gateway virtual machines to isolate production

compute/storage elements from all this change – ng1 - Globus2– ng2 - Globus4 – ngdata - gridFTP, other data (SRB)– ngportal - web application portals– Others are easy to build and deploy

• But some parts of GT2 especially assume they are running on the cluster mgmt/head node ...

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 13: APAC National Grid -   A technology responses to diverse requirements

Gateways on a National Backbone

Layer 3 Private Network

Gateway Server

Cluster

Datastore

HPC

Gateway Server

Cluster

Datastore

Cluster

Installed Gateway servers at all grid sites, using VM technology to support multiple grid stacks

Gateways supporting GT2, GT4, LCG, grid portals, and experimental

grid stacks

High bandwidth, dedicated, secure private network between

grid sites

consistency, ease of implementation,

performance?

Page 14: APAC National Grid -   A technology responses to diverse requirements

Virtual Organisations

• International use tends to be large VO’s• Australia demands small, dynamic VO’s• VOMS/VOMRS has problems

– Admin security model– myproxy interaction– gridmapfile – a user can be in one VO

• Adopting PRIMA/GUMS– Still complicated and not especially dynamic

Page 15: APAC National Grid -   A technology responses to diverse requirements

“Workflows”

• Many existing HPC users have significant shell scripts, and queue commands (PBS)

• WSGRAM, JSDL, BPEL may be human readable, but not human writable!

• Abstraction of HPC systems is tough– Eg SGI’s profile.pl doesn’t handle cpusets correctly

• Working on a gsub that will take the majority of batch scripts and run them on the grid– User doesn’t have to learn JSDL, WSGRAM …

• Unicore-like client GUI would be neat

Page 16: APAC National Grid -   A technology responses to diverse requirements

JSDL (1)

<?xml version="1.0" encoding="UTF-8"?> <jsdl:JobDefinition xmlns="http://www.example.org/" xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl" xmlns:jsdl-posix="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <jsdl:JobDescription> <jsdl:JobIdentification> <jsdl:JobName>My Gnuplot invocation</jsdl:JobName> <jsdl:Description> Simple application invocation: User wants to run the application 'gnuplot' to produce a plotted graphical file based on some data shipped in from elsewhere(perhaps as part of a workflow). A front-end application will then build into an animation of spinning data. Front-end application knows URL for data file which must be staged-in. Front-end application wants to stage in a control file that it specifies directly which directs gnuplot to produce the output files. In case of error, messages should be produced on stderr (also to be staged on completion) and no images are to be transferred. </jsdl:Description>

Page 17: APAC National Grid -   A technology responses to diverse requirements

JSDL (2)

</jsdl:JobIdentification> <jsdl:Application> <jsdl:ApplicationName>gnuplot</jsdl:ApplicationName> <jsdl-posix:POSIXApplication> <jsdl-posix:Executable> /usr/local/bin/gnuplot </jsdl-posix:Executable> <jsdl-posix:Argument>control.txt</jsdl-posix:Argument> <jsdl-posix:Input>input.dat</jsdl-posix:Input> <jsdl-posix:Output>output1.png</jsdl-posix:Output> </jsdl-posix:POSIXApplication> </jsdl:Application> <jsdl:Resources> <jsdl:IndividualPhysicalMemory> <jsdl:LowerBoundedRange>2097152.0</jsdl:LowerBoundedRange> </jsdl:IndividualPhysicalMemory> <jsdl:TotalCPUCount> <jsdl:Exact>1.0</jsdl:Exact> </jsdl:TotalCPUCount> </jsdl:Resources>

Page 18: APAC National Grid -   A technology responses to diverse requirements

JSDL (3)<jsdl:DataStaging> <jsdl:FileName>control.txt</jsdl:FileName> <jsdl:CreationFlag>overwrite</jsdl:CreationFlag> <jsdl:DeleteOnTermination>true</jsdl:DeleteOnTermination> <jsdl:Source> <jsdl:URI>http://foo.bar.com/~me/control.txt</jsdl:URI> </jsdl:Source> </jsdl:DataStaging> <jsdl:DataStaging> <jsdl:FileName>input.dat</jsdl:FileName> <jsdl:CreationFlag>overwrite</jsdl:CreationFlag> <jsdl:DeleteOnTermination>true</jsdl:DeleteOnTermination> <jsdl:Source> <jsdl:URI>http://foo.bar.com/~me/input.dat</jsdl:URI> </jsdl:Source> </jsdl:DataStaging> <jsdl:DataStaging> <jsdl:FileName>output1.png</jsdl:FileName> <jsdl:CreationFlag>overwrite</jsdl:CreationFlag> <jsdl:DeleteOnTermination>true</jsdl:DeleteOnTermination> <jsdl:Target> <jsdl:URI>rsync://spoolmachine/userdir</jsdl:URI> </jsdl:Target> </jsdl:DataStaging> </jsdl:JobDescription> </jsdl:JobDefinition>

Page 19: APAC National Grid -   A technology responses to diverse requirements

Portals

• Web browser is the interface to everything…• Present simple interface to the underlying resource• Good model for many users and applications• Gridsphere adopted as the web grid standard

– GT4 based ngportal VM

• Java cogkit for standalone “portal” apps– Chemistry Java application with molecular editor being developed

– Desktop job submission tool from iVEC

http://www.grid.apac.edu.au/Services/ProductionPortals

Page 20: APAC National Grid -   A technology responses to diverse requirements

Data services

• Data is hard– Different communities have different needs– Complex access controls

• We have gridftp between sites– Network consistency is interesting …– Data staging has to

• SRB today; iRODS later• Dcache, SRM, Gfarm as communities require• Credible use cases for a global file system?

Page 21: APAC National Grid -   A technology responses to diverse requirements

Registry services

• MDS2, MDS4 running• About to deploy Modular Information

Provider to present site and aggregated information more easily

• Using GLUE schema, but it’s far from satisfactory for describing real-world production HPC resources

Page 22: APAC National Grid -   A technology responses to diverse requirements

Improved AAA Services

• NCRIS will require e-research services to a much wider community than traditional HPC– PKI doesn’t scale and is conceptually difficult for non-IT focused

users

• Australian Access Federation funded (2007-_• IAM Suite from MELCOE

– Shibboleth authentication plus appropriate attributes generates short lived certificate (www.identiy

– Tools for users to easily create shared workspaces and manage attribute release

• Only a few people will need real certificates• But probably a year away before being ready for

prime time

Page 23: APAC National Grid -   A technology responses to diverse requirements

IAM Suite

GridSphere

Federation SP

GroupModule

VO-IdP

VO-WAYF

AuthN IM

Fedora(internal or external, e.g. IR)

VO-SP

Forum

Federation

FedoraWeb

ShARPE

Autograph

Presence

PeoplePicker

Calendar

MyProxy

AuthZ Mgnr VO-SP

LMS

VO-SP

Wiki

VO-SP

Etc.

GTK

Storage

GTKSpecific

tools

GTK

Cluster

GTK

Equipm.

SearchLogin via IdP

Receiveassertions

ReceiveassertionsReceive

proxy cert.

AFS adaptor

Macquarie University’s E-Learning Centre of Excellence (MELCOE) Macquarie University’s E-Learning Centre of Excellence (MELCOE) Erik VullingsErik Vullings

www.federation.org.au

Page 24: APAC National Grid -   A technology responses to diverse requirements

APAC National Grid Status

• Essentially operational– core services implemented

• APAC CA and myproxy, VOMRS, GT2, GT4, gridsphere, SRB

– some applications close to ‘production’ mode– See http://goc.grid.apac.edu.au; http://www.grid.apac.edu.au

• Systems coverage– users can access ALL systems at APAC partners

• via gateways• from the desktop is needed

– about 4600 processors and 100’s of Tbytes of disk– around 3Pbytes of disk-cached HSM systems

Page 25: APAC National Grid -   A technology responses to diverse requirements

Future Strategies• Expand the user base

– NCRIS, Merit Allocation Scheme, Partners – Open access to core grid services

• Expand the services– Workflow engines and tools – Kepler, Taverna– Data management: metadata support

• Expand the facilities– Include major data centres

• data from instruments, government agencies

– Include institutional systems and repositories

• Resulting changes:– Policies: acceptable service provision– Organisation: coordinated user support– Architecture: scaling gateways– Technologies: Attribute-based authorisation

Page 26: APAC National Grid -   A technology responses to diverse requirements

Changing User Base

• National Collaborative Research Infrastructure Strategy– Ambitious plan to hand out $0.5B of federal money to fund

research infrastructure collaboratively• Evolving Biomolecular Platforms and Informatics $50.0M• Integrated Biological Systems $40.0M• Characterisation $47.7M• Fabrication $41.0M• Biotechnology Products $35.0M• Networked Biosecurity Framework $25.0M• Optical and Radio Astronomy $45.0M• Integrated Marine Observing System $55.2M• Structure and Evolution of the Australian Continent $42.8M• Terrestrial Ecosystem Research Network $20.0M• Population Health and Clinical Data Linkage $20.0M

• Platforms for Collaboration $75.0M

Page 27: APAC National Grid -   A technology responses to diverse requirements

Summer in Australia?

Page 28: APAC National Grid -   A technology responses to diverse requirements

New Names and Structures

National Compute Infrastructure (NCI)

• APAC Nat. Fac. >1600cpu Altix

• Shoulder clusters

Interoperation and Collaboration

Services (ICS)

• Old APAC Grid

Aust. Nat. Data Service (ANDS)• Federation of Mass Data Stores• Long term archiving and curation

Aus

tral

ian

Acc

ess

Fed

erat

ion

(AA

F),

A

RE

N -

Net

wor

k

Nat

iona

l Coo

rdin

atio

n C

ounc

il

E-Reseach

services

Page 29: APAC National Grid -   A technology responses to diverse requirements

Bringing it all together - real applications

Page 30: APAC National Grid -   A technology responses to diverse requirements
Page 31: APAC National Grid -   A technology responses to diverse requirements

Future Strategies• Expand the user base

– NCRIS, Merit Allocation Scheme, Partners – Open access to core grid services

• Expand the services– Workflow engines and tools – Kepler, Taverna– Data management: metadata support, collections registry

• Expand the facilities– Include major data centres

• data from instruments, government agencies

– Include institutional systems and repositories

• Resulting changes:– Policies: acceptable service provision– Organisation: coordinated user support– Architecture: scaling gateways– Technologies: Attribute-based authorisation

Page 32: APAC National Grid -   A technology responses to diverse requirements

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 33: APAC National Grid -   A technology responses to diverse requirements

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Source: Office of Integrative Activities, NSF

Page 34: APAC National Grid -   A technology responses to diverse requirements

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 35: APAC National Grid -   A technology responses to diverse requirements

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 36: APAC National Grid -   A technology responses to diverse requirements

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 37: APAC National Grid -   A technology responses to diverse requirements

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 38: APAC National Grid -   A technology responses to diverse requirements

“The grid lets me run lots of jobs all over the” place – Nimrod, Gridbus

“The grid lets me build a workflow that uses” distributed resources” – Kepler, Taverna

“The grid lets me scale my workstation model to a supercomputer seamlessly” –DEISA

So grid means different things to different communities

We must deliver production quality services for all of them?

Three “views” of the Grid

Page 39: APAC National Grid -   A technology responses to diverse requirements

Grid Collaboration

• Beyond Data and Compute are the AccessGrid

• Australia has had a long-term commitment to the AG

• Small highly dispersed population– Queensland has the most distributed population in Aust.

• AG is still burdened by its “antique” media tools, but the concept is essential– Skype and IP video conferencing are insufficient

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 40: APAC National Grid -   A technology responses to diverse requirements

Access Grid - ATP, Sydney 1st in Australia… participating in SC-Global (Nov 2001)

Page 41: APAC National Grid -   A technology responses to diverse requirements

Australia’s 2nd AG node - in Qld. JCU, Townsville April 2002

Minister Hon Paul Lucas

Page 42: APAC National Grid -   A technology responses to diverse requirements

AccessGrid

• AccessGrid is now very widely available in Australia– Most Universities have several nodes– Extensive use in teaching (AMSI)

• New tools being developed– HD codes (Chris Willing UQ)– SRB data grid integration with accessgrid (Atkinson /

Willing)– International Quality Assurance Program– Better Multicast / Unicast integration

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 43: APAC National Grid -   A technology responses to diverse requirements

SRB BrowserConnecting the DataGrid to the AccessGrid

Nigel Bajema

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

AG VenueClientAG Vic video

AG Rat audio

SRB Browser•AG Shared Application•New SRB Java/Python interface library written

(now part of SRB)•All AG clients can share in data from SRB Data store

•Cross platform•Exposes SRB metadata

Files moved from SRB to/fromAG data manager

Page 44: APAC National Grid -   A technology responses to diverse requirements

Acknowledgment

• Lindsay Hood, APAC Grid Manager• Rhys Francis, Former APAC Grid Manager• David Bannon, VPAC Gateway project

manager• Rob Woodcock, CSIRO Minerals Exploration