introduction to the grid roy williams, caltech. enzo case study simulated dark matter density in...

25
Introduction to the Grid Roy Williams, Caltech

Upload: diego-cahill

Post on 27-Mar-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Introduction to the Grid Roy Williams, Caltech

Page 2: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Enzo Case Study

Simulated dark matter density in early universe

• N-body gravitational dynamics (particle-mesh method)

• Hydrodynamics with PPM and ZEUS finite-difference• Up to 9 species of H and He

• Radiative cooling

• Uniform UV background (Haardt & Madau)

• Star formation and feedback

• Metallicity fields

Page 3: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Enzo Features

– N-body gravitational dynamics (particle-mesh method)– Hydrodynamics with PPM and ZEUS finite-difference– Up to 9 species of H and He– Radiative cooling– Uniform UV background (Haardt & Madau)– Star formation and feedback– Metallicity fields

Page 4: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Adaptive Mesh Refinement (AMR)

• multilevel grid hierarchy

• automatic, adaptive, recursive

• no limits on depth,complexity of grids

• C++/F77

• Bryan & Norman (1998)

Source: J. Shalf

Page 5: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Distributed Computing Zoo

• Grid Computing• Also called High-Performance Computing• Big clusters, Big data, Big pipes, Big centers• Globus backbone, which now includes Services and Gateways• Decentralized control

• Cluster Computing• local interconnect between identical cpu’s

• Peer-to-Peer (Napster, Kazaa)• Systems for sharing data without centeral server

• Internet Computing• Screensaver cycle scavenging• eg SETI@home, Einstein@home, ClimatePrediction.net, etc

• Access Grid• A videoconferencing system

• Globus• A popular software package to federate resources into a grid

• TeraGrid• A $150M award from NSF to the Supercomputer centers (NCSA, SCSC, PSC, etc etc)

Page 6: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

• The World Wide Web provides seamless access to information that is stored in many millions of different geographical locations

• In contrast, the Grid is an emerging infrastructure that provides seamless access to computing power and data storage capacity distributed over the globe.

What is the Grid?

Page 7: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

• “Grid” was coined by Ian Foster and Carl Kesselman “The Grid: blueprint for a new computing infrastructure”.

• Analogy with the electric power grid: plug-in to computing power without worrying where it comes from, like a toaster.

• The idea has been around under other names for a while (distributed computing, metacomputing, …).

• Technology is in place to realise the dream on a global scale.

What is the Grid?

Page 8: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

• The Grid relies on advanced software, called middleware, which ensures seamless communication between different computers and different parts of the world

• The Grid search engine will not only find the data the scientist needs, but also the data processing techniques and the computing power to carry them out

• It will distribute the computing task to wherever in the world there is spare capacity, and send the result to the scientist

How will it work?

Page 9: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

The GRID middleware:• Finds convenient places for the scientists “job” (computing task) to be run• Optimises use of the widely dispersed resources• Organises efficient access to scientific data • Deals with authentication to the different sites • Interfaces to local site authorisation / resource allocation• Runs the jobs• Monitors progress• Recovers from problems

… and ….Tells you when the work is complete and transfers the result back!

How will it work?

Page 10: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Benefits for Science

• More effective and seamless collaboration of dispersed communities, both scientific and commercial

• Ability to run large-scale applications comprising thousands of computers, for wide range of applications

• Transparent access to distributed resources from your desktop, or even your mobile phone

• The term “e-Science” has been coined to express these benefits

Page 11: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Five Big Ideas of Grid

• Federated sharing– independent management;

• Trust and Security– access policy; authentication; authorization

• Load balancing and efficiency– Condor, queues, prediction, brokering

• Distance doesn’t matter– 20 Mbyte/sec, global certificates,

• Open standards– NVO, FITS, MPI, Globus, SOAP

Page 12: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Grid as Federation

Grid as a federation

independent centers

flexibility

unified interface

power and strength

Large/small state compromise

Page 13: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

•NASA Information Power Grid•DOE Science Grid•NSF National Virtual Observatory•NSF GriPhyN•DOE Particle Physics Data Grid•NSF TeraGrid•DOE ASCI Grid•DOE Earth Systems Grid•DARPA CoABS Grid•NEESGrid•DOH BIRN•NSF iVDGL

•UK e-Science Grid•Netherlands – VLAM, PolderGrid•Germany – UNICORE, Grid proposal•France – Grid funding approved•Italy – INFN Grid•Eire – Grid proposals•Switzerland - Network/Grid proposal•Hungary – DemoGrid, Grid proposal•Norway, Sweden - NorduGrid

•DataGrid (CERN, ...)•EuroGrid (Unicore)•DataTag (CERN,…)•Astrophysical Virtual Observatory•GRIP (Globus/Unicore)•GRIA (Industrial applications)•GridLab (Cactus Toolkit)•CrossGrid (Infrastructure Components)•EGSO (Solar Physics)

Grid projects in the world

Page 14: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

TeraGrid Wide Area Network

Page 15: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

TeraGrid Resources

ANL/UC

IU NCSA ORNL PSC Purdue SDSC TACC

ComputeResources

Itanium2(0.5 TF)

IA-32(0.5 TF)

Itanium2(0.2 TF)

IA-32(2.0 TF)

Itanium2

(10 TF)

SGI SMP(6.5 TF)

IA-32(0.3 TF)

XT3(10 TF)TCS (6 TF)Marvel(0.3 TF)

Hetero (1.7 TF)

Itanium2

(4.4 TF)

Power4(1.1 TF)

IA-32(6.3 TF)

Sun (Vis)

Online Storage

20 TB 32 TB 600 TB 1 TB 150 TB 540 TB 50 TB

MassStorage

1.2 PB 3 PB 2.4 PB 6 PB 2 PB

Data Collections

Yes Yes Yes Yes Yes

Visualization

Yes Yes Yes Yes Yes

Instruments Yes Yes

Network(Gb/s,Hub)

30CHI

10CHI

30CHI

10ATL

30CHI

10CHI

30LA

10CHI

Page 16: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

The TeraGrid VisionDistributing the resources is better than putting them at one site

• Recently awarded $150M by NSF• Build new, extensible, grid-based infrastructure to support

grid-enabled scientific applications– New hardware, new networks, new software, new practices, new

policies• Expand centers to support cyberinfrastructure

– Distributed, coordinated operations center– Exploit unique partner expertise and resources to make whole

greater than the sum of its parts• Leverage homogeneity to make the distributed computing

easier and simplify initial development and standardization– Run single job across entire TeraGrid– Move executables between sites

Page 17: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

TeraGrid Allocations Policies

• Any US researcher can request an allocation– Policies/procedures posted at:

• http://www.paci.org/Allocations.html – Online proposal submission

• https://pops-submit.paci.org/

• NVO has an account on Teragrid– (just ask RW)

Page 18: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Wide Variety of Usage Scenarios

• Tightly coupled simulation jobs storing vast amounts of data, performing visualization remotely as well as making data available through online collections (ENZO)

• Thousands of independent jobs using data from a distributed data collection (NVO)

• Science Gateways – "not a Unix prompt"!– from web browser with security– SOAP client for scripting– from application eg IRAF, IDL

Page 19: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Cluster Supercomputer

100s of nodes

purged /scratch

parallel file system/home (backed-up)

login node

job submission and queueing(Condor, PBS, ..)

user

metadata node

parallel I/O

Page 20: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

MPI parallel programming

• Each node runs same program• first finds its number (“rank”)• and the number of coordinating nodes (“size”)

• Laplace solver example

Algorithm:

Each value becomes average

of neighbor valuesnode 0 node 1

Parallel:

Run algorithm with ghost points

Use messages to exchange ghost points

Serial:

for each point, compute average

remember boundary conditions

Page 21: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Storage Resource Broker (SRB)

• Single logical namespace while accessing distributed archival storage resources

• Effectively infinite storage• Data replication• Parallel Transfers• Interfaces: command-line, API, SOAP,

web/portal.

Page 22: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Storage Resource Broker (SRB):Virtual Resources, Replication

BrowserSOAP client

Command-line....

casjobs at JHU

tape at sdsc

myDisk

Similar to NVO VOStore concept

certificate

File may be replicatedFile comes with metadata

... may be customized

Page 23: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Globus

• Security• Single-sign-on, certificate handling, CAS, MyProxy

• Execution Management• Remote jobs: GRAM and Condor-G

• Data Management• GridFTP, reliable FT, 3rd party FT

• Information Services• aggregating information from federated grid resources

• Common Runtime Components• New web service

Page 24: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Public Grids for Astronomy

• Data Pipelines– split into independent pieces, send to

scheduler• Condor, PBS, Condor-G, DAGman, Pegasus

– big data storage• infinite tape, purged disk, scratch disk• no permanent TByte disk

• Services– VOStore, SIAP– Science gateways

• asynchronous, secure, web, scripted

Page 25: Introduction to the Grid Roy Williams, Caltech. Enzo Case Study Simulated dark matter density in early universe N-body gravitational dynamics (particle-mesh

Public Grids for Astronomy

• Databases– Not really supported (note: ask audience if this is true)

– VO effort for this (Casjobs, VOStore)

• Simulation– Forward: 100’s synchronized nodes, MPI– Inverse: Independent trials, 1000’s of jobs