Download - The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes
The Australian Virtual Observatory
e-Science MeetingSchool of Physics, March 2003
David Barnes
What is a Virtual Observatory?
• A Virtual Observatory (VO) is a distributed, uniform interface to the data archives of the world’s major astronomical observatories.
• A VO is explored with advanced data mining and visualisation tools which exploit the unified interface to enable cross-correlation and combined processing of distributed and diverse datasets.
• VOs will rely on, and provide motivation for, the development of national and international computational and data grids.
Scientific motivation• Understanding of astrophysical processes depends
on multi-wavelength observations and input from theoretical models.
• As telescopes and instruments grow in complexity, surveys generate massive databases which require increasing expertise to comprehend.
• Theoretical modeling codes are growing in sophistication to consume available compute time.
• Major advances in astrophysics will be enabled by transparently cross-matching, cross-correlating and inter-processing otherwise disparate data.
Aus-VO in 2003
• “Phase A” funded AUD 260K by a 2003 ARC grant:– The University of Melbourne– The University of Sydney– CSIRO Australia Telescope National Facility– Anglo-Australian Observatory
• Funded common format on-line archive projects:– HIPASS: HI spectral line and 1.4-GHz continuum survey– SUMSS: 843 MHz continuum survey– ATCA archive: spectral line and radio continuum images– 2dFGRS: optical spectra of >200K southern galaxies
www.aus-vo.org
www.aus-vo.org/twiki
Melbourne
AdelaideCanberra
Sydney
Parkes?
Swinburne
Data CPU?
CPU?
CPU?
CPU?
Data
Data
Data
ATNF/AAO
Theory?
HIPASSGemini?
ATCAMSO
2dFGRSRAVE
SUMSS
CPU
Theory
GrangeNet
... thinking about the Aus-VO Grid, having data nodes and compute nodes...
GrangeNet: Grid and Next Generation Network – a 10 Gbit backbone
APACCPU
VPACCPU
Theory
VO Interface & Portal
• Agreement with AstroGrid (UK e-Science project) to be testers for their data publication and portal creation code.
• Collecting the necessary resources and intend to have an AstroGrid-based portal serving HIPASS catalogue data for demonstration at IAU General Assembly in July 2003.
The MACHO Grid!
• MACHO: 8-yr lightcurves for >18 million stars
• ANU, APAC and MSO have the data on mass store, and are working on a VOTable XML description of the data (metadata).
• Agreement with San Diego Supercomputer Center to install a storage resource broker (SRB) at ANU, with a view to making the MACHO data available on an international Grid.
Grid-based Visualisation• ATNF will build a Java
PixelCanvas so that AIPS++ visualisation applications can be deployed as Web-Service and Grid- Service Java Applets
• AIPS++ is modern, OpenSource software for reducing (radio) astronomy data, 1.6M lines of code.
Grid-based Volume Rendering• Agreement between Melbourne and AstroGrid to develop our
existing distributed-data volume rendering code into a fully-fledged Grid-Service.
• Challenge is to interactively render a multi-GB cube at the IAU GA 2003, using GridFTP to transfer the data volume from a remote data warehouse to a remote rendering cluster.
Time to render 512x512 view of 1024x1024x1024 volume (seconds)
1
10
100
1000
0 10 20 30 40
number of nodes
DataGrids for Aus-VO
• Australian archives range from ~10 GB to ~10 TB in processed (reduced) size.
• providing just the processed images and spectra on-line requires a distributed, high-bandwidth network of data servers – that is, a DataGrid.
• users may want some simple operations such as smoothing or filtering, applied at the data server. This is a Virtual DataGrid.
ComputeGrids for Aus-VO
• More complex operations may be applied requiring significant processing:– source detection and parameterisation– reprocessing of raw or intermediate data
products with new calibration algorithms– combined processing of raw, intermediate or
"final product" data from different archives
• These operations require a distributed, high-bandwidth network of computational nodes – that is, a ComputeGrid.