An Astronomical Image Mosaic Service for the
National Virtual Observatory
http://montage.ipac.caltech.edu/
ESTO
Contributors
Attila Bergou - JPL Bruce Berriman -
IPAC Ewa Deelman - ISI John Good - IPAC Joseph C. Jacob - JPL Daniel S. Katz - JPL
Carl Kesselman - ISI Anastasia Laity - IPAC Thomas Prince - Caltech Gurmeet Singh - ISI Mei-Hui Su - ISI Roy Williams - CACR
What is Montage? Delivers custom, science grade image mosaics
User specifies projection, coordinates, spatial sampling, mosaic size, image rotation
Preserve astrometry & flux
Modular “toolbox” design Loosely-coupled Engines for Image Reprojection,
Background Removal, Co-addition Control testing and maintenance costs
Flexibility; e.g., custom background algorithm; use as a reprojection and co-registration engine
Implemented in ANSI C for portability
Public service will be deployed on the Teragrid Order mosaics through web portal
Public Release of Montage Version 1.7.1 and User’s Guide available for
download at http://montage.ipac.caltech.edu/ Emphasizes accuracy in photometry and astrometry
Images processed serially Tested and validated on 2MASS 2IDR images on Linux Red
Hat 8.0 (Kernel release 2.4.18-14) on a 32-bit processor Tested on 10 WCS projections with mosaics smaller than 2
x 2 degrees and coordinate transformations Equ J2000 to Galactic and Ecliptic
Extensively tested 2,595 test cases executed 119 defects reported and 116 corrected 3 remaining defects to be corrected in future Montage
release
Applications of Montage Large scale processing of the sky; e.g., Atlasmaker Mosaics of the Infrared Sky
This is the age of Infrared Astronomy! Infrared astronomers study regions much larger than
covered by individual cameras ==> need to make mosaics to investigate star formation, redshift distribution of galaxies
Mosaics of the far infrared sky a primary data product of the SIRTF mission
Two SIRTF Legacy teams using Montage as a co-registration and mosaic engine to generate science mosaics, perform image simulations, mission planning & pipeline testing
Public Outreach - the wow factor! Combine single color images in Photoshop Example: back-lit display in NASA booth See next image - Rho Ophiuchi
Rho Ophiuchi 324 2MASS images in each band => 972 images
On a 1 GHz Sun, mosaicking takes about 15 hours
Montage: The Grid Years Re-projection is slow (100 seconds for one 1024 x 512
pixel 2MASS image on a single processor 1.4 GHz Linux box), so use parallelization inherent in design Grid is an abstraction - array of processors, grid of clusters, …
Montage has modular design - run on any environment
Prototype architecture for ordering a mosaic through a web portal Request processed on a computing grid
Prototype uses the Distributed Terascale Facility (Teragrid) This is one instance of how Montage could run on a grid
Atlasmaker is another example of Montage parallelization
Montage: The Grid Years (cont.)
Prototype version of a methodology for running on any “grid environment” Many parts of the process can be parallelized
Build a script to enable parallelization Called a Directed Acyclical Graph (DAG)
Describes flow of data and processing Describes which data are needed by which part of the
job
Describes what is to be run and when
Standard tools can execute a DAG
Using Montage Grid Prototype Web service at JPL creates an abstract workflow description of
Montage run (in XML)
Workflow description run through Pegasus to create concrete DAG Pegasus includes transfer nodes in the concrete DAG for staging in
the input image files and transferring out the generated mosaic
Concrete DAG submitted to Condor
Region Name,
Degrees
Pegasus
Concrete DAG
Condor DAGMAN
Teragrid Cluster
SDSC
NCSA
ISI Condor Pool
JPL
Abstract DAG
1 2 3
mProject 1 mProject 2 mProject 3
1 2 3
mDiff 1 2 mDiff 2 3
mFitplane D12 mFitplane D23
ax + by + c = 0 dx + ey + f = 0
a1x + b1y + c1 = 0
a2x + b2y + c2 = 0
a3x + b3y + c3 = 0
mBackground 1 mBackground 2 mBackground 3
1 2 3
mAdd
Final MosaicDescribed as abstract DAG - specifies:
• Input, output, and intermediate files
• Processing jobs• Dependencies between them
D12D23
Montage Workflow
mConcatFit
mBgModel
ax + by + c = 0
dx + ey + f = 0
Montage – Concrete DAG (single pool)
Data Stage in nodes
Montage compute nodesData stage out nodes
Registration nodes
Example DAG for 10 input files
mAdd
mBackground
mBgModel
mProject
mDiff
mFitPlane
mConcatFit
Montage Runs on the Teragrid
Test runs were done on the 1.5 degree x 1.5 degree area including M42 (Orion Nebula)
Required 113 input image files
Single pool DAG for SDSC consisted of 951 jobs 117 were data transfer jobs
113 for transferring the input image files
3 for transferring other header files
1 for transferring the final output mosaic
Run took 94 minutes
Montage Runs on the Teragrid (2)
Using same abstract DAG for multi pool DAG at SDSC and NCSA created 1202 jobs 367 were data transfer jobs
Some of these jobs transferred multiple files
249 of these 367 were inter pool data transfer jobs
Montage Computations Building a mosaic from N 1024 x 512 pixel 2MASS images
on a single processor 1.4 GHz Linux box takes roughly (N x 100) seconds
98-99% of this time is in the reprojection, which can be perfectly parallelized (this doesn’t embarrass us)
Dataset
# of images
Size of each image
Sky coverag
e
Total numbe
r of pixels (x 1012
)
Storage size (TB)
Processing time for all data in 1.4 GHz
IA32 processor hours (x 1,000)
2MASS
~ 4 million
~ 17’ x 8.5’ at 1”
~ 100% ~ 2.1 ~ 8 ~ 111
DPOSS
~ 2,600~ 6.6° x 6.6° at 1”
~ 50% ~ 1.4 ~ 3 ~ 74
SDSS (DR1)
~ 50,000~ 13.6’ x 9’ at 0.4”
~ 25% ~ 1.2 ~ 2.4 ~ 65
Summary Montage is a custom astronomical image mosaicking
service that emphasizes astrometric and photometric accuracy
First public release, Montage_v1.7.1, available for download at the Montage website
A prototype Montage service has been deployed on the Teragrid; ties together distributed services at JPL, Caltech IPAC, and ISI
More on Montage at SC2003: See ISI at ANL booth for demo of Pegasus portal running
Montage on the Teragrid See back-lit display of Montage images at NASA booth
Montage website: http://montage.ipac.caltech.edu/