the swinburne pulsar portal: real-time supercomputing processing of big data
DESCRIPTION
Presented at the Astroinformatics 2013: Knowledge from Data conference - December 11, 2013TRANSCRIPT
Arna KarickeResearch Consultant/Data Analyst/Astro
Swinburne Research
Swinburne Pulsar PortalReal-time Supercomputing Processing
of Big Data
This project is an extension of
The Swinburne University of Technology Metadata Stores Project
and partly supported by the Australian National Data Service (ANDS)
ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program and the
Education Investment Fund (EIF) Super Science Initiative
The ERA of All-Sky Science @ radio, optical and infrared wavelengths...
ASKAP: Wallaby all-sky HI survey - 620 Giga voxels/2.5 TB data cubes
MWA: All-sky radio survey, ~ 6 PB/yr archived data
WISE: All-sky IR survey – recent AllWISE data release (Nov 2013) Source catalog: 747 million objects VST ATLAS: 4500 square deg. of the Southern sky (U, V, R, I, Z)
VPHAS+: VST H-alpha Survey of the Southern Galactic Plane
+ Molonglo Observatory Synthesis Telescope (MOST)
WISE IR Survey
Value of Data Access & Analysis Tools for researchers and citizen scientists...
Hubble Legacy Archive
HST & SDSS obviously....• Dealing with the data deluge
• Efficient research (less reinventing the wheel)
• Greater Exposure
• New Collaborations
• Publications & increased citations
• New Discoveries
Swinburne Pulsar Portal - The Who?
Matthew Bailes: Research Astronomer & Pro-Vice Chancellor (Research)
Andrew Jameson: Software Developer & Systems Engineer
Chris Flynn: Research Astronomer (Molonglo Telescope)
Willem van Straten: Research Astronomer (Software & Instrumentation)
Ewan Barr: Pulsar Postdoc (HTRU reprocessing & Molonglo)
Arna Karick: eResearch (Research Data Management & Policy) & Astronomer (optical: galaxy clusters & ETGs)
Swinburne Pulsar Portal - The What? online tool facilitating remote access to and processing of CSIRO Parkes pulsar data
Survey snapshot
High Time Resolution Universe - HTRU (P630)• Paper I - Keith et al. (2010) + discovery papers• Collaborative Research (Swinburne, Manchester, ATNF, Cagliari)• Low-lat survey: thin strip Galactic Plane (deep) -> faint pulsars• Med-lat survey: bright MSP for timing array projevys • High-lat survey: snapshot of transient sky (Sth +10) • Rotating radio transients, short duration radio bursts • Running for ~5 years, over 100 new pulsars, including 26 ms pulsars • Survey has produced over 600 Tb of raw data (Total ~875 Tb)• Data archived to tape & streamed to Swinburne via 1 Gb/s link - cont. observing
Pulsar Timing Array projects (P140)• Detection of gravitational waves
High Time Resolution Universe North (HTRU - North)• Effelsburg Radio Telescope
Molonglo Observatory Synthesis Telescope (MOST): ??
possibly... in consultation with
research groups
Swinburne Pulsar Portal - The How?
• User friendly web interface
• Sophisicated analysis tools backed by significant processing power
• MySQL data base with a PHP frontend.
• Accesses a Pb scale database
• XML headers for instrumental and astrophysical metadata - format independent & editable - facilitates easy indexing
• Uses the supercomputer (gSTAR) batch queue system - email alerts - currently has ‘timeout’ in place
• Modular - datasets and analysis tools can be added over time
• Attempt to write ‘non-expert’ analysis tools
Swinburne Pulsar Portal - The Why?
• Sharing of collaborative datasets - secure / proprietory periods
• Target: project collaborators, registered astronomers.. public? • Enables users to query AND analyse data and download data products
(metadata available via CSIROs Data Access Facility)
• Alleviates Tb-Pb storage issues, and the guesswork associated with setup & maintenance of software & hardware infrastructure
• Access pulsar observations, search object catalogues, process time-series data with sophisticated analysis software
• Test & validate analysis techniques (for Parkes data, Molonglo & SKA) Improved multi-processing (eg. orbital solutions for high-eccenticity binaries)
• Science-ready results & greater discovery potential
Swinburne Pulsar Portal – Data Processing
Swinburne Pulsar Portal – Data Tools
standard routines + novel techniques
candidate sorting
folding & optimisation
pulse periods
plotting software
editabe? user software? e.g Geophysics VO on NeCTar
Swinburne Pulsar Portal – Data Products
failures ~2%
e.g. beams with interference/
timeouts
Swinburne Pulsar Portal – Data Products
Swinburne Pulsar Portal – Data Products
Swinburne Pulsar Portal – Data Products
Coming Soon...
Mid-2014
Other Projects
• MyTardis@Swinburne data solutions - Brain Imaging: MEG & EEG - Microscopy/Eng: raman spectrometer & confocal microscope
• Research Data Management & Policy - Instutional research storage, Cloudstor+ (file sharing & storage) - Research Conduct policy, analytics & strategy
• Swinburne (ANDS) Metadata Store Project - Research data collections for Research Data Australia - Copyright, software licencing, DOIs
• Astronomy Research - HST/ACS Coma Cluster Treasury Survey - HST imaging of the Atlas3D galaxy sample