astronomy toolkits and data structures
DESCRIPTION
Astronomy toolkits and data structures. Andrew Jenkins Durham University. Data requirements of cosmological simulations. Adrian Jenkins Durham University. Talk outline. DiRAC and its major users New astronomical instruments and missions Mock catalogues Millennium simulation and database - PowerPoint PPT PresentationTRANSCRIPT
Astronomy toolkits and data structures
Andrew Jenkins
Durham University
Data requirements of cosmological simulations
Adrian Jenkins
Durham University
3
Talk outline
• DiRAC and its major users• New astronomical instruments and
missions• Mock catalogues• Millennium simulation and database• Future directions for simulations
DiRAC 2 facility
• Cambridge HPC Service: data analytic cluster
• Cambridge COSMOS shared memory service
• Durham ICC Service: data centric cluster (6720 core - idataPlex)
• Edinburgh 6144 node Bluegene/Q• Leicester IT services: complexity cluster
DiRAC2 facility used by
Time allocated by RAC. Supports large projects (up to 3 years), and smaller allocations.
• Large users: UKQCD Virgo Consortium (UK) UKMHD
Horizon, Leicester …
JWST
Launch date: ~2017-8
Cost >$5 billion
EUCLID
Launch date:~2019
Cost ~€500 million
Future large surveys• Photometric e.g. Pan-STARRs, DES, LSST, Euclid-VIS
• Spectroscopic e.g. BOSS, BigBOSS, Euclid-NIS
• Multi-wavelength e.g SKA (HI)
Wide-field (>10,000 sq deg), wide redshift (z=0-3)
z-surveys: 10-50 million galaxies imaging surveys ~billions of galaxies
Why build a mock?
• Test galaxy formation models• Test algorithms - validation• Test processing pipelines• Assess survey performance (FoM)
Large surveys need mocks now!
Mock catalogues need observables
SFRSFHStellar massCold gas massBlack hole mass
imagesFull SED (UV, Optical, FIR, Radio)Galaxies : stars, gas, AGN
Euclid OU-LE3 requirements for simulations
CSWG OU-SIM
Cosmologicalsimulators
Instrumentsimulators
Generic needs from Euclid• Position, redshift
• Emission line properties/spectra Line flux, equivalent width• Broad photometry to AB~24-24.5 Euclid NIR Euclid VIS Pan-STARRS griz DES grizy CFHTLS ugriyz WFCAM ZYJHK SDSS ugriz VISTA-VHS-VIDEO ZYJHKs• Photometric redshifts
Specific needs: clustering
• 1% P(k) accuracy• Covariance estimates: P(k) etc• Initial conditions for reconstruction• Different cosmologies• Different galaxy formation models
(vary bias)
Specific needs – clusters of galaxies
• DM haloes M>1.e+13Msun, r(ΔΔ2500, 500,200; velocity dispersion along axes from DM particles
• For each galaxy host halo ID, central or sat?
• Simulated images for cluster detection and mass determination through weak/strong lensimg
Specific needs: weak lensing
• Galaxies and DM to generate kappa map
• Galaxy shapes with noise (no IA) • Galaxy shapes with IA• Shear at each galaxy position• Image properties: mask, bright stars, chip boundaries,
CCD defects, ghosts, variations in depth & background
16
Infrastructure required to make mocks
• Require large simulations
• To date these have been simulations of dark matter in large cosmological volumes.
17
18
19
20
21
24
Input simulations
• Large N-body simulations
• Approaching a trillion particles
MXXL simulation
2626
27
Future needsSimulations for Euclid multi-trillion particle simulations
Produce multi-petabyte datasets
Data growing faster than network capabilities
Need to scale databases up
Ideally would like to serve the raw simulation data - two or more orders of magnitude larger.
28
Current and future simulations
29
Summary
• Cosmological simulations are required to make the best use of observatories and space missions
• The size of the required simulations makes this a Big data problem
• Databases have proved very successful way of presenting processed data
• Making the raw simulation data public desirable - but very challenging given financial constraints.