sdo data access and distribution in europe and the wissdom data centre in rob, brussels

26
July 2010 Cospar10 Bremen Slide 1 SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels Authors David Boyes, Benjamin Mampaey, Cis Verbeeck, Veronique Delouille, Jean-François Hochedez STCE + ROB

Upload: yagil

Post on 12-Jan-2016

28 views

Category:

Documents


2 download

DESCRIPTION

Authors David Boyes, Benjamin Mampaey, Cis Verbeeck, Veronique Delouille, Jean-François Hochedez STCE + ROB. SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels. What will be covered. Where is the data and the access architecture for the users - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 1

SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

AuthorsDavid Boyes, Benjamin Mampaey, Cis Verbeeck,

Veronique Delouille, Jean-François Hochedez

STCE + ROB

Page 2: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 2

What will be covered

• Where is the data and the access architecture for the users

• Some basic terms

• User access methods

– modules

– basic web access

– virtual observatories

– simplified web access

– pseudo files and other developments

• Interesting issues

– Retention

– Saved searches

– Evolving calibration

• Neat stuff to come

– Cutouts

– Helioviewer

– Grid integration

Page 3: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 3

Where is the data for the users

• Data is available from one or more data centre(s) - all are networked

• Some users are "close", some are "far" - distance matters

• All data is available somewhere

• Users can get data (an "export")

– from the nearest centre directly

– via the nearest centre from a remote centre

– directly from another centre

• Most of this is automatic

– you will see differences in e.g. delays

Page 4: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 4

How the data is accessed(a bit technical)

• the system is the netDRMS

– created by the JSOC at Stanford

• files are generated by content

• system holds data files + metadata

– SUMS + DRMS

• mediator is an "export" module

• makes your very own file

– FITS, tar of FITS etc.

• SQL etc. is hidden from user

Page 5: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 5

Access summary ...

• No files until you ask for them

• Data is referenced by content - provided as a file(s) with whatever name you want

• The exported files are built using stored elements, so e.g. FITS with Rice compression quite direct as AIA data is stored internally in this format

• Can get anything but...

– you may as well ask for all metadata

– the files can be large - best not to ask for 100's

Page 6: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 6

Some basic terms• series

– basic collection of data items with shared properties

– by convention named <project>.<data>

– all series records share a metadata format (i.e. keywords)

• keywords

– FITS style keywords plus added metadata only keywords

– correspond to columns in the metadata (DRMS) database

• online means

– available from a disk at the site

– so offline means : not yet arrived/available, deleted but can be fetched

• data format

– whatever is stored is native (FITS, JP2000), conversion is post-processing

– characterised by resolution, cadence (e.g. 4K x 4K at 10s, 1K x 1K at 90s)

– naturally can't do better, but can reduce by "cutouts" in time or space

• data records

– can be several items as a group (e.g. image + bad pixel map + alternative format)

– data is SUMS plus metadata, referenced by metadata tables (DRMS) - usually one to one

– each is self contained, for example cadence is not part of data

Page 7: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 7

Example series

• aia_test.lev1 AIA images 4Kx4K full disk full cadence

• aia_test.synoptic2 AIA images reduced to 1Kx1K full disk and 90s cadence

• hmi_test.M_45s magnetograms, 45s cadence

• hmi_test.v_45s dopplergrams, 45s cadence

• jpeg2K to come, browsing and forecasting

Page 8: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 8

User access methods

• Direct via “modules”

– on site of data centre

• Query based

– precursor to full data access

– checks a part of the data (metadata) without having to retrieve the very large part

• Indirect via network

– web/http based

– delivers data somewhere - maybe to fetch immediately or later

• Direct via wrapper

– on site e.g. IDL (Matlab on way)

Page 9: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 9

A practical pause - limitations

• Sheer size of request - even if you have a 2TB USB stick, that's only 2 days

• Network speed - at about 200Mb/s it takes a day to get a day's worth

• Search/database speed - millions of records

• Raw data access/retrieval speed - the basic image data takes time to get from disk

• Retention time - you can get anything, but you probably have to wait for a full day from 2 years ago that nobody else has ever used

Page 10: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 10

• At the data centres, for example

– show_series

– show_info

– jsoc_export_as_fits

[jdb@db1 ~]$ show_info -s ds=aia_test.synoptic2

First Record: aia_test.synoptic2[2010-05-21T15:00:00.57Z][171] is first of 6 records matching first keyword, Recnum = 1Last Record: aia_test.synoptic2[2010-07-14T11:58:41.07Z][335] is first of 2 records matching first keyword, Recnum = 445376Last Recnum: 445377

[jdb@db1 ~]$ show_series

aia_test.lev1 aia_test.synoptic2 drms.sites hmi.doptest hmi_test.m_45s hmi_test.s_720s lm_jps.lev1_test4k10s

[jdb@db1 ~]$ jsoc_export_as_fits reqid=REQ_FTP expversion=0.5 rsquery=aia_test.lev1[:#209866] path=tmp method=url protocol=FITS

'10552320' bytes exported.

Access by : modules - the basic bricks

Page 11: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 11

Access by : basic web access

• System developed by JSOC : lookdata.html

• Online via JSOC web site, but heavily loaded

• Being tested at ROB

• Provides an easy access to an overview of all the available data

• Formulating a selection query does require knowledge of query syntax

• Provides for a wide variety of data packaging

– normal user FITS or internal format (FITS with no keywords)

– via web for immediate or later access, as one or more individual files or as tar

– ROB working on fewer packaging options

Page 12: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 12

Access by : basic web access

Page 13: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 13

Access by : Virtual Observatories

• VSO

– development of existing VSO

– prototype for SDO running and definitive version in preparation

– http://sdac.virtualsolar.org/cgi/search

• Soteria

– demo provider made for ROB/USET, SDO provider being coded now

– http://soteria-space.eu/

• Uniform search paradigm

• Infrastructure hides efficient searches with complex syntax e.g. SQL in various flavours

Page 14: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 14

Access by : Soteria Virtual Observatory

• One part of an EU project

• Based on current web access technology

• The example is for the ROB USET telescope as a data provider, each SDO site will able be able to act as a provider

Page 15: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 15

Access by : simplified web access

• Work in progress

• Limited offer to direct request of tar files or individual FITS format files, front end for PFS

• Simplified enquiry based such as :

– aia.lev1 + time + period + cadence + wavelengths

• Preparation is actually more complex than basic access - for example it requires decisions as to what keys are useful for what series

Page 16: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 16

Access by : pseudo files (PFS)

• Systematically named files in a directory tree with no real files until you access them

• Typically based on query covering a much wider range than you really need (or could use)

• Real files kept in cache so further access very cheap

Page 17: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 17

mnt`-- aia_test.lev1 `-- 2010 `-- 06 `-- 17 |-- H0000 | |-- AIA20100617_000000570000_0171.fits | |-- AIA20100617_000003570000_0304.fits | |-- AIA20100617_000009580000_94.fits | |-- AIA20100617_000018570000_1600.fits | |-- AIA20100617_000050070000_211.fits | |-- AIA20100617_000053050000_335.fits | |-- AIA20100617_000056100000_193.fits...... | |-- AIA20100617_004505070000_335.fits | |-- AIA20100617_004506570000_1600.fits | |-- AIA20100617_004508070000_193.fits | |-- AIA20100617_004509580000_94.fits | `-- AIA20100617_004511070000_131.fits |-- H0100 | |-- AIA20100617_010000580000_0171.fits | |-- AIA20100617_010002080000_211.f.......

|-- AIA20100617_043008060000_193.fits |-- AIA20100617_043009550000_94.fits |-- AIA20100617_043011090000_131.fits |-- AIA20100617_043018580000_1600.fits |-- AIA20100617_044500560000_0171.fits |-- AIA20100617_044502050000_211.fits |-- AIA20100617_044503570000_0304.fits |-- AIA20100617_044505070000_335.fits |-- AIA20100617_044506570000_1600.fits |-- AIA20100617_044508070000_193.fits |-- AIA20100617_044509580000_94.fits `-- AIA20100617_044511070000_131.fits

9 directories, 160 files

Access by : pseudo files (PFS)

• Example with 160 file names, all AIA wavelengths, 15min cadence

• In prototype at ROB, source downloadable

Page 18: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 18

Access by : useful methods in development

• Order and notify via e-mail for manual fetch

• Order and automatic delivery (e.g. sftp)

Page 19: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 19

Interesting issue - Retention

• All netDRMS sites have full information for selected series - their “subscribed” series

• But is it on line?

– sites keep the latest, but must selectively discard

• Enquiry modules can tell if online, but implications (delay...) if not?

• You can request it, but it can take some time to obtain

– for now quick, but after a year or so a record nobody has looked at will be from tape

Page 20: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 20

Interesting issue - Saved searches

• How to describe a selection of data

• Can save result as a record list for a reasonable number of records but this does not save the query

– save both query and result?

• For both your own use and publication

• Saved query might give different results (e.g. online only)

• Relates to the issue of calibration

Page 21: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 21

Interesting issue - Evolving calibration and which data did I use?

• More accurate calibration will be available as time goes on and more calibration points are acquired

• So the newest and best data can change

• This done for most by applying a calibration series e.g. via Solarsoft

• But there can also be metadata changes

• The raw data is unlikely to change

Page 22: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 22

Neat stuff to come - cutouts

• This is well on the way again being developed by JSOC and LMSAL - for those who don't need the full 4Kx4K

• Very much reduced data storage requirements

• Closely related to event tracking and the HEK

Page 23: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 23

Neat stuff to come - Helioviewer

• www.helioviewer.org

• Existing project now being directed towards use with SDO data

• JPEG2000 based viewer with event marker overlay

• integration with JPEG2000 series

• rapid browsing with links to full data

• ROB is CoI in requested next stage

Page 24: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 24

Neat stuff to come - grid integration

• The data element size (10's of MB) is natural for use in a high performance grid

• The data already geographically distributed - variety of access routes

• Distributed variety of resources - large clusters, pipelines, GPU's

• Sites are on high performance research networks

Page 25: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 25

Thanks to

• JSOC at Stanford

• LMSAL

• Belnet and Geant2 for networking

• The enthusiastic cooperation from the partner data centres

• Our sister institutes at the ROB site for hosting the data centre and infrastructure

Page 26: SDO Data Access and Distribution in Europe and the WisSDOm Data Centre in ROB, Brussels

July 2010 Cospar10 Bremen Slide 26

Web addresses

• The main source : JSOC at jsoc.stanford.edu

• HEK : www.lmsal.com/hek

• ROB : wissdom.oma.be

• SAO : www.cfa.harvard.edu/sao

• GDS : www.mps.mpg.de/projects/seismo/GDC-SDO

• UCLan : www.star.uclan.ac.uk

• IAS : idc-medoc.ias.u-psud.fr