ivoa interop, cambridge uk, 20071 ivoa data access layer table access protocol analysis doug tody...

19
IVOA Interop, Cambridge UK, 2007 1 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO) INTERNATIONAL VIRTUAL OBSERVATORY ALLIANCE

Upload: grace-maclean

Post on 27-Mar-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 1

IVOA Data Access LayerTable Access Protocol Analysis

Doug Tody (NRAO/NVO)

INTERNATIONAL VIRTUAL OBSERVATORY ALLIANCE

Page 2: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 2

TAP Context

• Architecture– Cross-match portal/application– Table Access Protocol– ADQL specification– VOSpace, UWS, SSO, etc.

• Role of TAP– Direct access to table data at a single site– Support for higher level distributed queries– Broader future role in DAL (complex data etc.)

Page 3: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 3

Primary TAP Use-Cases

• Complex/Large Table Query– Large query (async, vospace, authentication)– Multi-table operations (join etc.)– Multi-region queries (table upload)– Advanced ADQL/SQL capabilities

Required to support cross-match portal, advanced appsFull functionality is required

• Simple Table Query– Filter-type operation upon a single table– Most basic astronomical catalog access is of this type– ADQL, async useful but not required for simple queries

Probably sufficient for most small data providersCone search is not enough

Page 4: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 4

Primary TAP Use-Cases

• Table Metadata Query– Metadata describing stored data is also data– can be virtual, subsetted, transformed, etc.– Client application queries TAP service for available data– tables, table columns, relationships, etc.– service metadata (capabilities etc) is a separate issue

Basic metadata model should be simpleExtensibility required for advanced query support

• Data Access Query– This is "ADQL integration into DAL"– ADQL query against a DAL data model (complex data etc.)

Page 5: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 5

Key Requirements/Issues

• ADQL and Grid capabilities

– Motivation• Required for portals and advanced applications• Needs ADQL, multi-region, async, vospace, sso, etc.

– Issues or Options• Not controversial: everyone agrees we need this• Not required however for basic usage• Complex; will take time to prototype, specify, standardize• Unrealistic to expect community implementation w/o frameworks

Page 6: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 6

Key Requirements/Issues

• Simple Query capability

– Motivation• Provide simple basic table access capability• Needed anyway for simple table metadata queries• Adequate for most simple filter-type queries of single table• Supplants cone search; much more powerful but still simple• Provide robust implementation while we develop advanced stuff

– Issues or Options• Some want to make ADQL mandatory

– "all data should be at data centers"

• Options: legacy cone search plus ADQL-TAP– but cone search is too limited

Page 7: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 7

Key Requirements/Issues

• TAP Information Schema

– Motivation• Provide uniform access to both table data and metadata• Same query/access interface used for both• Supports virtual data, dynamic queries, format options, etc• Easily extended without changing interface• Don't do one thing now, another later

– Issues or Options• Need to specify/agree upon minimal core metadata• Strategy: Adopt registry table model with minor changes• Other options: VOTable with no data, literal registry XML

Page 8: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 8

Key Requirements/Issues

• Proposed Core TAP/Registry Table Schema

– Table• name [[catalog.]schema.]table• type base table, view, output, etc.• description table description

– Column• name column name• tableName table name• description column description• unit unit in VO standard format• ucd UCD if any• utype UTYPE if any• dataType dataType as in VOTable/registry• arrayShape array "shape"/size as in VOTable/registry• std standard column (else custom addition)

Page 9: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 9

Page 10: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 10

TAP Design Study

• History– Based upon work done by ESAC/VOQL-TEG and DAL WG in spring

2007– Also NVO tiger team, SkyNode experience, data center experience

• TAP Design Goals– Provide capability for ADQL queries to support advanced analysis– Define minimal implementation

• for small data provider, common queries• replace legacy cone search with more general facility

– Both data access and metadata access supported natively by service – Provide for scalability, in particular multi-position queries– Support Grid capabilities, i.e, async, staging, authentication– TAP should be consistent with other DAL interfaces where possible– Provide registry integration for automated service discovery

Page 11: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 11

TAP Interface Summary

• Form of interface– HTTP GET/POST based (other protocols possible, e.g. SOAP,

CEA)– Multiple output formats (VOTable, CSV/TSV, XML, VOSpace,

etc.)

• Operations– AdqlQuery ADQL-based queries, full functionality– SimpleQuery Simple data queries, metadata queries– GetCapabilities Return metadata describing the service– GetAvailability Monitor runtime service function and

health

Page 12: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 12

AdqlQuery Operation

• Scope and Form of Interface– General capability for ADQL-based queries– Both GET and POST versions are required

• GET is synchronous, indempotent, simple, RESTful• POST required for async, staging, large queries

– Semantics, e.g., parameters, identical for both versions– ADQL query is URL-encoded so use in GET is not a problem

• Parameters– QUERY The query string (ADQL; URL-encoded)– FORMAT Output data format (VOTable, CSV, XML, etc.)– <staging> Only used in POST version; for VOSpace – <async> Only used in POST version; for driving UWS – MAXREC Maximum records in the output table– RUNID Pass-through; used for logging

(others TBD)

Page 13: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 13

AdqlQuery Operation

• Field Names, UTYPE and UCD– Suggest this be done at level of field rather than by operation– Literal field names directly access database table– A UTYPE reference resolves into a literal table field name

• e.g., “ssa:Target.Name” resolves to table field “TargetName”

– UTYPE (in this context) is a special case of UTYPE ("ucd:")

• Field name resolution– Both literal and UTYPE/UCD field names resolve to table field– All queries evaluated equivalently after field name resolution– Data models, at the level of TAP, involve only mappings– UFI can automate this, or it can be done client side

Page 14: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 14

AdqlQuery Operation

• Multi-Position Queries– AKA multi-cone search; but doesn't have to be limited to position– Common use-case involves user source list with thousands of positions– Required for scalability to reduce operation overhead

• How It Works– Uses ADQL, REGION, POST form of operation– VOTable used to upload source table (ID, POS, SIZE, etc.)

• other fields are passed through to output• output is tagged by source ID• can be generalized to any input parameter, not just position

– POST (e.g., multipart/form-data) used to upload params, VOTable– Parameters are common to both GET and POST forms

• Data Scoping– Query, Local (DBMS), and VOSpace (Net) tables are equivalent– POST is a Query space table

Page 15: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 15

SimpleQuery Operation

• Scope and Form of Interface– Provides capability for simple non-ADQL queries– Used for both data queries and metadata queries (like ADQL/SQL)– Only a synchronous GET version is required– Only a single table is queried at a time

• Motivation– Simple to implement, easy to use– >90% of actual catalog queries are simple filters of a single table– We need something like this anyway for simple metadata queries

• but why limit it to only metadata?– Small data providers publish a few simple catalogs– Simpler to implement, likely to be more robust implementation

Page 16: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 16

SimpleQuery Operation

• Parameters– SELECT Table fields to be returned (default all)– FROM The table (or view) to be accessed– WHERE A filter to be applied to the table (default

none)– POS,SIZE Find data only in this spatial region– FORMAT Output data format

– MAXREC Maximum records out– RUNID Pass-through for logging

(etc)

• Provides– Simplified SQL-lite query (90/10 rule)– Both data and metadata queries– Simple cone search capability

Page 17: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 17

SimpleQuery Operation

• Metadata Queries– Information Schema concept

• great concept; definition/implementation imperfect• but it is a standard, widely (but not completely) implemented

– Concept• represent database/table metadata as data tables (views)• allows use of standard data table interface to query metadata• easily extensible without changing service interface• views can be used for things such as registry view

– Examples• FROM=SCHEMA.tables • FROM=SCHEMA.columns&WHERE=tableName,foo• FROM=SCHEMA.columns&WHERE=tableName,foo&FORMAT=xml

Page 18: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 18

Simple Cone Search

• Approach– Integrate into SimpleQuery to allow additional constraints

• would probably be too ambitious in a separate SCS standard– Re-use common DAL position syntax (POS, SIZE)

• extensible in terms of region type and spatial frame– UTYPE/UCD field syntax allows data models to be used– Table to be queried is specified with FROM– ADQL,REGION provides an advanced alternative with common

semantics

• Examples– REQUEST=SimpleQuery&FROM=foo&POS=180.0,12.5&SIZE=0.2– REQUEST=SimpleQuery&FROM=foo&POS=180.0,12.5&SIZE=0.2&WHERE=flu

x,5/

Page 19: IVOA Interop, Cambridge UK, 20071 IVOA Data Access Layer Table Access Protocol Analysis Doug Tody (NRAO/NVO ) I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE

IVOA Interop, Cambridge UK, 2007 19

Minimal TAP Service

• Requirements– Implements SimpleQuery operation

• possibly getCapabilities and getAvailability as well?

– Provides basic data query capability– Provides basic metadata query capability (tables, columns)– No ADQL support required (but may use SQL back end)– No UTYPE support required