![Page 1: 2003 April 151 Data Centres: Connecting to the Real World Clive Page](https://reader035.vdocuments.site/reader035/viewer/2022072011/56649e2d5503460f94b1cf0b/html5/thumbnails/1.jpg)
2003 April 15 1
Data Centres:Connecting to the Real World
Clive Page
![Page 2: 2003 April 151 Data Centres: Connecting to the Real World Clive Page](https://reader035.vdocuments.site/reader035/viewer/2022072011/56649e2d5503460f94b1cf0b/html5/thumbnails/2.jpg)
2003 April 15 2
Aim
To provide Web Services interfaces to a few data centres to provide realistic data for the current iteration.
How data centres many? Maybe three…
![Page 3: 2003 April 151 Data Centres: Connecting to the Real World Clive Page](https://reader035.vdocuments.site/reader035/viewer/2022072011/56649e2d5503460f94b1cf0b/html5/thumbnails/3.jpg)
2003 April 15 3
XML Formats for Output
Images Nothing yet
Time series Nothing yet
Spectra Nothing yet
Tables VOTable
Conclusion:
in this iteration just handle tabular datasets.
![Page 4: 2003 April 151 Data Centres: Connecting to the Real World Clive Page](https://reader035.vdocuments.site/reader035/viewer/2022072011/56649e2d5503460f94b1cf0b/html5/thumbnails/4.jpg)
2003 April 15 4
XML Formats for Input• AQL (Astronomical Query Language): much discussion,
not much progress.• ASU (Astronomical Server URL): ad-hoc pre-XML
definition of CGI parameters by CDS, used since 1996 by a number of archives.
• SIAP (Simple Image Access Protocol) – CGI parameters defined by NVO for their prototypes in 2002.
• Xpath: draft standard for querying XML documents – designed for tree-structured data.
• Xquery: based on Xpath, includes WHERE section similar to SQL, more suitable for tabular data.
• SQL: standard for RDBMS but only used by some astronomical archives
![Page 5: 2003 April 151 Data Centres: Connecting to the Real World Clive Page](https://reader035.vdocuments.site/reader035/viewer/2022072011/56649e2d5503460f94b1cf0b/html5/thumbnails/5.jpg)
2003 April 15 5
Query Language: proposal for ad-hoc solution
Use simplified form of SQL in an XML wrapper:
SELECT <list of columns>
FROM <list of tables>
WHERE <selection expression>
• <list of columns> includes UCDs so generic queries possible
• <list of tables> allows same query to be sent to >1 archive
• <selection expression> may includes column/UCD names and the usual syntax of relational expressions
• Need special provision for cone-search in selection, e.g.
– Boolean pseudo-function CONE(RA, DEC, RADIUS)• No joins, no sub-selects, no sorting or grouping, at present.
![Page 6: 2003 April 151 Data Centres: Connecting to the Real World Clive Page](https://reader035.vdocuments.site/reader035/viewer/2022072011/56649e2d5503460f94b1cf0b/html5/thumbnails/6.jpg)
2003 April 15 6
Possible Datasets
Dataset Location System/DBMS Input Output
APM catalogue Cambridge Solaris/Sybase CGI VOTable
6df galaxy survey
Edinburgh Win-NT/SQL server ? VOTable nearly done
USNO-B Leicester Linux/DB2 ? XML
STP datasets RAL Various/home grown ? VOTable
USNO-B Leicester
(LEDAS)
Solaris/WCStools CGI VOtable
SuperCOSMOS Edinburgh Win-NT/SQL server ? VOTable
in progress
Vizier collection
Leicester Linux/Sybase ASU VOTable
![Page 7: 2003 April 151 Data Centres: Connecting to the Real World Clive Page](https://reader035.vdocuments.site/reader035/viewer/2022072011/56649e2d5503460f94b1cf0b/html5/thumbnails/7.jpg)
2003 April 15 7
Problem Areas
• All current services are synchronous: user waits while HTML is generated and streamed to the browser.
– How to set up an asynchronous service where results appear later, and are sent to MySpace or elsewhere?
• How is the query generated in XML format?
• How is the query in pseudo-SQL parsed into the CGI parameters or SQL the local DBMS needs?
![Page 8: 2003 April 151 Data Centres: Connecting to the Real World Clive Page](https://reader035.vdocuments.site/reader035/viewer/2022072011/56649e2d5503460f94b1cf0b/html5/thumbnails/8.jpg)
2003 April 15 8
Metadata Problems
• How does the query system know which column names exist, or how to translate UCDs to columns?
– It gets the information in the Registry
• How does the Registry get its information on columns and UCDs in each table? Answer: either
– It gets the information from the Data Service
– OR it gets filled in laboriously by hand.
Conclusion
• Data centres must implement a Web Service which responds to queries about their metadata
• AQL must be extended to deal with these queries.